List/Grid Tag Archives: Apache Hadoop

Hadoop setup on single node and multi node
We will describe Hadoop setup on single node and multi node. The Hadoop environment setup and configuration will be described in details. First you need to download the following ...

How Hadoop Works? HDFS case study
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed ...

Ganglia configuration for a small Hadoop cluster and some troubleshooting
Ganglia is an open-source, scalable and distributed monitoring system for large clusters. It collects, aggregates and provides time-series views of tens of machine-related metrics such ...

Hadoop Books Giveaway – Roundup
Fellow geeks, Our giveaway of Packt Publishing’s books on Apache Hadoop has ended. You may find the original post for the competition here. The Prize Winners The 6 lucky winners that ...

Spring meets Apache Hadoop
SpringSource has just announced the first GA release of Spring for Apache Hadoop. The goal of this project is to simplify the development of Hadoop based applications. You may download ...

Hadoop Hangover: Launch a hadoop cluster CDH4 using Apache Whirr
This post is about how-to launch a CDH4 MRv1 or CDH4 Yarn cluster on EC2 instances. It’s said that you can launch a cluster with the help of Whirr and in a matter of 5 minutes! ...

MapReduce Algorithms – Secondary Sorting
We continue with our series on implementing MapReduce algorithms found in Data-Intensive Text Processing with MapReduce book. Other posts in this series:Working Through Data-Intensive ...

MapReduce Algorithms – Order Inversion
This post is another segment in the series presenting MapReduce algorithms as found in the Data-Intensive Text Processing with MapReduce book. Previous installments are Local Aggregation, ...

Calculating A Co-Occurrence Matrix with Hadoop
This post continues with our series of implementing the MapReduce algorithms found in the Data-Intensive Text Processing with MapReduce book. This time we will be creating a word co-occurrence ...

Hadoop Single Node Set Up
With this post I am hoping to share the procedure to set up Apache Hadoop in single node. Hadoop is used in dealing with Big Data sets where deployment is happening on low-cost commodity ...


