Home » Tag Archives: Apache Hadoop (page 4)

Tag Archives: Apache Hadoop

Hadoop Books Giveaway – Roundup

jcg-logo

Fellow geeks, Our giveaway of Packt Publishing’s books on Apache Hadoop has ended. You may find the original post for the competition here. The Prize Winners The 6 lucky winners that will receive the book prizes are (names are as appeared on their emails): Hadoop Real-World Solutions Cookbook Sellamuthu, Rudra Moorthy Josep Ventura Argerich Hadoop Beginner’s Guide Bhakti Rajdev Manuel ...

Read More »

Spring meets Apache Hadoop

spring-interview-questions-answers

SpringSource has just announced the first GA release of Spring for Apache Hadoop. The goal of this project is to simplify the development of Hadoop based applications. You may download the project here and check out the Maven artifacts here. Spring for Apache Hadoop was born to resolve the issue of having poorly constructed Hadoop applications, which usually consist of ...

Read More »

MapReduce Algorithms – Secondary Sorting

apache-hadoop-mapreduce-logo

We continue with our series on implementing MapReduce algorithms found in Data-Intensive Text Processing with MapReduce book. Other posts in this series: Working Through Data-Intensive Text Processing with MapReduce Working Through Data-Intensive Text Processing with MapReduce – Local Aggregation Part II Calculating A Co-Occurrence Matrix with Hadoop MapReduce Algorithms – Order Inversion       This post covers the pattern ...

Read More »

MapReduce Algorithms – Order Inversion

apache-hadoop-logo

This post is another segment in the series presenting MapReduce algorithms as found in the Data-Intensive Text Processing with MapReduce book. Previous installments are Local Aggregation, Local Aggregation PartII and Creating a Co-Occurrence Matrix. This time we will discuss the order inversion pattern. The order inversion pattern exploits the sorting phase of MapReduce to push data needed for calculations to ...

Read More »

Calculating A Co-Occurrence Matrix with Hadoop

apache-hadoop-mapreduce-logo

This post continues with our series of implementing the MapReduce algorithms found in the Data-Intensive Text Processing with MapReduce book. This time we will be creating a word co-occurrence matrix from a corpus of text. Previous posts in this series are: Working Through Data-Intensive Text Processing with MapReduce Working Through Data-Intensive Text Processing with MapReduce – Local Aggregation Part II ...

Read More »

Hadoop Single Node Set Up

apache-hadoop-logo

With this post I am hoping to share the procedure to set up Apache Hadoop in single node. Hadoop is used in dealing with Big Data sets where deployment is happening on low-cost commodity hardware. It is a map-reduce framework which map segments of a job among the nodes in a cluster for execution. Though we will not see the ...

Read More »

Hadoop + Amazon EC2 – An updated tutorial

apache-hadoop-logo

There is an old tutorial placed at Hadoop’s wiki page: http://wiki.apache.org/hadoop/AmazonEC2, but recently I had to follow this tutorial and I noticed that it doesn’t cover some new Amazon functionality. To follow this tutorial is recommended that you are already familiar with the basics of Hadoop, a very useful ‘how to start’ tutorial can be found at Hadoop’s homepage: http://hadoop.apache.org/. ...

Read More »

Testing Hadoop Programs with MRUnit

apache-hadoop-logo

 This post will take a slight detour from implementing the patterns found in Data-Intensive Processing with MapReduce to discuss something equally important, testing. I was inspired in part from a presentation by Tom Wheeler that I attended while at the 2012 Strata/Hadoop World conference in New York. When working with large data sets, unit testing might not be the first ...

Read More »

Distributed Apache Flume Setup With an HDFS Sink

apache-flume-logo

I have recently spent a few days getting up to speed with Flume, Cloudera‘s distributed log offering. If you haven’t seen this and deal with lots of logs, you are definitely missing out on a fantastic project. I’m not going to spend time talking about it because you can read more about it in the users guide or in the ...

Read More »
Do you want to know how to develop your skillset and become a ...

Subscribe to our newsletter to start Rocking right now!

To get you started we give you our best selling eBooks for FREE!
Get ready to Rock!
To download the books, please verify your email address by following the instructions found on the email we just sent you.

THANK YOU!

Close