Home » Tag Archives: MapReduce (page 2)

Tag Archives: MapReduce

Couchbase 101: Create views (MapReduce) from your Java application

couchbase-logo

When you are developing a new applications with Couchbase 2.0, you sometimes need to create view dynamically from your code. For example you may need this when you are installing your application, writing some test, or you can also use that when you are building frameworks, and wants to dynamically create views to query data. This post shows how to ...

Read More »

MapReduce Algorithms – Order Inversion

apache-hadoop-logo

This post is another segment in the series presenting MapReduce algorithms as found in the Data-Intensive Text Processing with MapReduce book. Previous installments are Local Aggregation, Local Aggregation PartII and Creating a Co-Occurrence Matrix. This time we will discuss the order inversion pattern. The order inversion pattern exploits the sorting phase of MapReduce to push data needed for calculations to ...

Read More »

Calculating A Co-Occurrence Matrix with Hadoop

apache-hadoop-mapreduce-logo

This post continues with our series of implementing the MapReduce algorithms found in the Data-Intensive Text Processing with MapReduce book. This time we will be creating a word co-occurrence matrix from a corpus of text. Previous posts in this series are: Working Through Data-Intensive Text Processing with MapReduce Working Through Data-Intensive Text Processing with MapReduce – Local Aggregation Part II ...

Read More »

MapReduce: Working Through Data-Intensive Text Processing

apache-hadoop-mapreduce-logo

It has been a while since I last posted, as I’ve been busy with some of the classes offered by Coursera. There are some very interesting offerings and is worth a look. Some time ago, I purchased Data-Intensive Processing with MapReduce by Jimmy Lin and Chris Dyer. The book presents several key MapReduce algorithms, but in pseudo code format. My ...

Read More »

Processing 10 million messages with Akka

akka-logo

Akka Actors promise concurrency. What better way to simulate that and see if how much time it takes to process 10 million messages using commodity hardware and software without any low level tunings.I wrote the entire 10 million messages processing in Java and the overall results astonished me. When I ran the program on my iMac machine with an intel ...

Read More »

MapReduce Questions and Answers Part 2

apache-hadoop-mapreduce-logo

4 Inverting Indexing for Text Retrieval The chapter contains a lot of details about integer numbers encoding and compression. Since these topics are not directly about MapReduce, I made no questions about them. 4.4 Inverting Indexing: Revised Implementation Explain inverting index retrieval algorithm. You may assume that each document fits into the memory. Assume also then there is a huge ...

Read More »

MapReduce Questions and Answers Part 1

apache-hadoop-mapreduce-logo

With all the hype and buzz surrounding NoSQL, I decided to have a look at it. I quickly found that there is not one NoSQL I could learn. Rather, there are various different solutions with different purposes and trade offs. Those various solutions tend to have one thing in common: processing of data in NoSQL storage is usually done using ...

Read More »

MapReduce for dummies

apache-hadoop-mapreduce-logo

Continuing the coverage on Hadoop component, we will go through the MapReduce component. MapReduce is a concept that has been programming model of LISP. But before we jump into MapReduce, lets start with an example to understand how MapReduce works. Given a couple of sentences, write a program that counts the number of words. Now, the traditional thinking when solving ...

Read More »

Joins with Map Reduce

apache-hadoop-mapreduce-logo

I have been reading on Join implementations available for Hadoop for past few days. In this post I recap some techniques I learnt during the process. The joins can be done at both Map side and Join side according to the nature of data sets of to be joined. Reduce Side Join Let’s take the following tables containing employee and ...

Read More »
Do you want to know how to develop your skillset and become a ...

Subscribe to our newsletter to start Rocking right now!

To get you started we give you our best selling eBooks for FREE!
Get ready to Rock!
To download the books, please verify your email address by following the instructions found on the email we just sent you.

THANK YOU!

Close