About Pascal Alma

Pascal is a senior JEE Developer and Architect at 4Synergy in The Netherlands. Pascal has been designing and building J2EE applications since 2001. He is particularly interested in Open Source toolstack (Mule, Spring Framework, JBoss) and technologies like Web Services, SOA and Cloud technologies. Specialties: JEE, SOA, Mule ESB, Maven, Cloud Technology, Amazon AWS.

Unit testing a Java Hadoop job

In my previous post I showed how to setup a complete Maven based project to create a Hadoop job in Java. Of course it wasn’t complete because it is missing the unit test part . In this post I show how to add MapReduce unit tests to the project I started previously. For the unit test I make use of the MRUnit framework.

    • Add the necessary dependency to the pom

Add the following dependency to the pom:
 
 
 

<dependency>
   <groupId>org.apache.mrunit</groupId>
   <artifactId>mrunit</artifactId>
   <version>1.0.0</version>
   <classifier>hadoop1</classifier>
   <scope>test</scope>
</dependency>

This will made the MRunit framework available to the project.

    • Add Unit tests for testing the Map Reduce logic

The use of this framework is quite straightforward, especially in our business case. So I will just show the unit test code and some comments if necessary but I think it is quite obvious how to use it. The unit test for the Mapper ‘MapperTest’:

package net.pascalalma.hadoop;

import org.apache.hadoop.io.Text;
import org.apache.hadoop.mrunit.mapreduce.MapDriver;
import org.junit.Before;
import org.junit.Test;
import java.io.IOException;

/**
 * Created with IntelliJ IDEA.
 * User: pascal
 */
public class MapperTest {

    MapDriver<Text, Text, Text, Text> mapDriver;

    @Before
    public void setUp() {
        WordMapper mapper = new WordMapper();
        mapDriver = MapDriver.newMapDriver(mapper);
    }

    @Test
    public void testMapper() throws IOException {
        mapDriver.withInput(new Text("a"), new Text("ein"));
        mapDriver.withInput(new Text("a"), new Text("zwei"));
        mapDriver.withInput(new Text("c"), new Text("drei"));
        mapDriver.withOutput(new Text("a"), new Text("ein"));
        mapDriver.withOutput(new Text("a"), new Text("zwei"));
        mapDriver.withOutput(new Text("c"), new Text("drei"));
        mapDriver.runTest();
    }
}

This test class is actually even simpler than the Mapper implementation itself. You just define the input of the mapper and the expected output and then let the configured MapDriver run the test. In our case the Mapper doesn’t do anything specific but you see how easy it is to setup a testcase. For completeness here is the test class of the Reducer:

package net.pascalalma.hadoop;

import org.apache.hadoop.io.Text;
import org.apache.hadoop.mrunit.mapreduce.ReduceDriver;
import org.junit.Before;
import org.junit.Test;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

/**
 * Created with IntelliJ IDEA.
 * User: pascal
 */
public class ReducerTest {

    ReduceDriver<Text, Text, Text, Text> reduceDriver;

    @Before
    public void setUp() {
        AllTranslationsReducer reducer = new AllTranslationsReducer();
        reduceDriver = ReduceDriver.newReduceDriver(reducer);
    }

    @Test
    public void testReducer() throws IOException {
        List<Text> values = new ArrayList<Text>();
        values.add(new Text("ein"));
        values.add(new Text("zwei"));
        reduceDriver.withInput(new Text("a"), values);
        reduceDriver.withOutput(new Text("a"), new Text("|ein|zwei"));
        reduceDriver.runTest();
    }
}
    • Run the unit tests it

With the Maven command “mvn clean test” we can run the tests:

screen-shot-2013-08-23-at-20-12-50

With the unit tests in place I would say we are ready to build the project and deploy it to an Hadoop cluster, which I will describe in the next post.
 

Reference: Unit testing a Java Hadoop job from our JCG partner Pascal Alma at the The Pragmatic Integrator blog.

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

JPA Mini Book

Learn how to leverage the power of JPA in order to create robust and flexible Java applications. With this Mini Book, you will get introduced to JPA and smoothly transition to more advanced concepts.

JVM Troubleshooting Guide

The Java virtual machine is really the foundation of any Java EE platform. Learn how to master it with this advanced guide!

Given email address is already subscribed, thank you!
Oops. Something went wrong. Please try again later.
Please provide a valid email address.
Thank you, your sign-up request was successful! Please check your e-mail inbox.
Please complete the CAPTCHA.
Please fill in the required fields.

Leave a Reply


4 − four =



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close