Ilias Tsagklis

About Ilias Tsagklis

Ilias Tsagklis is a senior software engineer working in the telecom domain. He is an applications developer in a wide variety of applications/services. Ilias is co-founder and Executive Editor at Java Code Geeks.

Spring meets Apache Hadoop

SpringSource has just announced the first GA release of Spring for Apache Hadoop. The goal of this project is to simplify the development of Hadoop based applications.

You may download the project here and check out the Maven artifacts here.

Spring for Apache Hadoop was born to resolve the issue of having poorly constructed Hadoop applications, which usually consist of command line utilities, scripts and pieces of code stitched together. It provides a consistent programming and configuration model across a wide range of Hadoop ecosystem projects, as expected from a Spring project.

The well known Template API design pattern is also embraced here, so the framework includes classes like:

Another embraced aspect is the approach of starting small and growing into complex solutions. So, Spring for Hadoop introduces various Runner classes which allow the execution of Hive, Pig scripts, vanilla Map/Reduce or Streaming jobs, Cascading flows but also invocation of pre and post generic JVM-based scripting all through the familiar JDK Callable contract.

When things start to get more complex, upgrading to Spring Batch is straightforward and easy. Spring Batch’s rich functionality for handling the ETL processing of large file translates directly into Hadoop use cases for the ingestion and export of files form HDFS.

Also, the use of Spring Hadoop in combination with Spring Integration allows for rich processing of event streams that can be transformed, enriched, filtered, before being read and written from HDFS or other storages such as NoSQL stores, for which Spring Data provides plenty of support.

To kick-start your applications, you can start with the sample apps provided (already compiled and ready for download). If you test drive Spring for Hadoop, let us know and share the knowledge.

Happy coding!

Related Whitepaper:

Hadoop Illuminated

Gentle Introduction of Hadoop and Big Data!

This Hadoop book was written with following goals and principles: Make Hadoop accessible to a wider audience -- not just the highly technical crowd. There are a few unique chapters that you won't find in other Hadoop books, for example: Hadoop use cases, Hadoop distributions rundown, BI Tools feature matrix.

Get it Now!  

Leave a Reply


− 3 = zero



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.

Sign up for our Newsletter

20,709 insiders are already enjoying weekly updates and complimentary whitepapers! Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

As an extra bonus, by joining you will get our brand new e-books, published by Java Code Geeks and their JCG partners for your reading pleasure! Enter your info and stay on top of things,

  • Fresh trends
  • Cases and examples
  • Research and insights
  • Two complimentary e-books