Ilias Tsagklis

About Ilias Tsagklis

Ilias Tsagklis is a senior software engineer working in the telecom domain. He is an applications developer in a wide variety of applications/services. Ilias is co-founder and Executive Editor at Java Code Geeks.

Spring meets Apache Hadoop

SpringSource has just announced the first GA release of Spring for Apache Hadoop. The goal of this project is to simplify the development of Hadoop based applications.

You may download the project here and check out the Maven artifacts here.

Spring for Apache Hadoop was born to resolve the issue of having poorly constructed Hadoop applications, which usually consist of command line utilities, scripts and pieces of code stitched together. It provides a consistent programming and configuration model across a wide range of Hadoop ecosystem projects, as expected from a Spring project.

The well known Template API design pattern is also embraced here, so the framework includes classes like:

Another embraced aspect is the approach of starting small and growing into complex solutions. So, Spring for Hadoop introduces various Runner classes which allow the execution of Hive, Pig scripts, vanilla Map/Reduce or Streaming jobs, Cascading flows but also invocation of pre and post generic JVM-based scripting all through the familiar JDK Callable contract.

When things start to get more complex, upgrading to Spring Batch is straightforward and easy. Spring Batch’s rich functionality for handling the ETL processing of large file translates directly into Hadoop use cases for the ingestion and export of files form HDFS.

Also, the use of Spring Hadoop in combination with Spring Integration allows for rich processing of event streams that can be transformed, enriched, filtered, before being read and written from HDFS or other storages such as NoSQL stores, for which Spring Data provides plenty of support.

To kick-start your applications, you can start with the sample apps provided (already compiled and ready for download). If you test drive Spring for Hadoop, let us know and share the knowledge.

Happy coding!

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

JPA Mini Book

Learn how to leverage the power of JPA in order to create robust and flexible Java applications. With this Mini Book, you will get introduced to JPA and smoothly transition to more advanced concepts.

JVM Troubleshooting Guide

The Java virtual machine is really the foundation of any Java EE platform. Learn how to master it with this advanced guide!

Given email address is already subscribed, thank you!
Oops. Something went wrong. Please try again later.
Please provide a valid email address.
Thank you, your sign-up request was successful! Please check your e-mail inbox.
Please complete the CAPTCHA.
Please fill in the required fields.

Leave a Reply


× nine = 9



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close