Home » Tag Archives: Hadoop

Tag Archives: Hadoop

MapReduce Design Patterns Implemented in Apache Spark

apache-spark-logo

This blog is a first in a series that discusses some design patterns from the book MapReduce design patterns and shows how these patterns can be implemented in Apache Spark(R). When writing MapReduce or Spark programs, it is useful to think about the data flows to perform a job. Even if Pig, Hive, Apache Drill and Spark Dataframes make it ...

Read More »

Drill into Your Big Data Today with Apache Drill

apache-hadoop-logo

Apache Drill has been gaining significant user adoption and community momentum since its initial Beta availability in September 2014. The generally available version of Drill—Drill 1.0—was released in May 2015, and numerous customers have deployed and used Drill in production since then. In this blog post, I will briefly summarize some of the key capabilities that customers are finding immensely ...

Read More »

Hadoop: HDFS – java.lang.NoSuchMethodError: org.apache.hadoop.fs.FSOutputSummer.(Ljava/util/zip/Checksum;II)V

apache-hadoop-logo

I wanted to write a little program to check that one machine could communicate a HDFS server running on the other and adapted some code from the Hadoop wiki as follows: package org.playground;   import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FSDataInputStream; import org.apache.hadoop.fs.FSDataOutputStream; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path;   import java.io.IOException;   public class HadoopDFSFileReadWrite {   static void printAndExit(String str) { System.err.println( str ...

Read More »

Big Data: What about Security?

apache-hadoop-logo

From the first time Hadoop appeared it had a security problem. Apache Knox and Cloudera Manager have been solutions for providing authentication and authorization for basic database management functions. Also, the underlying Hadoop Filesystem now incorporates Unix-like permissions. But the issue has not been solved, so usually the pattern followed is to “plunk the S-word after the name of a ...

Read More »

What is Big Data – Theory to Implementation

jcg-logo

What is Big Data? You may ask; and more importantly why it is the latest trend in nearly every business domain? Is it just a hype or its here to stay? As a matter of fact “Big Data” is a pretty straightforward term – its just what its says – a very large data-set. How large? The exact answer is ...

Read More »

Want to take your Java skills to the next level?

Grab our programming books for FREE!

Here are some of the eBooks you will get:

  • Advanced Java Guide
  • Java Design Patterns
  • JMeter Tutorial
  • Java 8 Features Tutorial
  • JUnit Tutorial
  • JSF Programming Cookbook
  • Java Concurrency Essentials