Home » Tag Archives: Apache Hadoop (page 2)

Tag Archives: Apache Hadoop

Data as a Service: JBoss Data Virtualization and Hadoop powering your Big Data solutions

apache-hadoop-logo

Red Hat and Cloudera, announce the formation of a strategic alliance. From JBoss perspective, the key objective of the alliance is to leverage big data enterprise-wide and not let Hadoop become another data silo. Cloudera combined with Red Hat JBoss Data Virtualization integrates Hadoop with existing information sources including data warehouses, SQL and NoSQL databases, enterprise and cloud applications, and ...

Read More »

Introducing Hadoop Development Tools

apache-hadoop-logo

Few days back Apache Hadoop Development Tools a.k.a. HDT was released.  The projects aims at bringing plugins in eclipse to simplify development on Hadoop platform. This blog aims to provide an overview of few great features of HDT. Single Endpoint The project can act as a single endpoint for your HDFS, Zookeeper and MR Cluster. You can connect to your HDFS/Zookeeper instance ...

Read More »

Graph Degree Distributions using R over Hadoop

software-development-2-logo

There are two common types of graph engines. One type is focused on providing real-time, traversal-based algorithms over linked-list graphs represented on a single-server. Such engines are typically called graph databases and some of the vendors include Neo4j, OrientDB, DEX, and InfiniteGraph. The other type of graph engine is focused on batch-processing using vertex-centric message passing within a graph represented ...

Read More »

Understanding the World using Tables and Graphs

apache-hadoop-logo

Organizations make use of data to drive their decision making, enhance their product features, and to increase the efficiency of their everyday operations. Data by itself is not useful. However, with data analysis, patterns such as trends, clusters, predictions, etc. can be distilled. The way in which data is analyzed is predicated on the way in which data is structured. ...

Read More »

ElasticSearch-Hadoop: Indexing product views count and customer top search query from Hadoop to ElasticSearch

apache-hadoop-logo

This post covers to use ElasticSearch-Hadoop to read data from Hadoop system and index that in ElasticSearch. The functionality it covers is to index product views count and top search query per customer in last n number of days. The analyzed data can further be used on website to display customer recently viewed, product views count and top search query string. ...

Read More »

Apache Hadoop 2.4.0

apache-hadoop-logo

The Apache community has voted to release Apache Hadoop 2.4.0, so the new release is now available and consists of important improvements. The improvements are related not only to HDFS but also to MapReduce. The important improvement in HDFS is about NameNodes. Multiple independent Namenodes and Namespaces are now used that do not require coordination with each other. Datanodes are ...

Read More »

Hadoop MapReduce Concepts

apache-hadoop-logo

What do you mean by Map-Reduce programming? MapReduce is a programming model designed for processing large volumes of data in parallel by dividing the work into a set of independent tasks. The MapReduce programming model is inspired by functional languages and targets data-intensive computations. The input data format is application-specific, and is specified by the user. The output is a ...

Read More »

MapReduce Algorithms – Understanding Data Joins Part II

apache-hadoop-logo

It’s been awhile since I last posted, and like last time I took a big break, I was taking some classes on Coursera. This time it was Functional Programming Principals in Scala and Principles of Reactive Programming. I found both of them to be great courses and would recommend taking either one if you have the time. In this post ...

Read More »

Coordination and service discovery with Apache Zookeeper

apache-hadoop-logo

Service-oriented design has proven to be a successful solution for a huge variety of different distributed systems. When used properly, it has a lot of benefits. But as number of services grows, it becomes more difficult to understand what is deployed and where. And because we are building reliable and highly-available systems, yet another question to ask: how many instances ...

Read More »
Want to take your Java Skills to the next level?
Grab our programming books for FREE!
  • Save time by leveraging our field-tested solutions to common problems.
  • The books cover a wide range of topics, from JPA and JUnit, to JMeter and Android.
  • Each book comes as a standalone guide (with source code provided), so that you use it as reference.
Last Step ...

Where should we send the free eBooks?

Good Work!
To download the books, please verify your email address by following the instructions found on the email we just sent you.