Home » Tag Archives: Big Data (page 4)

Tag Archives: Big Data

Drones and Big Data

software-development-logo

Two weeks ago, I had a conversation with some colleagues where I was postulating a future bull market for drones, as I envisioned a number of commercial applications (food service, surveillance, etc). Coincidentally, this topic has gained major momentum since Amazon’s disclosure of a drone R&D project for goods delivery on 60 Minutes this week. Suddenly, everyone has a drone ...

Read More »

Creating an on-line recommender system with Apache Mahout

apache-mahout-logo

Recently we’ve been implementing a recommender system for Yap.TV: you can see it in action after installing the app and going to the “Just for you” tab. We’re using Apache Mahout as the base for doing recommendations. Mahout is a “scalable machine learning library” and contains both local and distributed implementations of user- and item- based recommenders using collaborative filtering ...

Read More »

Unit testing a Java Hadoop job

apache-mrunit-logo

In my previous post I showed how to setup a complete Maven based project to create a Hadoop job in Java. Of course it wasn’t complete because it is missing the unit test part . In this post I show how to add MapReduce unit tests to the project I started previously. For the unit test I make use of ...

Read More »

Broken Glass : Diagnosing Production Cassandra Issues

apache-cassandra-logo

I just past my second year anniversary at Health Market Science (HMS), and we’ve been working with Cassandra for almost the entirety of my career here.   In that time, we have had remarkably few problems with it.  Like few other technologies I’ve worked with, Cassandra “just works”. But, as with *every* technology I’ve ever worked with, you eventually have ...

Read More »

ReSQL?

nosqlunit-logo

The NoSQL moniker that was coined circa 2009 marked a move from the “traditional” relational model. There were quite a few non-relational databases around prior to 2009, but in the last few years we’ve seen an explosion of new offerings (you can see,for example, the “NoSQL landscape” in a previous post I made). Generally speaking, and everything here is a wild ...

Read More »

Big Data Open Source Security

apache-hadoop-logo

In security there has never (IMHO) been enough open source solutions and Bruce Schneier has written about this several times in the past, and there’s no need to rewrite the arguments again. Now with “NoSQL” and “Big Data” Open Source trends in the market place Security finally has an intersection… a union if I may where new solutions to solve ...

Read More »

Distributed System Development Considerations

apache-hadoop-logo

There are a number of factors to take into account while developing distributed software systems. If you don’t even know what I am talking about in the first sentence then let me give you some insight, examples and for instances of what distributed systems are. Overview A distributed system is when multiple physical hardware devices interact with separate and discrete users and collaborate together through these ...

Read More »

Setting up Apache Hadoop Multi – Node Cluster

apache-hadoop-logo

We are sharing our experience about Apache Hadoop Installation in Linux based machines (Multi-node). Here we will also share our experience about different troubleshooting also and make update in future. User creation and other configurations step – We start by adding a dedicated Hadoop system user in each cluster.       $ sudo addgroup hadoop $ sudo adduser –ingroup hadoop ...

Read More »

Running Map-Reduce Job in Apache Hadoop (Multinode Cluster)

apache-hadoop-logo

We will describe here the process to run MapReduce Job in Apache Hadoop in multinode cluster. To set up Apache Hadoop in Multinode Cluster, one can read Setting up Apache Hadoop Multi – Node Cluster. For setting up we have to configure the hadoop with the following in each machine: Add the following property in conf/mapred-site.xml in all the nodes ...

Read More »

Hadoop setup on single node and multi node

apache-hadoop-logo

We will describe Hadoop setup on single node and multi node. The Hadoop  environment setup and configuration will be described in details. First you need to download the following software (rpm). Java JDK RPM Apache Hadoop 0.20.204.0 RPM A)  Single node system Hadoop setup 1) Install JDK on a Red Hat or CentOS 5+ system.   $ ./jdk-6u26-linux-x64-rpm.bin.sh Java is ...

Read More »
Want to take your Java Skills to the next level?
Grab our programming books for FREE!
  • Save time by leveraging our field-tested solutions to common problems.
  • The books cover a wide range of topics, from JPA and JUnit, to JMeter and Android.
  • Each book comes as a standalone guide (with source code provided), so that you use it as reference.
Last Step ...

Where should we send the free eBooks?

Good Work!
To download the books, please verify your email address by following the instructions found on the email we just sent you.