Home » Tag Archives: Big Data (page 10)

Tag Archives: Big Data

Big Data Open Source Security

In security there has never (IMHO) been enough open source solutions and Bruce Schneier has written about this several times in the past, and there’s no need to rewrite the arguments again. Now with “NoSQL” and “Big Data” Open Source trends in the market place Security finally has an intersection… a union if I may where new solutions to solve ...

Read More »

Distributed System Development Considerations

There are a number of factors to take into account while developing distributed software systems. If you don’t even know what I am talking about in the first sentence then let me give you some insight, examples and for instances of what distributed systems are. Overview A distributed system is when multiple physical hardware devices interact with separate and discrete users and collaborate together through these ...

Read More »

Setting up Apache Hadoop Multi – Node Cluster

We are sharing our experience about Apache Hadoop Installation in Linux based machines (Multi-node). Here we will also share our experience about different troubleshooting also and make update in future. User creation and other configurations step – We start by adding a dedicated Hadoop system user in each cluster.       $ sudo addgroup hadoop $ sudo adduser –ingroup hadoop ...

Read More »

Running Map-Reduce Job in Apache Hadoop (Multinode Cluster)

We will describe here the process to run MapReduce Job in Apache Hadoop in multinode cluster. To set up Apache Hadoop in Multinode Cluster, one can read Setting up Apache Hadoop Multi – Node Cluster. For setting up we have to configure the hadoop with the following in each machine: Add the following property in conf/mapred-site.xml in all the nodes ...

Read More »

Hadoop setup on single node and multi node

We will describe Hadoop setup on single node and multi node. The Hadoop  environment setup and configuration will be described in details. First you need to download the following software (rpm). Java JDK RPM Apache Hadoop 0.20.204.0 RPM A)  Single node system Hadoop setup 1) Install JDK on a Red Hat or CentOS 5+ system.   $ ./jdk-6u26-linux-x64-rpm.bin.sh Java is ...

Read More »

What is Big Data – Theory to Implementation

What is Big Data? You may ask; and more importantly why it is the latest trend in nearly every business domain? Is it just a hype or its here to stay? As a matter of fact “Big Data” is a pretty straightforward term – its just what its says – a very large data-set. How large? The exact answer is ...

Read More »

Monitoring S3 uploads for a real time data

If you are working on Big Data and its bleeding edge technologies like Hadoop etc., the primary thing you need is a “dataset” to work on. So, this data can be reviews, blogs, news, social media data (Twitter, Facebook etc), domain specific data, research data, forums, groups, feeds, fire hose data etc. Generally, companies reach the data vendors to fetch ...

Read More »

Hope vs. Motivation: Why Big Data needs empathy and emotion

Because, – says Om Malik, one of the most extraordinary thinkers on Silicon Valley and the  founder of GIGAOM – The problem with data is that the way it is used today, it lacks empathy and emotion. Data is used like a blunt instrument, a scythe trying to cut and tailor a cashmere sweater… He concludes The idea of combining data, ...

Read More »

Big Data 2013 Predictions

If you just invested a lot of money in a Big Data solution from any of the traditional BI vendors (Teradata, IBM, Oracle, SAS, EMC, HP, etc.) then you are likely to see a sub-optimal ROI in 2013. Several innovations will come in 2013 that will change the value of Big Data exponentially. Other technology innovations are just waiting for ...

Read More »