Home » Java » Enterprise Java » Local installation of standalone HBase and Apache Storm simple cluster

About Adrianos Dadis

Adrianos Dadis
Adrianos is working as senior software engineer in telcos business domain. Particularly interested in enterprise integration, multi-tier architecture and middleware services. He mainly works with Weblogic, JBoss, Java EE, Spring, Drools, Oracle SOA Suite and various ESBs.

Local installation of standalone HBase and Apache Storm simple cluster

We mainly use Apache Storm for streaming processing and Apache HBase as NoSQL wide-column database.

Even if Apache Cassandra is a great NoSQL database, we mostly prefer HBase because of Cloudera distribution and as it is more consistent (check CAP theorem) than Cassandra.

HBase is based on HDFS, but it can be easy installed as standalone for testing purposes. You just need to download latest version, extract compressed file, start standalone node and then start an HBase shell and play.

$> tar zxvf hbase-1.1.2-bin.tar.gz
$> cd hbase-1.1.2/bin/
$> ./start-hbase.sh
$> ./hbase shell
hbase(main):001:0> create 'DummyTable', 'cf'
hbase(main):001:0> scan 'DummyTable'

When you start HBase in standalone mode, then it automatically starts a local Zookeeper node too (running in default port 2181).

$> netstat -anp|grep 2181

Zookeeper is used by HBase and Storm as a distributed coordinator mechanism. Now, as you have already running a local Zookeeper node, then you are ready to configure and run a local Storm cluster.

  • Download latest Storm
  • Extract
  • Configure “STORM_HOME/conf/storm.yaml” (check below)
  • Start local cluster:
    • $> cd STORM_HOME/bin
    • $> ./storm nimbus
    • $> ./storm supervisor
    • $> ./storm ui
  • Logs are located at “STORM_HOME/logs/” directory
  • Check local Storm UI at: localhost:8080

Contents of new “storm.yaml” configuration file:

storm.zookeeper.servers:
- "localhost"

nimbus.host: "localhost"

supervisor.slots.ports:
- 6701
- 6702

You can also set parameter “worker.childopts” to set JVM options for each Worker (processing nodes). Here is a simple example for my local JVMs, where I set min/max heap size, garbage collection strategy, enable JXM and GC logs.

worker.childopts: "-server -Xms512m -Xmx2560m -XX:PermSize=128m -XX:MaxPermSize=512m -XX:+UseParallelOldGC -XX:ParallelGCThreads=3 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc -Xloggc:/tmp/gc-storm-worker-%ID%.log -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=1%ID% -XX:+PrintFlagsFinal -Djava.awt.headless=true -Djava.net.preferIPv4Stack=true"

Parameter “worker.childopts” is loaded by all the Worker JVM nodes. Variable “%ID%” corresponds to port (6701 or 6702) assigned to each Worker. As you can see, I have used it to enable different JMX port for each worker and different GC log file.

We are using Storm using JDK 7, but JDK 8 seems to be compatible too. Latest Storm has switched from Logback to Log4j2 (check full release notes here and here).

Using the above instructions, you will be able to run HBase and Storm mini cluster in your laptop without any problem.

(0 rating, 0 votes)
You need to be a registered member to rate this.
Start the discussion Views Tweet it!
Do you want to know how to develop your skillset to become a Java Rockstar?
Subscribe to our newsletter to start Rocking right now!
To get you started we give you our best selling eBooks for FREE!
1. JPA Mini Book
2. JVM Troubleshooting Guide
3. JUnit Tutorial for Unit Testing
4. Java Annotations Tutorial
5. Java Interview Questions
6. Spring Interview Questions
7. Android UI Design
and many more ....
I agree to the Terms and Privacy Policy

Leave a Reply

avatar

This site uses Akismet to reduce spam. Learn how your comment data is processed.

  Subscribe  
Notify of