Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you our best selling eBooks for FREE!

1. JPA Mini Book

2. JVM Troubleshooting Guide

3. JUnit Tutorial for Unit Testing

4. Java Annotations Tutorial

5. Java Interview Questions

and many more ....

Featured FREE Whitepapers

What's New Here?


Guava’s EventBus – Simple Publisher/Subscriber

Looking over recent additions to Google’s Guava Libraries Release 10 I noticed the addition of EventBus. This is a lightweight implementation of a publish-subscribe style messaging system. This is similar to the publish-subscribe model provided by JMS, however the messages remain within the application rather than being broadcast externally. EventBus allows you to create streams within your program to which objects can subscribe; they will then receive messages published to those streams. Although this inter-object communication is not particularly difficult to recreate using patterns such as singletons, EventBus does provide a particularly simple and lightweight mechanism. Singletons also make having multiple event buses of a single type more difficult, and   are hard to test. As an example I am going to create a simple multi-user chat program using sockets that several people will connect to via telnet. We will simply create an EventBus which will serve as a channel. Any messages that a user sends to the system will be published to all the other users. So here is our UserThread object: class UserThread extends Thread { private Socket connection; private EventBus channel; private BufferedReader in; private PrintWriter out;public UserThread(Socket connection, EventBus channel) { this.connection = connection; = channel; try { in = new BufferedReader(new InputStreamReader(connection.getInputStream())); out = new PrintWriter(connection.getOutputStream(), true); } catch (IOException e) { e.printStackTrace(); System.exit(1); } }@Subscribe public void recieveMessage(String message) { if (out != null) { out.println(message); } }@Override public void run() { try { String input; while ((input = in.readLine()) != null) {; } } catch (IOException e) { e.printStackTrace(); }//reached eof channel.unregister(this) try { connection.close(); } catch (IOException e) { e.printStackTrace(); } in = null; out = null; } } As can be seen this is just a simple threaded object that contains the EventBus that serves as a channel, and the user’s Socket. The run method then simply reads the socket and sends the message to the channel by calling the post method on the EventBus. Receiving messages is then implemented by adding a public method with the @Subscribe annotation (see above). This signals the EventBus to call this method upon receiving a message of the type given in the method argument. Here I am sending Strings however other objects can be used. GOTCHA: The method annotated with @Subscribe MUST be public. The receive function takes the message and writes it out to the user’s connection. This will of course also ping back the message that has been sent to the original user as the UserThread object will itself receive the message that it published. All that is left is to create a simple server object that listens for connections and creates UserThread objects as needed. public class EventBusChat { public static void main(String[] args) { EventBus channel = new EventBus(); ServerSocket socket; try { socket = new ServerSocket(4444); while (true) { Socket connection = socket.accept(); UserThread newUser = new UserThread(connection, channel); channel.register(newUser); newUser.start(); } } catch (IOException e) { e.printStackTrace(); } } } As shown this creates the channel, accepts user connections and registers them to the EventBus. The important code to notice here is the call to the register method with the UserThread object as an argument. This call subscribes the object on the EventBus, and indicates that it can process messages. Once the server is started users can then connect to the chat server with the telnet command: telnet 4444 And if you connect multiple instances you will see any message sent being relayed to the other instances. Having viewed this example you may be wondering what use an EventBus has. A very good example could be when maintaining a very loose coupling between a user interface and backend code. User input would generate a message such as resize, lost focus or closing down. Back end components could then simply subscribe to these events and deal with them appropriately. The official documentation lists many other uses as well. NB: EventBus isn’t meant for general purpose publisher-subscriber communication, this is just an example of the how the API interacts. Original:   Reference: Guava’s EventBus – Simple Publisher/Subscriber from our JCG partner Andriy Andrunevchyn at the Java User Group of Lviv blog. ...

Clojure macros for beginners

This article will guide you step-by-step (or even character-by-character) through the process of writing macros in Clojure. I will focus on fundamental macro characteristics while explaining what happens behind the scenes. Imagine you are about to write an assertions library for Clojure, similar to FEST Assertions, ScalaTest assertions or Hamcrest. Of course there are such existing but this is just for educational purposes. What we essentially need first is a assert-equals function used like this:   (assert-equals (count (filter even? primes)) 1) Of course this is more than trivial: (defn assert-equals [actual expected] (when-not (= actual expected) (throw (AssertionError. (str "Expected " expected " but was " actual))))) Quick test with incorrectly defined primes vector: user=> (def primes [0 2 3 5 7 11]) #'user/primes user=> (assert-equals (count (filter even? primes)) 1) AssertionError Expected 1 but was 2 Cool, but imagine this test failing on CI server or seeing this in your terminal. There is no context, maybe you’ll get test name if you’re lucky. “Expected 1 but was 2” tells us nothing about the nature or root cause of the problem. Wouldn’t it be great to see: AssertionError Expected '(count (filter even? primes))' to be 1 but was 2 You see this? Assertion error now gives us full expression that yielded incorrect result. We can see from the very first second what the issue can be. However, there is a problem. Big one. By the time we are throwing AssertionError, original expression is lost. We got actual value as an argument and we have no idea where did that value came from. It could have been a constant, result of expression like (count (filter even? primes)) or even a random value. Function arguments are computed eagerly and there is no way to access code that produced these arguments. Entering macros Macros and functions in Clojure are not independent or orthogonal. In fact, they are almost the same:Functions execute at run time, they take and produce data (values). Conceptually one can replace every (pure) function invocation with its value. Macros execute at compile time, they take and produce code. Conceptually one can replace (expand) every occurrence of macro with its value.Not that much different? Moreover since Clojure is homoiconic, Clojure code can be represented as Clojure data structures. In other words both functions and macros accept data, but in case of macros it’s more often to see Clojure source represented using data structures like lists. What does it all mean and how can it help us? Let’s jump straight into writing our first (incorrect) macro and improve it step-by-step to finally achieve desired result. To keep samples focused I skip throwing an AssertionError and leave only equality condition: user=> (defmacro assert-equals [actual expected] (= expected actual)) #'user/assert-equals user=> (assert-equals 2 2) true user=> (assert-equals 2 3) false Works? In fact we are very far from having a correct version: user=> (assert-equals (inc 5) 6) false user=> (def x 1) #'user/x user=> (assert-equals (+ x 2) 3) false 1 + 2 is definitely equal to 3, yet it returns false. In order to appreciate this behaviour and call it “ feature” rather than “ bug” we must deeply understand what just happened. Remember, macros are executed at compile time, right? And they are almost ordinary functions. So, the compiler executes assert-equals. However during compilation it can’t possibly know the values of variables like x, therefore it can’t eagerly evaluate macro arguments. We don’t even want that, as you see later. Instead the compiler passes Clojure code, literally. The actual parameter is (inc 5) – literally, Clojure list holding two elements: inc symbol and 5 number. That’s all there is to it. expected is just a number. This means that inside macro we have full access to Clojure source code enclosed by that macro. So maybe you can now guess what happens. Clojure compiler executes macro definition, that is (= expected actual). As far as the compiler is concerned, actual is a list (inc 5) while expected is a number 6. List can never possibly be equal to a number. Thus macro returns false, just like any other function can return it. Later on Clojure compiler replaces (assert-equals (inc 5) 6) expression with the outcome of macro, which happens to be… false. We said before that macro should return valid Clojure code (represented using Clojure data structures). false is valid Clojure code! Now we know that instead of evaluating (= expected actual) by the compiler (after all, we don’t want the compiler to run our assertions, we only want to compile them!) we simply want to return code that represents this assertion. It’s not that hard! (defmacro assert-equals [actual expected] (list '= expected actual)) Now our macro returns result of evaluating (list '= expected actual) expression. The result happens to be… (= expected actual). That’s right, it looks like valid Clojure code, again. Extra quote ('=) was added so that = is interpreted as raw symbol rather than a function reference. Let’s take it for a test drive: user=> (assert-equals (inc 5) 6) true user=> (macroexpand '(assert-equals (inc 5) 6)) (= 6 (inc 5)) macroexpand and macroexpand-1 are your weapons of choice when debugging macros. Here you see that (assert-equals (inc 5) 6) is actually being replaced by (= 6 (inc 5)). This process happens at compile time, macros don’t exist at runtime. In your compiled code you are left with (= 6 (inc 5)). OK, so let’s restore the full functionality of throwing AssertionError. As you know by now, our macro should return Clojure code that includes equality check and throwing an exception. This becomes a bit unwieldy: (defmacro assert-equals [actual expected] (list 'when-not (list '= actual expected) (list 'throw (list 'AssertionError. (list 'str "Expected " expected " but was " actual))))) Notice how every single symbol has to be escaped ( 'when-not, 'throw, 'AssertionError., …), otherwise compiler will try to evaluate it at compile time. Moreover list in Clojure denotes function call so we must proceed every list literal with (list ...) function call. If you are not that familiar with Clojure: (list 1 2) returns list of (1 2) while (1 2) will throw an exception since 1 number is not a function. Ugly or not, it works: user=> (assert-equals (inc 5) 6) nil user=> (assert-equals 5 6) AssertionError Expected 6 but was 5 We barely reproduced what original assert-equals function was doing and the first commandment of writing macros is: don’t write macros if function is sufficient. But before we go further, let us clean up what we have so far. Typical macro definition consists of lots of Clojure code that has to be escaped and not that much live values like actual and expected in our case. So there is a smart default – instead of quoting everything except few items, quote everything upfront and selectively unquote things. This is called syntax – quoting (using ` character) and unquoting is done via ~ operator. Look carefully: we syntax quote whole result and selectively unquote what was previously not quoted: (defmacro assert-equals [actual expected] `(when-not (= ~actual ~expected) (throw (AssertionError. (str "Expected " ~expected " but was " ~actual))))) This is equivalent to previous definition but looks much better, almost entirely like valid Clojure code. Let’s employ macroexpand-1 to see how our macro is expanded during compilation. macroexpand would work as well, but since when-not is also a macro (!) it would be recursively expanded, cluttering output: user=> (macroexpand-1 '(assert-equals (inc 5) 6)) (when-not (= (inc 5) 6) (throw (java.lang.AssertionError. (str "Expected " 6 " but was " (inc 5))))) It’s like templating language embedded within that language! Notice how (inc 5) piece of code was inserted instead of ~actual twice. Keep that in mind. Also experiment by removing unquote (~) symbol here or there. Use macroexpand-1 to figure out what is going on. Remember, our ultimate goal was to show actual expression in its full glory, not only its value. (AssertionError. (str "Expected '???' to be " ~expected " but was " actual-value#)))))) What should we put in place of ??? to print “ (inc 5)” string. We know that value of actual is not 6 but a list with two items: (inc 5). Can we somehow quote that list again so that it no longer evaluates at run-time but instead is treated as a data structure? Of course, we know how to quote things! (defmacro assert-equals [actual expected] `(let [~'actual-value ~actual] (when-not (= ~'actual-value ~expected) (throw (AssertionError. (str "Expected '" '~actual "' to be " ~expected " but was " ~'actual-value)))))) '~actual, oh dear! quote unquote actual. This translates to '(inc 5). And that’s it! Look how descriptive assertion error messages are: user=> (assert-equals (inc 5) 5) AssertionError Expected '(inc 5)' to be 5 but was 6 user=> (assert-equals (count (filter even? primes)) 1) AssertionError Expected '(count (filter even? primes))' to be 1 but was 2 Expanding this macro manually reveals how it is translated by the compiler (edited to improve readability): user=> (macroexpand-1 '(assert-equals (inc 5) 5)) (when-not (= (inc 5) 5) (throw (java.lang.AssertionError. (str "Expected '" (quote (inc 5)) "' to be " 5 " but was " (inc 5))))) There is really no magic here, we could have written that ourselves. But macros avoid lots of repetitive work. Bindings in macros Our solution so far has one major issue. Imagine we are testing impure or slow function like this: (def question "Answer to the Ultimate Question of Life, The Universe, and Everything") (defn answer [q] (do (println "Computing for 7½ million years...") 41)) As you can see it returns wrong result, which can be easily proved in a unit test: user=> (assert-equals (answer question) 42) Computing for 7½ million years... Computing for 7½ million years... AssertionError Expected '(answer question)' to be 42 but was 41 The error message is fine, but notice that “ Computing...” statement was printed twice. Clearly because impure answer function was called twice as well. Macro expansion reveals why: user=> (macroexpand-1 '(assert-equals (answer question) 42)) (when-not (= (answer question) 42) (throw (java.lang.AssertionError. (str "Expected '" (quote (answer question)) "' to be " 42 " but was " (answer question))))) (answer question) appears twice (not counting quoted one), once during comparison and second time when we generate assertion message. This is rarely desired, especially when function under test has side effects. The solution is simple: precompute (answer question) once, store it somewhere and reference when needed. But there is a twist: declaring let bindings inside macros is tricky. Sometimes you might hit unexpected name shadowing and overriding when names of variables inside macro collide with the ones used in user code. Not going into much detail, using (gensym) or convenient # suffix is enough to keep our macros safe. In both cases Clojure compiler will produce unique names making sure they don’t collide. Our final solution looks like this: (defmacro assert-equals [actual expected] `(let [actual-value# ~actual] (when-not (= actual-value# ~expected) (throw (AssertionError. (str "Expected '" '~actual "' to be " ~expected " but was " actual-value#)))))) This time actual-value# binding is used to compute actual only once: user=> (macroexpand-1 '(assert-equals (answer question) 42)) (let [actual-value__264__auto__ (answer question)] (when-not (= actual-value__264__auto__ 42) (throw (java.lang.AssertionError. (str "Expected '" (quote (answer question)) "' to be " 42 " but was " actual-value__264__auto__))))) Extra suffix replacing # symbol makes sure actual-value is not colliding with any other symbol. Summary Our assert-equals macro is not the most comprehensive one, just like this tutorial. But it gives you some impression of what macros can do and how they work. If you need further resources, check out this great macro tutorial (part 2 and 3). If you like the idea of enhanced assertions, Power Assertions in Groovy are even more comprehensive. But I bet this behaviour can be reproduced in Clojure macros!   Reference: Clojure macros for beginners from our JCG partner Tomasz Nurkiewicz at the Java and neighbourhood blog. ...

Setting up Apache Hadoop Multi – Node Cluster

We are sharing our experience about Apache Hadoop Installation in Linux based machines (Multi-node). Here we will also share our experience about different troubleshooting also and make update in future. User creation and other configurations step -We start by adding a dedicated Hadoop system user in each cluster.      $ sudo addgroup hadoop $ sudo adduser –ingroup hadoop hduserNext we configure the SSH (Secure Shell) on all the cluster to enable secure data communication.user@node1:~$ su – hduser hduser@node1:~$ ssh-keygen -t rsa -P “” The output will be something like the following: Generating public/private rsa key pair. Enter file in which to save the key (/home/hduser/.ssh/id_rsa): Created directory '/home/hduser/.ssh'. Your identification has been saved in /home/hduser/.ssh/id_rsa. Your public key has been saved in /home/hduser/.ssh/ The key fingerprint is: 9b:82:ea:58:b4:e0:35:d7:ff:19:66:a6:ef:ae:0e:d2 hduser@ubuntu .....Next we need to enable SSH access to local machine with this newly created key:hduser@node1:~$ cat $HOME/.ssh/ >> $HOME/.ssh/authorized_keys Repeat the above steps in all the cluster nodes and test by executing the following statement hduser@node1:~$ ssh localhost This step is also needed to save local machine’s host key fingerprint to the hduser user’s known_hosts file. Next we need to edit the /etc/hosts file in which we put the IPs and Name of each system in the cluster. In our scenario we have one master (with IP and one slave (with IP $ sudo vi /etc/hosts and we put the values into the host file as key value pair. master slaveProviding the SSH AccessThe hduser user on the master node must be able to connectto its own user account on the master via ssh master in this context not necessarily ssh localhost. to the hduser account of the slave(s) via a password-less SSH login.So we distribute the SSH public key of hduser@master to all its slave, (in our case we have only one slave. If you have more execute the following statement changing the machine name i.e. slave, slave1, slave2). hduser@master:~$ ssh-copy-id -i $HOME/.ssh/ hduser@slave Try by connecting master to master and master to slave(s) and check if everything is fine. Configuring HadoopLet us edit the conf/masters (only in the masters node)and we enter master into the file. Doing this we have told Hadoop that start Namenode and secondary NameNodes in our multi-node cluster in this machine. The primary NameNode and the JobTracker will always be on the machine we run bin/ and bin/ us now edit the conf/slaves(only in the masters node) withmaster slave This means that, we try to run datanode process on master machine also – where the namenode is also running. We can leave master to act as slave if we have more machines as datanode at our disposal. if we have more slaves, then to add one host per line like the following: master slave slave2 slave3 etc…. Lets now edit two important files (in all the nodes in our cluster):conf/core-site.xml conf/core-hdfs.xml1) conf/core-site.xml We have to change the fs.default.parameter which specifies NameNode host and port. (In our case this is the master machine) <property><name></name> <value>hdfs://master:54310</value>…..[Other XML Values]</property> Create a directory into which Hadoop will store its data – $ mkdir /app/hadoop We have to ensure the directory is writeable by any user: $ chmod 777 /app/hadoop Modify core-site.xml once again to add the following property: <property> <name>hadoop.tmp.dir</name> <value>/app/hadoop</value> </property> 2) conf/core-hdfs.xml We have to change the dfs.replication parameter which specifies default block replication. It defines how many machines a single file should be replicated to before it becomes available. If we set this to a value higher than the number of available slave nodes (more precisely, the number of DataNodes), we will start seeing a lot of “(Zero targets found, forbidden1.size=1)” type errors in the log files. The default value of dfs.replication is 3. However, as we have only two nodes available (in our scenario), so we set dfs.replication to 2. <property> <name>dfs.replication</name> <value>2</value> …..[Other XML Values] </property>Let us format the HDFS File System via NameNode.Run the following command at master bin/hadoop namenode -formatLet us start the multi node cluster:Run the command: (in our case we will run on the machine named as master) bin/ Checking of Hadoop Status - After everything has started run the jps command on all the nodes to see everything is running well or not. In master node the desired output will be  – $ jps14799 NameNode 15314 Jps 14880 DataNode 14977 SecondaryNameNode In Slave(s): $ jps 15314 Jps 14880 DataNode Ofcourse the Process IDs will vary from machine to machine. Troubleshooting It might be possible that Datanode might not get started in all our nodes. At this point if we see the logs/hadoop-hduser-datanode-.log on the effected nodes with the exception – Incompatible namespaceIDs In this case we need to do the following –Stop the full cluster, i.e. both MapReduce and HDFS layers. Delete the data directory on the problematic DataNode: the directory is specified by in conf/hdfs-site.xml. In our case, the relevant directory is /app/hadoop/tmp/dfs/data Reformat the NameNode. All HDFS data will be lost during the format perocess. Restart the cluster.Or We can manually update the namespaceID of problematic DataNodes:Stop the problematic DataNode(s). Edit the value of namespaceID in ${}/current/VERSION to match the corresponding value of the current NameNode in ${}/current/VERSION. Restart the fixed DataNode(s).In Running Map-Reduce Job in Apache Hadoop (Multinode Cluster), we will share our experience about Map Reduce Job Running as per apache hadoop example. Resources  Reference: Setting up Apache Hadoop Multi – Node Cluster from our JCG partner Piyas De at the Phlox Blog blog. ...

Android ListView context menu: ActionMode.CallBack

In this post we want to analyze the context menu (contextual action bar). This is a menu that is related to a specific item. The contextual menu can be applied to almost all views but it is usually used with ListView. We talk a lot about List view, because it is one of the most important component. We can distinguish two different type of contextual menu:Floating menu Contextual action mode (ActionMode)The floating menu is used with Android version lower than 3.0 (API level 11). It is essentially a menu that appears when an user long click on an ListView item. You can   find an example here. It looks like the image shown below: The contextual action mode is introduced in Android 3.0 or higher and it is essentially a contextual bar that appears on the top when user long clicks an item. According to Android guides this kind of menu is better than the floating menu. In this post we want to analyze how we can create this menu.Create contextual action Mode: Define ActionMode.CallBack interface To create a contextual menu we have first to define a ActionMode.CallBack interface. This interface is called when an user long clicks on an ListView item. The code looks like: private ActionMode.Callback modeCallBack = new ActionMode.Callback() {public boolean onPrepareActionMode(ActionMode mode, Menu menu) return false; }public void onDestroyActionMode(ActionMode mode) { mode = null; }public boolean onCreateActionMode(ActionMode mode, Menu menu) { return true; }public boolean onActionItemClicked(ActionMode mode, MenuItem item) { } }; We are interested on line 11 and line 15. The first one is where we will create our contextual action bar on the top of the screen and in line 15 is where we handle the logic when user chooses one of our menu item. The first thing we have to do is creating our menu. For simplicity we can suppose we have just two menu items, then we define a file under res/menu called activity_main.xml: <menu xmlns:android=""> <item android:id="@+id/edit" android:icon="@android:drawable/ic_menu_edit"/><item android:id="@+id/delete" android:icon="@android:drawable/ic_menu_delete"/></menu> Now we have our menu and we simply have to “inject” it in the onCreateActionMode method. public boolean onCreateActionMode(ActionMode mode, Menu menu) { mode.setTitle("Options"); mode.getMenuInflater().inflate(, menu); return true; } Now we have to show this contextual action bar when user long clicks on an item. ActionMode and Long Click: onItemLongClickListener If we want to show this contextual bar when user long clicks we have simply set a listener on our ListView, that we call lv in the source code. So we have: lv.setOnItemLongClickListener(new AdapterView.OnItemLongClickListener() { public boolean onItemLongClick (AdapterView parent, View view, int position, long id) { System.out.println("Long click"); startActionMode(modeCallBack); view.setSelected(true); return true; } }); In line 4 we simply start the contextual menu using startActionMode method. Now the result is:As you can see in the top we have our contextual action bar. Contextual menu item selection Now let’s suppose we an user clicks on a menu item. How do we handle this event? Well if we come back at ActionMode.CallBack we have to implement another method onActionItemClicked. So we have: public boolean onActionItemClicked(ActionMode mode, MenuItem item) {int id = item.getItemId(); switch (id) { case { aAdpt.remove( aAdpt.getItem(aAdpt.currentSelection) ); mode.finish(); break; } case { System.out.println(" edit "); break; } default: return false;} In line 6 we simply remove from our adapter the selected item. To know the position of the selected item inside the ListView we store it in the OnItemLongClickListener method. aAdpt.currentSelection = position; When we finish handling user menu item selection we have to dismiss the contextual action bar callig mode.finish (line 7).   Reference: Android ListView context menu: ActionMode.CallBack from our JCG partner Francesco Azzola at the Surviving w/ Android blog. ...

Running Map-Reduce Job in Apache Hadoop (Multinode Cluster)

We will describe here the process to run MapReduce Job in Apache Hadoop in multinode cluster. To set up Apache Hadoop in Multinode Cluster, one can read Setting up Apache Hadoop Multi – Node Cluster. For setting up we have to configure the hadoop with the following in each machine:Add the following property in conf/mapred-site.xml in all the nodes      <property> <name>mapred.job.tracker</name> <value>master:54311</value><description>The host and port that the MapReduce job tracker runs at. If “local”, then jobs are run in-process as a single map and reduce task. </description> </property><property> <name>mapred.local.dir</name> <value>${hadoop.tmp.dir}/mapred/local</value> </property><property> <name></name> <value>20</value> </property><property> <name>mapred.reduce.tasks</name> <value>2</value> </property> N.B. The last three are additional setting, so we can omit them.The Gutenberg ProjectFor our demo purpose of MapReduce we will be using the WordCount example job which reads text files and counts how often words occur. The input is text files and the output is text files, each line of which contains a word and the count of how often it occurred, separated by a tab. Download the example inputs from the following sites, and all e-texts should be in plain text us-ascii encoding.The Outline of Science, Vol. 1 (of 4) by J. Arthur Thomson The Notebooks of Leonardo Da Vinci Ulysses by James Joyce The Art of War by 6th cent. B.C. Sunzi The Adventures of Sherlock Holmes by Sir Arthur Conan Doyle The Devil’s Dictionary by Ambrose Bierce Encyclopaedia Britannica, 11th Edition, Volume 4, Part 3Please google for those texts. Download each ebook as text files in Plain Text UTF-8 encoding and store the files in a local temporary directory of choice, for example /tmp/gutenberg. Check the files with the following command: $ ls -l /tmp/gutenberg/Next we start the dfs and mapred layer in our cluster$$ Check by issuing the following command jps to check the datanodes, namenodes, and task tracker, job tracker is all up and running fine in all the nodes.Next we will copy the local files(here the text files) to Hadoop HDFS$ hadoop dfs -copyFromLocal /tmp/gutenberg /Users/hduser/gutenberg$ hadoop dfs -ls /Users/hduser If the files are successfully copied we will see something like below – Found 2 items drwxr-xr-x – hduser supergroup 0 2013-05-21 14:48 /Users/hduser/gutenberg Furthermore we check our file system whats in /Users/hduser/gutenberg: $ hadoop dfs -ls /Users/hduser/gutenberg Found 7 items-rw-r--r-- 2 hduser supergroup 336705 2013-05-21 14:48 /Users/hduser/gutenberg/pg132.txt -rw-r--r-- 2 hduser supergroup 581877 2013-05-21 14:48 /Users/hduser/gutenberg/pg1661.txt -rw-r--r-- 2 hduser supergroup 1916261 2013-05-21 14:48 /Users/hduser/gutenberg/pg19699.txt -rw-r--r-- 2 hduser supergroup 674570 2013-05-21 14:48 /Users/hduser/gutenberg/pg20417.txt -rw-r--r-- 2 hduser supergroup 1540091 2013-05-21 14:48 /Users/hduser/gutenberg/pg4300.txt -rw-r--r-- 2 hduser supergroup 447582 2013-05-21 14:48 /Users/hduser/gutenberg/pg5000.txt -rw-r--r-- 2 hduser supergroup 384408 2013-05-21 14:48 /Users/hduser/gutenberg/pg972.txtWe start our MapReduce JobLet us run the MapReduce WordCount example: $ hadoop jar hadoop-examples-1.0.4.jar wordcount /Users/hduser/gutenberg /Users/hduser/gutenberg-output N.B.: Assuming that you are already in the HADOOP_HOME dir. If not then, $ hadoop jar ABSOLUTE/PATH/TO/HADOOP/DIR/hadoop-examples-1.0.4.jar wordcount /Users/hduser/gutenberg /Users/hduser/gutenberg-output Or if you have installed the Hadoop in /usr/local/hadoop then, hadoop jar /usr/local/hadoop/hadoop-examples-1.0.4.jar wordcount /Users/hduser/gutenberg /Users/hduser/gutenberg-output The output is follows something like: 13/05/22 13:12:13 INFO mapred.JobClient: map 0% reduce 0% 13/05/22 13:12:59 INFO mapred.JobClient: map 28% reduce 0% 13/05/22 13:13:05 INFO mapred.JobClient: map 57% reduce 0% 13/05/22 13:13:11 INFO mapred.JobClient: map 71% reduce 0% 13/05/22 13:13:20 INFO mapred.JobClient: map 85% reduce 0% 13/05/22 13:13:26 INFO mapred.JobClient: map 100% reduce 0% 13/05/22 13:13:43 INFO mapred.JobClient: map 100% reduce 50% 13/05/22 13:13:55 INFO mapred.JobClient: map 100% reduce 100% 13/05/22 13:13:59 INFO mapred.JobClient: map 85% reduce 100% 13/05/22 13:14:02 INFO mapred.JobClient: map 100% reduce 100% 13/05/22 13:14:07 INFO mapred.JobClient: Job complete: job_201305211616_0011 13/05/22 13:14:07 INFO mapred.JobClient: Counters: 26 13/05/22 13:14:07 INFO mapred.JobClient: Job Counters 13/05/22 13:14:07 INFO mapred.JobClient: Launched reduce tasks=3 13/05/22 13:14:07 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=118920 13/05/22 13:14:07 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 13/05/22 13:14:07 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 13/05/22 13:14:07 INFO mapred.JobClient: Launched map tasks=10 13/05/22 13:14:07 INFO mapred.JobClient: Data-local map tasks=10 13/05/22 13:14:07 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=54620 13/05/22 13:14:07 INFO mapred.JobClient: File Output Format Counters 13/05/22 13:14:07 INFO mapred.JobClient: Bytes Written=1267287 13/05/22 13:14:07 INFO mapred.JobClient: FileSystemCounters 13/05/22 13:14:07 INFO mapred.JobClient: FILE_BYTES_READ=4151123 13/05/22 13:14:07 INFO mapred.JobClient: HDFS_BYTES_READ=5882320 13/05/22 13:14:07 INFO mapred.JobClient: FILE_BYTES_WRITTEN=6937084 13/05/22 13:14:07 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=1267287 13/05/22 13:14:07 INFO mapred.JobClient: File Input Format Counters 13/05/22 13:14:07 INFO mapred.JobClient: Bytes Read=5881494 13/05/22 13:14:07 INFO mapred.JobClient: Map-Reduce Framework 13/05/22 13:14:07 INFO mapred.JobClient: Reduce input groups=114901 13/05/22 13:14:07 INFO mapred.JobClient: Map output materialized bytes=2597630 13/05/22 13:14:07 INFO mapred.JobClient: Combine output records=178795 13/05/22 13:14:07 INFO mapred.JobClient: Map input records=115251 13/05/22 13:14:07 INFO mapred.JobClient: Reduce shuffle bytes=1857123 13/05/22 13:14:07 INFO mapred.JobClient: Reduce output records=114901 13/05/22 13:14:07 INFO mapred.JobClient: Spilled Records=463427 13/05/22 13:14:07 INFO mapred.JobClient: Map output bytes=9821180 13/05/22 13:14:07 INFO mapred.JobClient: Total committed heap usage (bytes)=1567514624 13/05/22 13:14:07 INFO mapred.JobClient: Combine input records=1005554 13/05/22 13:14:07 INFO mapred.JobClient: Map output records=1005554 13/05/22 13:14:07 INFO mapred.JobClient: SPLIT_RAW_BYTES=826 13/05/22 13:14:07 INFO mapred.JobClient: Reduce input records=178795Retrieving the Job ResultTo read directly from hadoop without copying to local file system: $ hadoop dfs -cat /Users/hduser/gutenberg-output/part-r-00000 Let us copy the the results to the local file system though. $ mkdir /tmp/gutenberg-output$ bin/hadoop dfs -getmerge /Users/hduser/gutenberg-output /tmp/gutenberg-output$ head /tmp/gutenberg-output/gutenberg-output We will get a output as:   "'Ample.' 1 "'Arthur!' 1 "'As 1 "'Because 1 "'But,' 1 "'Certainly,' 1 "'Come, 1 "'DEAR 1 "'Dear 2 "'Dearest 1 "'Don't 1 "'Fritz! 1 "'From 1 "'Have 1 "'Here 1 "'How 2 The command fs -getmerge will simply concatenate any files it finds in the directory you specify. This means that the merged file might (and most likely will) not be sorted. Resources:  Reference: Running Map-Reduce Job in Apache Hadoop (Multinode Cluster) from our JCG partner Piyas De at the Phlox Blog blog. ...

Android: Navigation drawer with account picker – Google Drive SDK

This post describes how to create a navigation drawer with an account picker. Navigation drawer is a new UI pattern introduced in the last I/O. To do it I will use the new Google Service API with Drive SDK mixing them in order to achieve this goal. There are a lot of docs explaining how to create a navigation drawer, but what I want to explain here is how to add an account picker to it. If you look at Google Drive App you will notice that you can choose your account directly from the left drawer instead of picking it from the settings. In this post I will show how we can get the folder in our Google drive account selecting it from the left navigation drawer.      Set up the navigation drawer layout I won’t give into the details how to use navigation drawer because there are already too many docs talking about it, so this post assumes you are already familiar with this pattern. If you want to know more you can give a look at “Navigation Drawer”. Let’s setup our layout first, we need:A spinner to select our account The navigation drawer itemsAs we know already the drawer must be the first element in our layout so we have < xmlns:android="" android:id="@+id/drawer_layout" android:layout_width="match_parent" android:layout_height="match_parent"><FrameLayout android:id="@+id/content_frame" android:layout_width="match_parent" android:layout_height="match_parent"/><LinearLayout android:layout_height="match_parent" android:orientation="vertical" android:id="@+id/menu" android:layout_gravity="start" android:layout_width="240dp"><TextView android:layout_width="wrap_content" android:layout_height="wrap_content" android:text="Accounts" style="?android:attr/textAppearanceMedium" /><Spinner android:id="@+id/spinnerAccount" android:layout_width="wrap_content" android:layout_height="wrap_content" android:layout_alignParentLeft="true" android:layout_alignParentTop="true" android:layout_gravity="start" /><ListView android:id="@+id/left_drawer" android:layout_height="match_parent" android:choiceMode="singleChoice" android:divider="@android:color/transparent" android:dividerHeight="0dp" android:background="#AA111000" android:layout_gravity="start" android:layout_width="match_parent" /> </LinearLayout></> At line 7 we introduce a FrameLayout in order to handle the UI content dynamically. Instead of using the ListView directly after the FrameLayout (as in the official doc), we introduce a linear layout (line 12) to hold the spinner (line 25) and the menu items.Populate the spinner with accounts The next step is populating the spinner with the active account configured on the smartphone. To it, we can use the AccountManager and get the list of the accounts, in this way: AccountManager accMgr = AccountManager.get(this);Account[] accountList = accMgr.getAccounts(); final String[] accountNames = new String[accountList.length + 1]; int i=1; accountNames[0] = getResources().getString(R.string.infospinner);for (Account account : accountList) { String name =; accountNames[i++] = name; } Once we have our account list, we have simply to define a new adapter so that the spinner can be populated. ArrayAdapter<String> adp = new ArrayAdapter<String>(this, android.R.layout.simple_spinner_item, accountNames); spinner.setAdapter(adp); User account selection We have to handle the user account selection so that we can set up the right process to require the authorization if needed and set up Drive API correctly. Then we have: @Override public void onItemSelected(AdapterView<?> parent, View view, int position, long id) { System.out.println("Pos ["+position+"]"); if (position == 0) return ;String currentAccount = accountNames[position]; credential = GoogleAccountCredential.usingOAuth2(MainActivity.this, DriveScopes.DRIVE); credential.setSelectedAccountName(currentAccount); service = getDriveService(credential); AsyncAuth auth = new AsyncAuth(); auth.execute(""); } .... }); At line 8 we select the account chosen. Then we save the credential info chosen (line 9) by the user and start the async process to retrieve the user folder (line 12,13). Authorization and Google Drive Access Now we want to access to the Google Drive data and retrieve all the folder in the root directory. We know that to do it we have to be authenticated and authorized and this process is made by two different steps:Choose the account Authorize the accountThe first step is already done when we select our account using the spinner (see above), so we have to focus on the second step (the authorization). The first thing we have to do is trying to access to the drive data so that we can know if we are already authorized or we need an authorization. The process to access to the remote drive data uses an HTTP connection so we can’t handle it in our main thread, so that we have to create an async process using AsyncTask. private class AsyncAuth extends AsyncTask<String, Void, List<File>> {@Override protected List<File> doInBackground(String... params) { List<File> fileList = new ArrayList<File>(); try { Files.List request = service.files().list().setQ("mimeType = '" + MIME_FOLDER + "'");FileList files = request.execute(); fileList = files.getItems();} catch(UserRecoverableAuthIOException e) { startActivityForResult(e.getIntent(), REQUEST_AUTHORIZATION); } catch (IOException e1) { e1.printStackTrace(); }return fileList; }} As we can see, we simply try to retrieve user folder (line 7-10) and we catch UserRecoverableAuthIOException (line 13) to get informed if we need to be authorized. If we have to be authorized then we start a new Activity (line 14) asking the authorization request to the user.Now we have to handle the authorization request result so we implement onActivityResult in our main activity in this way: @Override protected void onActivityResult(int requestCode, int resultCode, Intent data) { switch (requestCode) { case REQUEST_AUTHORIZATION: System.out.println("Auth request"); if (resultCode == Activity.RESULT_OK) { AsyncAuth auth = new AsyncAuth(); auth.execute(""); }} } At line 6 if we have a positive result (RESULT_OK) then we start again the async process to retrieve user folders. Beautify the code By now we haven’t still used the FrameLayout, so it is time to use it. As we said at the beginning this frame is used to display some content in the UI. We can change this content dynamically using Fragments. We want to show a “wait” info while we are accessing to the user drive account and then the folder retrieved. To do it we need simply two simple Fragment:WaitFragment that shows the wait symbol while we access to the drive ListFragment that shows the folder listI won’t go into the details because they’re very simply I want to underline how to change the content in the FrameLayout. If we come back to our AsyncAuth we can notice it is implemented just the doBackgroud method. We have two other methods to exploit in this way: @Override protected void onPreExecute() { WaitFragment wf = new WaitFragment();FragmentManager manager = MainActivity.this.getFragmentManager(); FragmentTransaction trans = manager.beginTransaction(); trans.replace(, wf); trans.commit(); } This fragment is shown at the beginning informing the user to wait, and the second: @Override protected void onPostExecute(List<File> result) { FragmentManager manager = MainActivity.this.getFragmentManager(); FragmentTransaction trans = manager.beginTransaction();ListFragment lf = new ListFragment(); List<String> itemList = new ArrayList<String>(); for (File f : result) { itemList.add(f.getTitle()); } lf.setItemList(itemList);trans.replace(, lf); trans.commit();} At line we simply populate the listview adapter with the folder names.Source code availabe soon.   Reference: Android: Navigation drawer with account picker – Google Drive SDK from our JCG partner Francesco Azzola at the Surviving w/ Android blog. ...

Java EE CDI programmatic dependency disambiguation example – Injection Point inspection

In this tutorial we shall see how we can avoid programmatic dependency disambiguation when injecting Java EE CDI beans. We have already shown in the Jave EE dependency disambiguation example how to avoid dependency disambiguation in CDI beans. Here we shall show you how to avoid dependency disambiguation in a dynamic way. We will achieve this by inspecting the injection point of the bean that injects another bean’s implementation. The programmatic disambiguation with injection point inspection will be examined by creating a simple service with two implementations. Then we will create a Producer method to produce and inject both implementations in an application. Our preferred development environment is Eclipse. We are using Eclipse Juno (4.2) version, along with Maven Integration plugin version 3.1.0. You can download Eclipse from here and Maven Plugin for Eclipse from here. The installation of Maven plugin for Eclipse is out of the scope of this tutorial and will not be discussed. Tomcat 7 is the application server used. Let’s begin, 1. Create a new Maven project Go to File -> Project ->Maven -> Maven Project.In the “Select project name and location” page of the wizard, make sure that “Create a simple project (skip archetype selection)” option is unchecked, hit “Next” to continue with default values.Here the maven archetype for creating a web application must be added. Click on “Add Archetype” and add the archetype. Set the “Archetype Group Id” variable to "org.apache.maven.archetypes", the “Archetype artifact Id” variable to "maven-archetype-webapp" and the “Archetype Version” to "1.0". Click on “OK” to continue.In the “Enter an artifact id” page of the wizard, you can define the name and main package of your project. Set the “Group Id” variable to "com.javacodegeeks.snippets.enterprise" and the “Artifact Id” variable to "cdibeans". The aforementioned selections compose the main project package as "com.javacodegeeks.snippets.enterprise.cdibeans" and the project name as "cdibeans". Set the “Package” variable to "war", so that a war file will be created to be deployed to tomcat server. Hit “Finish” to exit the wizard and to create your project.The Maven project structure is shown below:It consists of the following folders: /src/main/java folder, that contains source files for the dynamic content of the application, /src/test/java folder contains all source files for unit tests, /src/main/resources folder contains configurations files, /target folder contains the compiled and packaged deliverables, /src/main/resources/webapp/WEB-INF folder contains the deployment descriptors for the Web application , the pom.xml is the project object model (POM) file. The single file that contains all project related configuration.2. Add all the necessary dependencies You can add the dependencies in Maven’s pom.xml file, by editing it at the “Pom.xml” page of the POM editor, as shown below:   pom.xml: <project xmlns="" xmlns:xsi="" xsi:schemaLocation=""> <modelVersion>4.0.0</modelVersion> <groupId>com.javacodegeeks.snippets.enterprise.cdi</groupId> <artifactId>cdibeans</artifactId> <packaging>war</packaging> <version>0.0.1-SNAPSHOT</version> <name>cdibeans Maven Webapp</name> <url></url> <dependencies> <dependency> <groupId>org.jboss.weld.servlet</groupId> <artifactId>weld-servlet</artifactId> <version>1.1.10.Final</version> </dependency> <dependency> <groupId>javax.servlet</groupId> <artifactId>jstl</artifactId> <version>1.2</version> </dependency> <dependency> <groupId>javax.servlet</groupId> <artifactId>javax.servlet-api</artifactId> <version>3.0.1</version> <scope>provided</scope> </dependency> <dependency> <groupId>org.glassfish</groupId> <artifactId>javax.faces</artifactId> <version>2.1.7</version> </dependency> </dependencies><build> <finalName>cdibeans</finalName> </build> </project>As you can see Maven manages library dependencies declaratively. A local repository is created (by default under {user_home}/.m2 folder) and all required libraries are downloaded and placed there from public repositories. Furthermore intra – library dependencies are automatically resolved and manipulated. 3. Create a simple Service a simple service that creates a greeting message for the application that uses it. It is an interface with a method that produces the greeting message. package com.javacodegeeks.snippets.enterprise.cdibeans;public interface GreetingCard {void sayHello(); }The implementations of the service are shown below: package com.javacodegeeks.snippets.enterprise.cdibeans.impl;import com.javacodegeeks.snippets.enterprise.cdibeans.GreetingCard;public class GreetingCardImpl implements GreetingCard {public void sayHello() { System.out.println("Hello!!!"); }} package com.javacodegeeks.snippets.enterprise.cdibeans.impl;import com.javacodegeeks.snippets.enterprise.cdibeans.GreetingCard;public class AnotherGreetingCardImpl implements GreetingCard {public void sayHello() { System.out.println("Have a nice day!!!"); }}4. Create a Producer method to inject the bean In order to inject the service to another bean, we create our own annotation. CDI allows us to create our own Java annotation, that is the, and then use it in the injection point of our application to get the correct implementation of the GreetingCard according to the GreetingType of the bean. The Greetings is an enumeration parameterized with the implementations of the service, as shown below: package com.javacodegeeks.snippets.enterprise.cdibeans;import static java.lang.annotation.ElementType.FIELD; import static java.lang.annotation.ElementType.METHOD; import static java.lang.annotation.ElementType.TYPE; import static java.lang.annotation.RetentionPolicy.RUNTIME;import java.lang.annotation.Retention; import java.lang.annotation.Target;import com.javacodegeeks.snippets.enterprise.cdibeans.impl.AnotherGreetingCardImpl; import com.javacodegeeks.snippets.enterprise.cdibeans.impl.GreetingCardImpl;@Retention(RUNTIME) @Target({ FIELD, TYPE, METHOD }) public @interface GreetingType {Greetings value(); public enum Greetings { HELLO(GreetingCardImpl.class), HI(AnotherGreetingCardImpl.class); Class<? extends GreetingCard> clazz; private Greetings(Class<? extends GreetingCard> clazz){ this.clazz = clazz; }public Class<? extends GreetingCard> getClazz() { return clazz; } } }Now we can create a Producer to provide applications instances of the GreetingCard service implementations. The class is a Producer that has a method, getGreetingCard. The method takes two parameters. The first parameter is a javax.enterprise.inject.Instance parameterized with the the required bean type, that is the GreetingCard here. It is annotated with the @Any annotation that allows an injection point to refer to all beans or all events of a certain bean type. The second parameter is the javax.enterprise.inject.spi.InjectionPoint that is the field in the client application that will inject the bean using the @Inject annotation. So the method will return the correct implementation of the service according to the service type and the annotations in the injection point. package com.javacodegeeks.snippets.enterprise.cdibeans;import javax.enterprise.inject.Any; import javax.enterprise.inject.Instance; import javax.enterprise.inject.Produces; import javax.enterprise.inject.spi.Annotated; import javax.enterprise.inject.spi.InjectionPoint;public class GreetingCardFactory {@Produces @GreetingsProducer public GreetingCard getGreetingCard(@Any Instance<GreetingCard> instance, InjectionPoint ip){ Annotated gtAnnotated = ip.getAnnotated(); GreetingType gtAnnotation = gtAnnotated.getAnnotation(GreetingType.class); Class<? extends GreetingCard> greetingCard = gtAnnotation.value().getClazz(); return; } }Note that the method is annotated with an extra annotation, apart from the @Produces annotation that defines the method as Producer. The @GreetingsProducer annotation is used to the injection point to define that it makes use of the specified Producer method to inject a bean instance. It is actually a CDI Qualifier, shown below: package com.javacodegeeks.snippets.enterprise.cdibeans;import static java.lang.annotation.ElementType.FIELD; import static java.lang.annotation.ElementType.METHOD; import static java.lang.annotation.ElementType.TYPE; import static java.lang.annotation.RetentionPolicy.RUNTIME;import java.lang.annotation.Retention; import java.lang.annotation.Target;import javax.inject.Qualifier;@Qualifier @Retention(RUNTIME) @Target({ FIELD, TYPE, METHOD }) public @interface GreetingsProducer {}5. Run the application In order to run the application, we have created a simple servlet. In the servlet below both implementations are injected. Each injection point in the servlet is a field, where the @Inject annotation is used. It is also annotated with the @GreetingsProducer annotation to specify the Producer that will be used as also with the @GreetingType annotation that specifies which implementation will be produced by the Producer. package com.javacodegeeks.snippets.enterprise.cdibeans.servlet;import; import;import javax.inject.Inject; import javax.servlet.ServletException; import javax.servlet.annotation.WebServlet; import javax.servlet.http.HttpServlet; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse;import com.javacodegeeks.snippets.enterprise.cdibeans.GreetingCard; import com.javacodegeeks.snippets.enterprise.cdibeans.GreetingType; import com.javacodegeeks.snippets.enterprise.cdibeans.GreetingType.Greetings; import com.javacodegeeks.snippets.enterprise.cdibeans.GreetingsProducer;@WebServlet(name = "greetingServlet", urlPatterns = {"/sayHello"}) public class GreetingServlet extends HttpServlet {private static final long serialVersionUID = 2280890757609124481L; @Inject @GreetingsProducer @GreetingType(Greetings.HELLO) private GreetingCard greetingCard;@Inject @GreetingsProducer @GreetingType(Greetings.HI) private GreetingCard anotherGreetingCard;public void init() throws ServletException { }public void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { response.setContentType("text/html"); PrintWriter out = response.getWriter(); out.println("<h1>" + greetingCard.sayHello() + "</h1>"); out.println("<h1>" + anotherGreetingCard.sayHello() + "</h1>"); } public void destroy(){ }}To run the example we must build the project with Maven, and then place the war file produced in webbaps folder of tomcat. Then, we can hit on : http://localhost/8080/cdibeans/sayHello and the result is the one shown below:  This was a tutorial of Java EE CDI programmatic dependency disambiguation with injection point inspection.   Download the source code of this tutorial: ...

Reducing memory consumption by 20x

This is going to be another story sharing our recent experience with memory-related problems. The case is extracted from a recent customer support case, where we faced a badly behaving application repeadedly dying with OutOfMemoryError messages in production. After running the application with Plumbr attached we were sure we were not facing a memory leak this time. But something was still terribly wrong. The symptoms were discovered by one of our experimental features monitoring the overhead on certain data structures. It gave us a signal pinpointing towards one particular location in the source code. In order to protect the privacy of the customer we have recreated the case using a synthetic sample, at the same time keeping it technically equivalent to the original problem. Feel free to download the source code. We found ourselves staring at a set of objects loaded from an external source. The communication with the external system was implemented via XML interface. Which is not bad per se.  But the fact that the integration implementation details were scattered across the system – the documents received were converted to XMLBean instances and then used across the system – was not maybe the wisest thing. Essentially we were dealing with a lazily-loaded caching solution. The objects cached were Persons:   // Imports and methods removed to improve readability public class Person { private String id; private Date dateOfBirth; private String forename; private String surname; } Not too memory-consuming one might guess. But things start to look a bit more sour when we open up some more details. Namely the implementation of this data was anything like the simple class declaration above. Instead, the implementation used a model-generated data structure. Model used was similar to the following simplified XSD snippet: <xs:schema targetNamespace="" xmlns:xs="" elementFormDefault="qualified"> <xs:element name="person"> <xs:complexType> <xs:sequence> <xs:element name="id" type="xs:string"/> <xs:element name="dateOfBirth" type="xs:dateTime"/> <xs:element name="forename" type="xs:string"/> <xs:element name="surname" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> Using XMLBeans, the developer had generated the model used behind the scenes. Now lets add the fact that the cache was supposed to hold up to 1.3M instances of Persons and we have created a strong foundation to failure. Running a bundled testcase gave us an indication that 1.3M instances of the XMLBean-based solution would consume approximately 1.5GB of heap. We thought we could do better. First solution is obvious. Integration details should not cross system boundaries. So we changed the caching solution to the simple java.util.HashMap<Long, Person> solution. ID as the key and Person object as the value. Immediately we saw the memory consumption reduced to 214MB. But we were not satisfied yet. As the key in the Map was essentially a number, we had all the reasons to use Trove Collections to further reduce the overhead. Quick change in the implementation and we had replaced our HashMap withTLongObjectHashMap<Person>. Heap consumption dropped to 143MB. We definitely could have stopped there, but the engineering curiosity did not allow us to do so. We could not help to notice that the data used contained a redundant piece of information. Date Of Birth was actually encoded in the ID, so instead of duplicating it in additional field, we could easily calculate the birthday from the given ID. So we changed the layout of the Person object and now it contained just the following fields: // Imports and methods removed to improve readability public class Person { private long id; private String forename; private String surname; } Re-running the tests confirmed our expectations. Heap consumption was down to 93MB. But we were still not satisfied. The application was running on 64-bit machine with an old JDK6 release. Which did not compress the ordinary object pointers by default. Switching to the -XX:+UseCompressedOops gave us an additional win – now we were down to 73MB consumed.We could go further and start interning strings or building a b-tree based on the keys, but this would already start impacting the readability of the code, so we decided to stop here. 21.5x heap reduction should already be good enough result. Lessons learned?Do not let integration details cross system boundaries Redundant data will be costly. Remove the redundancy whenever you can. Primitives are your friends. Know thy tools and learn Trove if you already haven’t Be aware of the optimization techniques provided by your JVMIf you are curious about the experiment conducted, feel free to download the code used from here. The utility used for measurements is described and available in this blogpost.   Reference: Reducing memory consumption by 20x from our JCG partner Nikita Salnikov Tarnovski at the Plumbr Blog blog. ...

MongoDB to CSV

Every once in a while, I need to give a non-technical user (like a business analyst) data residing in MongoDB; consequently, I export the target data as a CSV file (which they can presumably slice and dice once they import it into Excel or some similar tool). Mongo has a handy export utility that takes a bevy of options, however, there is an outstanding bug and some general confusion as to how to properly export data in CSV format. Accordingly, if you need to export some specific data from MongoDB into CSV format, here’s how you do it. The key parameters are connection information to include authentication, an output file, and most important, a list of fields to export. What’s more, you can provide a query in escaped JSON format. You can find the mongoexport utility in your Mongo installation bin directory. I tend to favor verbose parameter names and explicit connection information (i.e. rather than a URL syntax, I prefer to spell out the host, port, db, etc directly). As I’m targeting specific data, I’m going to specify the collection; what’s more, I’m going to further filter the data via a query. ObjectId’s can be referenced via the $oid format; furthermore, you’ll need to escape all JSON quotes. For example, if my query is against a users collection and filtered by account_id (which is an ObjectId), the query via the mongo shell would be: Mongo Shell Query db.users.find({account_id:ObjectId('5058ca07b7628c0002099006')}) Via the command line à la monogexport, this translates to: Collections and queries --collection users --query "{\"account_id\": {\"\$oid\": \"5058ca07b7628c0002000006\"}}" Finally, if you want to only export a portion of the fields in a user document, for example, name, email, and created_at, you need to provide them via the fields parameter like so: Fields declaration --fields name,email,created_at Putting it all together yields the following command: Puttin’ it all together mongoexport --host --port 10332 --username acmeman --password 12345 \ --collection users --csv --fields name,email,created_at --out all_users.csv --db my_db \ --query "{\"account_id\": {\"\$oid\": \"5058ca07b7628c0999000006\"}}" Of course, you can throw this into a bash script and parameterize the collection, fields, output file, and query with bash’s handy $1, $2, etc variables.   Reference: MongoDB to CSV from our JCG partner Andrew Glover at the The Disco Blog blog. ...

5 Reasons to use Guava

Guava is an open source library containing many classes for Java and written by Google. It’s a potentially useful source of miscellaneous utility functions and classes that I’m sure many developers have written themselves before, or maybe just wanted and never had time to write. Here’s 5 good reasons to use it! 1. Collection Initializers and Utilities Generic homogeneous collections are a great feature to have in Java, but sometimes their construction is a bit too verbose, for example:     final Map<String, Map<String, Integer>> lookup = new HashMap<String, Map<String, Integer>>(); Java 7 solves this problem in a really generic way, by allowing a limited form of type inference informally referred to as the Diamond Operator. So we can rewrite the above example as: final Map<String, Map<String, Integer>> lookup = new HashMap<>(); It’s actually already possible to have this kind of inference for non-constructor methods in earlier Java releases, and Guava provides many ready made constructors for existing Java Collections. The above example can be written as: final Map<String, Map<String, Integer>> lookup = Maps.newHashMap(); Guava also provides many useful utility functions for collections under the Maps, Sets et al. classes. Particular favourites of mine are the Sets.union and Sets.intersection methods that return views on the sets, rather than recomputing the values. 2. Limited Functional-Style Programming Guava provides some common methods for passing around methods in a functional style. For example, the map function that many functional programming languages have exists in the form of the Collections2.transform method. Collections2 also has a filter method that allows you to restrict what values are in a collection. For example to remove the elements from a collection that are null, and store it in another collection, you can do the following: Collection<?> noNullsCollection = filter(someCollection, notNull()); Its important to remember that in both cases the function returns a new collection, rather than modifying the existing one and that the resulting collections are lazily computed. 3. Multimaps and Bimaps A really common usage case for a Map involves storing multiple values for a single key. Using the standard Java Collections that’s usually accomplished by using another collection as the value type. This unfortunately ends up involving a lot of ceremony that needs to be repeated in terms of initializing the collection. Multimaps clear this up quite a bit, for example: Multimap<String, Integer> scores = HashMultimap.create(); scores.put("Bob", 20); scores.put("Bob", 10); scores.put("Bob", 15); System.out.println(Collections.max(scores.get("Bob"))); // prints 20 There’s also a BiMap class which goes in the other direction – that is to say that it enforces uniqueness of values as well as keys. Since values are also unique, a BiMap can be used in reverse. 4. Easy Hashcodes and Comparators Its pretty common to want to generate a hashcode for a class in Java from the hashcodes of its fields. Guava provides a utility method for this in the Objects class, here’s an example: int foo; String bar;@Override public int hashCode() { return Objects.hashCode(foo, bar); } Don’t forget to maintain the equals contract if you’re defining a hashcode method. Comparators are another example where writing them frequently involves chaining together a sequence of operations. Guava provides a ComparisonChain class in order to ease this process. Here’s an example with an int and String class: int foo; String bar;@Override public int compareTo(final GuavaExample o) { return ComparisonChain.start().compare(foo,,; } 5. Defensive Coding Do you ever find yourself writing certain preconditions for your methods regularly? Sometimes these can be unnecessarily verbose, or fail to convey intent as directly. Guava provides the Preconditions class with a series of common preconditions. For example instead of an if statement and explicit exception throw … if (count <= 0) { throw new IllegalArgumentException("must be positive: " + count); } … you can use an explicit precondition: checkArgument(count > 0, "must be positive: %s", count); Conclusions Being able to replace existing library classes with those from guava, helps you to reduce the amount of code you need to maintain and offers a potential productivity boost. There are alternatives for example the Apache Commons project. It might be the case that you already use and know of these libraries, or prefer their approach and api to the Guava approach. Guava does have an Idea Graveyard - which gives you some idea of what the Google engineers perceive to be the limits of the library, or a bad design decision. You may not individually agree with these choices, at which point you’re back to writing your own library classes. Overall though Guava encourages a terser and less ceremonious style and some appropriate application of Guava could help many Java projects. Original:   Reference: 5 Reasons to use Guava from our JCG partner Andriy Andrunevchyn at the Java User Group of Lviv blog. ...
Java Code Geeks and all content copyright © 2010-2015, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you our best selling eBooks for FREE!

Get ready to Rock!
To download the books, please verify your email address by following the instructions found on the email we just sent you.