Featured FREE Whitepapers

What's New Here?


Anti cross-site scripting (XSS) filter for Java web apps

Here is a good and simple anti cross-site scripting (XSS) filter written for Java web applications. What it basically does is remove all suspicious strings from request parameters before returning them to the application. It’s an improvement over my previous post on the topic.You should configure it as the first filter in your chain (web.xml) and it’s generally a good idea to let it catch every request made to your site.The actual implementation consists of two classes, the actual filter is quite simple, it wraps the HTTP request object in a specialized HttpServletRequestWrapper that will perform our filtering. public class XSSFilter implements Filter {@Override public void init(FilterConfig filterConfig) throws ServletException { }@Override public void destroy() { }@Override public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException { chain.doFilter(new XSSRequestWrapper((HttpServletRequest) request), response); }}The wrapper overrides the getParameterValues(), getParameter() and getHeader() methods to execute the filtering before returning the desired field to the caller. The actual XSS checking and striping is performed in the stripXSS() private method. import java.util.regex.Pattern; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletRequestWrapper;public class XSSRequestWrapper extends HttpServletRequestWrapper {public XSSRequestWrapper(HttpServletRequest servletRequest) { super(servletRequest); }@Override public String[] getParameterValues(String parameter) { String[] values = super.getParameterValues(parameter);if (values == null) { return null; }int count = values.length; String[] encodedValues = new String[count]; for (int i = 0; i < count; i++) { encodedValues[i] = stripXSS(values[i]); }return encodedValues; }@Override public String getParameter(String parameter) { String value = super.getParameter(parameter);return stripXSS(value); }@Override public String getHeader(String name) { String value = super.getHeader(name); return stripXSS(value); }private String stripXSS(String value) { if (value != null) { // NOTE: It's highly recommended to use the ESAPI library and uncomment the following line to // avoid encoded attacks. // value = ESAPI.encoder().canonicalize(value);// Avoid null characters value = value.replaceAll("", "");// Avoid anything between script tags Pattern scriptPattern = Pattern.compile("<script>(.*?)</script>", Pattern.CASE_INSENSITIVE); value = scriptPattern.matcher(value).replaceAll("");// Avoid anything in a src='...' type of expression scriptPattern = Pattern.compile("src[\r\n]*=[\r\n]*\\\'(.*?)\\\'", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL); value = scriptPattern.matcher(value).replaceAll("");scriptPattern = Pattern.compile("src[\r\n]*=[\r\n]*\\\"(.*?)\\\"", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL); value = scriptPattern.matcher(value).replaceAll("");// Remove any lonesome </script> tag scriptPattern = Pattern.compile("</script>", Pattern.CASE_INSENSITIVE); value = scriptPattern.matcher(value).replaceAll("");// Remove any lonesome <script ...> tag scriptPattern = Pattern.compile("<script(.*?)>", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL); value = scriptPattern.matcher(value).replaceAll("");// Avoid eval(...) expressions scriptPattern = Pattern.compile("eval\\((.*?)\\)", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL); value = scriptPattern.matcher(value).replaceAll("");// Avoid expression(...) expressions scriptPattern = Pattern.compile("expression\\((.*?)\\)", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL); value = scriptPattern.matcher(value).replaceAll("");// Avoid javascript:... expressions scriptPattern = Pattern.compile("javascript:", Pattern.CASE_INSENSITIVE); value = scriptPattern.matcher(value).replaceAll("");// Avoid vbscript:... expressions scriptPattern = Pattern.compile("vbscript:", Pattern.CASE_INSENSITIVE); value = scriptPattern.matcher(value).replaceAll("");// Avoid onload= expressions scriptPattern = Pattern.compile("onload(.*?)=", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL); value = scriptPattern.matcher(value).replaceAll(""); } return value; } }Notice the comment about the ESAPI library, I strongly recommend you check it out and try to include it in your projects. If you want to dig deeper on the topic I suggest you check out the OWASP page about XSS and RSnake’s XSS (Cross Site Scripting) Cheat Sheet. Reference: Stronger anti cross-site scripting (XSS) filter for Java web apps from our JCG partner Ricardo Zuasti at the Ricardo Zuasti’s blog blog....

The Activiti Performance Showdown

The question everybody always asks when they learn about Activiti, is as old as software development itself: “How does it perform?”. Up till now, when you would ask me that same question, I would tell you about how Activiti minimizes database access in every way possible, how we break down the process structure into an ‘execution tree’ which allows for fast queries or how we leverage ten years of workflow framework development knowledge. You know, trying to get around the question without answering it. We knew it is fast, because of the theoretical foundation upon which we have built it. But now we have proof: real numbers …. Yes, it’s going to be a lengthy post. But trust me, it’ll be worth your time! Disclaimer: performance benchmarks are hard. Really hard. Different machines, slight different test setup … very small things can change the results seriously. The numbers here are only to prove that the Activiti engine has a very minimal overhead, while also integrating very easily into the Java eco-system and offering BPMN 2.0 process execution. The Activiti Benchmark Project To test process execution overhead of the Activiti engine, I created a little side project on github: https://github.com/jbarrez/activiti-benchmark The project contains currently 9 test processes, which we’ll analyse below. The logic in the project is pretty straightforward:A process engine is created for each test run Each of the processes are sequentially executed on this process engine, using a threadpool from 1 up to 10 threads. All the processes are thrown into a bag, of which a number of random executions are drawn. All the results are collected and a HTML report with some nice charts are generatedTo run the benchmark, simply follow the instructions on the github page to build and execute the jar. Benchmark Results The test machine I used for the results is my (fairly old) desktop machine: AMD Phenom II X4 940 3.0Ghz, 8 Gb 800Mhz RAM and an old-skool 7200 rpm HD running Ubuntu 11.10. The database used for the test runs on the same machine on which the tests also run. So keep in mind that in a ‘real’ server environment the results could even be better! The benchmark project I mentioned above, was executed on a default Ubuntu MySQL 5 database. I just switched to the ‘large.cnf’ setting (which throws more RAM at the db and stuff like that) instead the default config.Each of the test processes ran for 2500 times, using a threadpool going from one to ten threads. In simpleton language: 2500 process executions using just one thread, 2500 threads using two threads, 2500 process executions using three … yeah, you get it. Each benchmark run was done using a ‘default’ Activiti process engine. This basically means a ‘regular’ standalone Activiti engine, created in plain Java. Each benchmark run was also done in a ‘Spring’ config. Here, the process engine was constructed by wrapping it in the factory bean, the datasource is a Spring datasource and also the transactions and connection pool is managed by Spring (I’m actually using a tweaked BoneCP threadpool) Each benchmark run was executed with history on the default history level (ie. ‘audit’) and without history enabled (ie. history level ‘none’).The processes are in detail analyzed in the sections below, but here are the integral results of the test runs already:Activiti 5.9 – MySQL – default – history enabled Activiti 5.9 – MySQL – default – history disabled Activiti 5.9 – MySQL – Spring – history enabled Activiti 5.9 – MySQL – Spring – history disabledI ran all the tests using the latest public release of Activiti, being Activiti 5.9. However, my test runs brought some potential performance fixes to the surface (I also ran the benchmark project through a profiler). It was quickly clear that most of the process execution time was done actually cleaning up when a process ended. Basically, more than often queries were fired which were not necessary if we would save some more state in our execution tree. I sat together with Daniel Meyer from Camunda and my colleague Frederik Heremans, and they’ve managed to commit fixes for this! As such, the current trunk of Activiti, being Activiti 5.10-SNAPSHOT at the moment, is significantly faster than 5.9.Activiti 5.10 – MySQL – default – history enabled Activiti 5.10 – MySQL – default – history disabled Activiti 5.10 – MySQL – Spring – history enabled Activiti 5.10 – MySQL – Spring – history disabledFrom a high-level perspective (scroll down for detailed analysis), there are a few things to note:I had expected some difference between the default and Spring config, due to the more ‘professional’ connection pool being used. However, the results for both environments are quite alike. Sometimes the default is faster, sometimes Spring. It’s hard to really find a pattern. As such, I omitted the Spring results in the detailed analyses below. The best average timings are most of the times found when using four threads to execute the processes. This is probably due to having a quad-core machine. The best throughput numbers are most of the times found when using eight threads to execute the processes. I can only assume that is also has something to do with having a quad-core machine. When the number of threads in the threadpool go up, the throughput (processes executed / second) goes up, both it has a negative effect on the average time. Certainly with more than six or seven threads, you see this effect very clear. This basically means that while the processes on itself take a little longer to execute, but due to the multiple threads you can execute more of these ‘slower’ processes in the same amount of time. Enabling history does have an impact. Often, enabling history will double execution time. This is logical, given that many extra records are inserted when history is on the default level (ie. ‘audit’).There was one last test I ran, just out of curiosity: running the best performing setting on an Oracle XE 11.2 database. The Oracle XE is a free version of the ‘real’ Oracle database. No matter how hard, I tried, I couldn’t get it decently running on Ubuntu. As such, I used an old Windows XP install on that same machine. However, the OS is 32 bit, wich means the system only has 3.2 of the 8Gb of RAM available. Here are the results:Activiti 5.10 – Oracle on Windows – default – history disabledThe results speak for itself. Oracle blows away any of the (single-threaded) results on MySQL (and they are already very fast!). However, when going multi-threaded it is far worse than any of the MySQL results. My guess is that these are due to the limitations of the XE version: only one CPU is used, only 1 GB of RAM, etc. I would really like to run these test on a real Oracle-managed-by-a-real-DBA … Feel free to contact me if you are interested! In the next sections, we will take a detailed look into the performance numbers of each of the test processes. An Excel sheet containing all the the numbers and charts below can be downloaded for yourself. Process 1: The bare micromum (one transaction)The first process is not a very interesting one, business-wise at least. After starting the process, the end is immediately reached. Not very useful on itself, but its numbers learn us one essential thing: the bare overhead of the Activiti engine. Here are the average timings:This process runs in a single transaction, which means that nothing is saved to the database when the history is disabled due to Activiti’s optimizations. With history enabled, you’ll basically get the cost for inserting one row into the historical process instance table, which is around 4.44 ms here. It is also clear that our fix for Activiti 5.10 has an enormous impact here. In the previous version, 99% of the time was spent in the cleanup check of the process. Take a look at the best result here: 0.47 ms when using 4 threads to execute 2500 runs of this process. That’s only half a millisecond! It’s fair to say that the Activiti engine overhead is extremely small. The throughput numbers are equally impressive:In the best case here, 8741 processes are executed. Per second. By the time you arrive here reading the post, you could have executed a few millions of this process . You can also see that there is little difference between 4 or 8 threads here. Most of the execution time here is cpu time, and no potential collisions such as waiting for a database lock happens here. In these numbers, you can also easily see that the Oracle XE doesn’t scale well with multiple threads (which is explained above). You will see the same behavior in the following results. Process 2: The same, but a bit longer (one transaction)This process is pretty similar to the previous one. We have again only one transaction. After the process is started, we pass through seven no-op passthrough activities before reaching the end.Some things to note here:The best result (again 4 threads, with history disabled) is actually better than the simpler previous process. But also note that the single threaded execution is a tad slower. This means that the process on itself is a bit slower, which is logical as is has more activities. But using more threads and having more activities in the process does allow for more potential interleaving. In the previous case, the thread was barely born before it was killed again. The difference between history enabled/disabled is bigger than the previous process. This is logical, as more history is written here (for each activity one record in the database). Again, Activiti 5.10 is far more superior to Activiti 5.9.The throughput numbers follow these observations: there is more opportunity to use threading here. The best result lingers around 12000 process execution per second. Again, it demonstrates the very lightweight execution of the Activiti engine.Process 3: Parallelism in one transactionThis process executes a parallel gateway that forks and one that joins in the same transaction. You would expect something along the lines of the previous results, but you’d be surprised:Comparing these numbers with the previous process, you see that execution is slower. So why is this process slower, even if it has less activities? The reason lies with how the parallel gateway is implemented, especially the join behavior. The hard part, implementation-wise, is that you need to cope with the situation when multiple executions arrive at the join. To make sure that the behavior is atomic, we internally do some locking and fetch all child executions in the execution tree to find out whether the join activates or not. So it is quite a ‘costly’ operation, compared to the ‘regular’ activities. Do mind, we’re talking here about only 5 ms single threaded and 3.59 ms in the best case for MySQL. Given the functionality that is required for implementing the parallel gateway functionality, this is peanuts if you’d ask me. The throughput numbers:This is the first process which actually contains some ‘logic’. In the best case above, it means 1112 processes can be executed in a second. Pretty impressive, if you’d ask me!. Process 4: Now we’re getting somewhere (one transaction)This process already looks like something you’d see when modeling real business processes. We’re still running it in one database transaction though, as all the activities are automatic passthroughs. Here we also have two forks and two joins.Take a look at the lowest number: 6.88 ms on Oracle when running with one thread. That’s freaking fast, taking in account all that is happening here. The history numbers are at least doubled here (Activiti 5.10), which makes sense because there is quite a bit of activity audit logging going on here. You can also see that this causes to have a higher average time for four threads here, which is probably due to the implementation of the joining. If you know a bit about Activiti internals, you’ll understand this means there are quite a bit of executions in the execution tree. We have one big concurrent root, but also multiple children which are sometimes also concurrent roots. But while the average time rises, the throughput definitely benefits:Running this process with eight threads, allows you to do 411 runs of this process in a single second. There is also something peculiar here: the Oracle database performs better with more thread concurrency. This is completely contrary with all other measurements, where Oracle is always slower in that environment (see above for explanation). I assume it has something to do with the internal locking and forced update we are applying when forking/joining, which is better handled by Oracle it seems. Process 5: Adding some Java logic (single transaction)I added this process to see the influence of adding a Java service task in a process. In this process, the first activity generates a random value, stores it as a process variable and then goes up or down in the process depending on the random value. The chance is about 50/50 to go up or down.The average timings are very very good. Actually, the results are in the same range as those of process 1 and 2 above (which had no activities or only automatic passthroughs). This means that the overhead of integrating Java logic into your process is nearly non-existant (nothing is of course for free). Of course, you can still write slow code in that logic, but you can’t blame the Activiti engine for that Throughput numbers are comparable to those of process 1 and 2: very, very high. In the best case here, more than 9000 processes are executed per second. That indeed also means 9000 invocations of your own Java logic.Process 6, 7 and 8: adding wait states and transactions The previous processes demonstrated us the bare overhead of the Activiti engine. Here, we’ll take a look at how wait states and multiple transactions have influence on performance. For this, I added three test processes which contain user tasks. For each user task, the engine commits the current transaction and returns the thread to the client. Since the results are pretty much compatible for these processes, we’re grouping them here. These are the processes:Here are the average timings results, in order of the processes above. For the first process, containing just one user task:It is clear that having wait states and multiple transaction does have influence on the performance. This is also logical: before, the engine could optimize by not inserting the runtime state into the database, because the process was finished in one transaction. Now, the whole state, meaning the pointers to where you are currently, need to be saved into the database. The process could be ‘sleeping’ like this for many days, months, years now …. The Activiti engine doesn’t hold it into memory now anymore, and it is freed to give its full attention to other processes. If you check the results of the process with only one user task, you can see that in the best case (Oracle, single thread – the 4 threads on MySQL is pretty close) this is done in 6.27ms. This is really fast, if you take in account we have a few inserts (the execution tree, the task), a few updates (the execution tree) and deletes (cleaning up) going on here. The second process here, with 7 user tasks:The second chart learns us that logically, more transactions means more time. In the best case here the process is done in 32.12 ms. That is for seven transactions, which gives 4.6 ms for each transactions. So it is clear that average time scales in a linearly way when adding wait states. This makes of course sense, because transactions aren’t free. Also note that enabling history does add quite some overhead here. This is due to having the history level set to ‘audit’, which stores all the user task information in the history tables. This is also noticeable from the difference between Activiti 5.9 with history disabled and Activiti 5.10 with history enabled: this is a rare case where Activiti 5.10 with history enabled is slower than 5.9 with history disabled. But it is logical, given the volume of history stored here. And the third process learns us how user tasks and parallel gateways interact:The third chart learns us not much new. We have two user tasks now, and the more ‘expensive’ fork/join (see above). The average timings are how we expected them. The throughput charts are as you would expect given the average timings. Between 70 and 250 processes per second. Aw yeah! To save some space, you’ll need to click them to enlarge:Process 9: So what about scopes?For the last process, we’ll take a look at ‘scopes’. A ‘scope’ is how we call it internally in the engine, and it has to do with variable visibility, relationships between the pointers indicating process state, event catching, etc. BPMN 2.0 has quite some cases for those scopes, for example with embedded subprocesses as shown in the process here. Basically, every subprocess can have boundary events (catching an error, a message, etc) that only are applied on its internal activities when it’s scope is active. Without going into too much technical details: to get scopes implemented in the correct way, you need some not so trivial logic. The example process here has 4 subprocesses, nested in each other. The inner process is using concurrency, which is a scope on itself again for the Activiti engine. There are also two user tasks here, so that means two transactions. So let’s see how it performs:You can clearly see the big difference between Activiti 5.9 and 5.10. Scopes are indeed an area where the fixes around the ‘process cleanup’ at the end have a huge benefit, as many execution objects are created and persisted to represent the many different scopes. Single threaded performance is not so good on Activiti 5.9. Luckily, as you can see from the gap between the blue and the red bars, those scopes do allow for high concurrency. The numbers of Oracle, combined with the multi-threaded results of the 5.10 tests, do prove that scopes are now efficiently handled by the engine. The throughput charts prove that the process nicely scales with more threads, as you can see by the big gap between the red and green line in the second last block. In the best case, 64 processes of this more complex process are handled by the engine.Random execution If you have already clicked on the full reports at the beginning of the post, you probably have noticed also random execution is tested for each environment. In this setting, 2500 process executions were done, both the process was randomly chosen. As shown in those reports this meant that over 2500 executions, each process was executed almost the same number of times (normal distribution). This last chart shows the best setting (Activiti 5.10, history disabled) and how the throughput of those random process executions goes when adding more threads:As we’ve seen in many of the test above, once passed four threads things don’t change that much anymore. The numbers (167 processes/second) prove that in a realistic situation (ie. multiple processes executing at the same time), the Activiti engine nicely scales up. Conclusion The average timing charts show two things clearly:The Activiti engine is fast and overhead is minimal! The difference between history enabled or disabled is definitely noticeably. Sometimes it comes even down to half the time needed. All history tests were done using the ‘audit’ level, but there is a simpler history level (‘activity’) which might be good enough for the use case. Activiti is very flexible in history configuration, and you can tweak the history level for each process specifically. So do think about the level your process needs to have, if it needs to have history at all!The throughput charts prove that the engine scales very well when more threads are available (ie. any modern application server). Activiti is well designed to be used in high-throughput and availability (clustered) architectures. As I said in the introduction, the numbers are what they are: just numbers. My main point which I want to conclude here, is that the Activiti engine is extremely lightweight. The overhead of using Activiti for automating your business processes is small. In general, if you need to automate your business processes or workflows, you want top-notch integration with any Java system and you like all of that fast and scalable … look no further! Reference: The Activiti Performance Showdown from our JCG partner Joram Barrez at the Small steps with big feet blog....

Demeter Law

Hello, how are you?Let us talk today about the Demeter Law. It is a pattern of Object Orientation that helps us to lower our coupling, decrease our maintenance impact and the raise adaptability of our systems.What is utility for those “weird” words? If you have to do any maintenance in your application, it will have a lesser impact; your classes will only know the classes that it should know, and your code changes will be quicker and with less impact in your system.Just advantages in the Demeter law, right? Let us take it easy; in the end of this post we will see the disadvantage of this approach.If you want to see another post about OO just click in the link: “Tell, do not ask!”Take a look in the code bellow, what could we do to make it better? package com;public class Main {public static void main(String[] args) { IAddress address = new Address(); address.setName("01"); address.setZipCode("000001");IHouse house = new House(); house.setAddress(address);IPerson person = new Person(); person.setHouse(house);// Print the person zip code System.out.println(person.getHouse().getAddress().getZipCode()); } } The code above will run as expected; we are coding to interface, our code is wellindented and well formatted. What could we do to “upgrade” our code?The Demeter law says that a class may not know more then one friendly class. WHAT? Let us take small steps and analyze the code above; notice that our Main class wants to print the Person ZipCode, but to do this the Main class get to know two more classes. If you did not noticed, there is a coupling there.To print de ZipCode our class Main is going through the Person, House and finally Address class. What is this a bad approaching? Imagine if out Annalist decide to remove our Address class from the system and the House class will be responsible to keep the ZipCode.In our cod ewill be very easy to change; but imagine now if we had a huge system with the ZipCode printed for more than 100 code lines. You would have to change 100 lines of codes at your system.The Demeter law came to help us with this kind of situation, with a little change in our Person and House classes; we can avoid this huge impact when we remove the Address class. Take a look in our new code. package com;public class Main {public static void main(String[] args) { IAddress address = new Address(); address.setName("01"); address.setZipCode("000001");IHouse house = new House(); house.setAddress(address);IPerson person = new Person(); person.setHouse(house);// Print the person zip code System.out.println(person.getZipCode()); } } package com;public interface IPerson {void setHouse(IHouse house);IHouse getHouse();String getZipCode();} package com;public class Person implements IPerson { private IHouse house;@Override public void setHouse(IHouse house) { this.house = house; }@Override public IHouse getHouse() { return house; }@Override public String getZipCode() { return house.getZipCode(); } } package com;public interface IHouse {void setAddress(IAddress address);IAddress getAddress();String getZipCode();} package com;public class House implements IHouse {private IAddress address;@Override public void setAddress(IAddress address) { this.address = address; }@Override public IAddress getAddress() { return address; }@Override public String getZipCode() { return address.getZipCode(); } } package com;public interface IAddress {void setName(String string);void setZipCode(String string);String getZipCode();public abstract String getName();} package com;public class Address implements IAddress {private String name; private String zipCode;@Override public void setName(String name) { this.name = name; }@Override public void setZipCode(String zipCode) { this.zipCode = zipCode; }@Override public String getName() { return name; }@Override public String getZipCode() { return zipCode; } } Look at our new code, think now where you will need to change if you need to remove the Address class. Only the Home class will be edited, the rest of our code will remain the same.This is the greatest advantage of the Demeter law. When you have to do maintenance your project will have a small impact. The new features will be easily adapted, simpler code editions, and with a small cost. In code that we saw today only one class would be impacted. The other classes of your system would remain the same and your system will have a small coupling.The disadvantage of this approach is an impact on the system performance. You may have a low performance if you use this approach in loops like “While, For, …” .In this case you will have to see which code of your system will not have the performance impacted with the Demeter law.I believe that even with this disadvantage in the performance in some code pieces this approach is useful and worth of use it in our systems; Demeter law could be used in almost all code of our system.I hope this post might help youIf you have any doubt or question just post it.See you soon! \o_ Reference: Demeter Law from our JCG partner Hebert Coelho at the uaiHebert blog....

MongoDB with Java Kickstart

NoSQL databases due to their scalability are becoming increasingly popular. When used appropriately NoSQL databases can offer real benefits. MongoDB is such a highly scalable opensource NoSQL database written in C++. 1. Installing MongoDB Without much of a trouble you can install MongoDB using the instructions given in the official MongoDB site, according to whatever the OS you are using. 2. Starting the MongoDB server This is quite simple. Run the mongod.exe file inside bin folder(I am using windows OS here) to start the MongoDB server. By default the server will start on port 27017 and the data will be stored at /data/db directory which you’ll have to create during the installing process. 3. Starting MongoDB shell You can start the MongoBD shell by running the mongo.exe file. 4. Creating a database with MongoDB To create a database named ‘company’ using MongoDB type the following on MongoDB shell use company Mind that MangoDB will not create a database until you save something inside it. Use following command to view the available databases and that will show you that ‘company’ database hasn’t been created yet. show dbs; 5. Saving data in MongoDB Use following commands to save employee data to a collection called employees employee = {name : 'A', no : 1} db.employees.save(employee)To view the data inside the collection use following command, db.users.find();Do it with Java :) Following is a simple Java code which is doing the same thing we did above. You can get the mongo-java driver from here. Just go through the code, it’s very simple, hopefully you’ll get the idea. package com.eviac.blog.mongo;import java.net.UnknownHostException;import com.mongodb.BasicDBObject; import com.mongodb.DB; import com.mongodb.DBCollection; import com.mongodb.DBCursor; import com.mongodb.Mongo; import com.mongodb.MongoException;public class MongoDBClient {public static void main(String[] args) {try {Mongo mongo = new Mongo('localhost', 27017);DB db = mongo.getDB('company');DBCollection collection = db.getCollection('employees');BasicDBObject employee = new BasicDBObject(); employee.put('name', 'Hannah'); employee.put('no', 2);collection.insert(employee);BasicDBObject searchEmployee = new BasicDBObject(); searchEmployee.put('no', 2);DBCursor cursor = collection.find(searchEmployee);while (cursor.hasNext()) { System.out.println(cursor.next()); }System.out.println('The Search Query has Executed!');} catch (UnknownHostException e) { e.printStackTrace(); } catch (MongoException e) { e.printStackTrace(); }}}Result { '_id' : { '$oid' : '4fec74dc907cbe9445fd2d70'} , 'name' : 'Hannah' , 'no' : 2} The Search Query has Executed!Reference: MongoDB with Java from our JCG partner Pavithra Siriwardena at the EVIAC blog....

Apache Bigtop – Installing Hive, HBase and Pig

In the previous post we learnt how easy it was to install Hadoop with Apache Bigtop! We know its not just Hadoop and there are sub-projects around the table! So, lets have a look at how to install Hive, Hbase and Pig in this post. Before rowing your boat… Please follow the previous post and get ready with Hadoop installed! Follow the link for previous post: http://femgeekz.blogspot.in/2012/06/hadoop-hangover-introduction-to-apache.html also, the same can be found at DZone, developer site: http://www.dzone.com/links/hadoop_hangover_introduction_to_apache_bigtop_and.html All Set?? Great! Head On.. Make sure all the services of Hadoop are running. Namely, JobTracker, SecondaryNameNode, TaskTracker, DataNode and NameNode. [standalone mode] Hive with Bigtop: The steps here are almost the same as Installing Hive as a separate project. However, few steps are reduced. The Hadoop installed in the previous post is Release 1.0.1 We had installed Hadoop with the following command sudo apt-get install hadoop\* Step 1: Installing Hive We have installed Bigtop 0.3.0, and so issuing the following command installs all the hive components. ie. hive, hive-metastore, hive-server. The daemons names are different in Bigtop 0.3.0. sudo apt-get install hive\*This installs all the hive components. After installing, the scripts must be able to create /tmp and /usr/hive/warehouse and HDFS doesn’t allow these to be created while installing as it is unaware of the path to Java. So, create the directories if not created and grant the execute permissions. In the hadoop directory, ie. /usr/lib/hadoop/ bin/hadoop fs -mkdir /tmp bin/hadoop fs -mkdir /user/hive/warehouse bin/hadoop -chmod g+x /tmp bin/hadoop -chmod g+x /user/hive/warehouse Step 2: The alternative directories could be /var/run/hive and /var/lock/subsys sudo mkdir /var/run/hive sudo mkdir /var/lock/subsysStep 3: Start the hive server, a daemon sudo /etc/init.d/hive-server startImage:Step 4: Running Hive Go-to the directory /usr/lib/hive. See the Image below: bin/hiveStep 5: Operations on Hive Image:HBase with Bigtop: Installing Hbase is similar to Hive. Step 1: Installing HBase sudo apt-get install hbase\*Image:Step 2: Starting HMaster sudo service hbase-master startImage:Image:Step 3: Starting HBase shell hbase shell Image:Step 4: HBase Operations Image:Image:Pig with Bigtop: Installing Pig is similar too. Step 1: Installing Pig sudo apt-get install pigImage:Step 2: Moving a file to HDFS Image:Step 3: Installed Pig-0.9.2 Image:Step 4: Starting the grunt shell pigImage:Step 5: Pig Basic Operations Image:Image:We saw that is it possible to install the subprojects and work with Hadoop, with no issues. Apache Bigtop has its own spark! :) There is a release coming BIGTOP-0.4.0 which is supposedly to fix the following issues: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12318889&styleName=Html&projectId=12311420 Source and binary files: http://people.apache.org/~rvs/bigtop-0.4.0-incubating-RC0 Maven staging repo: https://repository.apache.org/content/repositories/orgapachebigtop-279 Bigtop’s KEYS file containing PGP keys we use to sign the release: http://svn.apache.org/repos/asf/incubator/bigtop/dist/KEYS Let us see how to install other sub-projects in the coming posts! Until then, Happy Learning! Reference: Hadoop Hangover: Introduction To Apache Bigtop and Installing Hive, HBase and Pig from our JCG partner Swathi V at the * Techie(S)pArK * blog....

JavaFX 2.0 Hello World

Before talking about the example itself, I want to show you how to create a JavaFX application in NetBeans. (If you haven´t installed JavaFX and NetBeans yet, please see my previous post Installing JavaFX 2.0 and NetBeans 7.7.1) Click on “New Project” in the “File” menu to open the project wizard. Then choose “JavaFX->JavaFX Application” and press “Next”.In the next dialog you can specify the name of your application and a destination folder, where it should be stored. If you have installed JavaFX correctly the “JavaFX Platform” should be specified already. Otherwise you can add the platform yourself by clicking on “Manage Platforms->Add Platform” and specifying the paths to your JavaFX installation.Note: By default the “Create Application Class” checkbox is checked. Please uncheck it because we´ll create our own application class. Click on “finish” to create your first JavaFX application. Hello World in JavaFX 2.0 – Example 1 Probably every single software developer knows the famous „HelloWorld“ example as it is often used to show the syntax of a (unknown) programming language and to give a first clue, of what the language looks like. I don´t want to break this tradition, so here are 2 different versions of a HelloWorld program in JavaFX 2.0. I´ll show the complete code first and then explain the individual parts. import javafx.application.Application; import javafx.event.ActionEvent; import javafx.event.EventHandler; import javafx.scene.Scene; import javafx.scene.control.Button; import javafx.scene.layout.StackPane; import javafx.stage.Stage;/** * * Created on: 17.03.2012 * @author Sebastian Damm */ public class HelloJavaFX extends Application { @Override public void start(Stage stage) throws Exception { Button bt = new Button('Print HelloWorld'); bt.setOnAction(new EventHandler<ActionEvent>() { @Override public void handle(ActionEvent arg0) { System.out.println('HelloWorld! :)'); } }); StackPane root = new StackPane(); Scene scene = new Scene(root, 300, 150); root.getChildren().add(bt);stage.setTitle('HelloWorld in JavaFX 2.0'); stage.setScene(scene); stage.show(); }public static void main(String[] args) { Application.launch(args); } } The first thing worth mentioning is that you have to extend from the Application class in order to create a working JavaFX application. This class provides several live-cycle methods and is the starting point for your application. It is an abstract class (which means, that you cannot instantiate it) with a single abstract method start, that you have to override. You are provided a stage object by the JavaFX runtime, which you can use to display your UI.To start your application you have to call the static method launch as seen in the main method in this example. After launching your application, it will call the start method. Here is the JavaDoc of the Application class, which shows the individual steps when starting a JavaFX application. The entry point for JavaFX applications is the Application class. The JavaFX runtime does the following, in order, whenever an application is launched: Constructs an instance of the specified Application classCalls the init() method Calls the start(javafx.stage.Stage) method Waits for the application to finish, which happens either when the last window has been closed, or the application calls Platform.exit() Calls the stop() methodLet´s start with the real source code inside the start method. First of all we create a simple Button and specify an action to be triggered when the button is clicked via the setOnAction method (compare JButton´s addActionListener). Next we create a StackPane object, which is one of the layout panes in JavaFX (One of the next blog posts will cover all different layout panes in JavaFX). I use a StackPane here, because it automatically takes all the available space provided by its surrounding parent and because it automatically centers its children inside. Note: The foundation of a JavaFX application is the Scene graph. Every single Node (which includes simple controls, groups and layout panes) is part of a hierarchical tree of nodes, which is called the Scene graph. The Scene graph and therefore your whole JavaFX application has always one single root node! As mentioned above, the start method has a Stage object parameter, which is provided by the JavaFX runtime. This Stage object is a kind of window. You have to give it a Scene object as its viewable content. You can create a Scene object by passing the root node of your application. Optional parameters are the width and the height of your scene as well as a Paint object, which includes simple colors and also complex color gradients. With root.getChildren().add(bt); you add the button to your root node, which is a stackpane. After that we set a title to the stage and apply the created scene object. Finally with the show method we tell the stage to show. (compare Swing´s setVisible Now your application should look like this:And it should print ‘HelloWorld’ to the command line if you click the button. Nothing spectacular yet, but it´s your first working JavaFX application, so congratulations! :) Hello World in JavaFX 2.0 – Example 2 Additionally a slightly changed example, which will show the text in the GUI. The code: import javafx.application.Application; import javafx.event.ActionEvent; import javafx.event.EventHandler; import javafx.scene.Group; import javafx.scene.Scene; import javafx.scene.control.Button; import javafx.scene.effect.DropShadow; import javafx.scene.paint.Color; import javafx.scene.text.Font; import javafx.scene.text.Text; import javafx.stage.Stage;/** * * Created on: 17.03.2012 * @author Sebastian Damm */ public class HelloJavaFX2 extends Application { @Override public void start(Stage stage) throws Exception { final Group root = new Group(); Scene scene = new Scene(root, 500, 200, Color.DODGERBLUE); final Text text = new Text(140, 120, 'Hello JavaFX 2.0!'); text.setFont(Font.font('Calibri', 35)); text.setFill(Color.WHITE); text.setEffect(new DropShadow()); Button bt = new Button('Show HelloWorld'); bt.setLayoutX(180); bt.setLayoutY(50); bt.setOnAction(new EventHandler<ActionEvent>() { @Override public void handle(ActionEvent arg0) { root.getChildren().add(text); } });root.getChildren().add(bt); stage.setTitle('HelloWorld in JavaFX 2.0'); stage.setScene(scene); stage.show(); }public static void main(String[] args) { Application.launch(args); } } Instead of using a layout pane we use a Group object here. Group is a subclass of Parent (which is a subclass of Node) and takes one or more children. A Group isn´t directly resizable and you can add transformations or effects to a Group which will then affect all children of the Group. (Note that we now also provided a Paint for the Scene.) Next we create a Text object. Because we have no layout pane, we specify the x and y coordinates directly. We specify a custom font, change the color to white and add a DropShadow. The Button also gets coordinates and instead of printing “HelloWorld” to the command line when we click the button, we add the created Text object to our root element (and therefore to the Scene Graph). After clicking the button, your application shoul look like this.Summary:A JavaFX Stage object is a kind of window and behaves similar to a JFrame or JDialog in Swing. A JavaFX Scene object is the viewable content of a Stage and has a single Parent root node. Node is one of the most important classes in JavaFX. Every control or layout pane is a kind of node. The Scene Graph is a hierarchical tree of nodes. It has one single root node and is the foundation of your application. It has to be passed to a Scene object In order to create and start a JavaFX application you have to complete the following steps:Extend the Application class Override the abstract start method Create a root node and add some elements to it Create a Scene object and pass the root node to it Apply the Scene to the Stage via setScene Tell the Stage to show with the show method Call Application.launch in your main methodReference: Hello World in JavaFX 2.0 from our JCG partner Sebastian Damm at the Just my 2 cents about Java blog....

All you need to know about QuickSort

It would be true to say that Quicksort is one of the most popular sorting algorithms. You can find it implemented on the most of the languages and it is present in almost any core library. In Java and Go Quicksort is default sorting algorithm for some data types and it is used in in C++ STL ( Introsoft which is used there begins with Quicksort). Such popularity can be explained by the fact that on average, Quicksort is one of the fastest known sorting algorithms. Interestingly that complexity of Quicksort is not less than it is for other algorithms like MergeSort or HeapSort. The best case performance is O(nlogn) and on the worst case it gives O(n^2). Latter, luckily, is exceptional case for the proper implementation. Quicksort performance is gained by the main loop which tends to make excellent usage of CPU caches. Another reason of popularity is that it doesn’t need allocation of additional memory.Personally for me Quicksort appeared as one of the most complex sorting algorithms. The basic idea is pretty simple and usually takes just a few minutes to implement. But that version, of course, if not practically usable. When it comes to details and to efficiency, it is getting more and more complicated.Quicksort was first discovered by C.A.R. Hoare in 1962 (see “Quicksort,” Computer Journal 5, 1, 1962) and in following years algorithm slightly mutated. The most known version is Three-way Quicksort. The most comprehensive of widely known ones is Dual-Pivot Quicksort. Both algorithms will be covered in that post.  The Java language was used to implement all algorithms. That post do not pretend to make adequate performance analysis. Test data used for performance comparison is incomplete and used just to show certain optimization techniques. Also, algorithm implementations are not necessary optimal. Just keep that in mind while you are reading. Basics The basic version of Quicksort is pretty simple and can be implemented just in few lines of code: public static void basicQuickSort(long arr[], int beginIdx, int len) { if ( len <= 1 ) return; final int endIdx = beginIdx+len-1;// Pivot selection final int pivotPos = beginIdx+len/2; final long pivot = arr[pivotPos]; Utils.swap(arr, pivotPos, endIdx);// partitioning int p = beginIdx; for(int i = beginIdx; i != endIdx; ++i) { if ( arr[i] <= pivot ) { Utils.swap(arr, i, p++); } } Utils.swap(arr, p, endIdx);// recursive call basicQuickSort(arr, beginIdx, p-beginIdx); basicQuickSort(arr, p+1, endIdx-p); } The code looks pretty simple and easily readable. Pivot selection is trivial and doesn’t require any explanation. The partitioning process can be illustrated using following figure:pointer “i” moves from the beginning to the end on array (note, that the last element of the array is skipped – we know that it the pivot). If i-th element is “<= pivot” then i-th and p-th elements are swapped and “p” pointer is moved to the next element. When partitioning is finished array will look like this: Remember, that in the code, at the end of array there is element with pivot value, and that element is excluded from the pivoting loop. That element is put on p-th position, which makes p-th element included in “<= pivot” area. If you need more details, have a look at Wikipedia. There is pretty good explanation with lots of references. I would just emphasize your attention that algorithm consists of three main sections. These sections are pivot selection, partitioning and recursive call to sort partition. To make separation clearer the algorithm can be written down as: public static void basicQuickSort(long arr[], int beginIdx, int len) { if ( len <= 1 ) return; final int endIdx = beginIdx + len - 1; final int pivotIdx = getPivotIdx(arr, beginIdx, len); final long pivot = arr[pivotIdx];Utils.swap(arr, pivotIdx, endIdx); int p = partition(arr, beginIdx, len, pivot); Utils.swap(arr, p, endIdx);basicQuickSort(arr, beginIdx, p-beginIdx); basicQuickSort(arr, p+1, endIdx-p); }public static int partition(long[] arr, int beginIdx, int len, long pivot) { final int endIdx = beginIdx + len - 1; int p = beginIdx; for(int i = beginIdx; i != endIdx; ++i) { if ( arr[i] <= pivot ) { Utils.swap(arr, i, p++); } } return p; }public static int getPivotIdx(long arr[], int beginIdx, int len) { return beginIdx+len/2; } Now let’s have a look how it performs vs Java 1.6 sort algorithm. For the test I will generate array using following loop: static Random rnd = new Random(); private static long[] generateData() { long arr[] = new long[5000000]; for(int i = 0; i != arr.length; ++i) { arr[i] = rnd.nextInt(arr.length); } return arr; } Then I run each JDK 6 Arrays.sort() and basicQuickSort() for 30 times and took the average run time as the result. New set of random data was generated for each run. The result if that exercise is this:arr[i]=rnd.nextInt(arr.length)Java 6 Arrays.sort 1654msbasicQuickSort 1431msNot that bad. Now look what would happen if input data has some more repeated elements. To generated that data, I just divided nextInt() argument by 100:arr[i]=rnd.nextInt(arr.length) arr[i]=rnd.nextInt(arr.length/100)Java 6 Arrays.sort 1654ms 935msbasicQuickSort 1431ms 2570msNow that is very bad. Obviously that simple algorithm doesn’t behave well in such cases. It can be assumed that the problem is in the quality of the pivot. The worst possible pivot is the biggest or the smallest element of the array. In that case, algorithm would has O(n^2) complexity. Ideally pivot should be chosen such as it splits an array into two parts with equal sizes. It means that ideal pivot is the median on all values of given array. Practically that is not good idea – too slow. Therefore, usually, implementation uses median of 3-5 elements. The decision on the number of elements used for pivot can be based on the size of partitioned array. The code for the pivot selection may look like this: public static int getPivotIdx(long arr[], int beginIdx, int len) { if ( len <= 512 ) { int p1 = beginIdx; int p2 = beginIdx+(len>>>1); int p3 = beginIdx+len-1;if ( arr[p1] > arr[p2] ) { int tmp = p1; p1 = p2; p2 = tmp; } if ( arr[p2] > arr[p3] ) { p2 = p3; } if ( arr[p1] > arr[p2] ) { p2 = p1; }return p2; } else { int p1 = beginIdx+(len/4); int p2 = beginIdx+(len>>1); int p3 = beginIdx+(len-len/4); int p4 = beginIdx; int p5 = beginIdx+len-1;if ( arr[p1] > arr[p2] ) { int tmp = p1; p1 = p2; p2 = tmp; } if ( arr[p2] > arr[p3] ) { int tmp = p2; p2 = p3; p3 = tmp; } if ( arr[p1] > arr[p2] ) { int tmp = p1; p1 = p2; p2 = tmp; } if ( arr[p3] > arr[p4] ) { int tmp = p3; p3 = p4; p4 = tmp; } if ( arr[p2] > arr[p3] ) { int tmp = p2; p2 = p3; p3 = tmp; } if ( arr[p1] > arr[p2] ) { p2 = p1; } if ( arr[p4] > arr[p5] ) { p4 = p5; } if ( arr[p3] > arr[p4] ) { p3 = p4; } if ( arr[p2] > arr[p3] ) { p3 = p2; } return p3; } } Here are results after improvements in pivot selection strategy:arr[i]=rnd.nextInt(arr.length) arr[i]=rnd.nextInt(arr.length/100)Java 6 Arrays.sort 1654ms 935msbasicQuickSort 1431ms 2570msbasicQuickSort with ‘better’ pivot 1365ms 2482msUnfortunately, the improvement is almost nothing. It appeared that pivot selection is not the root cause of the problem. But still let’s keep it, it doesn’t harm, even helps a little bit. It also significantly reduce possibility of O(n^2) behaviour. Another suspect is the algorithm itself. It seems like it’s not good enough. Obviously it doesn’t perform well, when collection has repeated elements. Therefore something has to be changed.Three-way partitioningThe way to get around that problem is three-way-partitioning. As a result of such partitioning, elements which are equal to the pivot are put in the middle of the array. Elements which are bigger than pivot are put in the right side of the array and ones which are smaller on the left side, appropriately. Implementation of that partitioning method consists of two stages. In the first stage arrays is scanned by two pointers (“i” and “j”) which are approaching in opposite directions. Elements which are equals to pivot are moved to the ends of array:  It can be seen that after the first stage elements which are equal to the pivot are located on the edges of the array. On the second stage these elements are moved to the middle. That is now their final position and they can be be excluded from the further sorting: After implementation of such algorithm partitioning function is getting much more complicated. In that implementation the result of the partitioning is lengths of two bound partitions: public static long partition(long[] arr, int beginIdx, int endIdx, long pivot) { int i = beginIdx-1; int l = i; int j = endIdx+1; int r = j; while ( true ) { while(arr[++i] <> pivot){}if ( i >= j ) break;Utils.swap(arr, i, j); if ( arr[i] == pivot ) { Utils.swap(arr, i, ++l); } if ( arr[j] == pivot ) { Utils.swap(arr, j, --r); } } // if i == j then arr[i] == arr[j] == pivot if ( i == j ) { ++i; --j; }final int lLen = j-l; final int rLen = r-i;final int pLen = l-beginIdx; final int exchp = pLen > lLen ? lLen: pLen; int pidx = beginIdx; for(int s = 0; s <= exchp; ++s) { Utils.swap(arr, pidx++, j--); } final int qLen = endIdx-r; final int exchq = rLen > qLen ? qLen : rLen; int qidx = endIdx; for(int s = 0; s <= exchq; ++s) { Utils.swap(arr, qidx--, i++); }return (((long)lLen)<<32)|rlen; } The pivot selection has to be changed as well, but more just for convenience, the idea remains absolutely the same. Now it returns actual value of pivot, instead of index: public static long getPivot(long arr[], int beginIdx, int len) { if ( len <= 512 ) { long p1 = arr[beginIdx]; long p2 = arr[beginIdx+(len>>1)]; long p3 = arr[beginIdx+len-1];return getMedian(p1, p2, p3); } else { long p1 = arr[beginIdx+(len/4)]; long p2 = arr[beginIdx+(len>>1)]; long p3 = arr[beginIdx+(len-len/4)]; long p4 = arr[beginIdx]; long p5 = arr[beginIdx+len-1];return getMedian(p1, p2, p3, p4, p5); } } And here is the main method, which is slightly changed as well: public static void threeWayQuickSort(long[] arr, int beginIdx, int len) { if ( len < 2 ) return;final int endIdx = beginIdx+len-1; final long pivot = getPivot(arr, beginIdx, len); final long lengths = threeWayPartitioning(arr, beginIdx, endIdx, pivot);final int lLen = (int)(lengths>>32); final int rLen = (int)lengths;threeWayQuickSort(arr, beginIdx, lLen); threeWayQuickSort(arr, endIdx-rLen+1, rLen); } now let’s compare it with Java 6 sort:arr[i]=rnd.nextInt(arr.length) arr[i]=rnd.nextInt(arr.length/100)Java 6 Arrays.sort 1654ms 935msbasicQuickSort 1431ms 2570msbasicQuickSort with ‘better’ pivot 1365ms 2482msThree-way partitioning Quicksort 1330ms 829msHuh, impressive! It is faster than standard library, which, by he way, implements the same algorithm. To be honest I was surprised, when found that it is such an easy task to beat standard library. But what about making it even faster? There is one trick which always helps and it works for all sorting algorithms which work with consecutive memory. That trick is Insertion sort. Although is has big chance of O(n^2), it appears to be very very effective on the small arrays and always gives some performance improvements. Especially that is noticeable when input data is not sorted and there are not many repeated elements. All you need to do is just add it at the beginning of sorting method: public static void threeWayQuickSort(long[] arr, int beginIdx, int len) { if ( len < 2 ) return;if ( len < 17 ) { InsertionSort.sort(arr, beginIdx, len); return; }final int endIdx = beginIdx+len-1; final long pivot = getPivot(arr, beginIdx, len); final long lengths = threeWayPartitioning(arr, beginIdx, endIdx, pivot);final int lLen = (int)(lengths>>32); final int rLen = (int)lengths;threeWayQuickSort(arr, beginIdx, lLen); threeWayQuickSort(arr, endIdx-rLen+1, rLen); } and run the test again:arr[i]=rnd.nextInt(arr.length) arr[i]=rnd.nextInt(arr.length/100)Java 6 Arrays.sort 1654ms 935msbasicQuickSort 1431ms 2570msbasicQuickSort with ‘better’ pivot 1365ms 2482msThree-way partitioning Quicksort 1330ms 829msThree-way partitioning Quicksort with Insertion sort 1155ms 818msnow standard library looks just awful. It looks now that all is said and done. But it in reality that’s not the end of the story and there is something else to talk about. Dual-pivot Quicksort Moving forward, I found that Java 7 is much more advanced and performs much faster than Java 6 version and outperforms all previous tests:arr[i]=rnd.nextInt(arr.length) arr[i]=rnd.nextInt(arr.length/100)Java 6 Arrays.sort 1654ms 935msJava 7 Arrays.sort 951ms 764msbasicQuickSort 1431ms 2570msbasicQuickSort with ‘better’ pivot 1365ms 2482msThree-way partitioning Quicksort 1330ms 829msThree-way partitioning Quicksort with Insertion sort 1155ms 818msAfter several seconds of very exciting research study it was found that Java 7 uses new version of Quicksort algorithm which was discovered just in 2009 by Vladimir Yaroslavskiy and named Dual-Pivot QuickSort. Interestingly that after some search in internet, I have found algorithm called “Multiple pivot sorting” which was published in 2007. It seems like generic case of “Dual-Pivot QuickSort” where is possible to have any number of pivots. As you may notice from the name, the main difference of that algorithm is that it is using two pivots, instead of one. Coding now is getting even more complicated. The simplest version of that algorithm may look like this: public static void dualPivotQuicksort(long arr[], int beginIdx, int len) { if ( len < 2 ) return;final int endIdx = beginIdx+len-1;long pivot1 = arr[beginIdx]; long pivot2 = arr[endIdx];if ( pivot1 == pivot2 ) { final long lengths = QuickSort.threeWayPartitioning(arr, beginIdx, endIdx, pivot1); final int lLen = (int)(lengths>>32); final int rLen = (int)lengths;dualPivotQuicksort3(arr, beginIdx, lLen); dualPivotQuicksort3(arr, endIdx-rLen+1, rLen); } else { if ( pivot1 > pivot2 ) { long tmp = pivot1; pivot1 = pivot2; pivot2 = tmp; Utils.swap(arr, beginIdx, endIdx); }int l = beginIdx; int r = endIdx; int p = beginIdx;while ( p <= r ) { if ( arr[p] < pivot1 ) { Utils.swap(arr, l++, p++); } else if ( arr[p] > pivot2 ) { while ( arr[r] > pivot2 && r > p ) { --r; } Utils.swap(arr, r--, p); } else { ++p; } } if ( arr[l] == pivot1 ) ++l; if ( arr[r] == pivot2 ) --r;dualPivotQuicksort3(arr, beginIdx, l-beginIdx); dualPivotQuicksort3(arr, l, r-l+1); dualPivotQuicksort3(arr, r+1, endIdx-r); } } First code picks up two pivots. If pivots are the same, it means we have just one pivot and in that case we can used three-way method for partitioning. If pivots are different, then partitioning process will look like this:Scanning pointer “p” is moving from the beginning of array. If current element is “<> pivot1″, then r-th element is swapped with p-th and “r” pointer is moved to next element backwards. All stops when “p” becomes less than “r”. After partitioning, array will look like this:When partition is finished, algorithms is called recursively for each partition.Reader shouldn’t expect good performance from the provided code, it is not fast and performs even worse than Java 6 Arrays.sort. I was provided just to illustrate the concept.To be honest I failed to make my implementation to perform any better than version from Java 7. I must admit, that Yaroslav made a very good job there. Therefore I do not think that there is any sense in discussing my implementation here in details.But, if someone wants to challenge Java 7 version I can point to some direction for optimizations. Firstly, which is seems obvious? is pivot selection. Another easy improvement is Insertions sort at the beginning. Also, I have noticed that that algorithm is very sensitive to inlining, so there is sense to inline Utils.swap(). As other option, you can decided to go thought the middle partition and move elements equals to pivot1 or pivot2 to their final positions which will exclide them from the further sorting. I found that it is effective for relatively (<=512 elements) small arrays. You can also have a look at source from the Java 7 and try to implement some tricks from there. Be ready to spend a lot of time :) All in all, it can be seen that over the years sorting is getting better and better. And that statement doesn’t only relate to Quicksort. Other sorting algorithms are improving as well. As examples can be considered Introsoft or Timsort. However, it would be true to say that nothing really new was discovered in that area since 1960s-1980s. Hopefully we will be lucky enough to see something completely new and radical in the future.For ones who want to dig deeper, as the starting point, I would suggest to visit following links:Quicksort Wikipedia article Dual-Pivot QuickSort Quicksort Is Optimal presentation by Robert Sedgewick & Jon Bentley MIT lecture about quicksortReference: All you need to know about QuickSort from our JCG partner Stanislav Kobylansky at the Stas’s blog blog....

Impressive first Apache Camel release

In preparation for the CamelOne conference next week, I took time to look back in the history of the Apache Camel project. So among others I had a look at the first official 1.0 release of Apache Camel.Apache Camel 1.0 – 5 years agoThe more I looked the more impressed I am with the fact of this release. Now you have to consider this was done 5 years ago, and in this release the Camel founders had already in the DNA of the projectJava DSL XML DSL (using Spring) OSGi on the roadmap camel-core JAR of 660kb 18 external components (+ whats in camel-core) 2 working examples full website with documentation included, incl FAQs Project logo and box The Camel Maven plugin to easily run Camel and its examples Test KitBelow is a screenshot of the tar ball distribution, of this release:Camel 1.0 distribution (hint the OSGi ambitions in the pom.xml)When you hear James talk about the past and how he created Camel, then his ambitions was that Camel should not restrain you. If you want to use Java and not XML then fine. If you are on the Spring XML wagon, then fine. If you are into Groovy, then fine, if you want to use Ruby, then hell yeah (Ruby supported was added in Camel 1.3). Lets take a look down the lane of the DSLs. Apache Camel is most likely the first integration project that offers multiple language DSLs out of the box in its very first release. It is simply in the projects DNA, and what makes IMHO Apache Camel stand out from the rest – The diverse and vibrant community and the DNA of the Camel project embracing “no shoe fits all”. So lets take a look at this example with the Java DSL. Todays people using the latest Camel release, eg 2.9.2 should instant be familiar with the DSL – Something just works from the very beginning!Java DSL in Camel 1.0And a sample of the XML DSL, which you can see in the source code as well.XML DSL in Camel 1.0And in this first release we also have the excellent Test Kit, for example notice the usage of mocks and setting up expectations in the screenshot below. Testing Camel was made easy from day-1. Yes its in the DNA of the Camel project.Camel Test Kit already in Camel 1.0And notice from the unit test above, the reference to the founders of Apache Camel.James Strachan Rob Davies Hiram Chirino Guillaume NodetThanks guys for creating this marvelous project. Impressive first release, you guys did 5 years ago. I will end this blog by running the camel-example-spring from the Apache Camel 1.0 release.  $cd examples $cd camel-example-spring $mvn compile $mvn camel:run Now you should have patience as Maven is downloading ancient JARs that are 5 years old. So it takes a while :)Camel 1.0 example runningThe screenshot above shows the Camel 1.0 example running. This example kicks off by consuming messages from a JMS queue and write those to a file. So we need to connect with jconsole, to send a message. I have highlighted the service url to use in jconsole.jconsole to send a message – Camel 1.0 rocksIn jconsole we expand the tree and find the test queue, and invoke the sentTextMessage operation with the text “Camel 1.0 rocks”. In the 2nd screenshot above, you may notice in the last line from the console, it says “Received Exchange”. This is Camel logging this, as the example uses the following route shown in the screenshot in the top of this blog. We can then see the file was written to the test directory as well, where we can see the file name is the message id, and the file content is what we sent from jconsole:This was 5 years ago, so lets fast forward to today. The last release of Apache Camel is 2.9.2, so lets migrate the old example to use this version instead. To do that you need to:Adjust the pom.xml to use Camel 2.9.2 and the camel-activemq component has been moved from Camel to ActiveMQ so you need to include that. And for logging we now use slf4j. The modified pom.xml is shown belowUpgrading the example from Camel 1.0 to 2.9.2, adjusting the pom.xml fileIn the Spring XML file you need to change the namespace of Camel, as when Camel graduated to become an Apache Top Level Project, the namespace was migrated from activemq to camel. Also we upgrade to use Spring 3.0 in the XSD. And the activemq component is now from ActiveMQ and not Camel. And the packages attribute is now a xml tag, so you need to use <packlage> in the <camelContext>. The updated file is shown below:Upgrading Spring XML from Camel 1.0 to Camel 2.9.2Okay we are now ready to go. There is no need for changes in the Java source code!!!The example migrated from Camel 1.0 to 2.9.2 without any Java code changes!!!And like before we use JConsole to send a text message. I must say James and the founders hit it in Camel 1.0 release, the DSL from the example is fully compatible with todays Camel release. Indeed a very impressive first release. Camel was off to a great start, and the project has grown from strength to strength ever since. Reference: Looking at the impressive first Apache Camel release from our JCG partner Claus Ibsen at the Claus Ibsen riding the Apache Camel blog....

ANTLR: Getting Started

This post drives you towards the basics of ANTLR. Previously, we had learnt about setting up of ANTLR as an external tool.RECAP! It’s here: ANTLR External Tool :) So, here we go…. What is ANTLR? • ANother Tool for Language Recognition, is a language tool that provides a framework for constructing recognizers, interpreters, compilers, and translators from grammatical descriptions containing actions. What can be the target languages?  • Action Script, Ada • C • C#; C#2 • C#3 • D • Emacs ELisp • Objective C • Java • Java Script • Python • Ruby • Perl6 • Perl • PHP • Oberon • Scala   What does ANTLR support? • Tree construction • Error recovery • Error handling • Tree walking • Translation   What environment does it support?  ANTLRWorks is the IDE for ANTLR. It is the graphical grammar editor and debugger, written by Jean Bovet using Swing.What for ANTLR can be used? • “”REAL”” programming languages • domain-specific languages [DSL]   Who is using ANTLR? • Programming languages :Boo, Groovy, Mantra, Nemerle, XRuby etc. • Other Tools: HIbernate, Intellij IDEA, Jazillian, JBoss Rules, Keynote(Apple), WebLogic(Oracle) etc.   Where is that you can look for ANTLR? You can always follow here http://www.antlr.org • to download ANTLR and ANTLRWorks, which are free and open source • docs,articles,wiki,mailing list,examples…. You can catch everything here! Row your Boat….  Basic terms• Lexer : converts a stream of characters to a stream of tokens. • Parser : processes of tokens, possibly creating AST • Abstract Syntax Tree(AST): an intermediate tree representation of the parsed input that is simpler to process than the stream of tokens. It can as well be processed multiple times. • Tree Parser: It processes an AST • String Template: a library that supports using templates with placeholders for outputting textGeneral Steps• Write Grammar in one or more files • Write string templates[optional] • Debug your grammar with ANTLRWorks • Generate classes from grammar • Write an application that uses generated classes • Feed the application text that conforms to the grammarA Bit Further…. Lets write a simple grammar which consists of • Lexer • Parser Lexer: Breaks the input stream into tokens Lets take the example of simple declaration type in C of the form “int a,b;” or “int a;” and same with float. As we see we can write lexer as follows: //TestLexer.g grammar TestLexer; ID : (‘a’..’z’|’A’..’Z’|’_’) (‘a’..’z’|’A’..’Z’|’0′..’9’|’_’|’.’|’a’..’z’|’A’..’Z’)*; COMMA: ‘,'; SEMICOLON:';'; DATATYPE: ‘int’ | ‘float';As we could see, these were the characters that were to be converted to tokens. So, now lets write some rules which processes these tokens generated and may it create a parse tree accordingly. //TestParser.g grammar TestParser; options {language : Java;} decl:DATATYPE ID (‘,’ ID)* ; Running ANTLR on the grammar just generates the lexer and parser,TestParser and TestLexer. To actually try the grammar on some input, we need a test rig with a main( ) method as follows: // Test.java import org.antlr.runtime.*; public class Test {public static void main(String[] args) throws Exception {// Create an input character stream from standard in ANTLRFileStream input = new ANTLRFileStream("input"); // give path to the file input // Create an ExprLexer that feeds from that stream TestLexer lexer = new TestLexer(input); // Create a stream of tokens fed by the lexer CommonTokenStream tokens = new CommonTokenStream(lexer); // Create a parser that feeds off the token stream TestParser parser = new TestParser(tokens); // Begin parsing at rule decl parser.decl(); }}We shall see how to create an AST and walk over the tree in the next blog post… Happy learning….! :) Reference: Getting Started With ANTLR:Basics from our JCG partner Swathi V at the * Techie(S)pArK * blog....

Spring 3.1 profiles and Tomcat configuration

Spring 3.1 introduced very useful feature called profiles. Thanks to that its easy to build one package that can be deployed in all environments (development, test, production and so on). By defining system property spring.profiles.active Spring allows us to create different beans depending on active profile name using XML configuration or @Profile annotation. As we all know system properties can be used in Spring XML files and we will take advantage of that. In this post I will show how to use Spring profiles to create one package for all environments and how to run it on Apache Tomcat. Example architecture I think the most common and wanted architecture is when applications deployed on dev, test and production differ only in used properties file containing configuration. WAR contains configuration for all environments and correct one is chosen during runtime. So it is the best if in application resources we have files like: src main resources - config_dev.properties - config_production.properties ...Configuring Spring property placeholder In order to load properties files in Spring we use <context:property-placeholder /> or @PropertySource annotation. In my example I will follow XML configuration approach for loading properties file: <?xml version='1.0' encoding='UTF-8'?> <beans xmlns='http://www.springframework.org/schema/beans' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:context='http://www.springframework.org/schema/context' xsi:schemaLocation='http://www.springframework.org/schema/beanshttp://www.springframework.org/schema/beans/spring-beans-3.1.xsdhttp://www.springframework.org/schema/contexthttp://www.springframework.org/schema/context/spring-context-3.1.xsd'><context:property-placeholder location='classpath:config_${spring.profiles.active}.properties' /></beans>Configuring Tomcat Now its time to tell Tomcat which profile is active. There at least ways to do that:defining context param in web.xml – that breaks “one package for all environments” statement. I don’t recommend that defining system property -Dspring.profiles.active=your-active-profileI believe that defining system property is much better approach. So how to define system property for Tomcat? Over the internet i could find a lot of advices like “modify catalina.sh” because you will not find any configuration file for doing stuff like that. Modifying catalina.sh is a dirty unmaintable solution. There is a better way to do that. Just create file setenv.sh in Tomcat’s bin directory with content: JAVA_OPTS='$JAVA_OPTS -Dspring.profiles.active=dev' and it will be loaded automatically during running catalina.sh start or run.Conclusion Using Spring profiles we can create flexible applications that can be deployed in several environments. How is it different from Maven profiles approach? With Maven a person who was building application had to define in which environment it was supposed to run. With approach described above environment decides if its development, testing or production. Thanks to that we can use exactly the same WAR file and deploy it everywhere. Reference: Spring 3.1 profiles and Tomcat configuration from our JCG partner Maciej Walkowiak at the Software Development Journey blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below: