Featured FREE Whitepapers

What's New Here?


Word Count MapReduce with Akka

In my ongoing workings with Akka, i recently wrote an Word count map reduce example. This example implements the Map Reduce model, which is very good fit for a scale out design approach. FlowThe client system (FileReadActor) reads a text file and sends each line of text as a message to the ClientActor. The ClientActor has the reference to the RemoteActor ( WCMapReduceActor ) and the message is passed on to the remote actor The server (WCMapReduceActor) gets the message. The Actor uses the PriorityMailBox to decide the priority of the message and filters the queue accordingly. In this case, the PriorityMailBox is used to segregate the message between the mapreduce requests and getting the list of results (DISPLAY_LIST)message from the aggregate actor. The WCMapReduceActor sends across the messages to the MapActor (uses RoundRobinRouter dispatcher) for mapping the words After mapping the words, the message is send across to the ReduceActor(uses RoundRobinRouter dispatcher) for reducing the words The reduced result(s) are send to the Aggregate Actor that does an in-memory aggregation of the resultThe following picture details how the program has been structuredThe code base for the program is available at the following location – https://github.com/write2munish/Akka-Essentials. For more information on MapReduce, read the post MapReduce for dummies. Reference: Word Count MapReduce with Akka from our JCG partner Munish K Gupta at the Akka Essentials blog....

Frameworks vs Libraries as Inheritance vs Composition?

For quite some time inheritance was the dominant model of structuring programs in OO languages like Java. Very often it was used as a mechanism for reusing code – “common” functions where placed in an abstract class, so that subclasses can use them. However, this often proves to be very limiting, as you can only inherit from a single class. The code becomes constrained and tied to one particular framework-specific class. Not to mention testing – such base classes often depend on outside state, making tests hard to setup. That’s why nowadays time and again you can hear that you should prefer composition over inheritance (see for example this StackOverflow question). When using composition, you can leverage multiple re-usable chunks of code, and combine them in an arbitrary way. Also using IoC/Dependency Injection strongly favors composition. I think the above inheritance-composition opposition strongly resembles the framework-library distinction. When using a framework, you are forced into a specific structure, where you must model your code in one specific way. Quite obviously it’s often hard or impossible to use two frameworks in one layer/module. That’s how hacks, ugly workarounds, reflection madness, etc. is born. Libraries on the other hand (unless they are deliberately wrongly written), can be freely combined. Just like composition of classes, you can compose usage of many libraries in one module. Your code can be kept clean and only use the library functionality it really requires. I’m not saying that frameworks are bad – just like inheritance, they may be very useful if used in the correct places. However, next time you put your code into a framework, maybe it’s better to think twice: can this functionality be implemented using composition, with the help of a library? Won’t this make my code cleaner and more maintainable? Reference: Frameworks vs Libraries as Inheritance vs Composition? from our W4G partner Adam Warski at the Blog of Adam Warski blog....

Runtime vs Compile-Time Classpath

This should really be a simple distinction, but I’ve been answering a slew of similar questions on Stackoverflow, and often people misunderstand the matter. So, what is a classpath? A set of all the classes (and jars with classes) that are required by your application. But there are two, or actually three distinct classpaths:compile-time classpath. Contains the classes that you’ve added in your IDE (assuming you use an IDE) in order to compile your code. In other words, this is the classpath passed to “javac” (though you may be using another compiler). runtime classpath. Contains the classes that are used when your application is running. That’s the classpath passed to the “java” executable. In the case of web apps this is your /lib folder, plus any other jars provided by the application server/servlet container test classpath – this is also a sort of runtime classpath, but it is used when you run tests. Tests do not run inside your application server/servlet container, so their classpath is a bit differentMaven defines dependency scopes that are really useful for explaining the differences between the different types of classpaths. Read the short description of each scope. Many people assume that if they successfully compiled the application with a given jar file present, it means that the application will run fine. But it doesn’t – you need the same jars that you used to compile your application to be present on your runtime classpath as well. Well, not necessarily all of them, and not necessarily only them. A few examples:you compile the code with a given library on the compile-time classpath, but forget to add it to the runtime classpath. The JVM throws NoClasDefFoundError, which means that a class is missing, which was present when the code was compiled. This error is a clear sign that you are missing a jar file on your runtime classpath that you have on your compile-time classpath. It is also possible that a jar you depend on in turn depends on a jar that you don’t have anywhere. That’s why libraries (must) have their dependencies declared, so that you know which jars to put on your runtime classpath containers (servlet containers, application servers) have some libraries built-in. Normally you can’t override the built-in dependencies, and even when you can, it requires additional configuration. So, for example, you use Tomcat, which provides the servlet-api.jar. You compile your application with the servlet-api.jar on your compile-time classpath, so that you can use HttpServletRequest in your classes, but do not include it in your WEB-INF/lib folder, because tomcat will put its own jar in the runtime classpath. If you duplicate the dependency, you may get bizarre results, as classloaders get confused. a framework you are using (let’s say spring-mvc) relies on another library to do JSON serialization (usually Jackson). You don’t actually need Jackson on your compile-time classpath, because you are not referring to any of its classes or even spring classes that refer to them. But spring needs Jackson internally, so the jackson jar must be in WEB-INF/lib (runtime classpath) for JSON serialization to work.The cases might be complicated even further, when you consider compile-time constants and version mismatches, but the general point is this: the classpaths that you use for compiling and for running the application are different, and you should be aware of that. Reference: Runtime Classpath vs Compile-Time Classpath from our JCG partner Bozhidar Bozhanov at the Bozho’s tech blog blog....

TeamCity Build Dependencies

Introduction The subject of build dependencies is neither a trivial nor a minor one. Various build tools approach this subject from different perspectives contributing various solutions, each with its own strengths and weaknesses. Maven and Gradle users who are familiar with release and snapshot dependencies may not know about TeamCity snapshot dependencies or assume they’re somehow related to Maven (which isn’t true). TeamCity users who are familiar with artifact and snapshot dependencies may not know that adding an Artifactory plugin allows them to use artifact and build dependencies as well, on top of those provided by TeamCity. Some of the names mentioned above seem not to be established enough while others may require a discussion about their usage patterns. Having this in mind I’ve decided to explore each solution in its own blog post, setting a goal of providing enough information so that people can choose what works best. The first post explored Maven snapshot and release dependencies. This is the second post, which covers artifact and snapshot dependencies provided by TeamCity and the third and final part will cover the artifact and build dependencies provided by TeamCity Artifactory plugin. Non-Maven Dependencies While Maven-based dependencies management and artifact repositories are very common and widespread in Java, there are cases where you may still find them insufficient or inadequate for your needs. For starters, you may not be developing in Java or perhaps your build tool is not providing built-in integration with Maven repositories, as is the case with Ant (or its Gant and NAnt spin-offs), SCons, Rake or MSBuild. Secondly, snapshot Maven dependencies provide their own set of challenges covered in the previous blog post, making it harder to ensure correct snapshot dependency is used in a chain of builds. In order to address these scenarios, TeamCity provides two ways to connect dependent build configurations and their outcomes: artifact and snapshot dependencies. TeamCity Artifact Dependencies The idea of artifact dependencies in TeamCity is very simple: download the artifacts produced by an other build before the current one begins. After the artifacts are downloaded to the folder specified (checkout directory by default), your build script can use them to achieve its goals. You can find configuration details in TeamCity documentation.Naturally, this scheme is not suitable for build tools with automatic dependencies management, but it works well with build or shell scripts accepting and expecting local paths, relative to the checkout directory. Note that the copying works not only for the produced build binaries, but for any kind of binary or text files, like the TeamCity coverage report as demonstrated on the screenshot above.There is one important detail about specifying artifact dependencies and that is “Get artifacts from” configuration where you specify what type of build should files be taken from. Possible values of this field are “last successful”, “finished”, “pinned”, or “tagged build”, as well as the build number or “Build from the same chain”. While most values should be trivial to understand with “Last successful build” being the default and generally suitable option, the definition of “same chain” build is directly related to TeamCity snapshot dependencies. TeamCity Snapshot Dependencies Imagine a monolithic multi-step build process (build, test, package, deploy) which you decide to split into multiple smaller builds, invoked sequentially, forming a chain of executions. Doing so allows one to configure or trigger every chain step separately and run certain steps in parallel in order to speedup the process (like executing tests or building independent components). Most of all, it makes the overall maintenance significantly easier. However, while doing so you need to ensure every chain step uses the same consistent set of sources pulled from VCS even if newer commits are made all the while chain steps are running. That’s what TeamCity snapshot dependencies are for: they connect several build configurations into a single chain of execution, called build chain, with every step using the same set of sources, regardless of VCS updates. Note that the TeamCity use of the term “snapshot dependencies” may confuse people familiar with Maven snapshot dependencies which are two unrelated concepts. Snapshot dependencies are configured similarly to artifact dependencies. You can find configuration details in TeamCity documentation.Using Artifact and Snapshot Dependencies Together When applicable, it is recommended to define both kinds of dependencies between build configurations, as this ensures not only a consistent set of sources used throughout a chain steps but also a consistent flow of artifacts produced. Now the definition of “Build from the same chain” in artifact dependency mentioned above becomes clear, as this is the only meaningful option in this scenario. In a way, you can think of build chain steps running in isolation from VCS updates after the first sources’ “snapshot” is taken. Chain artifacts are either re-created from the same sources or passed through chain steps with artifact dependencies. This makes chain steps consistent, reproducible and always up-to-date (when applied to using chain artifacts), something that can’t be easily achieved with Maven snapshot dependencies. Build Chains Visibility in TeamCity 7.0 TeamCity 7.0 took the notion of build chains to a whole new level by providing build chains a new UI, making chain steps visible and re-runnable. Once you have snapshot dependencies defined, a new “Build Chains” tab appears in project reports, providing a visual representation of all related build chains and a way to re-run any chain step manually, using the same set of sources pulled originally.Build Chain Triggering Having build configurations connected with snapshot dependencies and, therefore, their builds grouped into build chains not only makes them more consistent regarding the sources used, it also impacts the way builds are added to the build queue: after a certain chain step is triggered, the default behavior is to add all preceding chain steps as well, keeping their respective order, in addition to the one that was triggered initially. Let me repeat it for more clarity: triggering certain chain configuration adds preceding (those to the left of it) and not subsequent (to the right of it) configurations to the build queue, although it may seem counterintuitive at first. The idea is to mark the location where chain execution stops, which is exactly the configuration that was triggered initially; it becomes the last execution step. To trigger subsequent chain steps upon VCS changes found in a chain configuration, you can add a VCS trigger with the “Trigger on changes in snapshot dependencies” option to the configuration that would be the last execution step. This configuration is then triggered whenever any of the preceding chain steps is updated, which schedules the whole chain for execution. Having this behavior in mind, you therefore need to decide which configurations are triggered automatically and which should be run manually. Usually, earlier chain steps having no impact on external environment can be triggered automatically by VCS trigger but final chain steps, potentially modifying external systems, are invoked manually after a human verification of the previous chain results. The process of running the final chain steps manually is usually referred to as “promoting” previously finished builds. Sample Build Chain: Compile, Test, Deploy Imagine three sample build configurations, "Compile", "Test" and "Deploy" connected into a build chain: "Deploy" is snapshot dependent on "Test" which is snapshot dependent on "Compile".In this sample scenario the "Compile" and "Test" configurations are triggered automatically while "Deploy" is triggered manually, following the recommendations given above. VCS changes in "Compile" configuration only trigger an execution of this chain step, while VCS changes in "Test" configuration trigger "Compile" and "Test" execution (in that order). Once a "Compile" configuration is added to the builds queue, its sources’ timestamp is recorded on the server to be used in all subsequent chain steps. If any of the chain steps is connected to a different VCS root, its sources are also pulled according to the same timestamp. Promoting Finished Builds As soon as the automatic chain execution stops (after running "Test"), you can continue it by clicking the corresponding “Run” button on the "Deploy" configuration that was not triggered (see the build chain screenshot above). Alternatively, it is possible to promote a finished "Test" build through its “Build Actions” and invoke configurations which are snapshot dependent on it – "Deploy" configuration in this case.Summary This article has provided an overview of TeamCity artifact and snapshot dependencies, build chains, how their steps are triggered and how finished builds are promoted. I hope you now have a good understanding of how it works and of when it is appropriate (or not) to use TeamCity build dependencies in addition to those provided by build tools such as Maven. Please, refer to the TeamCity documentation for more information about this subject:Dependent Build Build ChainThe final blog post in the series will uncover how you can use the TeamCity Artifactory plugin in order to achieve a behavior which is similar to build chains for projects with Maven-based dependency management. Stay tuned! Reference: TeamCity Build Dependencies from our JCG partner Evgeny Goldin at the Goldin++ blog....

HPROF – Memory leak analysis tutorial

This article will provide you with a tutorial on how you can analyze a JVM memory leak problem by generating and analyzing a Sun HotSpot JVM HPROF Heap Dump file. A real life case study will be used for that purpose: Weblogic 9.2 memory leak affecting the Weblogic Admin server. Environment specificationsJava EE server: Oracle Weblogic Server 9.2 MP1 Middleware OS: Solaris 10 Java VM: Sun HotSpot 1.5.0_22 Platform type: Middle tierMonitoring and troubleshooting toolsQuest Foglight (JVM and garbage collection monitoring) jmap (hprof / Heap Dump generation tool) Memory Analyzer 1.1 via IBM support assistant (hprof Heap Dump analysis) Platform type: Middle tierStep #1 – WLS 9.2 Admin server JVM monitoring and leak confirmation The Quest Foglight Java EE monitoring tool was quite useful to identify a Java Heap leak from our Weblogic Admin server. As you can see below, the Java Heap memory is growing over time. If you are not using any monitoring tool for your Weblogic environment, my recommendation to you is to at least enable verbose:gc of your HotSpot VM. Please visit my Java 7 verbose:gc tutorial on this subject for more detailed instructions.Step #2 – Generate a Heap Dump from your leaking JVM Following the discovery of a JVM memory leak, the goal is to generate a Heap Dump file (binary format) by using the Sun JDK jmap utility. ** please note that jmap Heap Dump generation will cause your JVM to become unresponsive so please ensure that no more traffic is sent to your affected / leaking JVM before running the jmap utility ** <JDK HOME>/bin/jmap -heap:format=b <Java VM PID>This command will generate a Heap Dump binary file (heap.bin) of your leaking JVM. The size of the file and elapsed time of the generation process will depend of your JVM size and machine specifications / speed.For our case study, a binary Heap Dump file of ~ 2 GB was generated in about 1 hour elapsed time. Sun HotSpot 1.5/1.6/1.7 Heap Dump file will also be generated automatically as a result of a OutOfMemoryError and by adding -XX:+HeapDumpOnOutOfMemoryError in your JVM start-up arguments. Step #3 – Load your Heap Dump file in Memory Analyzer tool It is now time to load your Heap Dump file in the Memory Analyzer tool. The loading process will take several minutes depending of the size of your Heap Dump and speed of your machine.Step #4 – Analyze your Heap Dump The Memory Analyzer provides you with many features, including a Leak Suspect report. For this case study, the Java Heap histogram was used as a starting point to analyze the leaking objects and the source.For our case study, java.lang.String and char[] data were found as the leaking Objects. Now question is what is the source of the leak e.g. references of those leaking Objects. Simply right click over your leaking objects and select >> List Objects > with incoming referencesAs you can see, javax.management.ObjectName objects were found as the source of the leaking String & char[] data. The Weblogic Admin server is communicating and pulling stats from its managed servers via MBeans / JMX which create javax.management.ObjectName for any MBean object type. Now question is why Weblogic 9.2 is not releasing properly such Objects… Root cause: Weblogic javax.management.ObjectName leak! Following our Heap Dump analysis, a review of the Weblogic known issues was performed which did reveal the following Weblogic 9.2 bug below:Weblogic Bug ID: CR327368 Description: Memory leak of javax.management.ObjectName objects on the Administration Server used to cause OutOfMemory error on the Administration Server. Affected Weblogic version(s): WLS 9.2 Fixed in: WLS 10 MP1http://download.oracle.com/docs/cd/E11035_01/wls100/issues/known_resolved.html This finding was quite conclusive given the perfect match of our Heap Dump analysis, WLS version and this known problem description. Conclusion I hope this tutorial along with case study has helped you understand how you can pinpoint the source of a Java Heap leak using jmap and the Memory Analyzer tool. Please don’t hesitate to post any comment or question. I also provided free Java EE consultation so please simply email me and provide me with a download link of your Heap Dump file so I can analyze it for you and create an article on this Blog to describe your problem, root cause and resolution. Reference: HPROF – Memory leak analysis tutorial from our JCG partner Pierre-Hugues Charbonneau at the Java EE Support Patterns & Java Tutorial blog....

Java Thread CPU analysis on Windows

This article will provide you with a tutorial on how you can quickly pinpoint the Java Thread contributors to a high CPU problem on the Windows OS. Windows, like other OS such as Linux, Solaris & AIX allow you to monitor the CPU utilization at the process level but also for individual Thread executing a task within a process. For this tutorial, we created a simple Java program that will allow you to learn this technique in a step by step manner. Troubleshooting tools The following tools will be used below for this tutorial:Windows Process Explorer (to pinpoint high CPU Thread contributors) JVM Thread Dump (for Thread correlation and root cause analysis at code level)High CPU simulator Java program The simple program below is simply looping and creating new String objects. It will allow us to perform this CPU per Thread analysis. I recommend that you import it in an IDE of your choice e.g. Eclipse and run it from there. You should observe an increase of CPU on your Windows machine as soon as you execute it. package org.ph.javaee.tool.cpu;/** * HighCPUSimulator * @author Pierre-Hugues Charbonneau * http://javaeesupportpatterns.blogspot.com * */ public class HighCPUSimulator { private final static int NB_ITERATIONS = 500000000; // ~1 KB data footprint private final static String DATA_PREFIX = "datadatadatadatadatadatadatadatadatadatadatadatadatadatadatadata datadatadatadatadatadatadatadatadatadatadatadatadatadatadatadatad atadatadatadatadatadatadatadatadatadatadatadatadatadatadatadata datadatadatadatadatadatadatadatadatadatadatadatadatadatadatadata datadatadatadatadatadatadatadatadatadatadatadatadatadatadatadata datadatadatadatadatadatadatadatadatadatadatadatadatadatadatadata datadatadatadatadatadatadatadatadatadatadatadatadatadatadatadata datadatadatadatadatadatadata"; /** * @param args */ public static void main(String[] args) { System.out.println("HIGH CPU Simulator 1.0"); System.out.println("Author: Pierre-Hugues Charbonneau"); System.out.println("http://javaeesupportpatterns.blogspot.com/");try {for (int i = 0; i < NB_ITERATIONS; i++) { // Perform some String manipulations to slowdown and expose looping process... String data = DATA_PREFIX + i; }} catch (Throwable any) { System.out.println("Unexpected Exception! " + any.getMessage() + " [" + any + "]"); }System.out.println("HighCPUSimulator done!"); }}Step #1 – Launch Process Explorer The Process Explorer tool visually shows the CPU usage dynamically. It is good for live analysis. If you need historical data on CPU per Thread then you can also use Windows perfmon with % Processor Time & Thread Id data counters. You can download Process Explorer from the link below: http://technet.microsoft.com/en-us/sysinternals/bb896653 In our example, you can see that the Eclipse javaw.exe process is now using ~25% of total CPU utilization following the execution of our sample program.Step #2 – Launch Process Explorer Threads view The next step is to display the Threads view of the javaw.exe process. Simply right click on the javaw.exe process and select Properties. The Threads view will be opened as per below snapshot:The first column is the Thread Id (decimal format) The second column is the CPU utilization % used by each Thread - The third column is also another counter indicating if Thread is running on the CPUIn our example, we can see our primary culprit is Thread Id #5996 using ~ 25% of CPU. Step #3 – Generate a JVM Thread Dump At this point, Process Explorer will no longer be useful. The goal was to pinpoint one or multiple Java Threads consuming most of the Java process CPU utilization which is what we achieved. In order to go the next level in your analysis you will need to capture a JVM Thread Dump. This will allow you to correlate the Thread Id with the Thread Stack Trace so you can pinpoint that type of processing is consuming such high CPU. JVM Thread Dump generation can be done in a few manners. If you are using JRockit VM you can simply use the jrcmd tool as per below example:Once you have the Thread Dump data, simply search for the Thread Id and locate the Thread Stack Trace that you are interested in. For our example, the Thread “Main Thread” which was fired from Eclipse got exposed as the primary culprit which is exactly what we wanted to demonstrate. Main Thread id=1 idx=0x4 tid=5996 prio=5 alive, native_blocked at org/ph/javaee/tool/cpu/HighCPUSimulator.main (HighCPUSimulator.java:31) at jrockit/vm/RNI.c2java(IIIII)V(Native Method) -- end of trace Step #4 – Analyze the culprit Thread(s) Stack Trace and determine root cause At this point you should have everything that you need to move forward with the root cause analysis. You will need to review each Thread Stack Trace and determine what type of problem you are dealing with. That final step is typically where you will spend most of your time and problem can be simple such as infinite looping or complex such as garbage collection related problems. In our example, the Thread Dump did reveal the high CPU originates from our sample Java program around line 31. As expected, it did reveal the looping condition that we engineered on purpose for this tutorial. for (int i = 0; i < NB_ITERATIONS; i++) { // Perform some String manipulations to slowdown and expose looping process... String data = DATA_PREFIX + i; }I hope this tutorial has helped you understand how you can analyze and help pinpoint root cause of Java CPU problems on Windows OS. Please stay tuned for more updates, the next article will provide you with a Java CPU troubleshooting guide including how to tackle that last analysis step along with common problem patterns. Reference: Java Thread CPU analysis on Windows from our JCG partner Pierre-Hugues Charbonneau at the Java EE Support Patterns & Java Tutorial blog....

The True Story of the Grid Engine Dream

In Sun, I they called me Mr. Grid Engine. I was the product manager of Sun Grid Engine for a decade, until January 2010. This blog is a story of my journey. Where are we today? Defying many predictions, Grid Engine, formerly CODINE, is the Distributed Resource Management software most used as grid computing, particularly in High Performance Computing (HPC), and it refuses to yield to the cloud business model. Top Supercomputers cannot deliver the Infrastructure as a Service (HPC IaaS), yet. One day they will. If you want a metaphor, the Volkswagen Beetle created the cult car that made successful the entire company. Similarly, Grid Engine can be viewed as a not perfect, but fascinating product. It can become the launching pad for something much bigger, with a much wider adoption. Grid Engine is, yes – not a typo error – a cult product, an enviable positioning, which defies logic. Apple, Java, Facebook are cult products. Grid Engine has the potential grow from this reputation well beyond it is today. Commercial supported distributions are Univa Grid Engine and Oracle Grid Engine . Both claim to be the successor of Sun Grid Engine. The Open Source distributions are Son of Grid Engine , the latest release being 8.0.0e on April 19, 2012, and Open Grid Scheduler Here is a quote from Inc. magazine latest article 8 Core beliefs of Extraordinary BossesAverage bosses see business as a conflict between companies, departments and groups. They… demonize competitors as “enemies,” and treat customers as “territory” to be conquered. Extraordinary bosses see business as a symbiosis where the most diverse firm is most likely to survive and thrive. They naturally create teams that adapt easily to new markets and can quickly form partnerships with other companies, customers … and even competitors.Grid Engine started with an extraordinary entrepreneur, Dr Wolfgang Gentzsch, who founded Genias Software in 1991 later re-named Gridware in 1999. Wolfgang says there is only one Grid Engine Community community, which forms an ecosystem, which he calls a symbiosis of diversity. It all originated in Genias It implieda huge physical and mental effort, going through Sun acquisition of Gridware in 2000 and later – when Oracle took over Sun in 2012 the creation of the GE ecosystem. After Wolfgang left Sun, – many fine people in Sun had to leave at that time – it was frustrating to see how our efforts to have two Sun Grid Engine products (one available bysubscriptionand one available as free Open Source) failed because of management veto. On one hand we were under pressure to be profitable as a unit, on the other hand, our customers appeared to have no reason to pay even one cent for a subscription or license. Oracle still has IP control of Grid Engine. Both Univa and Oracle decided to make no more contributions to the open source. While in Oracle open source policies are clear, Univa, a champion of open source for many years, has surprised the community. This has created an agitated thread on Grid Engine discussion group Quoting from Inc. again: Extraordinary bosses see change as an inevitable part of life. While they don’t value change for its own sake, they know that success is only possible if employees and organization embrace new ideas and new ways of doing business. The paradox is the companies who make real big money do things as if not interested in money. Quora has a new thread, titled What is Mark Zuckerberg’s true attitude towards money ? Here is a quote from highest ranking answer: Mark’s main motivations were pretty clearly based around materially changing the world and building technology that was used by everyone on the planet…. My impression back then was that if he had to choose, he’d rather be the most important/influential person in the world rather than the richest. And I think that’s visible in how he directed the company to focus on user growth and product impact rather than revenue or business considerations. Even today, while Facebook makes a ton of money, it could probably make magnitudes more if that were its primary goal. What people usually say? “Well I am not Zuckerberg.” or “I am not Steve Jobs.” or “We can never make Grid Engine a business of this magnitude.” “Yes we can!” This my answer. Oracle is part of the Grid EngineEcosystem, They are one of the most powerful high tech companies in the world. In September 2008, Larry Ellison dismissed the concept of Cloud . In 2012, Cloud Computing is one of the main initiatives in Oracle. Univa web site points out that Oracle does not have a Grid Engine roadmap. This can change any moment, as Big Data becomes the buzz of the decade. Since Oracle takeover, Grid Engine is part of the Ops Center, a business unit whose culture is not in sync with Grid Engine. This may change Rackspace announced at the OpenStack Design Summit and Conference that it’s ready to run its public cloud service on OpenStack, an open source software they own and made accessible to anyone. 55 companies worldwide, including IBM support implementations. Oracle may get some inspiration here for Grid Engine The Grid Engine ecosystem has extremely giftedcontributors. In addition to Univa team (Gary Tyreman, Bill Bryce, Fritz Ferstl), we have a superb team from open source including Chris Dagdigian from BioTeam (the creator of the legendary http://gridengine.info domain), Daniel Templeton, now with Cloudera. We haveRayson Ha, Ron Chen, Dave Love, Chi Chan, Reuti and many more from the Son of Grid Engine project. Wolfgang Gentzsch tops the list to restore the soul of Grid Engine. He has made it before. He will do it again. I believe in miracles. When Steve Jobs returned to Apple, they had a month or so to dismantle and liquidate the company. But…Reference: The True Story of the Grid Engine Dream from our JCG partner Miha Ahronovitz at the The memories of a Product Manager blog. (Copyright 2012 – Ahrono Associates)...

DBUnit, Spring and Annotations for Database testing

If you have ever tried writing database tests in Java you might have come across DBUnit. DBUnit allows you to setup and teardown your database so that it contains consistent rows that you can write tests against. You usually specify the rows that you want DBUnit to insert by writing a simple XML document, for example:           <?xml version="1.0" encoding="UTF-8"?> <dataset> <Person id="0" title="Mr" firstName="Dave" lastName="Smith"/> <Person id="1" title="Mrs" firstName="Jane" lastName="Doe"/> </dataset>You can also use the same format XML files to assert that a database contains specific rows. DBUnit works especially well using in-memory databases, and if you work with Spring, setting them up is pretty straightforward. Here is a good article describing how to get started. Working directly with DBUnit is fine, but after a while it can become apparent how many of your tests are following the same pattern of setting-up the database then testing the result. To cut down on this duplication you can use the spring-test-dbunit project. This project is hosted on GitHub and provides a new set of annotations that can be added to your test methods. Version 1.0.0 has just been released and is now available in the maven central repository: <dependency> <groupId>com.github.springtestdbunit</groupId> <artifactId>spring-test-dbunit</artifactId> <version>1.0.0</version> <scope>test</scope> </dependency>Once installed three new annotations are available for use in your tests: @DatabaseSetup, @DatabaseTearDown and @ExpectedDatabase. All three can either be used on the test class or individual test methods. The @DatabaseSetup and @DatabaseTearDown annotations are used to put your database into a consistent state, either before the test runs or after it has finished. You specify the dataset to use as the annotation value, for example: @Test @DatabaseSetup("sampleData.xml") public void testFind() throws Exception { // test code }The @ExpectedDatabase annotation is used to verify the state of the database after the test has finished. As with the previous annotations you must specify the dataset to use. @Test @DatabaseSetup("sampleData.xml") @ExpectedDatabase("expectedData.xml") public void testRemove() throws Exception { // test code }You can use @ExpectedDatabase in a couple of different modes depending on how strict the verification should be (see the JavaDocs for details). For the annotations to be processed you need to make sure that your test is using the DbUnitTestExecutionListener. See the project readme for full details. If you want to learn more there is an example project on GitHub and some walk-though instructions available here. Reference: Database testing using DBUnit, Spring and Annotations from our JCG partner Phillip Webb at the Phil Webb’s Blog blog....

Using the final keyword on method parameters

After some own confusion which specific meaning final declared method parameters have this blog entry will try to clarify this. At least the final keyword on method parameters can be seen as an indicator for the Java compiler that this parameter can not be reassigned to another reference. Java parameter handling is always Call by Value (yes, even when dealing with Objects) and here comes why.: It is true, that Java handles a reference to the Object when dealing with non-primitive data types. The Object itself is not passed from the callee to the target function! Instead a reference is passed that points to the desired Object. But this reference is not equal to the one on callee side since it is just a copy. What is passed to a function is a copied reference as value – ok, everyone’s still on board? :-) Maybe Java should use the more matching explanation Call by Copied Reference as Value. To sum up: Java exclusively passes ALL method parameters (primitive data types or references to objects) in Call by Value style! As a proof for this let’s have a look at the following demo code and its output. /** * Call by Value Test Application. * * @author Christopher Meyer * @version 0.1 * Apr 21, 2012 */ public class CBVTest {public static void main(String[] args) { Integer mainInternInteger = new Integer(1);/* * Even references are copied during calls! * * Explanation Objects are never passed, only references to them, BUT * references are copied! So only reference COPIES reach the method. * Neither changes to the reference inside/outside the method will * influence the counterpart. * * Maybe it should be called "Call by Copied Reference as Value". */class RunMe implements Runnable {Integer runnerInternInteger;public RunMe(Integer i) { runnerInternInteger = i;/* * The following operation will have no effect on the main * thread, since the reference to "i" is a copied one. * Interfacing the "caller" reference is prevented. */ i = new Integer(3); }@Override public void run() { while (true) { System.out.println(runnerInternInteger.intValue() + "\t (runner intern value)"); } } }Thread runner = new Thread(new RunMe(mainInternInteger)); runner.start();// Create a new object and assign it to "mainInternInteger". mainInternInteger = new Integer(2); while (true) { System.out.println( mainInternInteger.intValue() + "\t (main intern value)"); } } }The output of the code looks like this: ... 2 (main intern value) 2 (main intern value) 2 (main intern value) 2 (main intern value) 1 (runner intern value) 2 (main intern value) 1 (runner intern value) 2 (main intern value) 1 (runner intern value) 2 (main intern value) 1 (runner intern value) 1 (runner intern value) 1 (runner intern value) 1 (runner intern value) 1 (runner intern value) ...So neither the assignment to the handled parameter (i = new Integer(3)), nor the reassignment from the calling class (mainInternInteger = new Integer(2)) have any influence on each other. So what is it worth if it isn’t really necessary? If added to the Constructor of RunMe (public RunMe(final Integer i)) the reassignment i = new Integer(3) throws an exception: Exception in thread “main” java.lang.RuntimeException: Uncompilable source code – final parameter i may not be assigned. It prevents failures related to unintentional reassignment. Accidental assignments to the handled parameter will always fail! final forces a developer to produce accurate code. The final keyword is not part of the method signature. So if declared final or not, the compiled code will be identical (everyone can easily check this by using diff). This means that a method can’t be overloaded by declaring the method parameters once final and once not. Since the byte code remains identical it also has absolutely no influence on performance. To confuse even more keep in mind that inner classes require to define a variable final when the variable can be modified (for example when dealing with anonymous inner classes for Threads – if this isn’t clear to you consider multiple variables in the same context with identical names that can be altered). Reference: Using the final keyword on method parameters from our JCG partner Christopher Meyer at the Java security and related topics blog....

Keep As Much Stuff As Possible In The Application Itself

There’s a lot of Ops work to every project. Setting up server machines, and clusters of them, managing the cloud instances, setting up the application servers, HAProxy, load balancers, database clusters, message queues, search engine, DNS, alerts, and whatnot. That’s why the Devops movement is popular – there’s a lot more happening outside the application that is vital to its success. But unix/linux is tedious. I hate it, to be honest. Shell script is ugly and I would rather invent a new language and write a compiler for it, that write a shell script. I know many “hackers” will gasp at this statement, but let’s face it – it should be used only as a really last resort, because it will most likely stay out of the application’s repository, it is not developer friendly, and it’s ugly (yes, you can version it, you can write it with good comments, and still…) But enough for my hate for shell scripting (and command-line executed perl scripts for that matter). That’s not the only thing that should be kept to minimum. (Btw, this is the ‘whining’ paragraph’, you can probably skip it). The “Getting Started” guide of all databases, message queues, search engines, servers, etc. says “easy to install”. Sure, you just apt-get install it, then go to /usr/lib/foo/bar and change a configuration file, then give permissions to a newly-created user that runs it, oh, and you customize the shell-script to do something, and you’re there. Oh, and /usr/lib/foo/bar – that’s different depending on how you install it and who has installed it. I’ve seen tomcat installed in at least 5 different ways. One time all of its folders (bin, lib, conf, webapps, logs, temp) were in a completely different location on the server. And of course somebody decided to use the built-in connection pool, so the configuration has to be done in the servlet container itself. Use the defaults. Put that application server there and leave it alone. But we need a message queue. And a NoSQL database in addition to MySQL. And our architects say “no, this should not be run in embedded mode, it will couple the components”. So a whole new slew of configurations and installations for stuff that can very easily be run inside our main application virtual machine/process. And when you think the external variables are just too many – then comes URL rewriting. “Yes, that’s easy, we will just add another rewrite rule”. 6 months later some unlucky developer will be browsing through the code wondering for hours why the hell this doesn’t open. And then he finally realizes it is outside the application, opens the apache configuration file, and he sees wicked signs all over. To summarize the previous paragraph – there’s just too much to do on the operations side, and it is (obviously) not programming. Ops people should be really strict about versioning configuration and even versioning whole environment setups (Amazon’s cloud gives a nice option to make snapshots and then deploy them on new instances). But then, when somethings “doesn’t work”, it’s back to the developers to find the problem in the code. And it’s just not there. That’s why I have always strived to keep as much stuff as possible in the application itself. NoSQL store? Embedded, please. Message queue? Again. URL rewrites – your web framework does that. Application server configurations? None, if possible, you can do them per-application. Modifications of the application server startup script? No, thanks. Caching? It’s in-memory anyway, why would you need a separate process. Every external configuration needed goes to a single file that resides outside the application, and Ops (or devs, or devops) can change that configuration. No more hidden stones to find in /usr/appconf, apache or whatever. Consolidate as much as possible in the piece that you are familiar and experienced with – the code. Obviously, not everything can be there. Some databases you can’t run embedded, or you really want to have separate machines. You need a load balancer, and it has to be in front of the application. You need to pass initialization parameters for the virtual machine / process, in the startup script. But stick to the bare minimum. If you need ti make something transparent to the application, do it with a layer of code, not with scripts and configurations. I don’t know if that aligns somehow with the devops philosophy, because it is more “dev” and less “ops”, but it actually allows developers to do the ops part, because it is kept down to a minimum. And it does not involve ugly scripting languages and two-line long shell commands. I know I sound like a big *nix noob. And I truly am. But as most of these hacks can be put up in the application and be more predictable and easy to read and maintain – I prefer to stay that way. If it is not possible – let them be outside it, but really version them, even in the same repository as the code, and document them. The main purpose of all that is to improve maintainability and manageability. You have a lot of tools, infrastructure and processes around your code, so make use of them for as much as possible. Reference: Keep As Much Stuff As Possible In The Application Itself from our JCG partner Bozhidar Bozhanov at the Bozho’s tech blog blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below: