Featured FREE Whitepapers

What's New Here?


Google Guava v07 examples

We have something called Weekly Technology Workshops at TouK, that is, every Friday at 16:00 somebody has a presentation for everyone willing to come. We present stuff we learn and work on at home, but we also have a bulletin board with topics that people would like to listen about. Last week Maciej Próchniak had a talk about Clojure, this time a few folks asked for an introduction to Google Guava libraries. Since this was a dead simple task, I was happy to deliver. WTF is Guava? It’s a set of very simple, basic classes, that you end up writing yourself anyway. Think in terms of Apache commons, just by Google. Just to make your life a little bit easier. There is an early (v04) presentation and there was a different one (in Polish) at Javarsowia 2010 by Wiktor Gworek. At the time of writing this, the latest version is v07, it’s been mavenized and is available at a public maven repo. Here’s a quick review of a few interesting things. Don’t expect anything fancy though, Guava is very BASIC.@VisibleForTesting A simple annotation that tells you why a particular property access restriction has been relaxed. A common trick to use in testing is to relax access restrictions to default for a particular property, so that you can use it in a unit test, which resides in the same package (though in different catalog). Whether you thing it’s good or bad, remember to give a hint about that to the developer. Consider: public class User { private Long id; private String firstName; private String lastName; String login;Why is login package scoped? public class User { private Long id; private String firstName; private String lastName; @VisibleForTesting String login; Ah, that’s why.Preconditions Guava has a few preconditions for defensive programming (Design By Contract), but they are not quite as good as what Apache Commons / Spring framework has. One thing interesting is that Guava solution returns the object, so could be inlined. Consider: Using hand written preconditions: public User(Long id, String firstName, String lastName, String login) { validateParameters(id, firstName, lastName, login); this.id = id; this.firstName = firstName; this.lastName = lastName; this.login = login.toLowerCase(); }private void validateParameters(Long id, String firstName, String lastName, String login) { if(id == null ) { throw new IllegalArgumentException('id cannot be null'); }if(firstName == null || firstName.length() == 0) { throw new IllegalArgumentException('firstName cannot be empty'); }if(lastName == null || lastName.length() == 0) { throw new IllegalArgumentException('lastName cannot be empty'); }if(login == null || login.length() == 0) { throw new IllegalArgumentException('login cannot be empty'); } } Using guava preconditions: public void fullyImplementedGuavaConstructorWouldBe(Long id, String firstName, String lastName, String login) { this.id = checkNotNull(id); this.firstName = checkNotNull(firstName); this.lastName = checkNotNull(lastName); this.login = checkNotNull(login);checkArgument(firstName.length() > 0); checkArgument(lastName.length() > 0); checkArgument(login.length() > 0); } (Thanks Yom for noticing that checkNotNull must go before checkArgument, though it makes it a bit unintuitive) Using spring or apache commons preconditions (the use looks exactly the same for both libraries): public void springConstructorWouldBe(Long id, String firstName, String lastName, String login) { notNull(id); hasText(firstName); hasText(lastName); hasText(login); this.id = id; this.firstName = firstName; this.lastName = lastName; this.login = login; } CharMatcher For people who hate regexp or just want a simple and good looking object style pattern matching solution. Examples: And/or ease of use String input = 'This invoice has an id of 192/10/10'; CharMatcher charMatcher = CharMatcher.DIGIT.or(CharMatcher.is('/')); String output = charMatcher.retainFrom(input);output is: 192/10/10 Negation: String input = 'DO NOT scream at me!'; CharMatcher charMatcher = CharMatcher.JAVA_LOWER_CASE.or(CharMatcher.WHITESPACE).negate(); String output = charMatcher.retainFrom(input);output is: DONOT! Ranges: String input = 'DO NOT scream at me!'; CharMatcher charMatcher = CharMatcher.inRange('m', 's').or(CharMatcher.is('a').or(CharMatcher.WHITESPACE)); String output = charMatcher.retainFrom(input);output is: sram a m Joiner / Splitter As the names suggest, it’s string joining/splitting done the right way, although I find the inversion of calls a bit… oh well, it’s java. String[] fantasyGenres = {'Space Opera', 'Horror', 'Magic realism', 'Religion'}; String joined = Joiner.on(', ').join(fantasyGenres);Output: Space Opera, Horror, Magic realism, Religion You can skip nulls: String[] fantasyGenres = {'Space Opera', null, 'Horror', 'Magic realism', null, 'Religion'}; String joined = Joiner.on(', ').skipNulls().join(fantasyGenres);Output: Space Opera, Horror, Magic realism, Religion You can fill nulls: String[] fantasyGenres = {'Space Opera', null, 'Horror', 'Magic realism', null, 'Religion'}; String joined = Joiner.on(', ').useForNull('NULL!!!').join(fantasyGenres);Output: Space Opera, NULL!!!, Horror, Magic realism, NULL!!!, Religion You can join maps Map<Integer, String> map = newHashMap(); map.put(1, 'Space Opera'); map.put(2, 'Horror'); map.put(3, 'Magic realism'); String joined = Joiner.on(', ').withKeyValueSeparator(' -> ').join(map);Output: 1 ? Space Opera, 2 ? Horror, 3 ? Magic realism Split returns Iterable instead of JDK arrays: String input = 'Some very stupid data with ids of invoces like 121432, 3436534 and 8989898 inside'; Iterable<String> splitted = Splitter.on(' ').split(input);Split does fixed length splitting, although you cannot give a different length for each “column” which makes it’s use a bit limited while parsing some badly exported excels. String input = 'A 1 1 1 1\n' + 'B 1 2 2 2\n' + 'C 1 2 3 3\n' + 'D 1 2 5 3\n' + 'E 3 2 5 4\n' + 'F 3 3 7 5\n' + 'G 3 3 7 5\n' + 'H 3 3 9 7'; Iterable<String> splitted = Splitter.fixedLength(3).trimResults().split(input);You can use CharMatcher while splitting String input = 'Some very stupid data with ids of invoces like 123231/fv/10/2010, 123231/fv/10/2010 and 123231/fv/10/2010'; Iterable<String> splitted = Splitter.on(CharMatcher.DIGIT.negate()) .trimResults() .omitEmptyStrings() .split(input); Predicates / Functions Predicates alone are not much, it’s just an interface with a method that returns true, but if you combine predicates with functions and Collections2 (a guava class that simplifies working on collections), you get a nice tool in your toolbox. But let’s start with basic predicate use. Imagine we want to find whether there are users who have logins with digits inside. The inocation would be (returns boolean): Predicates.in(users).apply(shouldNotHaveDigitsInLoginPredicate);And the predicate looks like that public class ShouldNotHaveDigitsInLoginPredicate implements Predicate<User> { @Override public boolean apply(User user) { checkNotNull(user); return CharMatcher.DIGIT.retainFrom(user.login).length() == 0; } }Now lets add a function that will transform a user to his full name: public class FullNameFunction implements Function<User, String> { @Override public String apply(User user) { checkNotNull(user); return user.getFirstName() + ' ' + user.getLastName(); } }You can invoke it using static method transform: List<User> users = newArrayList(new User(1L, 'sylwek', 'stall', 'rambo'), new User(2L, 'arnold', 'schwartz', 'commando'));List<String> fullNames = transform(users, new FullNameFunction());And now lets combine predicates with functions to print names of users that have logins which do not contain digits: List<User> users = newArrayList(new User(1L, 'sylwek', 'stall', 'rambo'), new User(2L, 'arnold', 'schwartz', 'commando'), new User(3L, 'hans', 'kloss', 'jw23'));Collection<User> usersWithoutDigitsInLogin = filter(users, new ShouldNotHaveDigitsInLoginPredicate()); String names = Joiner.on('\n').join( transform(usersWithoutDigitsInLogin, new FullNameFunction()) ); What we do not get: fold (reduce) and tuples. Oh well, you’d probably turn to Java Functional Library anyway, if you wanted functions in Java, right? CaseFormat Ever wanted to turn those ugly PHP Pear names into nice java/cpp style with one liner? No? Well, anyway, you can: String pearPhpName = 'Really_Fucked_Up_PHP_PearConvention_That_Looks_UGLY_because_of_no_NAMESPACES'; String javaAndCPPName = CaseFormat.UPPER_UNDERSCORE.to(CaseFormat.UPPER_CAMEL , pearPhpName);Output: ReallyFuckedUpPhpPearconventionThatLooksUglyBecauseOfNoNamespaces But since Oracle has taken over Sun, you may actually want to turn those into sql style, right? String sqlName = CaseFormat.UPPER_CAMEL.to(CaseFormat.LOWER_UNDERSCORE, javaAndCPPName); Output: really_fucked_up_php_pearconvention_that_looks_ugly_because_of_no_namespaces Collections Guava has a superset of Google collections library 1.0, and this indeed is a very good reason to include this dependency in your poms. I won’t even try to describe all the features, but just to point out a few nice things:you have an Immutable version of pretty much everything you get a few nice static and statically typed methods on common types like Lists, Sets, Maps, ObjectArrays, which include:easy way of creating based on return type: e.g. newArrayList transform (way to apply functions that returns Immutable version) partition (paging) reverseAnd now for a few more interesting collections. Mutlimaps Mutlimap is basically a map that can have many values for a single key. Ever had to create a Map<T1, Set<T2>> in your code? You don’t have to anymore. Multimap<Integer, String> multimap = HashMultimap.create(); multimap.put(1, 'a'); multimap.put(2, 'b'); multimap.put(3, 'c'); multimap.put(1, 'a2');There are of course immutable implementations as well: ImmutableListMultimap, ImmutableSetMultomap, etc. You can construct immutables either in line (up to 5 elements) or using a builder: Multimap<Integer, String> multimap = ImmutableSetMultimap.of(1, 'a', 2, 'b', 3, 'c', 1, 'a2'); Multimap<Integer, String> multimap = new ImmutableSetMultimap.Builder<Integer, String>() .put(1, 'a') .put(2, 'b') .put(3, 'c') .put(1, 'a2') .build();BiMap BiMap is a map that have only unique values. Consider this: @Test(expected = IllegalArgumentException.class) public void biMapShouldOnlyHaveUniqueValues() { BiMap<Integer, String> biMap = HashBiMap.create(); biMap.put(1, 'a'); biMap.put(2, 'b'); biMap.put(3, 'a'); //argh! an exception }That allows you to inverse the map, so the values become key and the other way around: BiMap<Integer, String> biMap = HashBiMap.create(); biMap.put(1, 'a'); biMap.put(2, 'b'); biMap.put(3, 'c');BiMap<String, Integer> invertedMap = biMap.inverse();Not sure what I’d actually want to use it for. Constraints This allows you to add constraint checking on a collection, so that only values which pass the constraint may be added. Imagine we want a collections of users with first letter ‘r’ in their logins. Constraint<User> loginMustStartWithR = new Constraint<User>() { @Override public User checkElement(User user) { checkNotNull(user); if(!user.login.startsWith('r')) { throw new IllegalArgumentException('GTFO, you are not Rrrrrrrrr'); }return user; } };And now for a test: @Test(expected = IllegalArgumentException.class) public void shouldConstraintCollection() { //given Collection<User> users = newArrayList(new User(1L, 'john', 'rambo', 'rambo')); Collection<User> usersThatStartWithR = constrainedCollection(users, loginMustStartWithR);//when usersThatStartWithR.add(new User(2L, 'arnold', 'schwarz', 'commando')); }You also get notNull constraint out of the box: //notice it's not an IllegalArgumentException :( @Test(expected = NullPointerException.class) public void notNullConstraintShouldWork() { //given Collection<Integer> users = newArrayList(1); Collection<Integer> notNullCollection = constrainedCollection(users, notNull());//when notNullCollection.add(null); }Thing to remember: constraints are not checking the data already present in a collection. Tables Just as expected, a table is a collection with columns, rows and values. No more Map<T1, Map<T2, T3>> I guess. The usage is simple and you can transpose: Table<Integer, String, String> table = HashBasedTable.create(); table.put(1, 'a', '1a'); table.put(1, 'b', '1b'); table.put(2, 'a', '2a'); table.put(2, 'b', '2b');Table transponedTable = Tables.transpose(table);That’s all, folks. I didn’t present util.concurrent, primitives, io and net packages, but you probably already know what to expect. Happy coding and don’t forget to share! Reference: Google Guava v07 examples from our JCG partner Jakub Nabrdalik at the Solid Craft blog....

Business Agility Through DevOps and Continuous Delivery

The principles of Continuous Delivery and DevOps have been around for a few years. Developers and system administrators who follow the lean-startup movement are more than familiar with both. However, more often than not, implementing either or both within a traditional, large IT environment is a significant challenge compared to a new age, Web 2.0 type organization (think Flickr) or a Silicon Valley startup (think Instagram). This is a case study of how the consultancy firm I work for delivered the largest software upgrade in the history of one blue chip client, using both. Background The client, is one of Australia’s largest retailers. The firm I work for is a trusted consultant working with them for over a decade. During this time (thankfully), we have earned enough credibility to influence business decisions heavily dependent on IT infrastructure. A massive IT infrastructure upgrade was imminent, when our client wanted to leverage their loyalty rewards program to fight competition head-on. With an existing user base of several millions and our client looking to double this number with the new campaign, the expectations from the software was nothing short of spectacular. In addition to ramping up the existing software, a new set of software needed to be in place, capable of handling hundreds of thousands of new user registrations per hour. Maintenance downtime was not an option (is it ever?) once the system went live (especially during the marketing campaign period). Why DevOps? Our long relationship with this client and the way IT operations is organized meant that adopting DevOps was evolutionary than revolutionary. The good folk at operations have a healthy respect and trust towards our developers and the feeling is mutual. Our consultants provided development and 24/7 support for the software. The software include a Web Portal, back office systems, partner integration systems and customer support systems. Adopting DevOps principles meant;That our developers have more control over the environments the software runs in, from build to production. Developers have better understanding of the production environment the software eventually run in, opposed to their local machines. Developers are able to clearly explain to infrastructure operations group what the software does in each environment. Simple clear processes to manage the delivery of change. Better collaboration between developers and operations. No need to raise tickets.Why Continuous Delivery? The most important reason was the reduced risk to our client’s new campaign. With a massive marketing campaign in full throttle, targeting millions of new user sign-ups, the software systems needed to maintain 100% up-time. Taking software offline for maintenance, meant lost opportunity and money for the business. In a nutshell;A big bang approach would have been fine for the initial release. But when issues are found we want to deliver fixes without down time. When the marketing campaign is running, based on analytics and metrics, improvements and features will need to be done to the software. Delivering them in large batches (taking months) doesn’t deliver good business value. In a developer’s perspective, delivering small changes frequently helps to identify what went wrong easily and either roll back or re-deploy a fix. Years of Agile practices followed by us at the client’s site ensured that a proper culture is in place to adopt continuous delivery painlessly. We were already using Hudson/Jenkins for continuous integration. We only needed the ‘last mile’ of the deployment pipeline to be built, in order to upgrade the existing technical process to a one that delivered continuously.The process: keep it simple and transparent The development process we follow is simple and the culture is such, that each developer is aware that at any given moment one or more of their commits can be released to production. To make the burden minimum, we use subversion tags and branching so that release candidate revisions are tagged before a release candidate is promoted to the test environment (more on that later). The advantage of tagging early is that we have more control over changes we deliver into production. For instance, bug fixes versus feature releases.Image credit – WikipediaThe production environment consists of a cluster of twenty nodes. Each node contains a Tomcat instance fronted by Apache. The load balancer provides functionality to release nodes from the cluster when required, although not as advanced as API level communication provided by Amazon’s elastic load balancer (this is an investment made by the client way back, so we opted to work with it than complaining). Jenkins CI is used as the foundation for our continuous delivery process. The deployment pipeline consists of several stages. We kept the process simple just like the diagram above, to minimize confusion.1.Build – At this stage the latest revision from Subversion is checked out by Jenkins at the build server, unit tests are run and once successful, the artifacts bundled. The build environment is also equipped with infrastructure to test deploy the software for verification. Every build is deployed to this test infrastructure by Jenkins.Creating a release candidate build with subversion tagging.Promotion tasks2.Test (UAT) – Once a build is verified by developers, it’s promoted to the Test environment using a Jenkins task.A promotion indicates that the developers are confident of a build and it’s ready for quality assurance. The automated promotion process creates a tag in Subversion using the revision information packaged into the artifacts. Automated integration tests written using Selenium is run against the Test deployment. The QA team uses this environment to carry out their testing.3.Production Verification – Once artifacts are tested by the test team and no failures reported by the automated integration tests, a node is picked from the production cluster and – using a Jenkins job – prepared for smoke testing. This automated process will;Remove the elected node from the cluster. Deploy the tested artifacts to this node.Removing a node from the production cluster.Nominating a node (s) for production verification.4.Production (Cut-over) – Once the smoke tests are done, the artifacts are deployed to the cluster by a separate Jenkins task.The deployment is following a round-robin schedule, where each node is taken off the load balancer to deploy and refresh the software. The deployment time is highly predictable and almost constant. As soon as a node is returned to the cluster, verification begins. 5.Rollback (Disaster recovery) – In case of a bad deployment, despite all the testing and verification, rollback to the last stable deployment. Just like the cut-over deployment above, the time is predictable for a full rollback.Preparing for rollback – The roll back process goes through test server.Implementation: Our toolsJenkins – Jenkins is the user interface to the whole process. We used parametrized builds whenever we required a developer to interact with a certain job. Jenkins Batch Task plugin – We automated all repetitive tasks to minimize human error. The Task Plugin was used extensively so that we have the flexibility to write scripts to do exactly what we want. Bash – Most of the hard work is done by a set of Bash scripts. We configured keyless login from the build server with appropriate permissions, so that these scripts can perform just like a human, once told what to do via Jenkins. Ant – The build scripts for the software were written in Ant. Ant also couples nicely with Jenkins and can be easily called from a shell script when needed. JUnit and Selenium – Automation is great, but without a good feedback loop, can lead to disaster. JUnit tests provides us with feedback for every single build, while Selenium does the same for ones that are promoted to the test environment. An error means immediate termination of the deployment pipeline for that build. This coupled with testing done by QA keep defects reaching production to a minimum. Puppet – Puppet (http://puppetlabs.com) is used by the operations team to manage configurations across environments. Once the operations team build a server for the developers, they have full access to go in and configure it to run the application. The most important part is to record everything we do while in there. Once a developer is satisfied that the configuration is working, they give a walk-through to the operations team, who in-turn update their Puppet Recipes. These changes are rolled out to the cluster by Puppet immediately. Monitoring – The logs from all production nodes are harvested to a single location for easy analysis. A health check page is built into the application itself, so that we can check the status of the application running in each node.Conclusion Neither DevOps nor Continuous delivery is a silver bullet. However, nurturing a culture, where developers and operations trust each other and work together can be very rewarding to a business. Cultivating such a culture allows a business to reap the full benefits of an Agile development process. Because of the mutual trust between us (the developers) and our client’s operations team, we were able to implement a deployment pipeline that is capable of delivering features and fixes within hours if necessary, instead of months. During a crucial marketing campaign, this kind of agility allowed our client to keep the software infrastructure well in-tune with feedback received through their marketing analytics and KPIs. Further reading A few articles you might find interesting.Four Principles of Low-Risk Software Releases On DVCS, continuous integration, and feature branches The Relationship Between Dev-Ops And Continuous DeliveryReference: Business Agility Through DevOps and Continuous Delivery from our JCG partner Tyrell Perera at the Conundrum blog....

Observer Design Pattern in Java

‘Don’t call us, we’ll call you’… that’s the Hollywood OO (Object Oriented) Principle and it’s exactly what the Observer pattern is about. In this post we’ll review this pattern and how it is used in Java, you may already have used it without knowing… According to Head First Design Patterns book, this is the definition of the Observer pattern: Defines a one-to-many dependency between objects so that when one object changes state, all its dependents are notified and updated automatically. Sounds familiar? Have you ever worked with Swing? Looks like the event handling mechanism is using the Observer Pattern. Let’s think about it, suppose you have a JButton and you want other objects to be notified when the button is pressed… have you done that? sure! The other objects are not asking all the time to the button whether it is pressed or not, they only wait for a notification from the button. Swing does it using the java.awt.event.ActionListener, but in essence it is using the Observer Pattern, even if we are not using interfaces like java.util.Observer or classes like java.util.Observable… Now, talking about observers and observables, this two exist since JDK 1.0 and you can use them to implement the Observer Pattern in your applications. Let’s see the how to do it: The objects you want to be waiting for notifications are called Observers, they implement the interface java.util.Observer. This interface defines only one method: +update(Observable,Object):void which is called whenever the observed object is changed. The first parameter, an Observable, is the object that changed. The second parameter may be used as follows:If using PUSH notifications, the Object parameter contains the information needed by the observers about the change. If using PULL notifications, the Object parameter is null and you should use the Observable parameter in order to extract the information needed.When to PUSH or PULL? It’s up to your implementation. The Object you want to be observed is called the Observable and it has to subclass the java.util.Observable class. Yes, subclass. That’s the dark side of the built-in implementation of the Observer Pattern in Java, sometimes you simply can’t subclass, we’ll talk about this in a minute… Once subclassed, you will inherit the following methods, among others:+addObserver(Observer):void which adds the Observer passed in as parameter to the set of observers. +deleteObserver(Observer):void which deletes the Observer passed in as parameter from the set of observers. setChanged():void which marks the Observable as having been changed. This method is protected, so you can only call it if you subclass the java.util.Observable class. Call it before notifying your observers. +notifyObservers():void which notifies the registered observers using PULL. It means that when the +update(Observable,Object):void is invoked on the Observer, the Object parameter will be null. +notifyObservers(Object):void which notifies the registered observers using PUSH. It means that when the +update(Observable,Object):void is invoked on the Observer, the Object parameter will be the same parameter passed in the +notifyObservers(Object):void .So, what happens if the class you want to be the Observable is already subclassing another class? Well, then you have to write your own Observer Pattern implementation because you can’t use the one built-in Java. The following diagram shows you the basic concepts of the Observer Pattern, so you can built your own implementation:One last thing, remember the most important OO Principle of all: Always use the simplest solution that meets your needs, even if it doesn’t include a pattern. Reference: Observer Pattern and Java from our JCG partner Alexis Lopez at the Java and ME blog....

Java Code Quality Tools – Overview

Recently, I had a chance to present the subject at the local IT community meetup. Here is the basic presentation: Java Code Quality Tools    and more meaningful mind map:But, I think I need to cover this subject more deeply. This blog post should be something like start point for further investigation in this direction. 1. CodePro Analytix It’s a great tool (Eclipse plugin) for improving software quality. It has the next key features: Code Analysis, JUnit Test Generation, JUnit Test Editor, Similar Code Analysis, Metrics, Code Coverage and Dependency Analysis. 2. PMD It scans Java source code and looks for potential problems: Possible bugs, Dead code, Suboptimal code, Overcomplicated expressions and Duplicate code. 3. FindBugs It looks for bugs in Java programs. It can detect a variety of common coding mistakes, including thread synchronization problems, misuse of API methods, etc. 4. Cobertura It’s a free Java tool that calculates the percentage of code accessed by tests. It can be used to identify which parts of your Java program are lacking test coverage. It is based on jcoverage. 5. Emma It is a fast Java code coverage tool based on bytecode instrumentation. It differs from the existing tools by enabling coverage profiling on large scale enterprise software projects with simultaneous emphasis on fast individual development. 6. Checkstyle It is a development tool to help programmers write Java code that adheres to a coding standard. 7. JBoss Tattletale JBoss Tattletale is a tool that can help you get an overview of the project you are working on or a product that you depend on. The tool will recursive scan a directory for JAR files and generate linked and formatted HTML reports. 8. UCDetector UCDetector (Unecessary Code Detector) is a Open Source eclipse PlugIn Tool to find unecessary (dead) java code. It also tries to make code final, protected or private. UCDetector also finds cyclic dependencies between classes. 9. Sonar Sonar is a continuous quality control tool for Java applications. Its basic purpose in life is to join your existing continuous integration tools to place all your development projects under quality control. 10. XRadar The XRadar is an open extensible code report tool that produces HTML/SVG reports of the systems current state and the development over time. Uses DependencyFinder, JDepend, PMD, PMD-CPD, JavaNCSS, Cobertura, Checkstyle, XSource, JUnit, Java2HTML, ant and maven. 11. QALab QALab consolidates data from Checkstyle, PMD, FindBugs and Simian and displays it in one consolidated view. QALab keeps a track of the changes over time, thereby allowing you to see trends over time. You can tell weather the number of violations has increased or decreased – on a per file basis, or for the entire project. It also plots charts of this data. QALab plugs in to maven or ant. 12. Clirr Clirr is a tool that checks Java libraries for binary and source compatibility with older releases. Basically you give it two sets of jar files and Clirr dumps out a list of changes in the public api. The Clirr Ant task can be configured to break the build if it detects incompatible api changes. In a continuous integration process Clirr can automatically prevent accidental introduction of binary or source compatibility problems. 13. JDiff JDiff is a Javadoc doclet which generates an HTML report of all the packages, classes, constructors, methods, and fields which have been removed, added or changed in any way, including their documentation, when two APIs are compared. This is very useful for describing exactly what has changed between two releases of a product. Only the API (Application Programming Interface) of each version is compared. It does not compare what the source code does when executed. 14. JLint It checks your Java code and find bugs, inconsistencies and synchronization problems by doing data flow analysis and building the lock graph. 15. JDepend JDepend traverses Java class file directories and generates design quality metrics for each Java package. JDepend allows you to automatically measure the quality of a design in terms of its extensibility, reusability, and maintainability to effectively manage and control package dependencies. 16. cloc cloc counts blank lines, comment lines, and physical lines of source code in many programming languages. 17. Dependometer Dependometer performs a static analysis of physical dependencies within a software system. Dependometer validates dependencies against the logical architecture structuring the system into classes, packages, subsystems, vertical slices and layers and detects cycles between these structural elements. Furthermore, it calculates a number of quality metrics on the different abstraction layers and reports any violations against the configured thresholds. 18. Hammurapi Hammurapi is an open source code inspection tool. Its release comes with more than 100 inspectors which inspect different aspects of code: Compliance with EJB specification, threading issues, coding standards, and much more. 19. JavaNCSS JavaNCSS is a simple command line utility which measures two standard source code metrics for the Java programming language. The metrics are collected globally, for each class and/or for each function. 20. DCD DCD finds dead code in your Java applications. 21. Classycle Classycle’s Analyser analyses the static class and package dependencies in Java applications or libraries. It is especially helpful for finding cyclic dependencies between classes or packages. Classycle is similar to JDepend which does also a dependency analysis but only on the package level. 22. ckjm The program ckjm calculates Chidamber and Kemerer object-oriented metrics by processing the bytecode of compiled Java files. The program calculates for each class the following six metrics proposed by Chidamber and Kemerer. 23. Jameleon Jameleon is an automated testing framework that can be easily used by technical and non-technical users alike. One of the main concepts behind Jameleon is to create a group of keywords or tags that represent different screens of an application. All of the logic required to automate each particular screen can be defined in Java and mapped to these keywords. The keywords can then be organized with different data sets to form test scripts without requiring an in-depth knowledge of how the application works. The test scripts are then used to automate testing and to generate manual test case documentation. 24. DoctorJ DoctorJ analyzes Java code, in the following functional areas: documentation verification, statistics generation and syntax analysis. 25. Macker Macker is a build-time architectural rule checking utility for Java developers. It’s meant to model the architectural ideals programmers always dream up for their projects, and then break — it helps keep code clean and consistent. You can tailor a rules file to suit a specific project’s structure, or write some general ‘good practice’ rules for your code. Macker doesn’t try to shove anybody else’s rules down your throat; it’s flexible, and writing a rules file is part of the development process for each unique project. 26. Squale Squale is a qualimetry platform that allows to analyze multi-language software applications in order to give a sharp and comprehensive picture of their quality: High level factors for top-managers and Practical indicators for development teams. 27. SourceMonitor The freeware program SourceMonitor lets you see inside your software source code to find out how much code you have and to identify the relative complexity of your modules. For example, you can use SourceMonitor to identify the code that is most likely to contain defects and thus warrants formal review. 28. Panopticon The Panopticode project provides a set of open source tools for gathering, correlating, and displaying code metrics. 29. Eclipse Metrics plugin Provide metrics calculation and dependency analyzer plugin for the Eclipse platform. Measure various metrics with average and standard deviation and detect cycles in package and type dependencies and graph them. 30. QJ-Pro QJ-Pro is a comprehensive software inspection tool targeted towards the software developer. Developers can automatically inspect their Java source code and improve their Java programming skills as they write their programs. QJ-Pro provides descriptive Java patterns explaining error prone code constructs and providing solutions for it. 31. Byecycle Byecycle is an auto-arranging dependency analysis plugin for Eclipse. Its goal is to make you feel sick when you see bad code and to make you feel happy when you see good code. 32. Coqua Coqua measures 5 distinct Java code quality metrics, providing an overview and history for the management, and down-to-the-code, detailed views for the developer. Metrics can be defined per team. Ideal for mid- to large-sized and/or offshore projects. 33. Dependency Finder Extracts dependencies and OO metrics from Java class files produced by most Java compilers. 34. Jalopy Jalopy is an easily configurable source code formatter that can detect, and fix, a number of code convention flaws that might appear in Java code. Jalopy is more of a code fixer than a code checker. Jalopy plug-ins are present for most IDEs and, in most cases, they gel quite seamlessly with the IDE. 35. JarAnalyzer JarAnalyzer is a dependency management tool for .jar files. JarAnalyzer will analyze all .jar in a given directory and identify the dependencies between each. Output formats include xml, with a stylesheet included to transform it to html, and GraphViz DOT, allowing you to produce a visual component diagram showing the relationships between .jar files. The xml output includes important design metrics such as Afferent and Efferent coupling, Abstractness, Instability, and Distance. There is also an Ant task available that allows you to include JarAnalyzer as part of your build script. 36. Condenser Condenser is a tool for finding and removing duplicated Java code. Unlike tools that only locate duplicated code, the aim of Condenser is to also automatically remove duplicated code where it is safe to do so. 37. Relief Relief provides a new look on Java projects. Relying on our ability to deal with real objects by examining their shape, size or relative place in space it gives a ‘physical’ view on java packages, types and fields and their relationships, making them easier to handle. Lets discuss quickly how we interprete physical properties and how it can help us to grasp project characteristics. 38. JCSC JCSC is a powerful tool to check source code against a highly definable coding standard and potential bad code. The standard covers naming conventions for class, interfaces, fields, parameter, … . Also the structural layout of the type (class/interface) can be defined. Like where to place fields, either before or after the methods and in which order. The order can be defined through the visibility or by type (instance, class, constant). The same is applicable for methods. Each of those rules is highly customizable. Readability is enhanced by defining where to put white spaces in the code and when to use braces. The existence of correct JavaDoc can be enforced and various levels. Apart from that, it finds weaknesses in the the code — potential bugs — like empty catch/finally block, switch without default, throwing of type ‘Exception’, slow code. 39. Spoon Spoon is a Java program processor that fully supports Java 5. It provides a complete and fine-grained Java metamodel where any program element (classes, methods, fields, statements, expressions…) can be accessed both for reading and modification. Spoon can be used on validation purpose, to ensure that your programs respect some programming conventions or guidelines, or for program transformation, by using a pure-Java template engine. 40. Lint4j Lint4j (‘Lint for Java’) is a static Java source and byte code analyzer that detects locking and threading issues, performance and scalability problems, and checks complex contracts such as Java serialization by performing type, data flow, and lock graph analysis. 41. Crap4j Crap4j is a Java implementation of the CRAP (Change Risk Analysis and Predictions) software metric – a mildly offensive metric name to help protect you from truly offensive code. 42. PathFinder Java PathFinder (JPF) is a system to verify executable Java bytecode programs. In its basic form, it is a Java Virtual Machine (JVM) that is used as an explicit state software model checker, systematically exploring all potential execution paths of a program to find violations of properties like deadlocks or unhandled exceptions. Unlike traditional debuggers, JPF reports the entire execution path that leads to a defect. JPF is especially well-suited to finding hard-to-test concurrency defects in multithreaded program 43. Soot Soot can be used as a stand alone tool to optimize or inspect class files, as well as a framework to develop optimizations or transformations on Java bytecode. 44. ESC/Java2 The Extended Static Checker for Java version 2 (ESC/Java2) is a programming tool that attempts to find common run-time errors in JML-annotated Java programs by static analysis of the program code and its formal annotations. Users can control the amount and kinds of checking that ESC/Java2 performs by annotating their programs with specially formatted comments called pragmas. This list includes open sourced and free tools. I intentionally have excluded commercial tools. I’m sure there are much more tools. In case your know some of them which isn’t listed here please add comment to this post. Don’t forget to share! Reference: Java Code Quality Tools – Overview from our JCG partner Orest Ivasiv at the Knowledge Is Everything blog....

Spring Security using API Authentication

Background While there are many blog posts that detail how to use Spring Security, I often still find it challenging to configure when a problem domain lies outside of the standard LDAP or database authentication. In this post, I’ll describe some simple customizations to Spring Security that enable it to be used with a REST-based API call. Specifically, the use case is where you have an API service that will return a user object that includes a SHA-256 password hash. Setup The prerequisites for running this sample is Git and Maven, and your choice of IDE (tested with both Eclipse and IntelliJ). The source code can be found at: https://github.com/dajevu/Spring3SecurityUsingAPI. After pulling down the code, perform the following steps:In a terminal window, cd to the Shared directory located under the root where the source code resides. Issue the command mvn clean install. This will build the Shared sub-project and install the jar into your local mvn repository. Within Eclipse or IntelliJ, import the project as a Maven project. In Eclipse, this will result in 3 projects being created: Shared, SpringWebApp, and RestfulAPI. In IntelliJ, this will be represented as sub-projects. No errors should exist after the compilation process is complete. Change directory to RestfulAPI. Then, issue the command mvn jetty:run to run the API webapp. You can then issue the following URL that will bring back a User object represented in JSON: http://localhost:9090/RestfulAPI/api/v1/user/john Open up a new terminal window, cd to SpringWebApp directory located under the project root. Issue the command mvn jetty:run. This will launch a standard Spring webapp that incorporates Spring Security. You can access the single HTML page at: http://localhost:8080/SpringWebApp/. After clicking the Login link, login with the username of john and a password of doe. You should be redirected to a Hello Admin page. In order to demonstrate the solution, three maven modules are used, which are illustrated below:SpringWebApp. This is a typical Spring webapp that serves up a single JSP page. The contents of the page will vary depending upon whether the user is currently logged in or not. When first visiting the page, a Login link will appear, which directs them to the built-in Spring Security login form. When they attempt to login, a RESTEasy client is used to place a call to the API service (described below), which returns a JSON string that is converted into a Java object via the RESTEasy client. The details of how Spring Security is configured is discussed in the following sections. RestfulAPI. An API service that serves JSON requests. It is configured using RESTEasy (a JAX-RS implementation), and is described in more detail in the next section. Shared. This contains a few Java classes that are shared between the other two projects. Specifically, the User object DTO, and the RESTEasy proxy definition (it’s shared because it can also be used by the RESTEasy client).RestfulAPI Dissection The API webapp is configured using RESTEasy’s Spring implementation. The RESTEasy documentation is very thorough, so I won’t go into a detailed explanation of its setup. A single API call is defined (in UserProxy in the Shared project) that returns a static JSON string. The API’s proxy (or interface) is defined as follows: Resteasy API Proxy @Produces(MediaType.APPLICATION_JSON) @Consumes(MediaType.APPLICATION_JSON) @Path(UserProxy.Urls.BASE_URL) public interface UserProxy { public interface Urls { public static final String BASE_URL = "/api/v1"; public static final String USER = "/user/{username}"; } @GET @Produces( { MediaType.APPLICATION_JSON }) @Path(UserProxy.Urls.USER) public User getUserByUsername(@PathParam("username") String username); }For those of you familiar with JAX-RS, you’ll easily follow this configuration. It defines an API URI that will respond to requests sent to the URL path of /api/v1/user/{username} where {username} is replaced with an actual username value. The implementation of this service, which simply returns a static response, is shown below: About the only thing remotely complicated is the use of the SHA-256 hashing of the user’s password. We’ll see shortly how this get’s interpreted by Spring Security. When the URL is accessed, the following JSON string is returned: The webapp’s web.xml contains the setup configuration to service RESTEasy requests, so if you’re curious, take a look at that. SpringWebApp Dissection Now we can look at the Spring Security configuration. The web.xml file for the project configures it as a Spring application, and specifies the file applicationContext-security.xml as the initial Spring configuration file. Let’s take a closer look at this file, as this is where most of the magic occurs: Let’s go through each of line numbers to describe their functionality. Lines 3 through 5 instructs Spring to look for Spring-backed classes in the com.acme directory and that Spring annotations will be supported. Line 7 is used to load the properties specified in the application.properties file (this is used to specify the API host). Lines 9 through 11 enable Spring Security for the application. Normally, as a child element to http, you would specify which pages should be protected using roles, but to keep this example simple, that wasn’t configured. Lines 13-17 are where the customizations to base Spring Security begin. We define a custom authentication-provider called userDetailsSrv through its bean ref. That bean is implemented through the custom class com.acme.security.UserDetailsService (line 19). Let’s take a closer look at this class: As you can see, this class implements the Spring interface org.springframework.security.core.userdetails.UserDetailsService. This requires overriding the method loadUserByUsername. This method is responsible for retrieving the user from the authentication provider/source. The returned user (or if no matching user is found, a UsernameNotFoundException is thrown – line 28) must contain a password property in order for Spring Security to compare against what was provided in the form. In this case, as we’ve seen previously, the password is returned in a SHA-256 hash. In our API implementation, the user lookup is pulled using the APIHelper class, which we’ll cover next. The returned API data is then populated in the custom class called UserDetails. This implements the Spring interface with the same name. That interface requires an concrete implementation of the getUsername() and getPassword() methods. Spring will invoke those in the next processing step of Security to compare those values against what was recorded in the web form. How does Spring go about comparing the password returned in the SHA-256 against the form password value. If you look back at the XML configuration, it contained this setting: Notice the passwordEncoder — this reference points to the Spring class ShaPasswordEncoder. This class will compute an SHA-256 password of the password provided through the web form, and then Spring will compare that computed value against what we returned via the API. Let’s close this out by looking at the APIHelper class: The first thing you’ll on lines 8 and 9 is the injection of the API.host property. As you recall, this was set in the application.properties file. This identifies the host in which to post the API call (since it’s running locally, localhost is specified). Lines 17 through 20 use one of the RESTEasy client mechanisms to post a JSON RESTful call (RESTEasy also has what is called client proxy implementation, which is easier to use/less code, but doesn’t provide as much low-level control). The resulting response from the API is then converted from JSON into the User Java object by way Jackson in line 26. That Java object is then returned to the UserDetails service. Summary/Wrap-up As you can see, the actual work involved in customizing Spring Security to authenticate against an API call (or really any external service) is really rather straightforward. Only a few classes have to be implemented, but it can be tricky trying to figure this out for the first time. Hence, the reason I included the complete end-to-end example. Reference: Spring Security using API Authentication from our JCG partner Jeff Davis at the Jeff’s SOA Ruminations blog....

Testing Custom Exceptions with JUnit’s ExpectedException and @Rule

Exception Testing Why test exception flows? Just like with all of your code, test coverage writes a contract between your code and the business functionality that the code is supposed to produce leaving you with a living documentation of the code along with the added ability to stress the functionality early and often. I won’t go into the many benefits of testing instead I will focus on just Exception Testing. There are many ways to test an exception flow thrown from a piece of code. Lets say that you have a guarded method that requires an argument to be not null. How would you test that condition? How do you keep JUnit from reporting a failure when the exception is thrown? This blog covers a few different methods culminating with JUnit’s ExpectedException implemented with JUnit’s @Rule functionality.The ‘old’ way In a not so distant past the process to test an exception required a dense amount of boilerplate code in which you would start a try/catch block, report a failure if your code did not produce the expected behavior and then catch the exception looking for the specific type. Here is an example: public class MyObjTest {@Test public void getNameWithNullValue() {try { MyObj obj = new MyObj(); myObj.setName(null); fail('This should have thrown an exception');} catch (IllegalArgumentException e) { assertThat(e.getMessage().equals('Name must not be null')); } } } As you can see from this old example, many of the lines in the test case are just to support the lack of functionality present to specifically test exception handling. One good point to make for the try/catch method is the ability to test the specific message and any custom fields on the expected exception. We will explore this a bit further down with JUnit’s ExpectedException and @Rule annotation. JUnit adds expected exceptions JUnit responded back to the users need for exception handling by adding a @Test annotation field ‘expected’. The intention is that the entire test case will pass if the type of exception thrown matched the exception class present in the annotation. public class MyObjTest {@Test(expected = IllegalArgumentException.class) public void getNameWithNullValue() { MyObj obj = new MyObj(); myObj.setName(null); } } As you can see from the newer example, there is quite a bit less boiler plate code and the test is very concise, however, there are a few flaws. The main flaw is that the test condition is too broad. Suppose you have two variables in a signature and both cannot be null, then how do you know which variable the IllegalArgumentException was thrown for? What happens when you have extended a Throwable and need to check for the presence of a field? Keep these in mind as you read further, solutions will follow. JUnit @Rule and ExpectedException If you look at the previous example you might see that you are expecting an IllegalArgumentException to be thrown, but what if you have a custom exception? What if you want to make sure that the message contains a specific error code or message? This is where JUnit really excelled by providing a JUnit @Rule object specifically tailored to exception testing. If you are unfamiliar with JUnit @Rule, read the docs here. ExpectedException JUnit provides a JUnit class ExpectedException intended to be used as a @Rule. The ExpectedException allows for your test to declare that an exception is expected and gives you some basic built in functionality to clearly express the expected behavior. Unlike the @Test(expected) annotation feature, ExpectedException class allows you to test for specific error messages and custom fields via the Hamcrest matchers library. An example of JUnit’s ExpectedException import org.junit.rules.ExpectedException;public class MyObjTest {@Rule public ExpectedException thrown = ExpectedException.none();@Test public void getNameWithNullValue() { thrown.expect(IllegalArgumentException.class); thrown.expectMessage('Name must not be null');MyObj obj = new MyObj(); obj.setName(null); } } As I eluded to above, the framework allows you to test for specific messages ensuring that the exception being thrown is the case that the test is specifically looking for. This is very helpful when the nullability of multiple arguments is in question. Custom Fields Arguably the most useful feature of the ExpectedException framework is the ability to use Hamcrest matchers to test your custom/extended exceptions. For example, you have a custom/extended exception that is to be thrown in a method and inside the exception has an ‘errorCode’. How do you test that functionality without introducing the boiler plate code from the try/catch block listed above? How about a custom Matcher! This code is available at: https://github.com/mike-ensor/custom-exception-testing Solution: First the test case import org.junit.rules.ExpectedException;public class MyObjTest {@Rule public ExpectedException thrown = ExpectedException.none();@Test public void someMethodThatThrowsCustomException() { thrown.expect(CustomException.class); thrown.expect(CustomMatcher.hasCode('110501'));MyObj obj = new MyObj(); obj.methodThatThrowsCustomException(); } } Solution: Custom matcher import com.thepixlounge.exceptions.CustomException; import org.hamcrest.Description; import org.hamcrest.TypeSafeMatcher;public class CustomMatcher extends TypeSafeMatcher<CustomException> {public static BusinessMatcher hasCode(String item) { return new BusinessMatcher(item); }private String foundErrorCode; private final String expectedErrorCode;private CustomMatcher(String expectedErrorCode) { this.expectedErrorCode = expectedErrorCode; }@Override protected boolean matchesSafely(final CustomException exception) { foundErrorCode = exception.getErrorCode(); return foundErrorCode.equalsIgnoreCase(expectedErrorCode); }@Override public void describeTo(Description description) { description.appendValue(foundErrorCode) .appendText(' was not found instead of ') .appendValue(expectedErrorCode); } }NOTE: Please visit https://github.com/mike-ensor/custom-exception-testing to get a copy of a working Hamcrest Matcher, JUnit @Rule and ExpectedException. And there you have it, a quick overview of different ways to test Exceptions thrown by your code along with the ability to test for specific messages and fields from within custom exception classes. Please be specific with your test cases and try to target the exact case you have setup for your test, remember, tests can save you from introducing side-effect bugs! Happy coding and don’t forget to share! Reference: Testing Custom Exceptions w/ JUnit’s ExpectedException and @Rule from our JCG partner Mike at the Mike’s site blog....

Log4j Thread Deadlock – A Case Study

This case study describes the complete root cause analysis and resolution of an Apache Log4j thread race problem affecting a Weblogic Portal 10.0 production environment. It will also demonstrate the importance of proper Java classloader knowledge when developing and supporting Java EE applications. This article is also another opportunity for you to improve your thread dump analysis skills and understand thread race conditions. Environment specificationsJava EE server: Oracle Weblogic Portal 10.0 OS: Solaris 10 JDK: Oracle/Sun HotSpot JVM 1.5 Logging API: Apache Log4j 1.2.15 RDBMS: Oracle 10g Platform type: Web PortalTroubleshooting toolsQuest Foglight for Java (monitoring and alerting) Java VM Thread Dump (thread race analysis)Problem overview Major performance degradation was observed from one of our Weblogic Portal production environments. Alerts were also sent from the Foglight agents indicating a significant surge in Weblogic threads utilization up to the upper default limit of 400. Gathering and validation of facts As usual, a Java EE problem investigation requires gathering of technical and non technical facts so we can either derived other facts and/or conclude on the root cause. Before applying a corrective measure, the facts below were verified in order to conclude on the root cause:What is the client impact? HIGH Recent change of the affected platform? Yes, a recent deployment was performed involving minor content changes and some Java libraries changes & refactoring Any recent traffic increase to the affected platform? No Since how long this problem has been observed? New problem observed following the deployment Did a restart of the Weblogic server resolve the problem? No, any restart attempt did result in an immediate surge of threads Did a rollback of the deployment changes resolve the problem? YesConclusion #1: The problem appears to be related to the recent changes. However, the team was initially unable to pinpoint the root cause. This is now what we will discuss for the rest of the article. Weblogic hogging thread report The initial thread surge problem was reported by Foglight. As you can see below, the threads utilization was significant (up to 400) leading to a high volume of pending client requests and ultimately major performance degradation.As usual, thread problems require proper thread dump analysis in order to pinpoint the source of threads contention. Lack of this critical analysis skill will prevent you to go any further in the root cause analysis. For our case study, a few thread dump snapshots were generated from our Weblogic servers using the simple Solaris OS command kill -3 <Java PID>. Thread Dump data was then extracted from the Weblogic standard output log files. Thread Dump analysis The first step of the analysis was to perform a fast scan of all stuck threads and pinpoint a problem “pattern”. We found 250 threads stuck in the following execution path: "[ACTIVE] ExecuteThread: '20' for queue: 'weblogic.kernel.Default (self-tuning)'" daemon prio=10 tid=0x03c4fc38 nid=0xe6 waiting for monitor entry [0x3f99e000..0x3f99f970] at org.apache.log4j.Category.callAppenders(Category.java:186) - waiting to lock <0x8b3c4c68> (a org.apache.log4j.spi.RootCategory) at org.apache.log4j.Category.forcedLog(Category.java:372) at org.apache.log4j.Category.log(Category.java:864) at org.apache.commons.logging.impl.Log4JLogger.debug(Log4JLogger.java:110) at org.apache.beehive.netui.util.logging.Logger.debug(Logger.java:119) at org.apache.beehive.netui.pageflow.DefaultPageFlowEventReporter.beginPageRequest(DefaultPageFlowEventReporter.java:164) at com.bea.wlw.netui.pageflow.internal.WeblogicPageFlowEventReporter.beginPageRequest(WeblogicPageFlowEventReporter.java:248) at org.apache.beehive.netui.pageflow.PageFlowPageFilter.doFilter(PageFlowPageFilter.java:154) at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:42) at com.bea.p13n.servlets.PortalServletFilter.doFilter(PortalServletFilter.java:336) at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:42) at weblogic.servlet.internal.RequestDispatcherImpl.invokeServlet(RequestDispatcherImpl.java:526) at weblogic.servlet.internal.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:261) at <App>.AppRedirectFilter.doFilter(RedirectFilter.java:83) at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:42) at <App>.AppServletFilter.doFilter(PortalServletFilter.java:336) at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:42) at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3393) at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321) at weblogic.security.service.SecurityManager.runAs(Unknown Source) at weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2140) at weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2046) at weblogic.servlet.internal.ServletRequestImpl.run(Unknown Source) at weblogic.work.ExecuteThread.execute(ExecuteThread.java:200) at weblogic.work.ExecuteThread.run(ExecuteThread.java:172)As you can see, it appears that all the threads are waiting to acquire a lock on an Apache Log4j object monitor (org.apache.log4j.spi.RootCategory) when attempting to log debug information to the configured appender and log file. How did we figure that out from this thread stack trace? Let’s dissect this thread stack trace in order for you to better understand this thread race condition e.g. 250 threads attempting to acquire the same object monitor concurrently.At this point the main question is why are we seeing this problem suddenly? An increase of the logging level or load was also ruled out at this point after proper verification. The fact that the rollback of the previous changes did fix the problem did naturally lead us to perform a deeper review of the promoted changes. Before we go to the final root cause section, we will perform a code review of the affected Log4j code e.g. exposed to thread race conditions. Apache Log4j 1.2.15 code review ## org.apache.log4j.Category /** * Call the appenders in the hierrachy starting at <code>this</code>. If no * appenders could be found, emit a warning. * * <p> * This method calls all the appenders inherited from the hierarchy * circumventing any evaluation of whether to log or not to log the * particular log request. * * @param event * the event to log. */ public void callAppenders(LoggingEvent event) { int writes = 0;for (Category c = this; c != null; c = c.parent) { // Protected against simultaneous call to addAppender, // removeAppender,... synchronized (c) { if (c.aai != null) { writes += c.aai.appendLoopOnAppenders(event); } if (!c.additive) { break; } } }if (writes == 0) { repository.emitNoAppenderWarning(this); }As you can see, the Catelogry.callAppenders() is using a synchronized block at the Category level which can lead to a severe thread race condition under heavy concurrent load. In this scenario, the usage of a re-entrant read write lock would have been more appropriate (e.g. such lock strategy allows concurrent “read” but single “write”). You can find reference to this known Apache Log4j limitation below along with some possible solutions. https://issues.apache.org/bugzilla/show_bug.cgi?id=41214 Does the above Log4j behaviour is the actual root cause of our problem? Not so fast… Let’s remember that this problem got exposed only following a recent deployment. The real question is what application change triggered this problem & side effect from the Apache Log4j logging API? Root cause: a perfect storm! Deep dive analysis of the recent changes deployed did reveal that some Log4j libraries at the child classloader level were removed along with the associated “child first” policy. This refactoring exercise ended-up moving the delegation of both Commons logging and Log4j at the parent classloader level. What is the problem? Before this change, the logging events were split between Weblogic Beehive Log4j calls at the parent classloader and web application logging events at the child class loader. Since each classloader had its own copy of the Log4j objects, the thread race condition problem was split in half and not exposed (masked) under the current load conditions. Following the refactoring, all Log4j calls were moved to the parent classloader (Java EE app); adding significant concurrency level to the Log4j components such as Category. This increase concurrency level along with this known Category.java thread race / deadlock behaviour was a perfect storm for our production environment. In other to mitigate this problem, 2 immediate solutions were applied to the environment:Rollback the refactoring and split Log4j calls back between parent and child classloader. Reduce logging level for some appenders from DEBUG to WARNINGThis problem case again re-enforce the importance of performing proper testing and impact assessment when applying changes such as library and class loader related changes. Such changes can appear simple at the “surface” but can trigger some deep execution pattern changes, exposing your application(s) to known thread race conditions. A future upgrade to Apache Log4j 2 (or other logging API’s) will also be explored as it is expected to bring some performance enhancements which may address some of these thread race & scalability concerns. Please provide any comment or share your experience on thread race related problems with logging API’s. Happy  coding and don’t forget to share! Reference: Log4j Thread Deadlock – A Case Study from our JCG partner Pierre-Hugues Charbonneau at the Java EE Support Patterns & Java Tutorial blog....

JUnit Pass Test Case on Failures

Why create a mechanism to expect a test failure? There comes a time when one would want and expect a JUnit @Test case fail. Though this is pretty rare, it happens. I had the need to detect when a JUnit Test fails and then, if expected, to pass instead of fail. The specific case was that I was testing a piece of code that could throw an Assert error inside of a call of the object. The code was written to be an enhancement to the popular new Fest Assertions framework, so in order to test the functionality, one would expect test cases to fail on purpose. A Solution One possible solution is to utilize the functionality provided by a JUnit @Rule in conjunction with a custom marker in the form of an annotation. Why use a @Rule? @Rule objects provide an AOP-like interface to a test class and each test cases. Rules are reset prior to each test case being run and they expose the workings of the test case in the style of an @Around AspectJ advice would.Required code elements@Rule object to check the status of each @Test case @ExpectedFailure custom marker annotation Test cases proving code works! Optional specific exception to be thrown if annotated test case does not failNOTE: working code is available on my github page and has been added to Maven Central. Feel free to Fork the project and submit a pull request Maven Usage <dependency> <groupId>com.clickconcepts.junit</groupId> <artifactId>expected-failure</artifactId> <version>0.0.9</version> </dependency>Example Usage In this example, the ‘exception’ object is a Fest assertion enhanced ExpectedException (look for my next post to expose this functionality). The expected exception will make assertions and in order to test those, the test case must be marked as @ExpectedFailure public class ExceptionAssertTest {@Rule public ExpectedException exception = ExpectedException.none();@Rule public ExpectedTestFailureWatcher watcher = ExpectedTestFailureWatcher.instance();@Test @ExpectedFailure('The matcher should fail becasue exception is not a SimpleException') public void assertSimpleExceptionAssert_exceptionIsOfType() { // expected exception will be of type 'SimpleException' exception.instanceOf(SimpleException.class); // throw something other than SimpleException...expect failure throw new RuntimeException('this is an exception'); } }Implementation of Solution Reminder, the latest code is available on my github page.@Rule code (ExpectedTestFailureWatcher.java)import org.junit.rules.TestRule; import org.junit.runner.Description; import org.junit.runners.model.Statement; // YEAH Guava!! import static com.google.common.base.Strings.isNullOrEmpty;public class ExpectedTestFailureWatcher implements TestRule {/** * Static factory to an instance of this watcher * * @return New instance of this watcher */ public static ExpectedTestFailureWatcher instance() { return new ExpectedTestFailureWatcher(); }@Override public Statement apply(final Statement base, final Description description) { return new Statement() { @Override public void evaluate() throws Throwable { boolean expectedToFail = description.getAnnotation(ExpectedFailure.class) != null; boolean failed = false; try { // allow test case to execute base.evaluate(); } catch (Throwable exception) { failed = true; if (!expectedToFail) { throw exception; // did not expect to fail and failed...fail } } // placed outside of catch if (expectedToFail && !failed) { throw new ExpectedTestFailureException(getUnFulfilledFailedMessage(description)); } }/** * Extracts detailed message about why test failed * @param description * @return */ private String getUnFulfilledFailedMessage(Description description) { String reason = null; if (description.getAnnotation(ExpectedFailure.class) != null) { reason = description.getAnnotation(ExpectedFailure.class).reason(); } if (isNullOrEmpty(reason)) { reason = 'Should have failed but didn't'; } return reason; } }; } }@ExpectedFailure custom annotation (ExpectedFailure.java)import java.lang.annotation.*;/** * Initially this is just a marker annotation to be used by a JUnit4 Test case in conjunction * with ExpectedTestFailure @Rule to indicate that a test is supposed to be failing */ @Documented @Retention(RetentionPolicy.RUNTIME) @Target(value = ElementType.METHOD) public @interface ExpectedFailure { // TODO: enhance by adding specific information about what type of failure expected //Class assertType() default Throwable.class;/** * Text based reason for marking test as ExpectedFailure * @return String */ String reason() default ''; }Custom Exception (Optional, you can easily just throw RuntimeException or existing custom exception) public class ExpectedTestFailureException extends Throwable { public ExpectedTestFailureException(String message) { super(message); } }Can’t one exploit the ability to mark a failure as expected? With great power comes great responsibility, it is advised that you do not mark a test as being @ExpectedFailure if you do not understand exactly why the test if failing. It is recommended that this testing method be implemented with care. DO NOT use the @ExpectedFailure annotation as an alternative to @Ignore Possible future enhancements could include ways to specify the specific assertion or the specific message asserted during the test case execution.Known issues In this current state, the @ExpectedFailure annotation can cover up additional assertions and until the future enhancements have been put into place, it is advised to use this methodology wisely. Reference: Allowing JUnit Tests to Pass Test Case on Failures from our JCG partner Mike at the Mike’s site blog....

Does Immutability really means Thread Safety?

I have often read articles telling “If an object is immutable, it is thread safe”. Actually, I have never found an article that convinces me that immutable means thread safety. Even the book by Brian Goetz Java Concurrency in Practice with its chapter on immutability did not fully satisfied me. In this book we can read word for word, in a frame : Immutable objects are always thread-safe. I think this sentence deserve more explanations. So I am going to try to define immutability and its relation to thread safety.Definitions Immutability My definition is “An immutable object is an object which state does not change after its construction”. I am deliberately vague, since no one really agrees on the exact definitions.Thread safety You can find a lot of different definition of “thread safe” on internet. It’s actually very tricky to define it. I would say that a thread safe code is a code which has an expected behaviour in multi-thread environment. I let you define “expected behaviour”…The String example Lets have a look at the code of String (actually just a part of the code…): public class String { private final char value[];/** Cache the hash code for the string */ private int hash; // Default to 0public String(char[] value) { this.value = Arrays.copyOf(value, value.length); }public int hashCode() { int h = hash; if (h == 0 && value.length > 0) { char val[] = value;for (int i = 0; i < value.length; i++) { h = 31 * h + val[i]; } hash = h; } return h; } } String is considered as immutable. Looking at its implementation, we can deduct one thing : an immutable can change its internal state (in this case, the hashcode which is lazy loaded) as long as it is not externally visible. Now I am going to rewrite the hashcode method in a non thread safe way : public int hashCode() { if (hash == 0 && value.length > 0) { char val[] = value;for (int i = 0; i < value.length; i++) { hash = 31 * hash + val[i]; } } return hash; } As you can see, I have removed the local variable h and affected the variable hash directly instead. This implementation is NOT thread safe! If several threads call hashcode at the same time, the returned value could be different for each thread. The question is, does this class is immutable? Since two different threads can see a different hashcode, in an external point of view we have a change of state and so it is not immutable. We can so conclude that String is immutable because it is thread safe and not the opposite. So… What’s the point of saying “Do some immutable object, it is thread-safe! But take care, you have to make your immutable object thread-safe!”?The ImmutableSimpleDateFormat example Below, I have written a class similar to SimpleDateFormat. public class VerySimpleDateFormat {private final DateFormat formatter = SimpleDateFormat.getDateInstance(SimpleDateFormat.SHORT);public String format(Date d){ return formatter.format(d); } } This code is not thread safe because SimpleDateFormat.format is not. Is this object immutable? Good question! We have done our best to make all fields not modifiable, we don’t use any setter or any methods that let suggest that the state of the object will change. Actually, internally SimpleDateFormat change its state and that’s what makes it not thread safe. Since something change in the object graph, I would say that it’s not immutable, even if it looks like it… The problem is not even that SimpleDateFormat changes its internal state, the problem is that it does it in a non-thread safe way. Conclusion of this example, it is not that easy to make an immutable class. The final keyword is not enough, you have to make sure that the object fields of your object doesn’t change their state, which is sometimes impossible.Immutable objects can have non thread-safe methods (No magics!) Let’s have a look at the following code. public class HelloAppender {private final String greeting;public HelloAppender(String name) { this.greeting = 'hello ' + name + '!\n'; }public void appendTo(Appendable app) throws IOException { app.append(greeting); } } The class HelloAppender is definitely immutable. The method appendTo accepts an Appendable. Since an Appendable has no guarantee to be thread-safe (eg. StringBuilder), appending to this Appendable will cause problems in a multi-thread environment.Conclusion Making immutable objects are definitely a good practice in some cases and it helps a lot to make thread-safe code. But it bothers me when I read everywhere Immutable objects are thread safe, displayed as an axiom. I get the point but I think it is always good to think a bit about that in order to understand what causes non-thread safe codes. Thanks to the comment of Jose, I end up this article with a different conclusion. It’s all about the definition of immutable. It needs clarifications! An object is immutable if :All its field are initialized before being used (which means you can do lazy initialization) The states of the field does not change after their initialization (does not change means that the object graph doesn’t change, even the internal state of the children)An immutable object will always be thread-safe unless it deals it has to manipulate non thread safe objects. Reference: Do Immutability really means Thread Safety? from our JCG partner Tibo Delor at the InvalidCodeException blog....

MapReduce: Working Through Data-Intensive Text Processing

It has been a while since I last posted, as I’ve been busy with some of the classes offered by Coursera. There are some very interesting offerings and is worth a look. Some time ago, I purchased Data-Intensive Processing with MapReduce by Jimmy Lin and Chris Dyer. The book presents several key MapReduce algorithms, but in pseudo code format. My goal is to take the algorithms presented in chapters 3-6 and implement them in Hadoop, using Hadoop: The Definitive Guide by Tom White as a reference. I’m going to assume familiarity with Hadoop and MapReduce and not cover any introductory material. So let’s jump into chapter 3 – MapReduce Algorithm Design, starting with local aggregation. Local Aggregation At a very high level, when Mappers emit data, the intermediate results are written to disk then sent across the network to Reducers for final processing. The latency of writing to disk then transferring data across the network is an expensive operation in the processing of a MapReduce job. So it stands to reason that whenever possible, reducing the amount of data sent from mappers would increase the speed of the MapReduce job. Local aggregation is a technique used to reduce the amount of data and improve the efficiency of our MapReduce job. Local aggregation can not take the place of reducers, as we need a way to gather results with the same key from different mappers. We are going to consider 3 ways of achieving local aggregation:Using Hadoop Combiner functions. Two approaches of “in-mapper” combining presented in the Text Processing with MapReduce book.Of course any optimization is going to have tradeoffs and we’ll discuss those as well. To demonstrate local aggregation, we will run the ubiquitous word count job on a plain text version of A Christmas Carol by Charles Dickens (downloaded from Project Gutenberg) on a pseudo distributed cluster installed on my MacBookPro, using the hadoop-0.20.2-cdh3u3 distribution from Cloudera. I plan in a future post to run the same experiment on an EC2 cluster with more realistic sized data.Combiners A combiner function is an object that extends the Reducer class. In fact, for our examples here, we are going to re-use the same reducer used in the word count job. A combiner function is specified when setting up the MapReduce job like so: job.setReducerClass(TokenCountReducer.class); Here is the reducer code: public class TokenCountReducer extends Reducer<Text,IntWritable,Text,IntWritable>{ @Override protected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int count = 0; for (IntWritable value : values) { count+= value.get(); } context.write(key,new IntWritable(count)); } } The job of a combiner is to do just what the name implies, aggregate data with the net result of less data begin shuffled across the network, which gives us gains in efficiency. As stated before, keep in mind that reducers are still required to put together results with the same keys coming from different mappers. Since combiner functions are an optimization, the Hadoop framework offers no guarantees on how many times a combiner will be called, if at all.In Mapper Combining Option 1 The first alternative to using Combiners (figure 3.2 page 41) is very straight forward and makes a slight modification to our original word count mapper: public class PerDocumentMapper extends Mapper<LongWritable, Text, Text, IntWritable> { @Override protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { IntWritable writableCount = new IntWritable(); Text text = new Text(); Map<String,Integer> tokenMap = new HashMap<String, Integer>(); StringTokenizer tokenizer = new StringTokenizer(value.toString());while(tokenizer.hasMoreElements()){ String token = tokenizer.nextToken(); Integer count = tokenMap.get(token); if(count == null) count = new Integer(0); count+=1; tokenMap.put(token,count); }Set<String> keys = tokenMap.keySet(); for (String s : keys) { text.set(s); writableCount.set(tokenMap.get(s)); context.write(text,writableCount); } } } As we can see here, instead of emitting a word with the count of 1, for each word encountered, we use a map to keep track of each word already processed. Then when all of the tokens are processed we loop through the map and emit the total count for each word encountered in that line.In Mapper Combining Option 2 The second option of in mapper combining (figure 3.3 page 41) is very similar to the above example with two distinctions – when the hash map is created and when we emit the results contained in the map. In the above example, a map is created and has its contents dumped over the wire for each invocation of the map method. In this example we are going make the map an instance variable and shift the instantiation of the map to the setUp method in our mapper. Likewise the contents of the map will not be sent out to the reducers until all of the calls to mapper have completed and the cleanUp method is called. public class AllDocumentMapper extends Mapper<LongWritable,Text,Text,IntWritable> {private Map<String,Integer> tokenMap;@Override protected void setup(Context context) throws IOException, InterruptedException { tokenMap = new HashMap<String, Integer>(); }@Override protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { StringTokenizer tokenizer = new StringTokenizer(value.toString()); while(tokenizer.hasMoreElements()){ String token = tokenizer.nextToken(); Integer count = tokenMap.get(token); if(count == null) count = new Integer(0); count+=1; tokenMap.put(token,count); } }@Override protected void cleanup(Context context) throws IOException, InterruptedException { IntWritable writableCount = new IntWritable(); Text text = new Text(); Set<String> keys = tokenMap.keySet(); for (String s : keys) { text.set(s); writableCount.set(tokenMap.get(s)); context.write(text,writableCount); } } } As we can see from the above code example, the mapper is keeping track of unique word counts, across all calls to the map method. By keeping track of unique tokens and their counts, there should be a substantial reduction in the number of records sent to the reducers, which in turn should improve the running time of the MapReduce job. This accomplishes the same effect as using the combiner function option provided by the MapReduce framework, but in this case you are guaranteed that the combining code will be called. But there are some caveats with this approach also. Keeping state across map calls could prove problematic and definitely is a violation of the functional spirit of a “map” function. Also, by keeping state across all mappers, depending on the data used in the job, memory could be another issue to contend with. Ultimately, one would have to weigh all of the trade offs to determine the best approach.Results Now lets take a look at the some results of the different mappers. Since the job was run in pseudo-distributed mode, actual running times are irrelevant, but we can still infer how using local aggregation could impact the efficiency of MapReduce job running on a real cluster. Per Token Mapper: 12/09/13 21:25:32 INFO mapred.JobClient: Reduce shuffle bytes=366010 12/09/13 21:25:32 INFO mapred.JobClient: Reduce output records=7657 12/09/13 21:25:32 INFO mapred.JobClient: Spilled Records=63118 12/09/13 21:25:32 INFO mapred.JobClient: Map output bytes=302886 In Mapper Reducing Option 1: 12/09/13 21:28:15 INFO mapred.JobClient: Reduce shuffle bytes=354112 12/09/13 21:28:15 INFO mapred.JobClient: Reduce output records=7657 12/09/13 21:28:15 INFO mapred.JobClient: Spilled Records=60704 12/09/13 21:28:15 INFO mapred.JobClient: Map output bytes=293402 In Mapper Reducing Option 2: 12/09/13 21:30:49 INFO mapred.JobClient: Reduce shuffle bytes=105885 12/09/13 21:30:49 INFO mapred.JobClient: Reduce output records=7657 12/09/13 21:30:49 INFO mapred.JobClient: Spilled Records=15314 12/09/13 21:30:49 INFO mapred.JobClient: Map output bytes=90565 Combiner Option: 12/09/13 21:22:18 INFO mapred.JobClient: Reduce shuffle bytes=105885 12/09/13 21:22:18 INFO mapred.JobClient: Reduce output records=7657 12/09/13 21:22:18 INFO mapred.JobClient: Spilled Records=15314 12/09/13 21:22:18 INFO mapred.JobClient: Map output bytes=302886 12/09/13 21:22:18 INFO mapred.JobClient: Combine input records=31559 12/09/13 21:22:18 INFO mapred.JobClient: Combine output records=7657 As expected the Mapper that did no combining had the worst results, followed closely by the first in-mapper combining option (although these results could have been made better had the data been cleaned up before running the word count). The second in-mapper combining option and the combiner function had virtually identical results. The significant fact is that both produced 2/3 less reduce shuffle bytes as the first two options. Reducing the amount of bytes sent over the network to the reducers by that amount would surely would have a positive impact on the efficiency of a MapReduce job. There is one point to keep in mind here and that is Combiners/In-Mapper combining can not just be used in all MapReduce jobs, in this case the word count lends itself very nicely to such an enhancement, but that might not always be true.Conclusion As you can see the benefits of using either in-mapper combining or the Hadoop combiner function require serious consideration when looking to improve the performance of your MapReduce jobs. As for which approach, it is up to you the weigh the trade offs for each approach.Related linksData-Intensive Processing with MapReduce by Jimmy Lin and Chris Dyer Hadoop: The Definitive Guide by Tom White Source Code from blog MRUnit for unit testing Apache Hadoop map reduce jobs Project Gutenberg a great source of books in plain text format, great for testing Hadoop jobs locally.Happy coding and don’t forget to share! Reference: Working Through Data-Intensive Text Processing with MapReduce from our JCG partner Bill Bejeck at the Random Thoughts On Coding blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below: