Featured FREE Whitepapers

What's New Here?

java-logo

Template Method Pattern Example Using Java Generics

If you find that a lot of your routines are exactly the same except for certain sections, you might want to consider the Template Method to eliminate error-prone code duplication. Here’s an example: Below are two classes that do similar things:                Instantiate and initialize a Reader to read from a CSV file. Read each line and break it up into tokens. Unmarshal the tokens from each line into an entity, either a Product or a Customer. Add each entity into a Set. Return the Set.As you can see, it’s only in the third step that there’s a difference – unmarshalling to one entity or another. All other steps are the same. I’ve highlighted the line where the code is different in each of the snippets. ProductCsvReader.java public class ProductCsvReader { Set<Product> getAll(File file) throws IOException { Set<Product> returnSet = new HashSet<>(); try (BufferedReader reader = new BufferedReader(new FileReader(file))){ String line = reader.readLine(); while (line != null && !line.trim().equals("")) { String[] tokens = line.split("\\s*,\\s*"); Product product = new Product(Integer.parseInt(tokens[0]), tokens[1], new BigDecimal(tokens[2])); returnSet.add(product); line = reader.readLine(); } } return returnSet; } } CustomerCsvReader.java public class CustomerCsvReader { Set<Customer> getAll(File file) throws IOException { Set<Customer> returnSet = new HashSet<>(); try (BufferedReader reader = new BufferedReader(new FileReader(file))){ String line = reader.readLine(); while (line != null && !line.trim().equals("")) { String[] tokens = line.split("\\s*,\\s*"); Customer customer = new Customer(Integer.parseInt(tokens[0]), tokens[1], tokens[2], tokens[3]); returnSet.add(customer); line = reader.readLine(); } } return returnSet; } } For this example, there are only two entities, but a real system might have dozens of entities, so that’s a lot of error-prone duplicate code. You might find a similar situation with DAOs, where the select, insert, update, and delete operations of each DAO would do the same thing, only work with different entities and tables. Let’s start refactoring this troublesome code. According to one of the design principles found in the first part of the GoF Design Patterns book, we should “Encapsulate the concept that varies.” Between ProductCsvReader and CustomerCsvReader, what varies is the highlighted code. So our goal is to encapsulate what varies into separate classes, while moving what stays the same into a single class. Let’s start editing just one class first, ProductCsvReader. We use Extract Method to extract the line into its own method: ProductCsvReader.java after Extract Method public class ProductCsvReader { Set<Product> getAll(File file) throws IOException { Set<Product> returnSet = new HashSet<>(); try (BufferedReader reader = new BufferedReader(new FileReader(file))){ String line = reader.readLine(); while (line != null && !line.trim().equals("")) { String[] tokens = line.split("\\s*,\\s*"); Product product = unmarshall(tokens); returnSet.add(product); line = reader.readLine(); } } return returnSet; }Product unmarshall(String[] tokens) { Product product = new Product(Integer.parseInt(tokens[0]), tokens[1], new BigDecimal(tokens[2])); return product; } } Now that we have separated what varies with what stays the same, we will create a parent class that will hold the code that stays the same for both classes. Let’s call this parent class AbstractCsvReader. Let’s make it abstract since there’s no reason for the class to be instantiated on its own. We’ll then use the Pull Up Method refactoring to move the method that stays the same to this parent class. AbstractCsvReader.java abstract class AbstractCsvReader {Set<Product> getAll(File file) throws IOException { Set<Product> returnSet = new HashSet<>(); try (BufferedReader reader = new BufferedReader(new FileReader(file))){ String line = reader.readLine(); while (line != null && !line.trim().equals("")) { String[] tokens = line.split("\\s*,\\s*"); Product product = unmarshall(tokens); returnSet.add(product); line = reader.readLine(); } } return returnSet; } } ProductCsvReader.java after Pull Up Method public class ProductCsvReader extends AbstractCsvReader {Product unmarshall(String[] tokens) { Product product = new Product(Integer.parseInt(tokens[0]), tokens[1], new BigDecimal(tokens[2])); return product; } } This class won’t compile since it calls an “unmarshall” method that’s found in the subclass, so we need to create an abstract method called unmarshall. AbstractCsvReader.java with abstract unmarshall method abstract class AbstractCsvReader {Set<Product> getAll(File file) throws IOException { Set<Product> returnSet = new HashSet<>(); try (BufferedReader reader = new BufferedReader(new FileReader(file))){ String line = reader.readLine(); while (line != null && !line.trim().equals("")) { String[] tokens = line.split("\\s*,\\s*"); Product product = unmarshall(tokens); returnSet.add(product); line = reader.readLine(); } } return returnSet; }abstract Product unmarshall(String[] tokens); } Now at this point, AbstractCsvReader will make a great parent for ProductCsvReader, but not for CustomerCsvReader. CustomerCsvReader will not compile if you extend it from AbstractCsvReader. To fix this, we use Generics. AbstractCsvReader.java with Generics abstract class AbstractCsvReader<T> {Set<T> getAll(File file) throws IOException { Set<T> returnSet = new HashSet<>(); try (BufferedReader reader = new BufferedReader(new FileReader(file))){ String line = reader.readLine(); while (line != null && !line.trim().equals("")) { String[] tokens = line.split("\\s*,\\s*"); T element = unmarshall(tokens); returnSet.add(product); line = reader.readLine(); } } return returnSet; }abstract T unmarshall(String[] tokens); } ProductCsvReader.java with Generics public class ProductCsvReader extends AbstractCsvReader<Product> {@Override Product unmarshall(String[] tokens) { Product product = new Product(Integer.parseInt(tokens[0]), tokens[1], new BigDecimal(tokens[2])); return product; } } CustomerCsvReader.java with Generics public class CustomerCsvReader extends AbstractCsvReader<Customer> {@Override Customer unmarshall(String[] tokens) { Customer customer = new Customer(Integer.parseInt(tokens[0]), tokens[1], tokens[2], tokens[3]); return customer; } } And that’s it! No more duplicate code! The method in the parent class is the “template”, which holds the code that stays the same. The things that change are left as abstract methods, which are implemented in the child classes. Remember that when you refactor, you should always have automated Unit Tests to make sure you don’t break your code. I used JUnit for mine. You can find the code I’ve posted here, as well as a few other Design Patterns examples, at this Github repository. Before I go, I’d like to leave a quick note on the disadvantage of the Template Method. The Template Method relies on inheritance, which suffers from the the Fragile Base Class Problem. In a nutshell, the Fragile Base Class Problem describes how changes in base classes get inherited by subclasses, often causing undesired effects. In fact, one of the underlying design principles found at the beginning of the GoF book is, “favor composition over inheritance”, and many of the other design patterns show how to avoid code duplication, complexity or other error-prone code with less dependence on inheritance. Please give me feedback so I can continue to improve my articles.Reference: Template Method Pattern Example Using Java Generics from our JCG partner Calen Legaspi at the Calen Legaspi blog....
apache-camel-logo

Camel on JBoss EAP with Custom Modules

Apache Camel — the best open source integration library Apache Camel is an awesome, open-source, integration library that can be used as the backbone of an ESB, or in stand alone applications to do routing, transformation, or mediation of systems (read: integrating multiple systems). Camel is quite versatile and does not force users to deploy into any particular container or JVM technology. Deploy into OSGi for flexible modularity, deploy into Java EE when you use the Java EE stack, or deploy into Plain Jane Java Main if you’re doing lightweight microservices style deployments.  Running Camel on EAP I’ve had a few people ask questions recently about running Camel on JBoss Enterprise Application Platform, and I can usually say “well look at this awesome blog someone did about doing just that.” However, for some of the folks at large companies that prefer to curate their usage of third-party libraries and prefer to put them into a globally accessible classpath, packaging the Camel libs into their WAR/EAR is not an option. Here are some reasons why you might want to package Camel on EAP as a global library:Golden image, curated list reduce bloated war deployments can patch/update libs at a single source location assure all applications are using the approved versionsWhy you might NOT want to do this:Java EE containers are intended to be multi-tenant Not flexible in deployment options/versions Possible classpath issues/collisions depending on the third party library and transitive dependencies Complicates the management of the Java EE containerEAP Modules Regardless of the pro/con approaches, what’s the best way to go about getting Camel packaged as a module on JBoss EAP so that you can use it from the global classpath? The answer is to use JBoss EAP’s native modular system called, fittingly, “Modules.” We can create custom modules for EAP and enable for our skinny wars. Step by Step For this blog, I’ll use the previously created Camel example deployed as a simple WAR project. However, instead of including all of the camel jars as <scope>compile</scope> we will change the scope to provided: <dependency> <groupId>org.apache.camel</groupId> <artifactId>camel-core</artifactId> <version>${camel.version}</version> <scope>provided</scope> </dependency> Just a refresh, the maven scope options help you finely control how your dependencies are packaged and presented to the classpath:compile — default scope, used for compiling the project and is packaged onto the classpath as part of the package phase provided — the dependency is required for compile time, but is NOT packaged in the artifact produced by the build in package phase runtime — the dependency must be on the classpath when it’s run, but is not required for compilation and is also not packagedThere are a couple others, but you may wish to check the docs to get a complete understanding. So now that we’ve changed the scope to provided, if we do a build, we should be able to inspect our WAR and verify there are no Camel jars: Build the project from $SOURCE_ROOT ceposta@postamachat$ mvn clean install [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 3.324s [INFO] Finished at: Wed Jul 16 14:16:53 MST 2014 [INFO] Final Memory: 29M/310M [INFO] ------------------------------------------------------------------------ List the contents of the WAR ceposta@postamachat$ unzip -l target/camel-cxf-contract-first-1.0.0-SNAPSHOT.war Archive: target/camel-cxf-contract-first-1.0.0-SNAPSHOT.war Length Date Time Name -------- ---- ---- ---- 0 07-16-14 14:15 META-INF/ 132 07-16-14 14:15 META-INF/MANIFEST.MF 0 07-16-14 14:15 WEB-INF/ 0 07-16-14 14:15 WEB-INF/classes/ 0 07-16-14 14:15 WEB-INF/classes/camelinaction/ 0 07-16-14 14:15 WEB-INF/classes/camelinaction/order/ 0 07-16-14 14:15 WEB-INF/classes/META-INF/ 0 07-16-14 14:15 WEB-INF/classes/META-INF/spring/ 0 07-16-14 14:15 WEB-INF/classes/wsdl/ 1927 07-16-14 14:15 WEB-INF/classes/camelinaction/order/ObjectFactory.class 992 07-16-14 14:15 WEB-INF/classes/camelinaction/order/OrderEndpoint.class 1723 07-16-14 14:15 WEB-INF/classes/camelinaction/order/OrderEndpointImpl.class 2912 07-16-14 14:15 WEB-INF/classes/camelinaction/order/OrderEndpointService.class 604 07-16-14 14:15 WEB-INF/classes/log4j.properties 1482 07-16-14 14:15 WEB-INF/classes/META-INF/spring/camel-cxf.xml 1935 07-16-14 14:15 WEB-INF/classes/META-INF/spring/camel-route.xml 3003 07-16-14 14:15 WEB-INF/classes/wsdl/order.wsdl 1193 05-23-14 04:22 WEB-INF/web.xml 0 07-16-14 14:15 META-INF/maven/ 0 07-16-14 14:15 META-INF/maven/com.redhat.demos/ 0 07-16-14 14:15 META-INF/maven/com.redhat.demos/camel-cxf-contract-first/ 8070 07-16-14 14:03 META-INF/maven/com.redhat.demos/camel-cxf-contract-first/pom.xml 134 07-16-14 14:15 META-INF/maven/com.redhat.demos/camel-cxf-contract-first/pom.properties -------- ------- 24107 23 files If we try to deploy this project to EAP, we would surely run into classpath issues because Camel is not included by default on the classpath in EAP. So let’s build the modules ourselves. First, get access to EAP by downloading from the Red Hat support portal. (Note, these steps may work in Wildfly, but I’m using EAP for this discussion). NOTE: I will use JBoss EAP 6.2 for this example as well as the Red Hat distribution of Apache Camel which comes from JBoss Fuse 6.1 For each of the dependencies in your pom that you’d like to create a custom module for, you’ll have to repeat these steps (Note these steps are formalized in the EAP knowledge base on the Red Hat support portal): create a folder under $EAP_HOME/modules to store your new module ceposta@postamachat(jboss-eap-6.2) $ cd modules ceposta@postamachat(modules) $ mkdir -p org/apache/camel/core create a folder named main under the module folder, as this is where we’ll place the jars for the module ceposta@postamachat(modules) $ mkdir org/apache/camel/core/main Now we’ll need to find out which dependencies/jars need to go into this module. If you use Maven’s Dependency Plugin this should help out tremendously. NOTE: these steps are a one-time effort, however, it’s probably worth a little bit of time to automate these steps with perl/python/bash script. for this demo, I didn’t create a script, but if you do, I’d appreciate you sharing it with everyone either let me know on twitter @christianposta or do a pull request on the github project associated with this blog.. thanks! show the dependencies for the project and each artifact: ceposta@postamachat$ mvn dependency:tree[INFO] ------------------------------------------------------------------------ [INFO] Building [TODO]Camel CXF Contract First Example 1.0.0-SNAPSHOT [INFO] ------------------------------------------------------------------------ [INFO] [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ camel-cxf-contract-first --- [INFO] com.redhat.demos:camel-cxf-contract-first:war:1.0.0-SNAPSHOT [INFO] +- org.apache.camel:camel-core:jar:2.12.0.redhat-610379:provided [INFO] | \- com.sun.xml.bind:jaxb-impl:jar:2.2.6:provided [INFO] +- org.apache.camel:camel-cxf:jar:2.12.0.redhat-610379:provided [INFO] | +- org.apache.camel:camel-spring:jar:2.12.0.redhat-610379:provided [INFO] | | \- org.springframework:spring-tx:jar:3.2.8.RELEASE:provided [INFO] | +- org.apache.camel:camel-cxf-transport:jar:2.12.0.redhat-610379:provided [INFO] | +- org.apache.cxf:cxf-rt-frontend-jaxrs:jar:2.7.0.redhat-610379:provided [INFO] | | +- javax.ws.rs:javax.ws.rs-api:jar:2.0-m10:provided [INFO] | | \- org.apache.cxf:cxf-rt-bindings-xml:jar:2.7.0.redhat-610379:provided [INFO] | +- org.apache.cxf:cxf-rt-frontend-jaxws:jar:2.7.0.redhat-610379:provided [INFO] | | +- xml-resolver:xml-resolver:jar:1.2:provided [INFO] | | +- asm:asm:jar:3.3.1:provided [INFO] | | +- org.apache.cxf:cxf-rt-frontend-simple:jar:2.7.0.redhat-610379:provided [INFO] | | \- org.apache.cxf:cxf-rt-ws-addr:jar:2.7.0.redhat-610379:provided [INFO] | | \- org.apache.cxf:cxf-rt-ws-policy:jar:2.7.0.redhat-610379:provided [INFO] | | \- org.apache.neethi:neethi:jar:3.0.3:provided [INFO] | +- org.springframework:spring-core:jar:3.2.8.RELEASE:provided [INFO] | | \- commons-logging:commons-logging:jar:1.1.3:provided [INFO] | +- org.springframework:spring-beans:jar:3.2.8.RELEASE:provided [INFO] | +- org.springframework:spring-context:jar:3.2.8.RELEASE:provided [INFO] | | \- org.springframework:spring-expression:jar:3.2.8.RELEASE:provided [INFO] | +- org.apache.cxf:cxf-rt-features-clustering:jar:2.7.0.redhat-610379:provided [INFO] | \- org.apache.cxf:cxf-rt-bindings-soap:jar:2.7.0.redhat-610379:provided [INFO] | \- org.apache.cxf:cxf-rt-databinding-jaxb:jar:2.7.0.redhat-610379:provided [INFO] +- log4j:log4j:jar:1.2.16:provided [INFO] +- org.slf4j:slf4j-api:jar:1.6.6:provided [INFO] +- org.slf4j:slf4j-log4j12:jar:1.6.6:provided [INFO] +- org.apache.cxf:cxf-rt-transports-http-jetty:jar:2.7.0.redhat-610379:provided [INFO] | +- org.apache.cxf:cxf-api:jar:2.7.0.redhat-610379:provided [INFO] | | +- org.codehaus.woodstox:woodstox-core-asl:jar:4.2.0:provided [INFO] | | | \- org.codehaus.woodstox:stax2-api:jar:3.1.1:provided [INFO] | | +- org.apache.ws.xmlschema:xmlschema-core:jar:2.1.0:provided [INFO] | | +- org.apache.geronimo.specs:geronimo-javamail_1.4_spec:jar:1.7.1:provided [INFO] | | +- wsdl4j:wsdl4j:jar:1.6.3:provided [INFO] | | \- org.osgi:org.osgi.compendium:jar:4.2.0:provided [INFO] | +- org.apache.cxf:cxf-rt-transports-http:jar:2.7.0.redhat-610379:provided [INFO] | +- org.apache.cxf:cxf-rt-core:jar:2.7.0.redhat-610379:provided [INFO] | +- org.eclipse.jetty:jetty-server:jar:8.1.14.v20131031:provided [INFO] | | +- org.eclipse.jetty:jetty-continuation:jar:8.1.14.v20131031:provided [INFO] | | \- org.eclipse.jetty:jetty-http:jar:8.1.14.v20131031:provided [INFO] | | \- org.eclipse.jetty:jetty-io:jar:8.1.14.v20131031:provided [INFO] | | \- org.eclipse.jetty:jetty-util:jar:8.1.14.v20131031:provided [INFO] | +- org.eclipse.jetty:jetty-security:jar:8.1.14.v20131031:provided [INFO] | \- org.apache.geronimo.specs:geronimo-servlet_3.0_spec:jar:1.0:provided [INFO] +- org.apache.camel:camel-test-spring:jar:2.12.0.redhat-610379:provided [INFO] | +- org.apache.camel:camel-test:jar:2.12.0.redhat-610379:provided [INFO] | \- org.springframework:spring-test:jar:3.2.8.RELEASE:provided [INFO] +- junit:junit:jar:4.11:test [INFO] | \- org.hamcrest:hamcrest-core:jar:1.3:test [INFO] \- org.springframework:spring-web:jar:3.2.5.RELEASE:provided [INFO] +- aopalliance:aopalliance:jar:1.0:provided [INFO] \- org.springframework:spring-aop:jar:3.2.5.RELEASE:provided [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 1.450s [INFO] Finished at: Wed Jul 16 15:03:08 MST 2014 [INFO] Final Memory: 17M/310M [INFO] ------------------------------------------------------------------------ This gives you the complete list of dependencies for your project and each of the top-level and transitive dependencies. Now you know what jars should go into each module. The next step is to download all of these jars to make it easy to copy them to the module folder: Copy all project dependencies to target/dependency ceposta@postamachat$ mvn dependency:copy-dependenciesceposta@postamachat$ ls -l target/dependencytotal 32072 -rw-r--r-- 1 ceposta staff 4467 Jul 16 14:50 aopalliance-1.0.jar -rw-r--r-- 1 ceposta staff 43581 Jul 16 14:50 asm-3.3.1.jar -rw-r--r-- 1 ceposta staff 2592519 Jul 16 14:50 camel-core-2.12.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 207482 Jul 16 14:43 camel-cxf-2.12.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 64726 Jul 16 14:50 camel-cxf-transport-2.12.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 244731 Jul 16 14:50 camel-spring-2.12.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 43947 Jul 16 14:50 camel-test-2.12.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 71455 Jul 16 14:50 camel-test-spring-2.12.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 62050 Jul 16 14:50 commons-logging-1.1.3.jar -rw-r--r-- 1 ceposta staff 1115924 Jul 16 14:50 cxf-api-2.7.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 204287 Jul 16 14:50 cxf-rt-bindings-soap-2.7.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 38847 Jul 16 14:50 cxf-rt-bindings-xml-2.7.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 408403 Jul 16 14:50 cxf-rt-core-2.7.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 129306 Jul 16 14:50 cxf-rt-databinding-jaxb-2.7.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 34276 Jul 16 14:50 cxf-rt-features-clustering-2.7.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 654099 Jul 16 14:50 cxf-rt-frontend-jaxrs-2.7.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 388669 Jul 16 14:50 cxf-rt-frontend-jaxws-2.7.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 67426 Jul 16 14:50 cxf-rt-frontend-simple-2.7.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 260274 Jul 16 14:50 cxf-rt-transports-http-2.7.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 97071 Jul 16 14:50 cxf-rt-transports-http-jetty-2.7.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 80014 Jul 16 14:50 cxf-rt-ws-addr-2.7.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 207480 Jul 16 14:50 cxf-rt-ws-policy-2.7.0.redhat-610379.jar -rw-r--r-- 1 ceposta staff 223298 Jul 16 14:50 geronimo-javamail_1.4_spec-1.7.1.jar -rw-r--r-- 1 ceposta staff 96323 Jul 16 14:50 geronimo-servlet_3.0_spec-1.0.jar -rw-r--r-- 1 ceposta staff 45024 Jul 16 14:50 hamcrest-core-1.3.jar -rw-r--r-- 1 ceposta staff 110928 Jul 16 14:50 javax.ws.rs-api-2.0-m10.jar -rw-r--r-- 1 ceposta staff 1112659 Jul 16 14:50 jaxb-impl-2.2.6.jar -rw-r--r-- 1 ceposta staff 21162 Jul 16 14:50 jetty-continuation-8.1.14.v20131031.jar -rw-r--r-- 1 ceposta staff 96122 Jul 16 14:50 jetty-http-8.1.14.v20131031.jar -rw-r--r-- 1 ceposta staff 104219 Jul 16 14:50 jetty-io-8.1.14.v20131031.jar -rw-r--r-- 1 ceposta staff 89923 Jul 16 14:50 jetty-security-8.1.14.v20131031.jar -rw-r--r-- 1 ceposta staff 357704 Jul 16 14:50 jetty-server-8.1.14.v20131031.jar -rw-r--r-- 1 ceposta staff 287680 Jul 16 14:50 jetty-util-8.1.14.v20131031.jar -rw-r--r-- 1 ceposta staff 245039 Jul 16 14:50 junit-4.11.jar -rw-r--r-- 1 ceposta staff 481535 Jul 16 14:50 log4j-1.2.16.jar -rw-r--r-- 1 ceposta staff 71487 Jul 16 14:50 neethi-3.0.3.jar -rw-r--r-- 1 ceposta staff 614152 Jul 16 14:50 org.osgi.compendium-4.2.0.jar -rw-r--r-- 1 ceposta staff 26176 Jul 16 14:50 slf4j-api-1.6.6.jar -rw-r--r-- 1 ceposta staff 9711 Jul 16 14:50 slf4j-log4j12-1.6.6.jar -rw-r--r-- 1 ceposta staff 335679 Jul 16 14:50 spring-aop-3.2.5.RELEASE.jar -rw-r--r-- 1 ceposta staff 612569 Jul 16 14:50 spring-beans-3.2.8.RELEASE.jar -rw-r--r-- 1 ceposta staff 866273 Jul 16 14:50 spring-context-3.2.8.RELEASE.jar -rw-r--r-- 1 ceposta staff 873608 Jul 16 14:50 spring-core-3.2.8.RELEASE.jar -rw-r--r-- 1 ceposta staff 196367 Jul 16 14:50 spring-expression-3.2.8.RELEASE.jar -rw-r--r-- 1 ceposta staff 457987 Jul 16 14:50 spring-test-3.2.8.RELEASE.jar -rw-r--r-- 1 ceposta staff 242436 Jul 16 14:50 spring-tx-3.2.8.RELEASE.jar -rw-r--r-- 1 ceposta staff 627339 Jul 16 14:50 spring-web-3.2.5.RELEASE.jar -rw-r--r-- 1 ceposta staff 182112 Jul 16 14:50 stax2-api-3.1.1.jar -rw-r--r-- 1 ceposta staff 482245 Jul 16 14:50 woodstox-core-asl-4.2.0.jar -rw-r--r-- 1 ceposta staff 186758 Jul 16 14:50 wsdl4j-1.6.3.jar -rw-r--r-- 1 ceposta staff 84091 Jul 16 14:50 xml-resolver-1.2.jar -rw-r--r-- 1 ceposta staff 165787 Jul 16 14:50 xmlschema-core-2.1.0.jar Now we find what jars go to what dependency and create modules. For example, looking above we see camel-core has a dependency on com.sun.xml.bind:jaxb-impl:jar:2.2.6 Luckily enough, that’s the only dependency and it’s a system dependency that JBoss EAP already provides. So all we need to copy to our JBoss Module directory is the org.apache.camel:camel-core:jar:2.12.0.redhat-610379 dependency. But where do we get that!? Well, since we used dependency:copy-dependencies, it should just be in your target/dependency folder. But the official answer is the Camel jars Red Hat curates are shipped as part of JBoss Fuse. So if you download the distribution for JBoss Fuse, and unpack it, you should see an /extras folder in that distribution. Inside that distribution is an archive file named apache-camel-2.12.0.redhat-610379.zip. If you unpack this archive and check the /lib folder, you will have all of the Camel components and jars that Red Hat supports. Now that we know camel-core is the only jar we’ll need for the camel-core module, let’s copy that over to our module folder on EAP: Copy all of the dependencies and transitive dependencies to module folder ceposta@postamachat(contract-first-camel-eap) $ cp target/dependency/camel-core-2.12.0.redhat-610379.jar $EAP_HOME/modules/org/apache/camel/core/main/ Create module.xml Now we’ll need to add a simple xml descriptor to let EAP know this is a valid module: <?xml version="1.0" encoding="UTF-8"?> <module xmlns="urn:jboss:module:1.1" name="org.apache.camel.core"> <resources> <resource-root path="camel-core-2.12.0.redhat-610379.jar"/> </resources> </module> And now you have a camel-core EAP module! If you have dependencies on other modules, you can add them like this for example, but not necessary for camel-core module (it’s just a sample of what it would look like for other modules that will need this): <dependencies> <module name="org.apache.commons.lang"/> <module name="org.apache.commons.logging" /> <module name="org.apache.commons.collections" /> <module name="org.apache.commons.io" /> <module name="org.apache.commons.configuration" /> </dependencies> Enable the camel-core module: The last thing to do is to enable the module in the global classpath. To do this, find the standalone configuration file and add it to the <global-modules> section of the “EE subsystem”: .... bunch of other stuff here....<subsystem xmlns="urn:jboss:domain:ee:1.1"> <global-modules> <module name="org.apache.camel.core" slot="main" /> </global-modules> </subsystem>.... bunch of other stuff here.... Now do this for the camel-cxf component (hint, these are the jars).. OR if already have some of your custom modules and you want to further split this out into reusable modules, split them by technology (spring, cxf, cxf-transport, etc): [INFO] +- org.apache.camel:camel-cxf:jar:2.12.0.redhat-610379:provided [INFO] | +- org.apache.camel:camel-spring:jar:2.12.0.redhat-610379:provided [INFO] | | \- org.springframework:spring-tx:jar:3.2.8.RELEASE:provided [INFO] | +- org.apache.camel:camel-cxf-transport:jar:2.12.0.redhat-610379:provided [INFO] | +- org.apache.cxf:cxf-rt-frontend-jaxrs:jar:2.7.0.redhat-610379:provided [INFO] | | +- javax.ws.rs:javax.ws.rs-api:jar:2.0-m10:provided [INFO] | | \- org.apache.cxf:cxf-rt-bindings-xml:jar:2.7.0.redhat-610379:provided [INFO] | +- org.apache.cxf:cxf-rt-frontend-jaxws:jar:2.7.0.redhat-610379:provided [INFO] | | +- xml-resolver:xml-resolver:jar:1.2:provided [INFO] | | +- asm:asm:jar:3.3.1:provided [INFO] | | +- org.apache.cxf:cxf-rt-frontend-simple:jar:2.7.0.redhat-610379:provided [INFO] | | \- org.apache.cxf:cxf-rt-ws-addr:jar:2.7.0.redhat-610379:provided [INFO] | | \- org.apache.cxf:cxf-rt-ws-policy:jar:2.7.0.redhat-610379:provided [INFO] | | \- org.apache.neethi:neethi:jar:3.0.3:provided [INFO] | +- org.springframework:spring-core:jar:3.2.8.RELEASE:provided [INFO] | | \- commons-logging:commons-logging:jar:1.1.3:provided [INFO] | +- org.springframework:spring-beans:jar:3.2.8.RELEASE:provided [INFO] | +- org.springframework:spring-context:jar:3.2.8.RELEASE:provided [INFO] | | \- org.springframework:spring-expression:jar:3.2.8.RELEASE:provided [INFO] | +- org.apache.cxf:cxf-rt-features-clustering:jar:2.7.0.redhat-610379:provided [INFO] | \- org.apache.cxf:cxf-rt-bindings-soap:jar:2.7.0.redhat-610379:provided [INFO] | \- org.apache.cxf:cxf-rt-databinding-jaxb:jar:2.7.0.redhat-610379:provided Note, you may want to split out the different third party dependencies here into their own modules. (For example, Spring Framework, Camel Spring, etc) Deploy our project to EAP: Now from the command line, go to the root of the source code for the sample project and do a build and deploy: ceposta@postamachat$ mvn clean installceposta@postamachat$ mvn jboss-as:deploy-only Where to go next? If you have issues with the above I’d be happy to assist, or contact Red Hat Support for quicker response!Reference: Camel on JBoss EAP with Custom Modules from our JCG partner Christian Posta at the Christian Posta – Software Blog blog....
agile-logo

Agile Outsourcing is Like Marriage – A 5-Step Agile Outsourcing How-To

There’s a lot of horror stories in offshore outsourcing of software development, but every once in a while you hear of excellent partnerships. Sounds a lot like marriages, doesn’t it? There’s actually parallels to marriage – You have to spend a lot of time in getting know a lot of potential partners until you find one. Then you’re still not sure so you need to go through an extended period of “getting to know each other”. Finally, you make a commitment, but you need to invest a lot of time and energy to sustain and grow the relationship. And just like marriage, if it goes bad, it’s just a big drain in energy, and you can’t end it without feeling you’ve wasted so much time. Step1: The Checklist All single people have an initial list of “must-haves” for people they’re willing to date. You should have the same in the search for an outsourcing partner. Here’s an initial checklist of things that the outsourcing firm should have, otherwise you should just cross them off your list:Direct Video Contact to Individual Team MembersThere’s a dirty trick in outsourcing where you think you’re emailing or chatting with a particular person, but the person behind that email address or chat ID may have already been replaced several times. Outsourcing companies that have problems with attrition try to hide this from their clients by pretending it’s still the same person that the client is communicating to. In the age of free video conferencing, this danger is easily avoided. Of course, cross-off any firm that insists on a strict “single-point of contact” for all communication.Training and MentoringThe firm should have a training and mentoring program that reassures you that its people are consistently grounded on the practices that you are looking for. Don’t hire a firm that just hires people and then lets them loose on your project without any training or mentoring, since such inconsistency in their practices can leave landmines in your code.Sourcing of TalentThe company has to have some sort of competitive advantage in sourcing talent, otherwise it provides no advantage over others. Does it have any special relationships with universities or developer communities? Are the company leaders prominent personalities in the tech community?MethodologyAlmost all outsourcing companies say they’re Agile now. Find out if they really know what they’re talking about by speaking with someone. Don’t talk to salespeople, talk to the operations people. Step 2: Screening (“First Dates”) After short-listing some potential partners, it’s important that you actually vet the individual developers who would be potentially assigned to your team. Before scheduling an interview, you can have them take a programming exam – after all, if they can’t code, no point in wasting your time in an interview. We use the Codility online service for our programming exams. If the developers pass the exam, set-up time to interview them, again ideally over video conference. It’s fine to interview them all at the same time – it saves time and you also get an idea of the group dynamics. Group interviews also helps give confidence to those developers who are actually very good, but tend to get nervous during interviews. Ask the questions you would normally ask if you were hiring developers in-house. Step 3: Starting the Relationship (“Going Steady”) Signing the first contract isn’t marriage yet, it’s just “going steady”. It’s important to start small, so that both sides have time to learn about each other and adjust. There’s no set formulas for the group dynamics of geographically-separated teams, and each relationship will need to come up with their own ways of doing things as they go (I’m starting to feel like a relationship counselor here).Small TeamStart with a team of just two or three people, for a project that will last around three months. Make sure at least one of the members of the team has a minimum of two years experience – you don’t want to be hand holding an entire team of fresh grads, no matter how talented or trained.High InteractionHave as much interaction as possible during the start, ideally to the point of engaging in daily video scrums or even remote pair programming! There are already numerous articles and tools on remote pair programming, so I won’t elaborate on it here, but please do search for them and check them out. These daily interactions can seem stressful, especially if the remote team’s timezone is not a match for yours, but this is just for the first few iterations. You can safely gradually reduce the intensity of the interactions towards the end of the “going steady” phase. Please remember the concept of “shared inconvenience” when dealing with timezones – if you can’t find a common convenient time, take turns in sharing times that are inconvenient.Engage the ManagementBe sure to be engaged with the management of the company at this time. Meet with the company’s management weekly to give feedback, so that they can perform the necessary interventions on their team, or give deserving performers a pat on the back. This is also an opportunity to request for specific resources that you are not happy with to be replaced. Step 4: Committing (“Marriage”) Now that the “going steady” phase is over, the “marriage” phase can begin. This is typically where you negotiate longer contracts with a larger team size, in exchange for some flexibility with rates. As a side note, you can probably get more concessions from your partner if you agree to hire more junior developers onto your team, since these are usually easier for your partner to source. You can scale down the intensity of the interactions during this phase, but as in marriage, you still need to put in effort to keep the relationship rich and growing.Product Demos and RetrospectivesIf daily scrums or remote pair programming is too intense, at a minimum, all core stakeholders from your side need to participate in Product Demos and Retrospectives at the end of each sprint. It’s also advisable for the team to have retrospectives on their own so that they can speak freely among each other, but they need to share their feedback with the stakeholders on your end.VisitsIf schedule and budget permits, a great way to kick-off the “marriage” phase is a visit to your partner. There’s no better way to build mutual trust and confidence than being face-to-face. This is a good time to conduct product training or domain training. Some team-building activities between yourselves and your outsourced team would also be a good investment – anything from playing Laser-Tag or going to a beach resort together. You’d be surprised at how cheap recreational activities are in most outsourced destinations. This is also a good time for people on your side to unwind. Also take the time to get to know the company’s management. Schedule at least one lunch or dinner with them. Get to know their philosophy and vision for their company so you’ll have a better idea on how to collaborate with them as your relationship progresses. You might even discover business synergies that you hadn’t considered.Allow Paid Time for TrainingOne of the top things that motivate talented people is opportunities to learn. You can help your outsourcing partner retain the people on your team, as well as make your team sharper, by allowing them to be trained on billable time. You can negotiate in your contract how many training hours per year, and what type of trainings will be billable, so it doesn’t go too far.Continue to Engage ManagementThe management of the firm are your partners in managing your team. Engage them early and often, not just when there’s a problem. Share both positive and negative feedback. If there are problem team members, let them early so they can plan interventions – don’t wait until you need to have him swapped out. If you have high performers, let the management know so he can be properly acknowledged. Give the management advanced warning of team composition changes, either up or down. If you need to scale up, you need to give them time to recruit, train, or at least earmark certain people for transfer to your team. If you need to scale down or change people, don’t give them a problem of what to do with people on the bench – give them an opportunity to plan where to reassign these people before you release them from your team. Step 5: Allowing for Growth The two people who enter into a marriage don’t just stay the same people. They continue to grow as individuals, such as growth in their careers and career ambitions. This often puts a strain in a marriage, but as long as the core values of the couple remain intact, each person in the couple needs to adjust and support the growth of the other.PromotionsI often encounter resistance from clients when we tell them that one or more of their well-performing team-members will be likely be promoted. They resent that it comes with an increase in billing rates. We tell them in advance, of course, so they can either make changes in budgets or decide to interview more junior people to take soon-to-be-promoted person’s place, but it’s still never taken well. Keeping a well-performing person in a team without promoting him is a situation that can’t last long. Eventually, the person will leave to find employment more deserving of his skill and professionalism. When that happens, both you and the outsourcing company lose. Therefore, don’t make it so hard for the outsourcing firm to promote its people. Even if your budget doesn’t allow you to keep the person on your team, the person is still within the company, and can be called on by his colleagues in your team for advice.RotationEven more difficult for clients to accept is the need for outsourcing and consulting firms to rotate their people between projects. As I mentioned earlier, talented people want learning and growth. If they can’t get it in the company they’re working for, they’ll get it somewhere else, and then both client and provider lose. The frequency of rotation is inverse to the seniority of the resource. Clients should expect movement of junior developers around once a year, mid-level developers around two years, and then senior developers around three years. I hope you found this primer on Agile outsourcing helpful. If you have any questions, just drop me a line, and I’ll be happy to answer.Reference: Agile Outsourcing is Like Marriage – A 5-Step Agile Outsourcing How-To from our JCG partner Calen Legaspi at the Calen Legaspi blog....
java-logo

Grouping, sampling and batching – custom collectors in Java 8

Continuing first article, this time we will write some more useful custom collectors: for grouping by given criteria, sampling input, batching and sliding over with fixed size window. Grouping (counting occurrences, histogram) Imagine you have a collection of some items and you want to calculate how many times each item (with respect toequals()) appears in this collection. This can be achieved using CollectionUtils.getCardinalityMap() from Apache Commons Collections. This method takes an Iterable<T> and returns Map<T, Integer>, counting how many times each item appeared in the collection. However sometimes instead of using equals() we would like to group by an arbitrary attribute of input T. For example say we have a list of Person objects and we would like to compute the number of males vs. females (i.e. Map<Sex, Integer>) or maybe an age distribution. There is a built-in collector Collectors.groupingBy(Function<T, K> classifier) – however it returns a map from key to all items mapped to that key. See: import static java.util.stream.Collectors.groupingBy;//...final List<Person> people = //... final Map<Sex, List<Person>> bySex = people .stream() .collect(groupingBy(Person::getSex));It’s valuable, but in our case unnecessarily builds two List<Person>. I only want to know the number of people. There is no such collector built-in, but we can compose it in a fairly simple manner: import static java.util.stream.Collectors.counting; import static java.util.stream.Collectors.groupingBy;//...final Map<Sex, Long> bySex = people .stream() .collect( groupingBy(Person::getSex, HashMap::new, counting()));This overloaded version of groupingBy() takes three parameters. First one is the key (classifier) function, as previously. Second argument creates a new map, we’ll see shortly why it’s useful. counting() is a nested collector that takes all people with same sex and combines them together – in our case simply counting them as they arrive. Being able to choose map implementation is useful e.g. when building age histogram. We would like to know how many people we have at given age – but age values should be sorted: final TreeMap<Integer, Long> byAge = people .stream() .collect( groupingBy(Person::getAge, TreeMap::new, counting()));byAge .forEach((age, count) -> System.out.println(age + ":\t" + count));We ended up with a TreeMap from age (sorted) to count of people having that age. Sampling, batching and sliding window IterableLike.sliding() method in Scala allows to view a collection through a sliding fixed-size window. This window starts at the beginning and in each iteration moves by given number of items. Such functionality, missing in Java 8, allows several useful operators like computing moving average, splitting big collection into batches (compare withLists.partition() in Guava) or sampling every n-th element. We will implement collector for Java 8 providing similar behaviour. Let’s start from unit tests, which should describe briefly what we want to achieve: import static com.nurkiewicz.CustomCollectors.sliding@Unroll class CustomCollectorsSpec extends Specification {def "Sliding window of #input with size #size and step of 1 is #output"() { expect: input.stream().collect(sliding(size)) == outputwhere: input | size | output [] | 5 | [] [1] | 1 | [[1]] [1, 2] | 1 | [[1], [2]] [1, 2] | 2 | [[1, 2]] [1, 2] | 3 | [[1, 2]] 1..3 | 3 | [[1, 2, 3]] 1..4 | 2 | [[1, 2], [2, 3], [3, 4]] 1..4 | 3 | [[1, 2, 3], [2, 3, 4]] 1..7 | 3 | [[1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7]] 1..7 | 6 | [1..6, 2..7] }def "Sliding window of #input with size #size and no overlapping is #output"() { expect: input.stream().collect(sliding(size, size)) == outputwhere: input | size | output [] | 5 | [] 1..3 | 2 | [[1, 2], [3]] 1..4 | 4 | [1..4] 1..4 | 5 | [1..4] 1..7 | 3 | [1..3, 4..6, [7]] 1..6 | 2 | [[1, 2], [3, 4], [5, 6]] }def "Sliding window of #input with size #size and some overlapping is #output"() { expect: input.stream().collect(sliding(size, 2)) == outputwhere: input | size | output [] | 5 | [] 1..4 | 5 | [[1, 2, 3, 4]] 1..7 | 3 | [1..3, 3..5, 5..7] 1..6 | 4 | [1..4, 3..6] 1..9 | 4 | [1..4, 3..6, 5..8, 7..9] 1..10 | 4 | [1..4, 3..6, 5..8, 7..10] 1..11 | 4 | [1..4, 3..6, 5..8, 7..10, 9..11] }def "Sliding window of #input with size #size and gap of #gap is #output"() { expect: input.stream().collect(sliding(size, size + gap)) == outputwhere: input | size | gap | output [] | 5 | 1 | [] 1..9 | 4 | 2 | [1..4, 7..9] 1..10 | 4 | 2 | [1..4, 7..10] 1..11 | 4 | 2 | [1..4, 7..10] 1..12 | 4 | 2 | [1..4, 7..10] 1..13 | 4 | 2 | [1..4, 7..10, [13]] 1..13 | 5 | 1 | [1..5, 7..11, [13]] 1..12 | 5 | 3 | [1..5, 9..12] 1..13 | 5 | 3 | [1..5, 9..13] }def "Sampling #input taking every #nth th element is #output"() { expect: input.stream().collect(sliding(1, nth)) == outputwhere: input | nth | output [] | 1 | [] [] | 5 | [] 1..3 | 5 | [[1]] 1..6 | 2 | [[1], [3], [5]] 1..10 | 5 | [[1], [6]] 1..100 | 30 | [[1], [31], [61], [91]] } }Using data driven tests in Spock I managed to write almost 40 test cases in no-time, succinctly describing all requirements. I hope these are clear for you, even if you haven’t seen this syntax before. I already assumed existence of handy factory methods: public class CustomCollectors {public static <T> Collector<T, ?, List<List<T>>> sliding(int size) { return new SlidingCollector<>(size, 1); }public static <T> Collector<T, ?, List<List<T>>> sliding(int size, int step) { return new SlidingCollector<>(size, step); }}The fact that collectors receive items one after another makes are job harder. Of course first collecting the whole list and sliding over it would have been easier, but sort of wasteful. Let’s build result iteratively. I am not even pretending this task can be parallelized in general, so I’ll leave combiner() unimplemented: public class SlidingCollector<T> implements Collector<T, List<List<T>>, List<List<T>>> {private final int size; private final int step; private final int window; private final Queue<T> buffer = new ArrayDeque<>(); private int totalIn = 0;public SlidingCollector(int size, int step) { this.size = size; this.step = step; this.window = max(size, step); }@Override public Supplier<List<List<T>>> supplier() { return ArrayList::new; }@Override public BiConsumer<List<List<T>>, T> accumulator() { return (lists, t) -> { buffer.offer(t); ++totalIn; if (buffer.size() == window) { dumpCurrent(lists); shiftBy(step); } }; }@Override public Function<List<List<T>>, List<List<T>>> finisher() { return lists -> { if (!buffer.isEmpty()) { final int totalOut = estimateTotalOut(); if (totalOut > lists.size()) { dumpCurrent(lists); } } return lists; }; }private int estimateTotalOut() { return max(0, (totalIn + step - size - 1) / step) + 1; }private void dumpCurrent(List<List<T>> lists) { final List<T> batch = buffer.stream().limit(size).collect(toList()); lists.add(batch); }private void shiftBy(int by) { for (int i = 0; i < by; i++) { buffer.remove(); } }@Override public BinaryOperator<List<List<T>>> combiner() { return (l1, l2) -> { throw new UnsupportedOperationException("Combining not possible"); }; }@Override public Set<Characteristics> characteristics() { return EnumSet.noneOf(Characteristics.class); }}I spent quite some time writing this implementation, especially correct finisher() so don’t be frightened. The crucial part is a buffer that collects items until it can form one sliding window. Then “oldest” items are discarded and window slides forward by step. I am not particularly happy with this implementation, but tests are passing. sliding(N)(synonym to sliding(N, 1)) will allow calculating moving average of N items. sliding(N, N) splits input into batches of size N. sliding(1, N) takes every N-th element (samples). I hope you’ll find this collector useful, enjoy!Reference: Grouping, sampling and batching – custom collectors in Java 8 from our JCG partner Tomasz Nurkiewicz at the Java and neighbourhood blog....
java-interview-questions-answers

Use Cases for Elasticsearch: Flexible Query Cache

In the previous two posts on use cases for Elasticsearch we have seen that Elasticsearch can be used to store even large amounts of documents and that we can access those using the full text features of Lucene via the Query DSL. In this shorter post we will put both of use cases together to see how read heavy applications can benefit from Elasticsearch. Search Engines in Classic Applications Looking at classic applications search engines were a specialized thing that was only responsible for helping with one feature, the search page.  On the left we can see our application, most of its functionality is build by querying the database. The search engine only plays a minor part and is responsible for rendering the search page. Databases are well suited for lots of types of applications but it turns out that often it is not that easy to scale them. Websites with high traffic peaks often have some problems scaling database access. Indexing and scaling machines up can help but often requires specialized knowledge and can become rather expensive. As with other search features especially ecommerce providers started doing something different. They started to employ the search engine not only for full text search but also for other parts of the page that require no direct keyword input by the user. Again, let’s have a look at a page at Amazon.This is one of the category pages that can be accessed using the navigation. We can already see that the interface looks very similar to a search result page. There is a result list, we can sort and filter the results using the facets. Though of course I have no insight how Amazon is doing this exactly a common approach is to use the search engine for pages like this as well. Scaling Read Requests A common problem for ecommerce websites is that there are huge traffic spikes. Depending on your kind of business you might have a lot more traffic just before christmas. Or you might have to fight spikes when there are TV commercials for your service or any special discounts. Flash sale sites are at the extreme end of those kind of sites with very high spikes at a certain point in time when a sale starts. It turns out that search engines are good at being queried a lot. The immutable data set, the segments, are very cache friendly. When it comes to filters those can be cached by the engine as well most of the times. On a warm index most of the data will be in RAM so it is lightning fast. Back to our example of talks that can be accessed online. Imagine a navigation where the user can choose the city she wants to see events for. You can then issue a query like this to Elasticsearch: curl -XPOST "http://localhost:9200/conferences/_search " -d' { "filter": { "term": { "conference.city": "stuttgart" } } }' There is no query part but only a filter that limits the results to the talks that are in Stuttgart. The whole filter will be cached so if a lot of users are accessing the data there can be a huge performance gain for you and especially your users. Additionally as we have seen new nodes can be added to Elasticsearch without a lot of hassle. If we need more query capacity we can easily add more machines and more replicas, even temporarily. When we can identify some pages that can be moved to the search engine the database doesn’t need to have that much traffic anymore.Especially for getting the huge spikes under control it is best to try to not access the database anymore for read heavy pages and deliver all of the content from the search engine. Conclusion Though in this post we have looked at ecommerce the same strategy can be applied to different domains. Content management systems can push the editorial content to search engines and let those be responsible for scaling. Classifieds, social media aggregation, …. All of those can benefit from the cache friendly nature of a search engine. Maybe you will even notice that parts of your data don’t need to be in the database at all and you can migrate them to Elasticsearch as a primary data store. A first step to polyglot persistence.Reference: Use Cases for Elasticsearch: Flexible Query Cache from our JCG partner Florian Hopf at the Dev Time blog....
software-development-2-logo

Test Attribute #5 – Differentiation

This the 5th post about Test Attributes that started off with celebrity-level “ How to test your tests” post. Differentiation is not an attribute of a single test. Differentiation does not ride alone, because it requires multiple tests.   Tests allow us to (a) know something is wrong and (b) help us locate that something. We want to plant lots of clues for our future us (or another code victim of ours) who will need to analyze and solve the problem. For us to do this, and I really hate doing it, I’ll raise the ghost of our fallen enemy: Waterfall. Years ago, when I visited water-world, I use to write SDDs. These were the dreaded Software Detailed Design documents, and we wrote them for both features and components. Of course, nobody liked them, their templates, the weird words in the beginning and they even smelled funny. But… They had one thing going for them: In order to write one, you had to think about what you’re going to do. (Sounds like the biggest benefit of TDD, right?). Once we reviewed the documents, it was a good starting point to ask “what if” questions. What happens in the case of disconnect? What if the component is not initialized in time? As part of our learning, at one point we even added a test-case description to the doc, so the writer needed to also think up front of all the cases he needed to check, and we could review those too. The list also served as a check list for the implementer to test. Back to the future That’s right, waterfall was evil, but sometimes had some good parts in its heart. We usually give BDUF (big design up front) a bad rep, but really, it’s the effort in documentation that bothers us, not the thinking up front. Scientists have proven that thinking about something before doing it correlates to its success. Imagine that. TDD tells us to focus on the current test. The hardcore guys take that to the extreme, but in fact, it’s really hard to do. While we’re doing one scenario, we’re still thinking about the other “what ifs”. If we’re not doing TDD, and writing code first, as we code we’re thinking about those “what ifs”. And we should embrace the way we think, and make the most of it. Baking Differentiation In We’re already doing the thinking about the scenarios, and what makes them different from each other. All we have to do now is to make sure we leave the breadcrumb trail of our thoughts.Group the test cases. Put all relating cases in one place, and separate from others. Put them in a separate class/file and give it a distinct group name. Yes, even if it there are other tests for that method – remember, convention should help us be effective, not restrict us because it’s there. Review the test names as a group. First, look for missing cases, and if there are – write tests for them. Review the names in the group individually and see if they complement each other. If the names overlap, move the distinction to the left, so you can differentiate between them if the test runner does not show the entire name. Review the test body. Sometimes, we “cover” the code as part of the setup for the test, and what differentiates are actual settings that differ between tests. Make the tests reflect that: separate the common setup lines from the differentiation setting and action. You can also try (but may not always succeed) to extract a common setup, and have the remaining, distinct lines remain in the test. Review the covered code. You may even leave hints in the code itself, by matching names of variables and functions in the tested code to naming used in the test. However, much like stale comments, this can go bad, if things don’t get updated when refactoring. Use at your own risk.In order to analyze a problem when tests fail, we need to get into detective mode. The more evidence we have, the better. With enough differentiation, we can get a mental model of what works and what doesn’t, and better – where the problem might lurk, so we can go on and fix it.Reference: Test Attribute #5 – Differentiation from our JCG partner Gil Zilberfeld at the Geek Out of Water blog....
software-development-2-logo

BDD (Behavior-Driven Development): Missing Piece in the Continuous Integration Puzzle

Behavior-Driven Development (BDD) is a process or it can be a tool. In many cases, BDD is both. However, it should not be a goal in itself. The goal of software development is to deliver quality as fast and as cheap as possible. The only real measure of quality is whether it fulfills user needs in a reliable manner. The best way we can take to accomplish that goal is through continuous integration, deployment and delivery. For the sake of this article I will ignore the differences between those three and refer to all of them as continuous integration or CI. CI is often misunderstood and BDD can provide a missing piece of the puzzle. It is usually implemented as a series of steps that are initiated with a commit to the repository, followed by software being built, statically checked, unit tested, integration tested and, finally delivered. With those steps we are confirming that the software always does what the team expects it to do. The only way to accomplish this goal is to have the team work as a single unified body. Even though there is always some type of specialization and different profiles might have some level of autonomy (front-end and back-end developers, testers…) they must all work together from the start until the end. Often overlooked element in this picture is the client and the users. Having the software always working as expected can not be accomplished unless those that set the expectations are involved throughout the whole process. Who sets the expectations? Users do. They are the only ones who can say whether the application we’re building is a success or not. They define what should be built because it is their needs that we are trying to fulfill. This is where BDD comes in and creates a wrapper around our CI process. With CI and BDD we can have the software that is always integrated in a way that fulfills expectations of our users instead doing what we think it should do. This sentence present small but very important difference. Whether software works as we expect it to work is not a goal we should aim for. It should do what users expect it to do. We do not set the expectations. Users do. BDD replaces traditional requirements with executable specifications written by or in cooperation with customers and users and provides continuous feedback when executed as part of our CI process. While narrative and scenarios are a substitute for traditional requirements or user stories, automation of those scenarios is required for BDD to be fully integrated into the CI process. Narrative and scenarios are a process that through different tools can provide automation that we require. It is both a process and a tool. Sprint 0 should be used to set up our tools and high level design (IDE, servers, architecture design…), CI server and the BDD framework. From there on we can start writing our BDD stories. Each of them, once written, should be taken by developers and implemented. If the story is pushed together with the implementation code, feedback obtained from the CI is almost immediate. That feedback is the piece often missing in order to have the successful implementation of the CI process. Having Jenkins (or any other similar framework) is not sufficient by itself. If we’re seeking to build reliable software continuously, final verification in the process must be based on some kind of integration and functional tests that confirm that user expectations are met. Otherwise, we’ll never have the confidence required for the decision to implement continuous deployment or delivery. The question might arise why the feedback from unit tests is not good enough to provide us with information whether our software is working as expected. Unit tests are a must because they are fast to write and to execute. However, they are telling us whether all our units of code are working properly. They can not assure us that all those units are integrated into the functionality they compose. How about other types of integration tests? If they are based on pure code, they can neither be written nor understood by the customer or users. Without them, integration tests are our assumption of what they want and might or might not be true. More over, since they must work in conjunction with requirements they represent duplication of work by providing two forms of the same concept. Requirements are tests often written in a different format. If requirements become executable, there is no need for separate artifacts. If BDD is the replacement for requirements and integration tests, can we get rid of unit tests? Yes we can but we should not. Even though one can write BDD on all levels, using it instead of unit tests would increase drastically the amount of work. More over, it would complicate the communication with the customer and users. Keeping unit tests as a way to verify all combinations software can do on a unit level frees us to write BDD scenarios in a compact way that confirms the integration of those unit tests while providing good communication tool that acts as a final verification of functionalities we are developing. Requirements themselves should be executable and that is what BDD is trying to accomplish. If integrated into the CI process, it provides the missing piece by converting the process that was continuously providing feedback regarding what we think should be developed to what the customer and users think should be developed. It is present throughout the whole process starting as a way to capture requirements, guide through the development and act as the final verification of the CI. It is the missing piece required to have a reliable delivery to production on a continuous basis.Reference: BDD (Behavior-Driven Development): Missing Piece in the Continuous Integration Puzzle from our JCG partner Viktor Farcic at the Technology conversations blog....
software-development-2-logo

SonarQube As An Education Platform

I’ve been using SonarQube [1] platform for more than four years. I remember the time when it was making its first baby steps as a code quality management tool. It looked more like a system that was integrated with various third-party static analysis tools (like PMD, FindBugs etc.) and provided a few but important code quality metrics. Many things changed over the next years. SonarQube today is considered a mature software eco-system (in my humble opninion the best) that provides a set of features for successfully applying the process of continous inspection to any development methodology. In this article I’m not going to discuss about SonarQube’s star features that help you manage and control your Techinical Debt. I will give a different point of view and explain how you can use it as an educational platform. Teaching developers with coding rules Since release 4.0, integration of external tools has been gradually dropped off and several of the coding rules provided by these tools have been replaced by rules, written using an in-house developed (but still open-sourced) language parsing library which is called SonarSource[2] Language Recognizer (SSLR)[3]. One of the great benefits of this rule re-writing is that they include a very explanatory description about the purpose of the rule as well as several code examples – if applicable – that present the right and wrong way of writing code. Let’s take a look at the following image which is a snapshot of a java coding rule that checks if Object.equals is overridden when Object.compareTo is overridden. As you see, the rule is not only backed-up by a very detailed and well-argued explanation but it also contains two code snippets: a compliant and a not-compliant one.Developers are able to read all this information when they are looking at an issue[4] that violated this rule. They are supposed to understand what they did wrong, fix it and hopefully don’t make the same mistake again in the future. But hey!! You don’t have to sit down and wait for SonarQube to raise an issue so that developers read about the correct way of writing code. You can send the developers to study the rules anytime they want. In other words, educate them before a quality flaw appears. In the company I work with, we have filtered out the rules that are not aligned with our coding style and then we grouped them by using the tagging mechanism provided by SonarQube [5]. Then, we organized training sessions where we walked through every rule of a specific tag(group) and we discussed the details of each rule and the suggested way of coding. That’s all! We noticed that the developers started writing better code from the very next day and SonarQube’s issues were very limited for the coding rules we have already discussed. Learning from code reviews If you don’t have enough time to allocate for the previous suggestion then you might consider an alternative approach. Most of you, are probably familiar with code reviews or at least know the basics and the benefits of applying such a practice. SonarQube provides a built-in tool that facilitates the code review process.In a few words, each issue can be assigned to a developer and can be also planned in an action plan. Code reviewers are able to confirm the issue, can mark it as a false-positive case by providing some additional reasoning for that, or just comment on it with some suggestions or possible solutions to fix the problematic code. All this issue-interactivity can be viewed as a way to teach people, especially young developers. Like the previous section, you can ask developers to read comments or study the raised issues. A nice way of doing this, without needing to cut time from your development tasks, would be the following. First prioritize SonarQube issues and plan them using action plans. For instance, you might have an action plan that includes all issues that should be fixed during the current iteration and another one for future iterations. Then try to hold short meetings during the iteration where you review all SonarQube issues and prioritize them. As soon as you finish and plan the required issues , let’s say for the current iteration, you can ask the developers to work as a team and come to a solution, especially for those issues that an one-line fix is not enough. Finally document the solution by commenting the relevant issues so that everyone can see it. The benefit of this approach is that developers are required to understand the underlying broken coding rules for all issues (not only for the ones that they created) and then figure out the fix. Conclusion Educating developers should be constant and continuous. But this is something that most companies forget, intentionally (lack of budget) or not (lack of time). If we try not to regard it as a necessary evil, but something that can take place during every day development tasks, then we might have a better chance. SonarQube’s coding rules and the easy-to-use coding review mechanism come to the rescue and can be put-upon to teach developers how to write better code and eventually make them better professionals.   This article was originally published at NDC Oslo Magazine 2014 Referenceshttp://www.sonarqube.org http://www.sonarsource.com http://docs.codehaus.org/display/SONAR/SSLR http://docs.codehaus.org/display/SONAR/Issues http://docs.codehaus.org/display/SONAR/Configuring+Rules#ConfiguringRules-TaggingRules Reference: SonarQube As An Education Platform from our JCG partner Patroklos Papapetrou at the Only Software matters blog....
devops-logo

Why you should build an Immutable Infrastructure

Some of the major challenges today when building infrastructure are predictability, scalability and automated recovery. A predictable system will promote the exact same artifact that you tested into your production system so no intermittent failure can cause any trouble. A scalable system makes it trivial, especially automatically, to deal with any rise in traffic. And automated recovery will make sure your team can focus on building a better product and sleep during the night instead of maintaining infrastructure constantly. At Codeship we’ve found that an Infrastructure made up of immutable components has helped us tremendously with these goals.   Julian Dunn from Chef recently released a blog post about their stance on immutable infrastructure. Chad Fowler summed it up very well in a tweet: @flomotlik pretty weak IMO. It conflates "containerisation" & "immutable infrastructure" then harps on a rigid definition of "immutable" — CHad Fowler (@chadfowler) June 30, 2014Instead of going over every piece of the article, I want to present an overview of the experience we – and others – have had in making parts of our infrastructure immutable. What is Immutable Infrastructure Immutable infrastructure is comprised of immutable components that are replaced for every deployment, rather than being updated in-place. Those components are started from a common image that is built once per deployment and can be tested and validated. The common image can be built through automation, but doesn’t have to be. Immutability is independent of any tool or workflow for building the images. Its best use case is in a cloud or virtualized environment. While it’s possible in non-virtualized environments, the benefit doesn’t outweigh the effort. State Isolation The main criticism against immutable infrastructure – as stated in the Chef blog post – is that there is always state somewhere in the system and, therefore, the whole system isn’t immutable. That misses the point of immutable components. The main advantage when it comes to state in immutable infrastructure is that it is siloed. The boundaries between layers storing state and the layers that are ephemeral are clearly drawn and no leakage can possibly happen between those layers. There simply is no way to mix state into different components when you can’t expect them to be up and running the next minute. Atomic Deployments and Validation Updating an existing server can easily have unintended consequences. That’s why Chef, Puppet, CFEngine or other such tools exist – to take care of consistency across your infrastructure. A central system is necessary to manage the expected state of each server and to take action to ensure compliance. Deployment is not an atomic action but a transition that can go wrong and lead to an unknown state. This becomes very hard and complex to debug, as the exact state you are in is hard to know. Chef, Puppet or CFEngine are very complex systems as they have to deal with an overly complex problem. Another solution to that problem is to build completely new images and servers that contain the application and the environment every time you want to deploy. In that case, the deployment doesn’t depend on the status the servers were in before, so the result is much more predictable and repeatable. Any third-party issues that may cause the deployment to fail can be caught by validating the new image and ensuring no production system was impacted. This one image can then be used to start any number of servers and switch atomically from the old machines to the new ones by changing the load balancer, for example. There are of course downsides to rebuilding your images with every deployment. A full rebuild of the system takes a lot longer than simply updating and restarting the application. By layering your deployment you can optimize this, e.g. have a repository to build a base image and use that base image to just put in your application for the deployment image, but it will still be a slower process. Another problem is that you introduce dependencies to third parties during deployment. If you install packages in the system and your apt repository is slow or down this can fail the deployment. While this could be a problem in a non immutable infrastructure as well you typically interact less with third party systems when you just push new code into an already provisioned system. By deploying from a pre-provisioned base image and updating that base image regularly you can soften that problem, but it’s still there and might fail a deployment from time to time. Building the automation currently still takes more time at the beginning of the project, as the tools for building immutable infrastructure are still new or need to be developed. It is definitely more investment in the beginning, but pays off immediately. You can still use Chef, Puppet, CFEngine or Ansible to build your images, but as they aren’t built for an immutable infrastructure workflow they tend to be more complex than necessary. Fast Recovery by preserving History As all deployments are done by building new images, history is preserved automatically for rollback when necessary. The same process and automation that is used to deploy the next version can be used to roll back, which ensures the process of rolling back will work. By automating the creation of the images, you can even recreate historical images and branch off from earlier points in the history of the infrastructure. Data schema changes are a potential problem, but that’s a general issue with rollbacks. Backwards compatibility and zero downtime deployments are a way to make sure this will work regardless of the changes. Simple Experimentation As you control the whole environment and application, any experiments with new versions of the language, operating system or dependencies are easy. With strict testing and validation in place, and the ability to roll-back if necessary, all the fear of upgrading any dependency is removed. Experimentation becomes an integral and trivial part of building your infrastructure. Makes you collect your logs and metrics in a central location With immutable components in place, it’s easy to simply kill a misbehaving server. While often errors are simply a product of the environment, for example a third party system misbehaving, and can be ignored, some will keep coming up. Not having access into the servers puts the right incentive on the team to collect and store logs and system metrics externally. This way, debugging can happen while the server is long gone. If logs and metrics are missing to properly debug an issue, it’s easy to add more data collection to the infrastructure and replace all existing servers. Then once the error comes up again you can debug it fully from the data stored on an external system. Conclusions Immutable components as part of your infrastructure are a way to reduce inconsistency in your infrastructure and improve the trust into your deployment process. Atomic deployments, combined with validation of the image and easy rollback, make managing your infrastructure a lot easier. It forces teams to silo data and expect failures that are inherent when building on top of a cloud infrastructure or when building systems in general. This increases resilience and trains you in a process to withstand any problems, especially in an automated fashion. Furthermore, it helps with building simple and independent components that are easy to deploy and scale. And it’s not a theoretical idea. At Codeship, we’ve built our infrastructure this way for a long time. Heroku and other PaaS providers are built as immutable components and lots of companies – small and very large – have used immutability as a core concept of their infrastructure. Tools like Packer have made building immutable components very easy. Together with existing cloud infrastructure they are a powerful concept to help you build better and safer infrastructure. Let me know in the comments if you have any questions or interesting insights to share.Reference: Why you should build an Immutable Infrastructure from our JCG partner Florian Motlik at the Codeship Blog blog....
logback-logo

How to Instantly Improve Your Java Logging With 7 Logback Tweaks

The benchmark tests to help you discover how Logback performs under pressure Logging is essential for server-side applications but it comes at a cost. It’s surprising to see though how much impact small changes and configuration tweaks can have on an app’s logging throughput. In this post we will benchmark Logback’s performance in terms of log entries per minute. We’ll find out which appenders perform best, what is prudent mode, and what are some of the awesome side effects of Async methods, sifting and console logging. Let’s get to it. The groundwork for the benchmark At its core, Logback is based on Log4j with tweaks and improvements under Ceki Gülcü’s vision. Or as they say, a better Log4j. It features a native slf4j API, faster implementation, XML configuration, prudent mode, and a set of useful Appenders which I will elaborate on shortly. Having said that, there are quite a few ways to log with the different sets of Appenders, patterns and modes available on Logback. We took a set of commonly used combinations and put them to a test on 10 concurrent threads to find out which can run faster. The more log entries written per minute, the more efficient the method is and more resources are free to serve users. It’s not exact science but to be more precise, we’ve ran each test 5 times, removed the top and bottom outliers and took the average of the results. To try and be fair, all log lines written also had an equal length of 200 characters. ** All code is available on GitHub right here. The test was run on a Debian Linux machine running on Intel i7-860 (4 core @ 2.80 GHz) with 8GB of RAM. First Benchmark: What’s the cost of synchronous log files? First we took a look at the difference between synchronous and asynchronous logging. Both writing to a single log file, the FileAppender writes entries directly to file while the AsyncAppender feeds them to a queue which is then written to file. The default queue size is 256, and when it’s 80% full it stops letting in new entries of lower levels (Except WARN and ERROR).The table compares between the FileAppender and different queue sizes for the AsyncAppender. Async came on top with the 500 queue size.Tweak #1: AsyncAppender can be 3.7x faster than the synchronous FileAppender. Actually, it’s the fastest way to log across all appenders.It performed way better than the default configuration that even trails behind the sync FileAppender which was supposed to finish last. So what might have happened? Since we’re writing INFO messages, and doing so from 10 concurrent threads, the default queue size might have been too small and messages could have been lost to the default threshold. Looking at results of the 500 and 1,000,000 queue sizes, you’ll notice that their throughput was similar so queue size and threshold weren’t an issue for them.Tweak #2: The default AsyncAppender can cause a 5 fold performance cut and even lose messages. Make sure to customize the queue size and discardingThreshold according to your needs.<appender name="ASYNC500" class="ch.qos.logback.classic.AsyncAppender"> <queueSize>500</queueSize> <discardingThreshold>0</discardingThreshold> <appender-ref ref="FILE" /> </appender> ** Setting an AsyncAppender’s queueSize and discardingThreshold Second Benchmark: Do message patterns really make a difference? Now we want to see the effect of log entry patterns on the speed of writing. To make this fair we kept the log line’s length equal (200 characters) even when using different patterns. The default Logback entry includes the date, thread, level, logger name and message, by playing with it we tried to see what the effects on performance might be.This benchmark demonstrates and helps see up close the benefit of logger naming conventions. Just remember to change its name accordingly to the class you use it in.Tweak #3: Naming the logger by class name provides 3x performance boost.Taking the loggers or the threads name off added some 40k-50k entries per minute. No need to write information you’re not going to use. Going minimal also proved to be a bit more effective.Tweak #4: Compared to the default pattern, using only the Level and Message fields provided 127k more entries per minute.Third Benchmark: Dear prudence, won’t you come out to play? In prudent mode a single log file can be accessed from multiple JVMs. This of course takes a hit on performance because of the need to handle another lock. We tested prudent mode on 2 JVMs writing to a single file using the same benchmark we ran earlier.Prudent mode takes a hit as expected, although my first guess was that the impact would be a stronger.Tweak #5: Use prudent mode only when you absolutely need it to avoid a throughput decrease.<appender name="FILE_PRUDENT" class="ch.qos.logback.core.FileAppender"> <file>logs/test.log</file> <prudent>true</prudent> </appender> ** Configuring Prudent mode on a FileAppender Fourth Benchmark: How to speed up synchronous logging? Let’s see how synchronous appenders other than the FileAppender perform. The ConsoleAppender writes to system.out or system.err (defaults to system.out) and of course can also be piped to a file. That’s how we we’re able to count the results. The SocketAppender writes to a specified network resource over a TCP socket. If the target is offline, the message is dropped. Otherwise, it’s received as if it was generated locally. For the benchmark, the socket was was sending data to the same machine so we avoided network issues and concerns.To our surprise, explicit file access through FIleAppender is more expensive than writing to console and piping it to a file. The same result, a different approach, and some 200k more log entries per minute. SocketAppender performed similarly to FileAppender in spite of adding serialization in between, the network resource if existed would have beared most of the overhead.Tweak #6: Piping ConsoleAppender to a file provided 13% higher throughput than using FileAppender.Fifth Benchmark: Now can we kick it up a notch? Another useful method we have in our toolbelt is the SiftingAppender. Sifting allows to break the log to multiple files. Our logic here was to create 4 separate logs, each holding the logs of 2 or 3 out of the 10 threads we run in the test. This is done by indicating a discriminator, in our case, logid, which determines the file name of the logs: <appender name="SIFT" class="ch.qos.logback.classic.sift.SiftingAppender"> <discriminator> <key>logid</key> <defaultValue>unknown</defaultValue> </discriminator> <sift> <appender name="FILE-${logid}" class="ch.qos.logback.core.FileAppender"> <file>logs/sift-${logid}.log</file> <append>false</append> </appender> </sift> </appender> ** Configuring a SiftingAppenderOnce again our FileAppender takes a beat down. The more output targets, the less stress on the locks and fewer context switching. The main bottleneck in logging, same as with the Async example, proves to be synchronising a file.Tweak #7: Using a SiftingAppender can allow a 3.1x improvement in throughput.Conclusion We found that the way to achieve the highest throughput is by using a customized AsyncAppender. If you must use synchronous logging, it’s better to sift through the results and use multiple files by some logic. I hope you’ve found the insights from the Logback benchmark useful and look forward to hear your thoughts at the comments below.Reference: How to Instantly Improve Your Java Logging With 7 Logback Tweaks from our JCG partner Alex Zhitnitsky at the Takipi blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close