Featured FREE Whitepapers

What's New Here?

career-logo

The Experience Paradox–How to Get a Job Without Experience

One of the most difficult things about becoming a software developer, is the experience paradox of needing to have a job to get experience and needing to have experience in order to get a job. This problem is of course not relegated to the field of software development, but many new software developers often struggle with getting that first job–especially if they come into software development from another field–most notoriously quality assurance. It definitely seems like it is especially difficult to transition from the role of quality analyst to software developer–even if you can competently write code.     So, how exactly do you get experience when can’t get a job without it?It’s an interesting problem to ponder. There doesn’t seem to be very many good solutions to the problem. Most software developers just assume you have to get lucky in order to get that first job without experience. The other popular alternative it to simply lie about your previous experience until you have enough that you don’t have to. I’m not really a big fan of making up a fake history in order to get a job. It’s pretty likely you’ll get found out and it’s not a great way to start of a relationship with an employer. And, I’m also not that fond of leaving things up to luck either. Counting on luck isn’t a good way to build a successful career. I’d much rather work with things that are directly under my control than rely on chance. So, that brings us back to the question we started with–or rather a modification of it: Without experience, lying or dumb luck, how can I get a job? One of the best ways to gain experience without a job is to create your own job. What’s that you say? Create my own job? Yes, you heard me right. There is basically nothing stopping you from creating your own company, hiring yourself as the only employee and doing real work that will count as valid experience. Now, of course, it’s not just as simple as that. You need to create some real valid experience. You can’t just create a shim company, call yourself the lead software developer and use it to get a job. But, what you can do is to work on creating a few simple apps and do it under the name of a company that you create. There is nothing dishonest or fishy with that approach. Today, it is easier than ever to do, because it is possible to create simple mobile or web application as a solo developer. It is even easy to sell an application you create–although, getting customers might be the hard part. I usually recommend that developer starting out, who are trying to get experience, start off by developing a few mobile applications. The reason I recommend this approach is because mobile applications are generally expected to be fairly small projects and they are easy to sell and distribute. Actually having an application that is being used or sold bring a bit more credibility than just building something for “fun.” But, you don’t have to build an mobile application. Really, you just have to build something useful that is a real application–not just a demo project. This means building something end-to-end. The barrier to entry is extremely low today. Just about anyone can build their own application and distribute it. That means that you can build a real, legit software company all by yourself. With a few applications created, not only will you be able to claim some real valid experience, but you’ll also be able to show the source code for the applications you built at a job interview. You might even find that you will be ahead of some developers who have 4-to-5 years experience, but have never successfully built an application end-to-end. Many software developer start out getting experience maintaining existing systems, but never learn how to actually build a complete application. If you can show some experience building a complete application, even if you are doing it for your own company, you can put yourself way ahead. If one of your applications takes off, you might even find that you don’t need to get a job working for someone else. Your own software company might become successful itself. They key is getting into the interview. You might be thinking that creating your own company and building a few apps is not the same as having a real job and having real experience. I tend to disagree, I tend to think it is actually more valuable and shows a more practical ability, but I realize that some employers and some developers will disagree with me. It doesn’t matter though, because the point of gaining this experience is to get into the job interview. It is very difficult to get a job interview without any experience on your resume, and it is fairly easy to make a job at your own company look just the same as a job from any other company on your resume. Of course, once you get into the interview, you need to be completely honest about the situation. Don’t try and pretend that the company you worked for was anything other than your own creation. Instead, use this information to your advantage. Talk about how you created your own job and took initiative instead of waiting for a job to come to you. Talk about how you learned a great deal by building and distributing your own applications.Turn everything that might be seen as a negative into a positive. Now, this technique of gaining your first experience might not get you a top level software development position, but it should at least help you get your foot in the door–which is arguably the hardest part.Reference: The Experience Paradox–How to Get a Job Without Experience from our JCG partner John Sonmez at the Making the Complex Simple blog....
jboss-hibernate-logo

Hibernate Statistics with Hawtio and Jolokia

A huge part of enterprise Java deals with data. Among all the different ways of working with data in enterprise settings, there is still the proven and widely taught approach to use O/R mapping of any kind. The JPA standard makes this comparably easy to use for everybody and it should also be portable. But let’s not talk about migration details. The biggest drawback of O/R mapping is, that a developer tend to lose contact with what’s happening on the database or even to which exact SQL statements get issued against it. This is the number one reason that those projects run into performance issues. If you’re there, you need to analyze the root causes and drill down to the problems. I recently found a nice feature of Hibernate which makes this comparably easy.   Available Statistics And Ways To Get Them. Hibernate up to 3.5.x ships with a statistics and metrics API that allows you to figure out a lot about what is happening under the covers. All available counters are described in the Statistics interface API, in three categories:Metrics related to the general Session usage, such as number of open sessions, retrieved JDBC connections, etc. Metrics related to the entities, collections, queries, and caches as a whole (aka global metrics). Detailed metrics related to a particular entity, collection, query or cache region.For example, you can check the cache hit, miss, and put ratio of entities, collections and queries, and the average time a query needs. Be aware that the number of milliseconds is subject to approximation in Java. Hibernate is tied to the JVM precision and on some platforms this might only be accurate to 10 seconds. Simple getters are used to access the global metrics (i.e. not tied to a particular entity, collection, cache region, etc.). You can access the metrics of a particular entity, collection or cache region through its name, and through its HQL or SQL representation for queries. Please refer to the Statistics, EntityStatistics, CollectionStatistics, SecondLevelCacheStatistics, and QueryStatistics API Javadoc for more information. All you have to do is enable statistics for the session factory you’re interested in and retrieve the statistics to analyze them. There are plenty of examples out there how to use this feature with Spring. The reason is pretty simple: Spring comes with a legendary MBeanExporter which exposes JMX MBeans as Java Objects. And guess what: Hibernate Statistics provides an easy way of exposing them through JMX. But there is no need to use Spring if you just put together some more RedHat magic! You basically have two different ways of enabling the statistics in your configured setting. The easiest way is to add a property to your persistence-unit configuration:    <property name="hibernate.generate_statistics" value="true"/> But it is also possible to enable them manually. More details on how to do that can be found on the community wiki and in the performance-monitoring section in the Hibernate documentation. Enabling and Exposing Statistics By Example I created a little example standalone Hibernate application with two entities and a main class which is working with hibernate and initializing everything you need to know. Get your hands on it instantly by forking it on GitHub. Here is the little walk-through: There are the two mandatory entities (Department and Employee) and the META-INF/persistence.xml. This is the basic setting. There is not much magic in here. You can see where to enable the statistics (potentially) in the persistence.xml. The example enables them in the main class JpaTest. But let’s start at the beginning. The main method performs the following steps in order:Create the EntityManager to use.  Register the Statistics Mbean we need. Initialize the Jolokia Server to expose JMX via JSON for Hawtio Does something with the entities.The magic starts in step two which is in the registerHibernateMBeans(EntityManager manager) method. It get’s hand on the PlatformMBeanServer, registers the relevant Hibernate JMX Mbean, set’s the Session Factory in which we’re interested in to it and enables the statistics. That is easy. Now you have a JMX MBean “Hibernate” with the attribute “statistics” registered. If you have access to the server via JConsole or Mission Control or VisualVM you can simply connect to the process and browse through the statistics:  In production environments this typically isn’t possible at all. So you would need to find a way to access this via http/https. This is where I found it handy to try out Hawtio as a a modular web console for managing your Java stuff. Burned down to the basics it is a web-console with plugins. It has a ton of plugins and can be customized and extended to fit your needs. Today we’re looking at a very simple plugin, the JMX plugin. It gives you a raw view of the underlying JMX metric data, allowing access to the entire JMX domain tree of MBeans. But in order to make this happen, we first need to find a way to expose the JMX features to Hawtio. This is where Jolokia comes in. There is a JVM agent in it who can expose JMX MBeans via JSON. All you have to do is to init and start the server like this: JolokiaServerConfig config = new JolokiaServerConfig(new HashMap<String, String>()); JolokiaServer jolokiaServer = new JolokiaServer(config, true); jolokiaServer.start(); Now you’re ready to try out the Hawtio console. Have a look at the quickstart to see what is possible. For this example I just use the Google Chrome Extension which you just have to download and drag into your extensions page in Chrome. This looks like:  If you configure “localhost”, “8778” and path “jolokia” you’re all set to start browsing your results. After you click “connect” you can look through the dashboard or switch to the JMX view and navigate to the Hibernate MBean:  There is a more comprehensive introduction to Hawtio by Stan Lewis from DevNation 2014 waiting for you to watch it:That was the short example. Go ahead and look at the GitHub source-code and feel free to look into Hawtio a bit more:Read the getting started guide to find out how to download and install Hawtio in your own environment. Read up on how to configure Hawtio in various environments, such as configuring security and where Hawtio stores stuff. See how to configure Hawtio on WildFly. We prefer to use the issue tracker for dealing with ideas and issues, but if you just want to chat about all things Hawtio please join us on the mailing list. Find the Hawtio source-code on GitHub.Reference: Hibernate Statistics with Hawtio and Jolokia from our JCG partner Markus Eisele at the Enterprise Software Development with Java blog....
spring-logo

Deploying a Spring boot application to Cloud Foundry with Spring-Cloud

I have a small Spring boot based application that uses a Postgres database as a datastore. I wanted to document the steps involved in deploying this sample application to Cloud Foundry. Some of the steps are described in the Spring Boot reference guide, however the guides do not sufficiently explain how to integrate with the datastore provided in a cloud based environment. Spring-cloud provides the glue to connect Spring based applications deployed on a Cloud to discover and connect to bound services, so the first step is to pull in the Spring-cloud libraries into the project with the following pom entries:     <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-spring-service-connector</artifactId> <version>1.0.0.RELEASE</version> </dependency><dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-cloudfoundry-connector</artifactId> <version>1.0.0.RELEASE</version> </dependency> Once this dependency is pulled in, connecting to a bound service is easy, just define a configuration along these lines: @Configuration public class PostgresCloudConfig extends AbstractCloudConfig {@Bean public DataSource dataSource() { return connectionFactory().dataSource(); }} Spring-Cloud understands that the application is deployed on a specific Cloud(currently Cloud Foundry and Heroku by looking for certain characteristics of the deployed Cloud platform), discovers the bound services, recognizes that there is a bound service using which a Postgres based datasource can be created and returns the datasource as a Spring bean. This application can now deploy cleanly to a Cloud Foundry based Cloud. The sample application can be tried out in a version of Cloud Foundry deployed with bosh-lite, these are how the steps in my machine looks like once Cloud Foundry is up and running with bosh-lite: The following command creates a user provided service in Cloud Foundry: cf create-user-provided-service psgservice -p '{"uri":"postgres://postgres:p0stgr3s@bkunjummen-mbp.local:5432/hotelsdb"}' Now, push the app, however don’t start it up. We can do that once the service above is bound to the app: cf push spring-boot-mvc-test -p target/spring-boot-mvc-test-1.0.0-SNAPSHOT.war --no-start Bind the service to the app and restart the app: cf bind-service spring-boot-mvc-test psgservice cf restart spring-boot-mvc-test That is essentially it, Spring Cloud should ideally take over at the point and cleanly parse the credentials from the bound service which within Cloud Foundry translates to an environment variable called VCAP_SERVICES, and create the datasource from it. There is however an issue with this approach – once the datasource bean is created using spring-cloud approach, it does not work in a local environment anymore. The potential fix for this is to use Spring profiles, assume that there is a different “cloud” Spring profile available in Cloud environment where the Spring-cloud based datasource gets returned: @Profile("cloud") @Configuration public class PostgresCloudConfig extends AbstractCloudConfig {@Bean public DataSource dataSource() { return connectionFactory().dataSource(); } } and let Spring-boot auto-configuration create a datasource in the default local environment, this way the configuration works both local as well as in Cloud. Where does this “cloud” profile come from, it can be created using a ApplicationContextInitializer, and looks this way: public class SampleWebApplicationInitializer implements ApplicationContextInitializer<AnnotationConfigEmbeddedWebApplicationContext> {private static final Log logger = LogFactory.getLog(SampleWebApplicationInitializer.class);@Override public void initialize(AnnotationConfigEmbeddedWebApplicationContext applicationContext) { Cloud cloud = getCloud(); ConfigurableEnvironment appEnvironment = applicationContext.getEnvironment();if (cloud!=null) { appEnvironment.addActiveProfile("cloud"); }logger.info("Cloud profile active"); }private Cloud getCloud() { try { CloudFactory cloudFactory = new CloudFactory(); return cloudFactory.getCloud(); } catch (CloudException ce) { return null; } } } This initializer makes use of the Spring-cloud’s scanning capabilities to activate the “cloud” profile. One last thing which I wanted to try was to make my local behave like Cloud atleast in the eyes of Spring-Cloud and this can be done by adding in some environment variables using which Spring-Cloud makes the determination of the type of cloud where the application is deployed, the following is my startup script in local for the app to pretend as if it is deployed in Cloud Foundry: read -r -d '' VCAP_APPLICATION <<'ENDOFVAR' {"application_version":"1","application_name":"spring-boot-mvc-test","application_uris":[""],"version":"1.0","name":"spring-boot-mvc-test","instance_id":"abcd","instance_index":0,"host":"0.0.0.0","port":61008} ENDOFVARexport VCAP_APPLICATION=$VCAP_APPLICATIONread -r -d '' VCAP_SERVICES <<'ENDOFVAR' {"postgres":[{"name":"psgservice","label":"postgresql","tags":["postgresql"],"plan":"Standard","credentials":{"uri":"postgres://postgres:p0stgr3s@bkunjummen-mbp.local:5432/hotelsdb"}}]} ENDOFVARexport VCAP_SERVICES=$VCAP_SERVICESmvn spring-boot:runThis entire sample is available at this github location: https://github.com/bijukunjummen/spring-boot-mvc-testConclusion Spring Boot along with Spring-Cloud project now provide an excellent toolset to create Spring-powered cloud ready applications, and hopefully these notes are useful in integrating Spring Boot with Spring-Cloud and using these for seamless local and Cloud deployments.Reference: Deploying a Spring boot application to Cloud Foundry with Spring-Cloud from our JCG partner Biju Kunjummen at the all and sundry blog....
software-development-2-logo

Test Attribute #7 – Footprint

When we talk footprint, we’re really talking about isolation. Isolation is key to trust. Wait, What? The “checking” part of testing, is really about trust. We check, because we want to make sure our system works as we anticipated. Therefore, we build a suite of tests that confirm our assumptions about the system. And every time we look at the test results, we want to be sure a 100% these tests are not lying to us. We need to trust our tests, because then we won’t need to recheck every time. We’ll know failure points at a real problem. And that mass of tests we’ve accumulated over the years was not an utter waste of our time. We need to know that no matter:Where in the world the test runs When the test runs On which kind of machine the test runs Who runs the test How many times we run it In what order we run it, if run alone or in sequence And any environmental conditions we run itThe result will not be affected. Isolation means we can put a good chunk of trust in our tests, because we eliminate the effect of outside interference. If we ensure total isolation, we’ll know that not only does Test XYZ has reliable results, it also doesn’t affect the results of any other test. There’s only one small problem. We cannot ensure total isolation! Is the memory status the same every time we run the test? Did our browser leave temporary files around the last time that might impact how full the disk is? Did the almighty garbage collector cleared all the unused objects? Was it the same length of time since system reboot? We don’t know. Usually these things don’t matter. Like in real life, we’re good at filtering out the un-risky stuff, that can have an affect, but usually doesn’t. So we need good-enough isolation. And that means minimal controllable footprint.Every memory allocated by the test should be freed Every file the test created should be deleted. Every file the test deleted should be restored. Every changed registry key, environment variable, log entry, etc…I’m using Test, but I actually mean Test AND Code. So if the tested code does something that requires rollback, the test needs to do it as well. Mister, You Are A Terrible Isolationist! It’s not the first time I’ve been called that. Sounds a bit extreme, isn’t it? I mean, if I test against a “dirty” database, and don’t rely on any previous state, am I doing things wrong? Do I need to start always from the same database? Well, yes and no. If you’ve analyzed the situation, and have written a test that doesn’t rely on previous state, that means that you’ve taken isolation into account already. So a suite of tests that pile data on the database and don’t clean it up, are in a context that doesn’t care about footprint. The question is – what if the test fails? Since you’ve allowed the tests to mark their territory, you now have tests that are hard to reproduce. That will cost you in debugging, and maybe in even resolving the problem. As always, it’s an ROI balance of risk analysis and mitigation.  The thing is you need to be aware of the balance, when making the decision.Reference: Test Attribute #7 – Footprint from our JCG partner Gil Zilberfeld at the Geek Out of Water blog....
Jersey-logo

Tutorial – REST API design and implementation in Java with Jersey and Spring

Looking to REST in Java? Then you’ve come to the right place, because in the blog post I will present you how to “beautifully” design a REST API and also, how to implement it in Java with the Jersey framework. The RESTful API developed in this tutorial will demonstrate a complete Create,_read,_update_and_delete (CRUD) functionality for podcast resources stored in a MySql database.             1. The example 1.1. Why? Before we start, let me tell you why I’ve written this post – well, my intention is to offer in the future a REST API for Podcastpedia.org. Of course I could use Spring’s own REST implementation, as I currently do for the AJAX calls, but I wanted also to see how the “official” implementation looks like. So, the best way to get to know the technology is to build a prototype with it. That is what I did and what I am presenting here, and I can say that I am pretty damn satisfied with Jersey. Read along to understand why!!! Note: You can visit my post Autocomplete search box with jQuery and Spring MVC to see how Spring handles REST requests. 1.2. What does it do? The resource managed in this tutorial are podcasts. The REST API will allow creation, retrieval, update and deletion of such resources. 1.3. Architecture and technologiesThe demo application uses a multi-layered architecture, based on the “Law of Demeter (LoD) or principle of least knowledge”[16]:the first layer is the REST support implemented with Jersey, has the role of a facade and delegates the logic to the business layer the business layer is where the logic happens the data access layer is where the communcation with the pesistence storage (in our case the MySql database) takes placeA few words on the technologies/frameworks used: 1.3.1. Jersey (Facade) The Jersey RESTful Web Services framework is open source, production quality, framework for developing RESTful Web Services in Java that provides support for JAX-RS APIs and serves as a JAX-RS (JSR 311 & JSR 339) Reference Implementation. 1.3.2. Spring (Business layer) I like glueing stuff together with Spring, and this example makes no exception. In my opinion there’s no better way to make POJOs with different functionalities. You’ll find out in the tutorial what it takes to integrate Jersey 2 with Spring. 1.3.3. JPA 2 / Hibernate (Persistence layer) For the persistence layer I still use a DAO pattern, even though for implementing it I am using JPA 2, which, as some people say, should make DAOs superfluous (I, for one, don’t like my service classes cluttered with EntityManager/JPA specific code). AS supporting framework for JPA 2 I am using Hibernate. See my post Java Persistence Example with Spring, JPA2 and Hibernate for an interesting discussion around persistence thema in Java. 1.3.4. Web Container Everything gets packaged with Maven as a .war file and can be deployed on any web container – I used Tomcat and Jetty but, it could also be Glassfih, Weblogic, JBoss or WebSphere. 1.3.5. MySQL The sample data is stored in a MySQL table:1.3.6. Technology versionsJersey 2.9 Spring 4.0.3 Hibernate 4 Maven 3 Tomcat 7 Jetty 9 MySql 5.6Note: The main focus in the post will be on the REST api design and its implementation with the Jersey JAX-RS implementation, all the other technologies/layers are considered as enablers. 1.4. Source code The source code for the project presented here is available on GitHub, with complete instructions on how to install and run the project:Codingpedia / demo-rest-jersey-spring2. Configuration Before I start presenting the design and implementation of the REST API, we need to do a little configuration so that all these wonderful technologies can come and play together 2.1. Project dependencies The Jersey Spring extension must be present in your project’s classpath. If you are using Maven add it to the pom.xml file of your project: Jersey-spring dependency in the pom.xml <dependency> <groupId>org.glassfish.jersey.ext</groupId> <artifactId>jersey-spring3</artifactId> <version>${jersey.version}</version> <exclusions> <exclusion> <groupId>org.springframework</groupId> <artifactId>spring-core</artifactId> </exclusion> <exclusion> <groupId>org.springframework</groupId> <artifactId>spring-web</artifactId> </exclusion> <exclusion> <groupId>org.springframework</groupId> <artifactId>spring-beans</artifactId> </exclusion> </exclusions> </dependency> <dependency> <groupId>org.glassfish.jersey.media</groupId> <artifactId>jersey-media-json-jackson</artifactId> <version>2.4.1</version> </dependency> Note: The jersey-spring3.jar, uses its own version for Spring libraries, so to use the ones you want (Spring 4.0.3.Release in this case), you need to exclude these libraries manually. Code alert: If you want to see what other dependencies are needed (e.g. Spring, Hibernate, Jetty maven plugin, testing etc.) in the project you can have a look at the the complete pom.xml file available on GitHub. 2.2. web.xml Web Application Deployment Descriptor <?xml version="1.0" encoding="UTF-8"?> <web-app version="3.0" xmlns="http://java.sun.com/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_3_0.xsd"> <display-name>Demo - Restful Web Application</display-name><listener> <listener-class> org.springframework.web.context.ContextLoaderListener </listener-class> </listener><context-param> <param-name>contextConfigLocation</param-name> <param-value>classpath:spring/applicationContext.xml</param-value> </context-param><servlet> <servlet-name>jersey-serlvet</servlet-name> <servlet-class> org.glassfish.jersey.servlet.ServletContainer </servlet-class> <init-param> <param-name>javax.ws.rs.Application</param-name> <param-value>org.codingpedia.demo.rest.RestDemoJaxRsApplication</param-value> </init-param> <load-on-startup>1</load-on-startup> </servlet><servlet-mapping> <servlet-name>jersey-serlvet</servlet-name> <url-pattern>/*</url-pattern> </servlet-mapping><resource-ref> <description>Database resource rest demo web application </description> <res-ref-name>jdbc/restDemoDB</res-ref-name> <res-type>javax.sql.DataSource</res-type> <res-auth>Container</res-auth> </resource-ref> </web-app> 2.2.1. Jersey-servlet Notice the Jersey servlet configuration [lines 18-33]. The javax.ws.rs.core.Application class defines the components (root resource and provider classes,) of the JAX-RS application. I used ResourceConfig, which is Jersey’s own implementation of the class Application, and which provides advanced capabilites to simplify registration of JAX-RS components. Check out the JAX-RS Application Model in the documentation for more possibilities. My implementation of the ResourceConfig class, org.codingpedia.demo.rest.RestDemoJaxRsApplication, registers application resources, filters, exception mappers and feature : org.codingpedia.demo.rest.service.MyDemoApplication package org.codingpedia.demo.rest.service;//imports omitted for brevity/** * Registers the components to be used by the JAX-RS application * * @author ama * */ public class RestDemoJaxRsApplication extends ResourceConfig {/** * Register JAX-RS application components. */ public RestDemoJaxRsApplication() { // register application resources register(PodcastResource.class); register(PodcastLegacyResource.class);// register filters register(RequestContextFilter.class); register(LoggingResponseFilter.class); register(CORSResponseFilter.class);// register exception mappers register(GenericExceptionMapper.class); register(AppExceptionMapper.class); register(NotFoundExceptionMapper.class);// register features register(JacksonFeature.class); register(MultiPartFeature.class); } } Please note the:org.glassfish.jersey.server.spring.scope.RequestContextFilter, which is a Spring filter that provides a bridge between JAX-RS and Spring request attributes org.codingpedia.demo.rest.resource.PodcastsResource, which is the “facade” component that exposes the REST API via annotations and will be thouroughly presented later in the post org.glassfish.jersey.jackson.JacksonFeature, which is a feature that registers Jackson JSON providers – you need it for the application to understand JSON data2.1.2.2. Spring application context configuration The Spring application context configuration is located in the classpath under spring/applicationContext.xml: Spring application context configuration <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:context="http://www.springframework.org/schema/context" xmlns:tx="http://www.springframework.org/schema/tx" xsi:schemaLocation="http://www.springframework.org/schema/beanshttp://www.springframework.org/schema/beans/spring-beans.xsdhttp://www.springframework.org/schema/txhttp://www.springframework.org/schema/tx/spring-tx.xsdhttp://www.springframework.org/schema/contexthttp://www.springframework.org/schema/context/spring-context.xsd"><context:component-scan base-package="org.codingpedia.demo.rest.*" /><!-- ************ JPA configuration *********** --> <tx:annotation-driven transaction-manager="transactionManager" /> <bean id="transactionManager" class="org.springframework.orm.jpa.JpaTransactionManager"> <property name="entityManagerFactory" ref="entityManagerFactory" /> </bean> <bean id="transactionManagerLegacy" class="org.springframework.orm.jpa.JpaTransactionManager"> <property name="entityManagerFactory" ref="entityManagerFactoryLegacy" /> </bean> <bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean"> <property name="persistenceXmlLocation" value="classpath:config/persistence-demo.xml" /> <property name="persistenceUnitName" value="demoRestPersistence" /> <property name="dataSource" ref="restDemoDS" /> <property name="packagesToScan" value="org.codingpedia.demo.*" /> <property name="jpaVendorAdapter"> <bean class="org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter"> <property name="showSql" value="true" /> <property name="databasePlatform" value="org.hibernate.dialect.MySQLDialect" /> </bean> </property> </bean> <bean id="entityManagerFactoryLegacy" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean"> <property name="persistenceXmlLocation" value="classpath:config/persistence-demo.xml" /> <property name="persistenceUnitName" value="demoRestPersistenceLegacy" /> <property name="dataSource" ref="restDemoLegacyDS" /> <property name="packagesToScan" value="org.codingpedia.demo.*" /> <property name="jpaVendorAdapter"> <bean class="org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter"> <property name="showSql" value="true" /> <property name="databasePlatform" value="org.hibernate.dialect.MySQLDialect" /> </bean> </property> </bean><bean id="podcastDao" class="org.codingpedia.demo.rest.dao.PodcastDaoJPA2Impl"/> <bean id="podcastService" class="org.codingpedia.demo.rest.service.PodcastServiceDbAccessImpl" /> <bean id="podcastsResource" class="org.codingpedia.demo.rest.resource.PodcastsResource" /> <bean id="podcastLegacyResource" class="org.codingpedia.demo.rest.resource.PodcastLegacyResource" /><bean id="restDemoDS" class="org.springframework.jndi.JndiObjectFactoryBean" scope="singleton"> <property name="jndiName" value="java:comp/env/jdbc/restDemoDB" /> <property name="resourceRef" value="true" /> </bean> <bean id="restDemoLegacyDS" class="org.springframework.jndi.JndiObjectFactoryBean" scope="singleton"> <property name="jndiName" value="java:comp/env/jdbc/restDemoLegacyDB" /> <property name="resourceRef" value="true" /> </bean> </beans> Nothing special here, it just defines the beans that are needed throughout the demo application (e.g. podcastsResource which is the entry point class for our REST API). 3. The REST API (design & implementation) 3.1. Resources 3.1.1. Design As mentioned earlier, the demo application manages podcasts, which represent the resource in our REST API. Resources are the central concept in REST and are characterized by two main things:each is referenced with a global identifier (e.g. a URI in HTTP). has one or more representations, that they expose to the outer world and can be manipulated with (we’ll be working mostly with JSON representations in this example)Resources are usually represented in REST by nouns (podcasts, customers, user, accounts etc.) and not verbs (getPodcast, deleteUser etc.) The endpoints used throughout the tutorial are :/podcasts – (notice the plural) URI identifying a resource representing a collection of podcasts /podcasts/{id} – URI identifying a podcast resource, by the podcast’s id3.1.2. Implementation For the sake of simplicity, a podcast will have only the following properties:id – uniquely identifies the podcast feed – url feed of the podcast title – title of the podcast linkOnPodcastpedia – where you can find the podcast on Podcastpedia.org description – a short description of the podcastI could have used only one Java class for the representation of the podcast resource in the code, but in that case the class and its properties/methods would have gotten cluttered with both JPA and XML/JAXB/JSON annotations. I wanted to avoid that and I used two representations which have pretty much the same properties instead:PodcastEntity.java – JPA annotated class used in the DB and business layers Podcast.java – JAXB/JSON annotated class used in the facade and business layersNote: I am still trying to convince myself that this is the better approach, so if you have a suggestion on this please leave a comment. The Podcast.java classs look something like the following: Podcast.java package org.codingpedia.demo.rest.resource;//imports omitted for brevity/**  * Podcast resource placeholder for json/xml representation  *  * @author ama  *  */ @SuppressWarnings("restriction") @XmlRootElement @XmlAccessorType(XmlAccessType.FIELD) public class Podcast implements Serializable {    private static final long serialVersionUID = -8039686696076337053L;    /** id of the podcast */     @XmlElement(name = "id")         private Long id;          /** title of the podcast */     @XmlElement(name = "title")         private String title;              /** link of the podcast on Podcastpedia.org */     @XmlElement(name = "linkOnPodcastpedia")         private String linkOnPodcastpedia;          /** url of the feed */     @XmlElement(name = "feed")         private String feed;          /** description of the podcast */     @XmlElement(name = "description")     private String description;              /** insertion date in the database */     @XmlElement(name = "insertionDate")     @XmlJavaTypeAdapter(DateISO8601Adapter.class)         @PodcastDetailedView     private Date insertionDate;    public Podcast(PodcastEntity podcastEntity){         try {             BeanUtils.copyProperties(this, podcastEntity);         } catch (IllegalAccessException e) {             // TODO Auto-generated catch block             e.printStackTrace();         } catch (InvocationTargetException e) {             // TODO Auto-generated catch block             e.printStackTrace();         }     }          public Podcast(String title, String linkOnPodcastpedia, String feed,             String description) {                  this.title = title;         this.linkOnPodcastpedia = linkOnPodcastpedia;         this.feed = feed;         this.description = description;              }          public Podcast(){}//getters and setters now shown for brevity } and translates into the following JSON representation, which is actually the de facto media type used with REST nowadays: { "id":1, "title":"Quarks & Co - zum Mitnehmen-modified", "linkOnPodcastpedia":"http://www.podcastpedia.org/podcasts/1/Quarks-Co-zum-Mitnehmen", "feed":"http://podcast.wdr.de/quarks.xml", "description":"Quarks & Co: Das Wissenschaftsmagazin", "insertionDate":"2014-05-30T10:26:12.00+0200" } Even though JSON is becoming more and more the preffered representation in REST APIs, you shouldn’t neglect the XML representation, as most of the systems still use XML format for communication with other parties. The good thing is that in Jersey you can kill two rabbits with one shot – with JAXB beans (as used above) you will be able to use the same Java model to generate JSON as well as XML representations. Another advantage is simplicity of working with such a model and availability of the API in Java SE Platform. Note: Most of the methods defined in this tutorial will produce and consume also the application/xml media type, with application/json being the preferred way. 3.2. Methods Before I present you the API, let me to tell you thatCreate = POST Read = GET Update = PUT Delete = DELETEand is not a strict 1:1 mapping. Why? Because you can also use PUT for Creation and POST for Update. This will be explained and demonstrated in the coming paragraphs. Note: For Read and Delete it is pretty clear, they map indeed one to one with the GET and DELETE HTTP operations. Anyway REST is an architectural style, is not a specification and you should adapt the architecture to your needs, but if you want to make your API public and have somebody willing to use it, you should follow some “best practices”. As already mentioned the PodcastRestResource class is the one handling all the rest requests: package org.codingpedia.demo.rest.resource; //imports ...................... @Component @Path("/podcasts") public class PodcastResource { @Autowired private PodcastService podcastService; ..................... } Notice the @Path("/podcasts") before the class definition – everything related to podcast resources will occur under this path. The @Path annotation’s value is a relative URI path. In the example above, the Java class will be hosted at the URI path /podcasts. The PodcastService interface exposes the business logic to the REST facade layer. Code alert: You can find the entire content of the class on GitHub – PodcastResource.java. We’ll be going through the file step by step and explain the different methods corresponding to the different operations. 3.2.1. Create podcast(s) 3.2.1.1. Design While the “most known” way for resource creation is by using POST, As mentioned before to create a new resource I could use both the POST and PUT methods, and I did just that:  Description   URI   HTTP method  HTTP Status response Add new podcast  /podcasts/ POST 201 Created Add new podcast (all values must be sent)  /podcasts/{id} PUT 201 Created  The big difference between using POST (not idempotent) “The POST method is used to request that the origin server accept the entity enclosed in the request as a new subordinate of the resource identified by the Request-URI in the Request-Line[...] If a resource has been created on the origin server, the response SHOULD be 201 (Created) and contain an entity which describes the status of the request and refers to the new resource, and a Location header” [1] and PUT (idempotent) “The PUT method requests that the enclosed entity be stored under the supplied Request-URI [...] If the Request-URI does not point to an existing resource, and that URI is capable of being defined as a new resource by the requesting user agent, the origin server can create the resource with that URI. If a new resource is created, the origin server MUST inform the user agent via the 201 (Created) response.” [1] is that for PUT you should know beforehand the location where the resource will be created and send all the possible values of the entry. 3.2.1.2. Implementation 3.2.1.2.1. Create a single resource with POST Create a single podcast resource from JSON /** * Adds a new resource (podcast) from the given json format (at least title * and feed elements are required at the DB level) * * @param podcast * @return * @throws AppException */ @POST @Consumes({ MediaType.APPLICATION_JSON }) @Produces({ MediaType.TEXT_HTML }) public Response createPodcast(Podcast podcast) throws AppException { Long createPodcastId = podcastService.createPodcast(podcast); return Response.status(Response.Status.CREATED)// 201 .entity("A new podcast has been created") .header("Location", "http://localhost:8888/demo-rest-jersey-spring/podcasts/" + String.valueOf(createPodcastId)).build(); } Annotations@POST – indicates that the method responds to HTTP POST requests @Consumes({MediaType.APPLICATION_JSON}) – defines the media type, the method accepts, in this case "application/json" @Produces({MediaType.TEXT_HTML}) – defines the media type) that the method can produce, in this case "text/html".Responseon success: text/html document, with a HTTP status of 201 Created, and a Location header specifying where the resource has been created on error:400 Bad request if not enough data is provided 409 Conflict if on the server side is determined a podcast with the same feed exists3.2.1.2.2. Create a single resource (“podcast”) with PUT This will be treated in the Update Podcast section below. 3.2.1.2.3. Bonus – Create a single resource (“podcast”) from form Create a single podcast resource from form /** * Adds a new podcast (resource) from "form" (at least title and feed * elements are required at the DB level) * * @param title * @param linkOnPodcastpedia * @param feed * @param description * @return * @throws AppException */ @POST @Consumes({ MediaType.APPLICATION_FORM_URLENCODED }) @Produces({ MediaType.TEXT_HTML }) @Transactional public Response createPodcastFromApplicationFormURLencoded( @FormParam("title") String title, @FormParam("linkOnPodcastpedia") String linkOnPodcastpedia, @FormParam("feed") String feed, @FormParam("description") String description) throws AppException {Podcast podcast = new Podcast(title, linkOnPodcastpedia, feed, description); Long createPodcastid = podcastService.createPodcast(podcast);return Response .status(Response.Status.CREATED)// 201 .entity("A new podcast/resource has been created at /demo-rest-jersey-spring/podcasts/" + createPodcastid) .header("Location", "http://localhost:8888/demo-rest-jersey-spring/podcasts/" + String.valueOf(createPodcastid)).build(); } Annotations@POST – indicates that the method responds to HTTP POST requests @Consumes({MediaType.APPLICATION_FORM_URLENCODED})- defines the media type, the method accepts, in this case"application/x-www-form-urlencoded"@FormParam – present before the input parameters of the method, this annotation binds the value(s) of a form parameter contained within a request entity body to a resource method parameter. Values are URL decoded unless this is disabled using the Encoded annotation@Produces({MediaType.TEXT_HTML}) – defines the media type that the method can produce, in this case “text/html”. The response will be a html document, with a status of 201, indicating to the caller that the request has been fulfilled and resulted in a new resource being created.Responseon success: text/html document, with a HTTP status of 201 Created, and a Location header specifying where the resource has been created on error:400 Bad request if not enough data is provided 409 Conflict if on the server side is determined a podcast with the same feed exists3.2.2. Read podcast(s) 3.2.2.1. Design The API supports two Read operations:return a collection of podcasts return a podcast identified by id Description  URI  HTTP method HTTP Status response Return all podcasts  /podcasts/?orderByInsertionDate={ASC|DESC}&numberDaysToLookBack={val} GET 200 OK Add new podcast (all values must be sent)  /podcasts/{id} GET 200 OKNotice the query parameters for the collection resource – orderByInsertionDate and numberDaysToLookBack. It makes perfect sense to add filters as query parameters in the URI and not be part of the path. 3.2.2.2. Implementation 3.2.2.2.1. Read all podcasts (“/”) Read all resources /** * Returns all resources (podcasts) from the database * * @return * @throws IOException * @throws JsonMappingException * @throws JsonGenerationException * @throws AppException */ @GET @Produces({ MediaType.APPLICATION_JSON, MediaType.APPLICATION_XML }) public List<Podcast> getPodcasts( @QueryParam("orderByInsertionDate") String orderByInsertionDate, @QueryParam("numberDaysToLookBack") Integer numberDaysToLookBack) throws JsonGenerationException, JsonMappingException, IOException, AppException { List<Podcast> podcasts = podcastService.getPodcasts( orderByInsertionDate, numberDaysToLookBack); return podcasts; } Annotations@GET - indicates that the method responds to HTTP GET requests @Produces({MediaType.APPLICATION_JSON, MediaType.APPLICATION_XML}) – defines the media type) that the method can produce, in this case either "application/json" or "application/xml"(you need the @XmlRootElement in front of the Podcast class ). The response will be a list of podcasts either in JSON or XML format.Responselist of podcasts from the database and a HTTP Status of 200 OK3.2.2.2.1. Read one podcast Read one resource by id @GET @Path("{id}") @Produces({ MediaType.APPLICATION_JSON, MediaType.APPLICATION_XML }) public Response getPodcastById(@PathParam("id") Long id) throws JsonGenerationException, JsonMappingException, IOException, AppException { Podcast podcastById = podcastService.getPodcastById(id); return Response.status(200).entity(podcastById) .header("Access-Control-Allow-Headers", "X-extra-header") .allow("OPTIONS").build(); } Annotations@GET – indicates that the method responds to HTTP GET requests @Path("{id}") – identifies the URI path that the class method will serve requests for. The “id” value is an embedded variable making an URI path template. It is used in combination with the @PathParam variable.@PathParam("id") – binds the value of a URI template parameter (“id”) to the resource method parameter. The value is URL decoded unless this is di sabled using the @Encoded annotation. A default value can be specified using the @DefaultValue annotation.@Produces({MediaType.APPLICATION_JSON, MediaType.APPLICATION_XML}) – defines the media type) that the method can produce, in this case "application/json" or "application/xml"(you need the @XmlRootElement in front of the Podcast class ).Responseon success: requested podcast with a 200 OK HTTP status. The format is either xml or JSON, depending on the Accept -header’s value sent by the client (might bet application/xml or application/json) on error: 404 Not found if the podcast with the given id does not exist in the database3.2.3. Update podcast 3.2.3.1. DesignDescription URI HTTP methodHTTP Status response Update podcast (fully)  /podcasts/{id} PUT 200 OK Update podcast (partially)  /podcasts/{id} POST 200 OK  In the REST arena you will be doing two kind of updates:full updates – that is where you will provide all the partial updates – when only some properties will be sent over the wire for updateFor full updates, it’s pretty clear you can use the PUT method and you are conform the method’s specification in the RFC 2616. Now for the partial update there’s a bunch of proposals/debate on what to use:via PUT via POST via PATCHLet me tell why I consider the first option (with PUT) is a NO GO. Well, accordingly to the specification “If the Request-URI refers to an already existing resource, the enclosed entity SHOULD be considered as a modified version of the one residing on the origin server.“[1] if I would like to update just the title property of the podcast with the id 2 PUT command for partial update PUT http://localhost:8888/demo-rest-jersey-spring/podcasts/2 HTTP/1.1 Accept-Encoding: gzip,deflate Content-Type: application/json Content-Length: 155 Host: localhost:8888 Connection: Keep-Alive User-Agent: Apache-HttpClient/4.1.1 (java 1.5){ "title":"New Title" } then, accordingly to the specification the resource “stored” at the location should have only id and title, and clearly my intent was not that. The second option via POST… well we can “abuse” this one and that is exactly what I did in the implementation, but it does not seem conform to me, because the spec for POST states: “The posted entity is subordinate to that URI in the same way that a file is subordinate to a directory containing it, a news article is subordinate to a newsgroup to which it is posted, or a record is subordinate to a database.“[1] That does not look like a partial update case to me… The third option is to use PATCH,  and I guess this is the main reason the method came to life: “Several applications extending the Hypertext Transfer Protocol (HTTP)    require a feature to do partial resource modification.  The existing    HTTP PUT method only allows a complete replacement of a document.    This proposal adds a new HTTP method, PATCH, to modify an existing    HTTP resource.”[2] I am pretty sure this will be used in the future for partial updates, but since is not yet part of the specification and not yet implemented in Jersey I chose to use the second option with POST for this demo. If you really want to implement partial update in Java with the PATCH check out this post  – Transparent PATCH support in JAX-RS 2.0 3.2.3.1. Implementation 3.2.3.1.1. Full Update Create or fully update resource implementation method @PUT @Path("{id}") @Consumes({ MediaType.APPLICATION_JSON }) @Produces({ MediaType.TEXT_HTML }) public Response putPodcastById(@PathParam("id") Long id, Podcast podcast) throws AppException {Podcast podcastById = podcastService.verifyPodcastExistenceById(id);if (podcastById == null) { // resource not existent yet, and should be created under the // specified URI Long createPodcastId = podcastService.createPodcast(podcast); return Response .status(Response.Status.CREATED) // 201 .entity("A new podcast has been created AT THE LOCATION you specified") .header("Location", "http://localhost:8888/demo-rest-jersey-spring/podcasts/" + String.valueOf(createPodcastId)).build(); } else { // resource is existent and a full update should occur podcastService.updateFullyPodcast(podcast); return Response .status(Response.Status.OK) // 200 .entity("The podcast you specified has been fully updated created AT THE LOCATION you specified") .header("Location", "http://localhost:8888/demo-rest-jersey-spring/podcasts/" + String.valueOf(id)).build(); } } Annotations@PUT - indicates that the method responds to HTTP PUT requests @Path("{id}") – identifies the URI path that the class method will serve requests for. The “id” value is an embedded variable making an URI path template. It is used in combination with the @PathParam variable.@PathParam("id") – binds the value of a URI template parameter (“id”) to the resource method parameter. The value is URL decoded unless this is di sabled using the @Encoded annotation. A default value can be specified using the @DefaultValue annotation.@Consumes({MediaType.APPLICATION_JSON}) – defines the media type, the method accepts, in this case "application/json" @Produces({MediaType.TEXT_HTML}) – defines the media type) that the method can produce, in this case “text/html”.will be a html document containing different messages and stati depending on what action has been taken Respsonseon creationon success: 201 Created and in the Location header the specified location where the resource was created on error: 400 Bad request if the minimum required properties are not provided for insertionon full updateon success: 200 OK on error: 400 Bad Request if not all properties are provided3.2.3.1.2. Partial Update Partial Update //PARTIAL update @POST @Path("{id}") @Consumes({ MediaType.APPLICATION_JSON }) @Produces({ MediaType.TEXT_HTML }) public Response partialUpdatePodcast(@PathParam("id") Long id, Podcast podcast) throws AppException { podcast.setId(id); podcastService.updatePartiallyPodcast(podcast); return Response.status(Response.Status.OK)// 200 .entity("The podcast you specified has been successfully updated") .build(); } Annotations@POST – indicates that the method responds to HTTP POST requests @Path("{id}") – identifies the URI path that the class method will serve requests for. The “id” value is an embedded variable making an URI path template. It is used in combination with the @PathParam variable.@PathParam("id") – binds the value of a URI template parameter (“id”) to the resource method parameter. The value is URL decoded unless this is di sabled using the @Encoded annotation. A default value can be specified using the @DefaultValue annotation.@Consumes({MediaType.APPLICATION_JSON}) – defines the media type, the method accepts, in this case "application/json" @Produces({MediaType.TEXT_HTML}) – defines the media type) that the method can produce, in this case "text/html".Responseon success: 200 OK on error: 404 Not Found, if there is no resource anymore available at the provided location3.2.4. Delete podcast 3.2.4.1. DesignDescription URI HTTP methodHTTP Status response Removes all podcasts  /podcasts/ DELETE 204 No content Removes podcast at the specified location  /podcasts/{id} DELETE 204 No content  3.2.4.2. Implementation 3.2.4.2.1. Delete all resources Delete all resources @DELETE @Produces({ MediaType.TEXT_HTML }) public Response deletePodcasts() { podcastService.deletePodcasts(); return Response.status(Response.Status.NO_CONTENT)// 204 .entity("All podcasts have been successfully removed").build(); } Annotations@DELETE – indicates that the method responds to HTTP DELETE requests @Produces({MediaType.TEXT_HTML}) – defines the media type that the method can produce, in this case “text/html”.ResponseThe response will be a html document, with a status of 204 No content, indicating to the caller that the request has been fulfilled.3.2.4.2.2. Delete one resource Delete one resource @DELETE @Path("{id}") @Produces({ MediaType.TEXT_HTML }) public Response deletePodcastById(@PathParam("id") Long id) { podcastService.deletePodcastById(id); return Response.status(Response.Status.NO_CONTENT)// 204 .entity("Podcast successfully removed from database").build(); } Annotations@DELETE – indicates that the method responds to HTTP DELETE requests @Path("{id}") – identifies the URI path that the class method will serve requests for. The “id” value is an embedded variable making an URI path template. It is used in combination with the @PathParam variable.@PathParam("id") – binds the value of a URI template parameter (“id”) to the resource method parameter. The value is URL decoded unless this is di sabled using the @Encoded annotation. A default value can be specified using the @DefaultValue annotation.@Produces({MediaType.TEXT_HTML}) – defines the media type that the method can produce, in this case “text/html”.Responseon success: if the podcast is removed a 204 No Content success status is returned on error: podcast is not available anymore and status of 404 Not found is returned4. Logging Every request’s path and the response’s entity will be logged when the logging level is set to DEBUG. It is developed like a wrapper, AOP-style functionality with the help of Jetty filters. See my post How to log in Spring with SLF4J and Logback for more details on the matter. 5. Exception handling In case of errros, I decided to response with unified error message structure.  Here’s an example how an error response might look like: Example – error message response { "status": 400, "code": 400, "message": "Provided data not sufficient for insertion", "link": "http://www.codingpedia.org/ama/tutorial-rest-api-design-and-implementation-with-jersey-and-spring", "developerMessage": "Please verify that the feed is properly generated/set" } Note: Stay tuned, because the following post will present more details about error handling in REST with Jersey. 6. Add CORS support on the server side I extended the capabilities of the API developed for the tutorial to support Cross-Origing Resource Sharing (CORS) on the server side. Please see my post How to add CORS support on the server side in Java with Jersey for more details on the matter. 7. Testing 7.1. Integration tests in Java To test the application I will use the Jersey Client and execute requests against a running Jetty server with the application deployed on it. For that I will use the Maven Failsafe Plugin. 7.1.1. Configuration 7.1.1.1 Jersey client dependency To build a Jersey client the jersey-client jar is required in the classpath. With Maven you can add it as a dependency to the pom.xml file: Jersey Client maven dependency <dependency> <groupId>org.glassfish.jersey.core</groupId> <artifactId>jersey-client</artifactId> <version>${jersey.version}</version> <scope>test</scope> </dependency> 7.1.1.2. Failsafe plugin The Failsafe Plugin is used during the integration-test and verify phases of the build lifecycle to execute the integration tests of the application. The Failsafe Plugin will not fail the build during the integration-test phase thus enabling the post-integration-test phase to execute. To use the Failsafe Plugin, you need to add the following configuration to your pom.xml Maven Failsafe Plugin configuration <plugins> [...] <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-failsafe-plugin</artifactId> <version>2.16</version> <executions> <execution> <id>integration-test</id> <goals> <goal>integration-test</goal> </goals> </execution> <execution> <id>verify</id> <goals> <goal>verify</goal> </goals> </execution> </executions> </plugin> [...] </plugins> 7.1.1.2. Jetty Maven Plugin The integration tests will be executed against a running jetty server, that will be started only for the execution of the tests. For that you have to configure the following executionin the jetty-maven-plugin: Jetty Maven Plugin configuration for integration tests <plugins> <plugin> <groupId>org.eclipse.jetty</groupId> <artifactId>jetty-maven-plugin</artifactId> <version>${jetty.version}</version> <configuration> <jettyConfig>${project.basedir}/src/main/resources/config/jetty9.xml</jettyConfig> <stopKey>STOP</stopKey> <stopPort>9999</stopPort> <stopWait>5</stopWait> <scanIntervalSeconds>5</scanIntervalSeconds> [...] </configuration> <executions> <execution> <id>start-jetty</id> <phase>pre-integration-test</phase> <goals> <!-- stop any previous instance to free up the port --> <goal>stop</goal> <goal>run-exploded</goal> </goals> <configuration> <scanIntervalSeconds>0</scanIntervalSeconds> <daemon>true</daemon> </configuration> </execution> <execution> <id>stop-jetty</id> <phase>post-integration-test</phase> <goals> <goal>stop</goal> </goals> </execution> </executions> </plugin> [...] </plugins> Note: In the pre-integration-test phase the Jetty server will be started, after stopping any running instance to free up the port, and in the post-integration-phase it will be stopped. The scanIntervalSeconds has to be set to 0, and daemon to true. Code alert: Find the complete pom.xml file on GitHub 7.1.2. Build the integration tests I am using JUnit as the testing framework. By default, the Failsafe Plugin will automatically include all test classes with the following wildcard patterns:"**/IT*.java" – includes all of its subdirectories and all java filenames that start with “IT”. "**/*IT.java" – includes all of its subdirectories and all java filenames that end with “IT”. "**/*ITCase.java" – includes all of its subdirectories and all java filenames that end with “ITCase”.I have created a single test class – RestDemoServiceIT – that will test the read (GET) methods, but the procedure should be the same for all the other: public class RestDemoServiceIT {[....] @Test public void testGetPodcast() throws JsonGenerationException, JsonMappingException, IOException {ClientConfig clientConfig = new ClientConfig(); clientConfig.register(JacksonFeature.class);Client client = ClientBuilder.newClient(clientConfig);WebTarget webTarget = client .target("http://localhost:8888/demo-rest-jersey-spring/podcasts/2");Builder request = webTarget.request(MediaType.APPLICATION_JSON);Response response = request.get(); Assert.assertTrue(response.getStatus() == 200);Podcast podcast = response.readEntity(Podcast.class);ObjectMapper mapper = new ObjectMapper(); System.out .print("Received podcast from database *************************** " + mapper.writerWithDefaultPrettyPrinter() .writeValueAsString(podcast));} } Note:I had to register the JacksonFeature for the client too so that I can marshall the podcast response in JSON format – response.readEntity(Podcast.class) I am testing against a running Jetty on port 8888 – I will show you in the next section how to start Jetty on a desired port I am expecting a 200 status for my request With the help org.codehaus.jackson.map.ObjectMapper I am displaying the JSON response pretty formatted7.1.3. Running the integration tests The Failsafe Plugin can be invoked by calling the verify phase of the build lifecycle. Maven command to invoke the integration tests mvn verify To start jetty on port 8888 you need to set the jetty.port property to 8888. In Eclipse I use the following configuration:  7.2. Integration tests with SoapUI Recently I’ve rediscovered SoapUI after using it heavily for testing SOAP based web services. With the recent versions (at the time of writing latest is 5.0.0) it offers pretty good functionality to test REST based web services, and coming versions should improve on this. So unless you develop your own framework/infrastructure to test REST services, why not give it a try to SoapUI. I did, I was satisfied with the results so far and I’ve decided to do a video tutorial, that you can now find on YouTube on our channel:8. Versioning There are three major possibilitiesURL:  “/v1/podcasts/{id}” Accept/Content-type header: application/json; version=1Because I am a developer and not a RESTafarian yet I would do the URL option. All I would have to do on the implementation side for this example, would be to modify the @Path‘s value annotation on the PodcastResource class from to Versioning in the path @Component @Path("/v1/podcasts") public class PodcastResource {...} Of course on a production application, you wouldn’t want every resource class preprefixed with the version number,  you’d want the version somehow treated through a filter in a AOP manner. Maybe something like this will come in a following post… Here are some great resources from people that understand better the matter:[Video] REST+JSON API Design – Best Practices for Developers Your API versioning is wrong, which is why I decided to do it 3 different wrong ways by @troyhunt Versioning REST Services Best practices for API versioning? – interesting discussion on Stackoverflow9. Summary Well, that’s it. I have to congratulate you, if you’ve come so far, but I hope you could learn something from this tutorial about REST, like designing a REST API, implementing a REST API in Java, testing a REST API and much more. If you did, I’d be very grateful if you helped it spread by leaving a comment or sharing it on Twitter, Google+ or Facebook. Thank you! Don’t forget also to check out Podcastpedia.org – you’ll find for sure interesting podcasts and episodes. We are grateful for your support. If you liked this article, we would really appreciate a small contribution for our work! Donate now with Paypal. 10. Resources 10.1. Source CodeGitHub – Codingpedia/demo-rest-jersey-spring (instructions on how to install and run the project)10.2. Web resourcesHTTP – Hypertext Transfer Protocol — HTTP/1.1 – RFC2616 rfc5789 – PATCH Method for HTTP Jersey User Guide HTTP Status Code Definitions REST – http://en.wikipedia.org/wiki/Representational_State_Transfer CRUD – http://en.wikipedia.org/wiki/Create,_read,_update_and_delete Java API for RESTful Services (JAX-RS) Jersey – RESTful Web Services in Java HTTP PUT, PATCH or POST – Partial updates or full replacement? Transparent PATCH support in JAX-RS 2.0 Maven Failsafe Plugin Maven Failsafe Plugin Usage SoapUI 5.0 released today! SoapUI – Using Script Assertions [Video] REST+JSON API Design – Best Practices for Developers [Video] RESTful API Design – Second Edition Law of Demeter10.3. Codingpedia related resourcesJava Persistence Example with Spring, JPA2 and Hibernate http://www.codingpedia.org/ama/spring-mybatis-integration-example/ http://www.codingpedia.org/ama/tomcat-jdbc-connection-pool-configuration-for-production-and-development/ http://www.codingpedia.org/ama/error-when-executing-jettyrun-with-jetty-maven-plugin-version-9-java-lang-unsupportedclassversionerror-unsupported-major-minor-version-51-0/ http://www.codingpedia.org/ama/autocomplete-search-box-with-jquery-and-spring-mvc/Reference: Tutorial – REST API design and implementation in Java with Jersey and Spring from our JCG partner Adrian Matei at the Codingpedia.org blog....
java-logo

Why I distrust wildcards and why we need them anyway

In any programming language that combines subtype polymorphism (object orientation) with parametric polymorphism (generics), the question ofvariance arises. Suppose I have a list of strings, type List<String>. Can I pass that to a function which accepts List<Object>? Let’s start with this definition:             interface List<T> { void add(T element); Iterator<T> iterator(); ... }Broken covariance Intuitively, we might at first think that this should be allowed. This looks OK: void iterate(List<Object> list) { Iterator<Object> it = list.iterator(); ... } iterate(ArrayList<String>());Indeed, certain languages, including Eiffel and Dart do accept this code. Sadly, it’s unsound, as can be seen in the following example: //Eiffel/Dart-like language with //broken covariance: void put(List<Object> list) { list.add(10); } put(ArrayList<String>());Here we pass a List<String> to a function accepting List<Object>, which attempts to add an Integer to the list. Java makes this same mistake with arrays. The following code compiles: //Java: void put(Object[] list) { list[0]=10; } put(new String[1]);It fails at runtime with an ArrayStoreException. Use-site variance Java takes a different approach, however, for generic class and interface types. By default, a class or interface type is invariant, which is to say, that:is assignable to L<V> if and only if U is exactly the same type as V.Since this is extremely inconvenient much of the time, Java supports something called use-site variance, where:L<U> is assignable to L<? extends V> if U is a subtype of V, and L<U> is assignable to L<? super V> if U is a supertype of V.The ugly syntax ? extends V or ? super V is called a wildcard. We also say that:L<? extends V> is covariant in V, and that L<? super V> is contravariant in V.Since Java’s wildcard notation is so ugly, we’re not going to use it anymore in this discussion. Instead, we’ll write wildcards using the keywords inand out for contravariance and covariance respectively. Thus:L<out V> is covariant in V, and L<in V> is contravariant in V.A given V is called the bound of the wildcard:out V is an upper-bounded wildcard, and V is its upper bound, and in V is a lower-bounded wildcard, and V is its lower bound.In theory, we could have a wildcard with both an upper and lower bound, for example, L<out X in Y>. We can express multiple upper bounds or multiple lower bounds using an intersection type, for example, L<out U&V> or L<in U&V>. Note that the type expressions L<out Anything> and L<in Nothing> refer to exactly the same type, and this type is a supertype of all instantiations of L. You’ll often see people refer to wildcarded types as existential types. What they mean by this is that if I know that list is of type List<out Object>: List<out Object> list;Then I know that there exists an unknown type T, a subtype of Object, such that list is of type List<T>. Alternatively, we can take a more Ceylonic point of view, and say that List<out Object> is the union of all types List<T> where T is a subtype ofObject. In a system with use-site variance, the following code does not compile: void iterate(List<Object> list) { Iterator<Object> it = list.iterator(); ... } iterate(ArrayList<String>()); //error: List<String> not a List<Object>But this code does: void iterate(List<out Object> list) { Iterator<out Object> it = list.iterator(); ... } iterate(ArrayList<String>());Correctly, this code does not compile: void put(List<out Object> list) { list.add(10); //error: Integer is not a Nothing } put(ArrayList<String>());Now we’re at the entrance to the rabbit hole. In order to integrate wildcarded types into the type system, while rejecting unsound code like the above example, we need a much more complicated algorithm for type argument substitution. Member typing in use-site variance That is, when we have a generic type like List<T>, with a method void add(T element), instead of just straightforwardly substituting Object forT, like we do with ordinary invariant types, we need to consider the variance of the location in which the type parameter occurs. In this case, Toccurs in a contravariant location of the type List, namely, as the type of a method parameter. The complicated algorithm, which I won’t write down here, tells us that we should substitute Nothing, the bottom type, in this location. Now imagine that our List interface has a partition() method with this signature: interface List<T> { List<List<T>> partition(Integer length); ... }What is the return type of partition() for a List<out Y>? Well, without losing precision, it is: List<in List<in Y out Nothing> out List<in Nothing out Y>>Ouch. Since nobody in their right mind wants to have to think about types like this, a sensible language would throw away some of those bounds, leaving something like this: List<out List<out Y>>Which is vaguely acceptable. Sadly, even in this very simple case, we’re already well beyond the point where the programmer can easily follow along with what the typechecker is doing. So here’s the essence of why I distrust use-site variance:A strong principle in the design of Ceylon is that the programmer should always be able to reproduce the reasoning of the compiler. It is verydifficult to reason about some of the complex types that arise with use-site variance. It has a viral effect: once those wildcard types get a foothold in the code, they start to propagate, and it’s quite hard to get back to my ordinary invariant types.Declaration-site variance A much saner alternative to use-site variance is declaration-site variance, where we specify the variance of a generic type when we declare it. This is the system we use in Ceylon. Under this system, we need to split List into three interfaces: interface List<out T> { Iterator<T> iterator(); List<List<T>> partition(Integer length); ... } interface ListMutator<in T> { void add(T element); } interface MutableList<T> satisfies List<T>&ListMutator<T> {}List is declared to be a covariant type, ListMutator a contravariant type, and MutableList an invariant subtype of both. It might seem that the requirement for multiple interfaces is a big disadvantage of declaration-site variance, but it often turns out to be useful to separate mutation from read operations, and:mutating operations are very often invariant, whereas read operations are very often covariant.Now we can write our functions like this: void iterate(List<Object> list) { Iterator<Object> it = list.iterator(); ... } iterate(ArrayList<String>()); void put(ListMutator<Integer> list) { list.add(10); } put(ArrayList<String>()); //error: List<String> is not a ListMutator<Integer>You can read more about declaration-site variance here. Why we need use-site variance in Ceylon Sadly, Java doesn’t have declaration-site variance, and clean interoperation with Java is something that is very important to us. I don’t like adding a major feature to the typesystem of our language purely for the purposes of interoperation with Java, and so I’ve resisted adding wildcards to Ceylon for years. In the end, reality and practicality won, and my stubborness lost. So Ceylon 1.1 now features use-site variance with single-bounded wildcards. I’ve tried to keep this feature as tightly constrained as possible, with just the minimum required for decent Java interop. That means that, like in Java:there are no double-bounded wildcards, of form List<in X out Y>, and a wildcarded type can not occur in the extends or satisfies clause of a class or interface definition.Furthermore, unlike Java:there are no implicitly-bounded wildcards, upper bounds must always be written in explicitly, and there is no support for wildcard capture.Wildcard capture is a very clever feature of Java, which makes use of the “existential” interpretation of a wildcard type. Given a generic function like this one: List<T> unmodifiableList<T>(List<T> list) => ... :Java would let me call unmodifiableList(), passing a wildcarded type like List<out Object>, returning another wildcarded List<out Object>, reasoning that there is some unknown X, a subtype of Object for which the invocation would be well-typed. That is, this code is considered well-typed, even though the type List<out Object> is not assignable to List<T> for any T: List<out Object> objects = .... ; List<out Object> unmodifiable = unmodifiableList(objects);In Java, typing errors involving wildcard capture are almost impossible to understand, since they involve the unknown, and undenoteable, type. I have no plans to add support for wildcard capture to Ceylon. Try it out Use-site variance is already implemented and already works in Ceylon 1.1, which you can get from GitHub, if you’re super-motivated. Even though the main motivation for this feature was great Java interop, there will be other, hopefully rare, occasions where wildcards will be useful. That doesn’t, however, indicate any significant shift in our approach. We will continue using declaration-site variance in the Ceylon SDK except in extreme cases.   UPDATE: I just realized I forgot to say thanks to Ross Tate for helping me with the finer points of the member typing algorithm for use site variance. Very tricky stuff that Ross knows off the top of his head!Reference: Why I distrust wildcards and why we need them anyway from our JCG partner Gavin King at the Ceylon Team blog blog....
jboss-wildfly-logo

HawtIO on JBoss Wildfly 8.1

HawtIO gives awesome eye candy to your JVM based middleware. It’s a unifying console for applications that would otherwise have to build out their own crappy web console; and let’s be honest, they’re all built differently, differing technology, different UX, and all around a terrible way to try to manage middleware in QA/PROD environments… I can hear the operations folks with the “amen brotha”.So HawtIO is a nice solution to this problem. It’s opensource, Apache 2.0 Licensed, and has a great community behind it Written using AngularJS and a nice plugin architecture, you can extend it to your hearts content for your own personal applications. You may have noticed that it’s also the awesome console for Fabric8 which is the open DevOps platform for JVM middleware — it makes managing your deployments, configuration, versioning, discovery, load balancing, etc easier for your middleware. But what options do you have for using HawtIO today? Many! HawtIO is really just a web application that runs in a JVM. So here are your options:Deploy it as a WAR in your favorite servlet container (Tomcat, Jetty, JBoss Wildfly/EAP) Deploy it stand alone as an executable java application Use the HawtIO Chrome extension to plug into your apps directly from your browserTake a look at the Getting Started page for more details about using HawtIO deployed in the different configurations. HawtIO has excellent plugins for configuring, managing and visualizing Apache ActiveMQ brokers, Apache Camel routes, Apache Karaf OSGI bundles/services/config and a lot more like Tomcat, Wildfly, Jetty, ElasticSearch, jclouds, etc,etc. For example, to manage ActiveMQ brokers, take a look at my buddy Dejan’s blog post. As we at Red Hat roll out JBoss Fuse and JBoss Fuse Serviceworks, we’re getting better at integrating the individual components. For example, a Fuse Service Works subscription gives you full access to A-MQ, Fuse, and all of its components, including HawtIO. Unfortunately, HawtIO isn’t “officially” supported in EAP as of today, but that will be fixed in next releases. It’s not a limitation of the technology, it’s just there’s so much there and Red Hat has stringent testing/compatibility requirements so we need to have all of the testing/certification done before we “support it” fully. BUT… there’s really no reason to not use it anyway (at least Development and QA), while we wait for support. And there are lots of people already doing that. Just remember, it’s not officially supported yet!So the rest of this blog is a step-by-step guide with best practices for getting HawtIO deployed and secured on your JBoss Wildfly 8.1 application server. The next entry (Part II) will show the same for JBoss EAP 6.2 distribution. I will use HawtIO 1.4.11 (latest release from the community) for this guide. Getting Started First of all, the assumption is that you know where to download Wildfly 8.1. But to get started here, we will want to get the latest HawtIO distribution (1.4.11 at the time of this writing) . We will be using the hawtio-default-1.4.11.war to be specific. Once you’ve downloaded the distro, consider this next step: 1. Remove the log4j.properties file We will want to remove the log4j.properties file that comes with the distro because we will want to use JBoss Wildfly’s built in logging facility which automatically plugins into the log4j logs that HawtIO writes to. If we didn’t remove the log4j.properties, we’d want to set the per deployment logging to false.. But since it’s not that difficult, let’s just remove the log4j.properties (NOTE: You should see the wildfly documentation on its logging component to get more information about the flexibility of the logging subsystem) ceposta@postamachat(renamed) $ ll total 50936 -rw-r--r--@ 1 ceposta staff 25M Jul 25 14:00 hawtio-default-1.4.11.warceposta@postamachat(renamed) $ unzip -l hawtio-default-1.4.11.war | grep log4j.properties 1268 07-13-14 17:23 WEB-INF/classes/log4j.propertiesceposta@postamachat(renamed) $ zip -d hawtio-default-1.4.11.war WEB-INF/classes/log4j.properties deleting: WEB-INF/classes/log4j.properties 2. Rename the distro We will want to rename the distro to make it easier to go to the webapp once it’s deployed. Note, this is not a mandatory step, but a nicety that makes it easy to use: ceposta@postamachat(renamed) $ mv hawtio-default-1.4.11.war hawtio.war Now when we deploy the WAR file, we’ll be able to hit the context like this: http://localhost:8080/hawtio instead of having to worry about the version number. 3. Relax the CDI subsystem HawtIO does use some CDI annotations (@Inject for example) but by default doesn’t include a beans.xml file. Wildfly 8.1 does not like this by default per the CDI 1.1 spec which introduces implicit bean archives. We can tell Wildfly to ignore this webapp as a CDI app since it doesn’t have the beans.xml included, and we can effectively disable implicit bean archives. To do this, edit your configuration file (we’ll use standalone.xml but if using domain mode, edit appropriate config files for that): 353 .... 354 <subsystem xmlns="urn:jboss:domain:weld:2.0" require-bean-descriptor="true"/> 355 </profile> 4. Purposefully disable Security We want to make sure the webapp deployed correctly and you can access all of the HawtIO goodies. So we'll temporarily _disable_ security on the webapp so we can access it. To do this, add this section after the `<extensions/>` section:31 <system-properties> 32 <property name="hawtio.authenticationEnabled" value="false" /> 33 </system-properties> We will restore security in a later section 5. Deploy HawtIO Now you’re ready to deploy HawtIO! If you’ve just freshly unpacked the Wildfly distro, you’ll want to add some users to your Management and Application realms: ceposta@postamachat(wildfly-8.1.0.Final) $ ./bin/add-user.shWhat type of user do you wish to add? a) Management User (mgmt-users.properties) b) Application User (application-users.properties) (a):Enter the details of the new user to add. Using realm 'ManagementRealm' as discovered from the existing property files. Username : admin The username 'admin' is easy to guess Are you sure you want to add user 'admin' yes/no? yes Password recommendations are listed below. To modify these restrictions edit the add-user.properties configuration file. - The password should not be one of the following restricted values {root, admin, administrator} - The password should contain at least 8 characters, 1 alphabetic character(s), 1 digit(s), 1 non-alphanumeric symbol(s) - The password should be different from the username Password : Re-enter Password : What groups do you want this user to belong to? (Please enter a comma separated list, or leave blank for none)[ ]: admin About to add user 'admin' for realm 'ManagementRealm' Is this correct yes/no? yes Added user 'admin' to file '/Users/ceposta/dev/eap/wildfly-8.1.0.Final/standalone/configuration/mgmt-users.properties' Added user 'admin' to file '/Users/ceposta/dev/eap/wildfly-8.1.0.Final/domain/configuration/mgmt-users.properties' Added user 'admin' with groups admin to file '/Users/ceposta/dev/eap/wildfly-8.1.0.Final/standalone/configuration/mgmt-groups.properties' Added user 'admin' with groups admin to file '/Users/ceposta/dev/eap/wildfly-8.1.0.Final/domain/configuration/mgmt-groups.properties' Is this new user going to be used for one AS process to connect to another AS process? e.g. for a slave host controller connecting to the master or for a Remoting connection for server to server EJB calls. yes/no? no You can now start up Wildfly and deploy HawtIO! Fire up Wildfly: ceposta@postamachat(wildfly-8.1.0.Final) $ ./bin/standalone.sh And navigate to the web console. Use the username and password you set up in the add-user section above to gain access to the webconsole. You can access the web console by navigating to http://localhost:9990/. Now, click on the Runtime tab and then Manage Deployments. Click “Add”, navigate to where you downloaded, and renamed, the HawtIO distro. Once you’ve added it, you should click the “Enable” button to enable it. You should have a screen that looks like this:6. Use HawtIO! Now you should be able to go to http://localhost:8080/hawtio and start using HawtIO!NOTE.. There seem to be some issues with security/login being respected on Safari on a Mac. It seems to prompt you for un/pw. Just try with Chrome or another web browser. 7. Set up Security So in an enterprise situation, we’ll want to secure HawtIO regardless of whether it’s Dev or QA environments. To do this, we’ll want to tie into Widlfly’s security subsystem. First, let’s start by stopping Wildfly and editing the standalone configuration file again. In the same spot where we disabled security, let’s re-enable it and add a couple more options. Your <system-properties> section should look like this: 31 <system-properties> 32 <property name="hawtio.authenticationEnabled" value="true" /> 33 <property name="hawtio.realm" value="jboss-web-policy" /> 34 <property name="hawtio.role" value="admin" /> 35 </system-properties> Awesome! Now let’s add a user to be able to login. We’ll again use ./bin/add-user.sh for this guide, but most likely in your environments you use more sophisticated security mechanisms (Database, LDAP, etc) than the properties files that’s used by default. But nevertheless, let’s add a new user to the ApplicationRealm: ceposta@postamachat(wildfly-8.1.0.Final) $ ./bin/add-user.shWhat type of user do you wish to add? a) Management User (mgmt-users.properties) b) Application User (application-users.properties) (a): bEnter the details of the new user to add. Using realm 'ApplicationRealm' as discovered from the existing property files. Username : ceposta Password recommendations are listed below. To modify these restrictions edit the add-user.properties configuration file. - The password should not be one of the following restricted values {root, admin, administrator} - The password should contain at least 8 characters, 1 alphabetic character(s), 1 digit(s), 1 non-alphanumeric symbol(s) - The password should be different from the username Password : Re-enter Password : What groups do you want this user to belong to? (Please enter a comma separated list, or leave blank for none)[ ]: admin About to add user 'ceposta' for realm 'ApplicationRealm' Is this correct yes/no? yes Added user 'ceposta' to file '/Users/ceposta/dev/eap/wildfly-8.1.0.Final/standalone/configuration/application-users.properties' Added user 'ceposta' to file '/Users/ceposta/dev/eap/wildfly-8.1.0.Final/domain/configuration/application-users.properties' Added user 'ceposta' with groups admin to file '/Users/ceposta/dev/eap/wildfly-8.1.0.Final/standalone/configuration/application-roles.properties' Added user 'ceposta' with groups admin to file '/Users/ceposta/dev/eap/wildfly-8.1.0.Final/domain/configuration/application-roles.properties' Is this new user going to be used for one AS process to connect to another AS process? e.g. for a slave host controller connecting to the master or for a Remoting connection for server to server EJB calls. yes/no? no Now let’s start up the app server again: ceposta@postamachat(wildfly-8.1.0.Final) $ ./bin/standalone.sh When we navigate to the http://localhost:8080/hawtio endpoint again, we should be greeted with a login page:What about EAP? There you have it! You have HawtIO running and secured on Wildfly! You can now check out all the awesome things you can do with HawtIO, especially what you can do with managing, debugging, tracing, profiling, and monitoring Apache Camel routes. But what about doing the same on JBoss EAP? Stay tuned for the next part… I’ll show you exactly how to do that!Reference: HawtIO on JBoss Wildfly 8.1 from our JCG partner Christian Posta at the Christian Posta – Software Blog blog....
software-development-2-logo

Top 10 Very Very VERY Important Topics to Discuss

Some things are just very very very VERY very important. Such as John Cleese. The same is true for Whitespace:                Yes. 1080 Reddit Karma points (so urgently needed!) in only 23 hours. That’s several orders of magnitudes better than any of our – what we wrongfully thought to be – very deep and interesting technical insight about Java and SQL has ever produced. The topic of interest was a humourous treatise about whether this: for (int i=0; i<LENGTH; i++) … or this: for (int i = 0; i < LENGTH; i++) … should be preferred. Obviously both options are completely wrong. The right answer is: for ( int i = 0 ; i < LENGTH ; i++ ) Read the full treatise here. But at some point, the whitespace discussion is getting stale. We need new very very very important topics to discuss instead of fixing them bugs. After all, the weekend is imminent, and we don’t know what else to talk about. This is why we are now publishing… Top 10 Very Very VERY Important Topics to Discuss Here we go… 0. Whitespace OK, that was a no-brainer. We’ve already had that. Want to participate? The very interesting Reddit discussion is still hot. 1. The Vietnam of Computer Science In case you haven’t heard of this highly interesting discussion, there are some people who believe that ORMs are outdated, because ORMs don’t work as promised. And they’re totally right. And the best thing is, all the others are totally right as well. Why is that great? Because that means we get to discuss it. Endlessly! While everyone keeps talking about ORMs like that, no one cares what Gavin King (creator of Hibernate) had said from the beginning:Why should we care about his opinion? We have our own, far superior opinion! Let’s have another discussion about why ORMs are evil! 2. Case-sensitivity Unfortunately, us Java folks cannot have any of those very very very very very important discussions about casing, because unfortunately, Java is a case-sensitive language. But take SQL (or PL/SQL, T-SQL for that sake). When writing SQL, we can have awesome discussions about whether we should: -- Upper case it all SELECT TAB.COL1, TAB.COL2 FROM TAB-- Upper case keywords, lower case identifiers SELECT tab.col1, tab.col2 FROM tab-- Lower case keywords, upper case identifiers select TAB.COL1, TAB.COL2 from TAB-- Lower case it all select tab.col1, tab.col2 from tab-- Add some PascalCase (awesome SQL Server!) SELECT Tab.Col1, Tab.Col2 FROM Tab-- Mix case-sensitivity with case-insensitivity -- (Protip to piss off your coworkers: Name your -- table "FROM" or "INTO" and let them figure out -- how to query that table) SELECT TAB."COL1", TAB."col2" FROM "FROM"-- PascalCase keywords (wow, MS Access) Select TAB.COL1, TAB.COL2 From TAB Now that is really incredibly interesting. And because this is so interesting and important, you can only imagine the number of interesting discussions we’ve had on the jOOQ User Group, for instance, about how to best generate meta data from the database. With jOOQ, we promise that you can extend these enticing discussions from the SQL area to the Java area by overriding the code generator’s default behaviour:Should classes be PascalCased and literals be UPPER_CASED? Should everything be PascalCased and camelCased as in Java? Should everything be generated as named in the database?Endless interesting discussions!We have so many options to SQL casing, which brings us to 3. SQL formatting Unlike C-style general-purpose languages such as C, Java, Scala, C#, or even keyword-heavy ones Delphi, Pascal, Ada, SQL has one more awesome grounds for numerous discussions. It is not only keyword-heavy, but it also has a very complex and highly irregular syntax. So we’re lucky enough to get to choose (after long discussions and settlements) between: -- All on one line. Don't tell me about print margins, -- Or I'll telefax you my SQL! SELECT table1.col1, table1.col2 FROM table1 JOIN table2 ON table1.id = table2.id WHERE id IN (SELECT x FROM other_table)-- "Main" keywords on new line SELECT table1.col1, table1.col2 FROM table1 JOIN table2 ON table1.id = table2.id WHERE id IN (SELECT x FROM other_table)-- (almost) all keywords on new line SELECT table1.col1, table1.col2 FROM table1 JOIN table2 ON table1.id = table2.id WHERE id IN (SELECT x FROM other_table)-- "Main" keywords on new line, others indented SELECT table1.col1, table1.col2 FROM table1 JOIN table2 ON table1.id = table2.id WHERE id IN ( SELECT x FROM other_table )-- "Main" keywords on new line, others heavily indented SELECT table1.col1, table1.col2 FROM table1 JOIN table2 ON table1.id = table2.id WHERE id IN (SELECT x FROM other_table)-- Doge formatting SUCH table1.col1, table1.col2 MUCH table1 JOIN table2 WOW table1.id = table2.id WHICH id IN (SUCH xWOW other_table )  And so on and so forth. Now any project manager should reserve at least 10 man-weeks in every project to agree on rules about SQL formatting. 4. The end of the DBA Now THAT is a very interesting topic that is not only interesting for developers who are so knowledgeable about productive systems, no it’s also very interesting for operations teams. Because as we all know, the DBA is dead (again). For those of you who have been missing out on this highly interesting topic, do know that all of this started (again) when the great NoSQL vs. SQL debate was initiated by brilliant minds and vendors of truly alternative systems. Which are now starting to implement SQL, because apparently, well… SQL isn’t all that bad: History of NoSQL according to @markmadsen #strataconf pic.twitter.com/XHXMJsXHjV — Edd Dumbill (@edd) November 12, 2013Please, do engage in some more discussions about the best and only true way to tackle database problems. Because your opinion counts! 5. New lines and comments Remember our own blog post about putting some keywords on new lines? Yes, we prefer: // If this if (something) { ... }// Else something else else { ... } Exactly. Because this allows comments to be written where they belong: Next to the appropriate keyword, and always aligned at the same column. This leads us to the next very interesting question: Should we put comments in code at all? Or is clean code self-documenting? And we say, why yes, of course we should comment. How on earth will anyone ever remember the rationale behind something like this?? // [#2744] DB2 knows the SELECT .. FROM FINAL // TABLE (INSERT ..) syntax case DB2:// Firebird and Postgres can execute the INSERT // .. RETURNING clause like a select clause. JDBC // support is not implemented in the Postgres JDBC // driver case FIREBIRD: case POSTGRES: { try { listener.executeStart(ctx); rs = ctx.statement().executeQuery(); listener.executeEnd(ctx); } finally { consumeWarnings(ctx, listener); }break; } Taken from our “hacking JDBC” page. 6. JSON is totally better than XML Of course it is! Because… because… errr. Because it allows me to structure data hierarchically. Waaaait a second… I love JSON. It's giving people opportunities to recreate the dumbest ideas of the XML era but with curly braces instead of angle brackets. — Tom Morris (@tommorris) July 21, 2014Dayum. You’re saying, JSON and XML are the SAME THING!? But MongoDB and PostgreSQL allow me to store JSON. Oh wait. They tried to store XML in databases, back in the 90s, too!? And it failed? Well, of course it failed, because XML sucks, right? (which is essentially another way of saying that I’ve never understood XSLT or XQuery or XPath or didn’t even hear about XProc, and I’m just ranting about angle brackets and namespaces) Let’s further discuss this matter. I feel that we’re close to the very ultimate solution on that topic. Speaking of JSON… 7. Curly braces OMG! This is the most interesting of all topics. Should we put the opening brace:On the same line? On a NEW line?? NO BRACE AT ALL???The right answers are 1) and 3). 1) only if we absolutely have to, as in try or switch statements. We’re not paid by the number of lines of code, so we don’t add meaningless lines with only opening braces. And if we can omit the braces entirely, fine. Here’s an awesome statement, if you ask me: if (something) outer: for (String thing : list) inner: do if (somethingElse) break inner; else continue outer; while (true); That ought to teach them juniors not to touch my code. Which brings us to: 8. Labels Nothing wrong with them. I’ll break out of my loops any time I want. Don’t tell me labels are Java’s way of saying GOTO, they’re much more sophisticated than that. (Besides, goto is a reserved word in Java, and it is an actual bytecode instruction). So I’ll happily do my jumping forward: label: { // do stuff if (check) break label; // do more stuff } Or my jumping backward: label: do { // do stuff if (check) continue label; // do more stuff break label; } while(true); (observe how the above example used two spaces for indentation instead of four (or tabs). Another great topic for further discussion) 9. emacs vs. vim vs. vi vs. Eclipse vs. IntelliJ vs. Netbeans Can we please, PLEASE, have another very interesting discussion about which one of these is better? Please! 10. Last but not Least: Is Haskell better than [your language]? According to TIOBE, Haskell ranks 38. And as we all know, the actual market share (absolutely none in the case of Haskell) of any programming language is inversely proportional to the amount of time spent on reddit discussing the importance of said language, and how said language is totally superior to the one ranking 1-2 above on TIOBE, for instance. Which would be Lua. So, I would love to invite you to join our blogging friends below to a very very interesting discussion about…Daniel Lyons – Smalltalk, Haskell and Lisp John Wiegley – Hello Haskell, Goodbye Lisp J Cooper – Haskell, Lisp, and verbosity Maybe we can iterate over … flatMap that s***! Robert Smith – The Perils of Lisp in Production Manuel Simon – Why Lisp is a Big Hack (And Haskell is Doomed to Succeed)Now, of course, we could enlargen the debate and compare functional programming with OO programming in general before delving into why Scala is NOT a functional programming language, let alone Java 8. Oh, and you think your dialect of Haskell or Lisp is not good enough, so you should roll your own? Go ahead (right after checking this checklist!) Such great topics. So little time. Conclusion The great thing about these social networks like Reddit, Hackernews, and all the others is the fact that we can finally spend all day to discuss really really intersting topics instead of fixing them boring bugs our boss wants us to fix. After all, this is IMPORTANT. Or as Randall Munroe would say: “Duty calls!” Further reading If you’re now all hot and ready to discuss things, please consider also reading these very interesting and insightful articles on how to best format and style code:Tom Wijsman – Should curly braces appear on their own line? Michael Breen – Rational Java Indentation and Style Daniel W. Dyer – No, your code is not so great that it doesn’t need comments Adam Davis – What is self-documenting code and can it replace well documented code? Carlo Pescio – Your coding conventions are hurting you Jason Lengstorf – Why You’re a Bad PHP Programmer Richard Rodger – Why I Have Given Up on Coding Standards Are `break` and `continue` bad programming practices? Douglas Crockford – Code Conventions for the JavaScript Programming LanguageOr add your own. There’s still much much important writing to do!Reference: Top 10 Very Very VERY Important Topics to Discuss from our JCG partner Lukas Eder at the JAVA, SQL, AND JOOQ blog....
software-development-2-logo

Code is NOT poetry, it is just code

One can read in the footer of WordPress.org: Code is Poetry. This is quite a bold statement and whoever believes in that slogan, he would be made to think that some kind of extraordinary portion of code must reside in the WordPress repositories. I took the time to look at a random “quote” of the WordPress poem, i.e. a single line of code: $mode = ( empty( $_REQUEST['mode'] ) ) ? 'list' : $_REQUEST['mode']; // – WordPress.org If William Shakespeare would have become a coder, he might have written some documentation like this: We know what we are, but know not what we may be. – Shakespeare I love Shakespeare for his ability to create emotions, dreams, visions and even imaginary problems by using the right words at the right time. The common programmer, on the other hand, tries to solve a specific problem, and to do this by using as little expressions as possible. We consider code “elegant” when it’s done like that, and when you can easily read and maintain it. With all my love to Shakespeare, I definitely cannot say I can understand him without any problems. I often have to read his texts multiple times. Here are some more observations:Programmers shouldn’t use the 32,000 words Shakespeare used [1], and they shouldn’t even dare to try to include the other 35,000 words Shakespeare knew, but haven’t felt the need to use. Programmers shouldn’t try to create emotions in their audience. If your colleague got emotional about your code, you most likely have messed something up. Programmers don’t need to write code that sounds nice when you read it loud. Programming is about solving a problem in time and with-in budget. Most poems don’t have a budget, and never solve their problems (See: Edgar Allen Poe, but at least it looks like he wasn’t suffering too much from his problems). Programmers need to write code which can be maintained from others. Some poets would throw an (empty) bottle of wine after you if you would try to “maintain” their work. Programmers shouldn’t care about philosophical problems. Poets are allowed and asked to do exactly that. Programmers need to express something as straight as possible. Many good poems are a matter of interpretation.While we are at interpretation: poetic interpretation is also very different to how the PHP interpreter for example does his job. Interpretation in poetry is based on your own mind, your own thinking and your own beliefs. Try to analyse this very good poem of William Blake: Tyger! Tyger! burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? You can read one analysis here. On the other hand, interpret this: prnt_r)('Hello World!'); Sure, sure, you can interpret the programmer is suffering from dyslexia, didn’t test his code or his keyboard was broken. But aside from that, there is nothing more in this line of code. If this code would work, it is has only a single thing to tell: print a message to the screen. Why? No idea. Read the specs for a detailed and accurate interpretation, or ask your customer to interpret it for you. There is nothing else to find in this line of code. If there is… we usually consider it a bug. Code is code, but definitely not poetry. It’s also not music. Nor is it painting. It’s code. At best, you could say it has something to do with math. But code and poems have different goals. But why should one call his code poetry? Honestly, coding actually has a creative aspect: you have a problem, and you need to craft something which solves it just using your mind and a limited set of vocabulary. This isn’t the work of many. Some people say “everybody can code”, but that isn’t true. This is a very specific way to think and it is ignorant to say everybody is able to do that. If a person would try and fail, this phrase would mean: “everybody except you can code”. It sounds as the person would not be clever enough. Saying things like that would mean we would ignore the fact that people think different, and some people have more physical competence and some others can think in functions and objects. For example, I am not exactly athletic and I am also not good with statistics. Still people consider me a good programmer, but well, I fail miserably with other things. Being able to code and craft things by mind allows some people to think they are somehow special; that their power to express commands in clever way is art, or in the case of WordPress: “poetry”. If somebody in my team would be that proud of their code, all alarm triggers would be called. Would this person be actually able to accept improvements from colleagues? Or would he act like E. A. Poe and throw a wine bottle? Is the “coder-artist” able to improve his piece of code if he believes this is art? Can he actually write code which solves the problem, or does he need to find the “golden way”? That is the way which is the most likely best solution for a musical problem, as my Jazz friend told me when he explained how he composes music. As coders, we don’t always need the golden way. Developers have a constraint which is usually missing in art: time. My Jazz friend works for 20 years on his first album. It’s ok in art, he just needs to earn money a different way. But let your customer know a Shop-feature will take 20 years until you find the best code for solving the problem. Time changes things. It changes how we think. It changes our tools. It changes our needs. When Blake describes the Tyger, he described the Tyger in one moment of his time. Maybe some kind of period. But there is no need to write a Tyger 2.0. Blake wrote the Tyger, and when he finished, he finished. The Tyger was there, and will not change. Our software will change. We will, and should, change. For that reason we cannot consider code as art, because it doesn’t survive time in any case. We can call ourselves skillful. But seriously, would you call yourself a fantastic developer? There are quite a couple of problems which comes withthat, as I wrote in “The Zen Programmer”. Why should you call your code “art” or “poetry”? It must change over time. And this is the only art to master as a developer: to accept that things change, and to adopt these changes without being attached to them. [1]: http://kottke.org/10/04/how-many-words-did-shakespeare-know[Go back]The Zen Programmer This book will teach you that there are more than just emails, phone calls, and urgent issues. What Zen teachers told us hundreds of years ago is still true today: we can say “No” and have our lives in our own hands. Zen is not only for famous corporate leaders like Steve Jobs. It is for you. It is not for weekends. You can practice Zen at any time, even right now, this second. Programmers are wanted people. But a lot of us are caught up in social networks, phone calls, and people who just get on our nerves. We believe we need to take every job we can get just because somebody told us we will end up poor and alone if we don’t. By the end of the day we don’t achieve our goals. We try to relax on weekends, but our phones ring with something urgent. In the end we are lost in chaos, day after day, and it’s almost impossible to find our way out. This book will teach you there are more than just emails, phone calls, and urgent issues. What Zen teachers told us hundreds of years ago is still true today: we can say “No” and we have our lives in our own hands. Zen is not only for famous corporate leaders like Steve Jobs. It is for you. It is not for weekends. You can practice Zen at any time, even now, right this second.Get the book now!...
java-logo

Testing code for excessively large inputs

When writing unit tests we mostly focus on business correctness. We do our best to exercise happy path and all edge cases. We sometimes microbenchmark and measure throughput. But one aspect that is often missed is how our code behaves when the input is excessively large? We test how we handle normal input files, malformed files, empty files, missing files… but what about insanely large input files? Let’s start from a real-life use case. You were given a task to implement GPX (GPS Exchange Format, basically XML) to JSON transformation. I chose GPX for no particular reason, it’s just another XML format that you might have come across e.g. when recording your hike or bicycle ride with GPS receiver. Also I thought it will be nice to use some standard rather than yet another “people database” in XML. Inside GPX file there are hundreds of flat <wpt/> entries, each one representing one point in space-time: <gpx> <wpt lat="42.438878" lon="-71.119277"> <ele>44.586548</ele> <time>2001-11-28T21:05:28Z</time> <name>5066</name> <desc><![CDATA[5066]]></desc> <sym>Crossing</sym> <type><![CDATA[Crossing]]></type> </wpt> <wpt lat="42.439227" lon="-71.119689"> <ele>57.607200</ele> <time>2001-06-02T03:26:55Z</time> <name>5067</name> <desc><![CDATA[5067]]></desc> <sym>Dot</sym> <type><![CDATA[Intersection]]></type> </wpt> <!-- ...more... --> </gpx>Full example: www.topografix.com/fells_loop.gpx. Our task is to extract each individual <wpt/> element, discard those without lat or lon attributes and store back JSON in the following format: [ {"lat": 42.438878,"lon": -71.119277}, {"lat": 42.439227,"lon": -71.119689} ...more... ]That’s easy! First of all I started with generating JAXB classes using xjc utility from JDK and GPX 1.0 XSD schema. Please note that GPX 1.1 is the most recent version as of this writing, but examples I got use 1.0. For JSON marshalling I used Jackson. The complete, working and tested program looks like this: import org.apache.commons.io.FileUtils; import org.codehaus.jackson.map.ObjectMapper; import javax.xml.bind.JAXBException;public class GpxTransformation {private final ObjectMapper jsonMapper = new ObjectMapper(); private final JAXBContext jaxbContext;public GpxTransformation() throws JAXBException { jaxbContext = JAXBContext.newInstance("com.topografix.gpx._1._0"); }public void transform(File inputFile, File outputFile) throws JAXBException, IOException { final List<Gpx.Wpt> waypoints = loadWaypoints(inputFile); final List<LatLong> coordinates = toCoordinates(waypoints); dumpJson(coordinates, outputFile); }private List<Gpx.Wpt> loadWaypoints(File inputFile) throws JAXBException, IOException { String xmlContents = FileUtils.readFileToString(inputFile, UTF_8); final Unmarshaller unmarshaller = jaxbContext.createUnmarshaller(); final Gpx gpx = (Gpx) unmarshaller.unmarshal(new StringReader(xmlContents)); return gpx.getWpt(); }private static List<LatLong> toCoordinates(List<Gpx.Wpt> waypoints) { return waypoints .stream() .filter(wpt -> wpt.getLat() != null) .filter(wpt -> wpt.getLon() != null) .map(LatLong::new) .collect(toList()); }private void dumpJson(List<LatLong> coordinates, File outputFile) throws IOException { final String resultJson = jsonMapper.writeValueAsString(coordinates); FileUtils.writeStringToFile(outputFile, resultJson); }}class LatLong { private final double lat; private final double lon;LatLong(Gpx.Wpt waypoint) { this.lat = waypoint.getLat().doubleValue(); this.lon = waypoint.getLon().doubleValue(); }public double getLat() { return lat; }public double getLon() { return lon; } }Looks fairly good, despite few traps I left intentionally. We load GPX XML file, extract waypoints to a List, transform that list into lightweight LatLong objects, first filtering out broken waypoints. Finally we dump List<LatLong> back to disk. However one day extremely long bicycle ride crashed our system with OutOfMemoryError. Do you know what happened? The GPX file uploaded to our application was huge, much bigger then we ever expected to receive. Now look again at the implementation above and count in how many places we allocate more memory then necessary? But if you want to refactor immediately, stop right there! We want to practice TDD, right? And we want to limit WTF/minute factor in our code? I have a theory that many “WTFs” are not caused by careless and inexperienced programmers. Often it’s because of these late Friday production issues, totally unexpected inputs and unpredicted side effects. Code gets more and more workarounds, hard to understand refactorings, logic more complex then one might anticipate. Sometimes bad code was not intended, but required given circumstances we had long forgotten. So if one day you see null check that can’t possible happen or hand-written code that could’ve been replaced by a library – think about the context. That being said let’s start from writing tests proving our future refactorings are needed. If one day someone “fixes” our code, assuming “this stupid programmer” complicated things without good reason, automated tests will tell precisely why. Our test will simply try to transform insanely big input files. But before we begin we must refactor the original implementation a bit, so that it accapets InputStream and OutputStream rather than input and output Files – there is no reason to limit our implementation to file system only: Step 0a: Make it testable import org.apache.commons.io.IOUtils;public class GpxTransformation {//...public void transform(File inputFile, File outputFile) throws JAXBException, IOException { try ( InputStream input = new BufferedInputStream(new FileInputStream(inputFile)); OutputStream output = new BufferedOutputStream(new FileOutputStream(outputFile))) { transform(input, output); } }public void transform(InputStream input, OutputStream output) throws JAXBException, IOException { final List<Gpx.Wpt> waypoints = loadWaypoints(input); final List<LatLong> coordinates = toCoordinates(waypoints); dumpJson(coordinates, output); }private List<Gpx.Wpt> loadWaypoints(InputStream input) throws JAXBException, IOException { String xmlContents = IOUtils.toString(input, UTF_8); final Unmarshaller unmarshaller = jaxbContext.createUnmarshaller(); final Gpx gpx = (Gpx) unmarshaller.unmarshal(new StringReader(xmlContents)); return gpx.getWpt(); }//...private void dumpJson(List<LatLong> coordinates, OutputStream output) throws IOException { final String resultJson = jsonMapper.writeValueAsString(coordinates); output.write(resultJson.getBytes(UTF_8)); }}Step 0b: Writing input (stress) test Input will be generated from scratch using repeat(byte[] sample, int times) utility developed earlier. We will basically repeat the same <wpt/> item millions of times, wrapping it with GPX header and footer so that it is well-formed. Normally I would consider placing samples in src/test/resources, but I wanted this code to be self-containing. Notice that we neither care about the actual input, nor output. This is already tested. If transformation succeeds (we can add some timeout if we want), it’s OK. If it fails with any exception, most likely OutOfMemoryError, it’s a test failure (error): import org.apache.commons.io.FileUtils import org.apache.commons.io.output.NullOutputStream import spock.lang.Specification import spock.lang.Unrollimport static org.apache.commons.io.FileUtils.ONE_GB import static org.apache.commons.io.FileUtils.ONE_KB import static org.apache.commons.io.FileUtils.ONE_MB@Unroll class LargeInputSpec extends Specification {final GpxTransformation transformation = new GpxTransformation()final byte[] header = """<?xml version="1.0"?> <gpx version="1.0" creator="ExpertGPS 1.1 - http://www.topografix.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.topografix.com/GPX/1/0" xsi:schemaLocation="http://www.topografix.com/GPX/1/0 http://www.topografix.com/GPX/1/0/gpx.xsd"> <time>2002-02-27T17:18:33Z</time> """.getBytes(UTF_8)final byte[] gpxSample = """ <wpt lat="42.438878" lon="-71.119277"> <ele>44.586548</ele> <time>2001-11-28T21:05:28Z</time> <name>5066</name> <desc><![CDATA[5066]]></desc> <sym>Crossing</sym> <type><![CDATA[Crossing]]></type> </wpt> """.getBytes(UTF_8)final byte[] footer = """</gpx>""".getBytes(UTF_8)def "Should not fail with OOM for input of size #readableBytes"() { given: int repeats = size / gpxSample.length InputStream xml = withHeaderAndFooter( RepeatedInputStream.repeat(gpxSample, repeats))expect: transformation.transform(xml, new NullOutputStream())where: size << [ONE_KB, ONE_MB, 10 * ONE_MB, 100 * ONE_MB, ONE_GB, 8 * ONE_GB, 32 * ONE_GB] readableBytes = FileUtils.byteCountToDisplaySize(size) }private InputStream withHeaderAndFooter(InputStream samples) { InputStream withHeader = new SequenceInputStream( new ByteArrayInputStream(header), samples) return new SequenceInputStream( withHeader, new ByteArrayInputStream(footer)) } }There are actually 7 tests here, running GPX to JSON transformation for inputs of size: 1 KiB, 1 MiB, 10 MiB, 100 MiB, 1 GiB, 8 GiB and 32 GiB. I run these tests on JDK 8u11x64 with the following options: -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xmx1g. 1 GiB of memory is a lot, but clearly can’t fit the whole input file in memory:  While small tests are passing, inputs above 1 GiB are failing fast. Step 1: Avoid keeping whole files in Strings The stack trace reveals where the problem lies: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3326) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:569) at java.lang.StringBuilder.append(StringBuilder.java:190) at org.apache.commons.io.output.StringBuilderWriter.write(StringBuilderWriter.java:138) at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2002) at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1980) at org.apache.commons.io.IOUtils.copy(IOUtils.java:1957) at org.apache.commons.io.IOUtils.copy(IOUtils.java:1907) at org.apache.commons.io.IOUtils.toString(IOUtils.java:778) at com.nurkiewicz.gpx.GpxTransformation.loadWaypoints(GpxTransformation.java:56) at com.nurkiewicz.gpx.GpxTransformation.transform(GpxTransformation.java:50)loadWaypoints eagerly loads input GPX file into a String (see: IOUtils.toString(input, UTF_8)) to later parse it. That’s kind of dumb, especially since JAXB Unmarshaller can easily read InputStream directly. Let’s fix it: private List<Gpx.Wpt> loadWaypoints(InputStream input) throws JAXBException, IOException { final Unmarshaller unmarshaller = jaxbContext.createUnmarshaller(); final Gpx gpx = (Gpx) unmarshaller.unmarshal(input); return gpx.getWpt(); }private void dumpJson(List<LatLong> coordinates, OutputStream output) throws IOException { jsonMapper.writeValue(output, coordinates); }Similarly we fixed dumpJson as it was first dumping JSON into String and later copying that String intoOutputStream. Results are slightly better, but again 1 GiB fails, this time by going into infinite death loop of Full GC and finally throwing: java.lang.OutOfMemoryError: Java heap space at com.sun.xml.internal.bind.v2.runtime.unmarshaller.LeafPropertyLoader.text(LeafPropertyLoader.java:50) at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext.text(UnmarshallingContext.java:527) at com.sun.xml.internal.bind.v2.runtime.unmarshaller.SAXConnector.processText(SAXConnector.java:208) at com.sun.xml.internal.bind.v2.runtime.unmarshaller.SAXConnector.endElement(SAXConnector.java:171) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:609) [...snap...] at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:649) at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:243) at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java:214) at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:157) at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:204) at com.nurkiewicz.gpx.GpxTransformation.loadWaypoints(GpxTransformation.java:54) at com.nurkiewicz.gpx.GpxTransformation.transform(GpxTransformation.java:47)Step 2: (Poorly) replacing JAXB with StAX We can suspect that main issue now is XML parsing using JAXB, which always eagerly maps the whole XML file into Java objects. It’s easy to imagine why turning a 1 GiB file into object graph fails. We would like to somehow take more control over reading XML and consuming it in chunks. SAX was traditionally used in such circumstances, however the push programming model in SAX API is very inconvenient. SAX uses callback mechanism, which is very invasive and not very readable. StAX (Streaming API for XML), working on a slightly higher level, exposes pull model. It means client code decides when, and how much input to consume. This gives us better control over input and allows more flexibility. To familiarize you with the API, here is almost equivalent code to loadWaypoints(), but I skip attributes of <wpt/>which aren’t needed later: private List<Gpx.Wpt> loadWaypoints(InputStream input) throws JAXBException, IOException, XMLStreamException { final XMLInputFactory factory = XMLInputFactory.newInstance(); final XMLStreamReader reader = factory.createXMLStreamReader(input); final List<Gpx.Wpt> waypoints = new ArrayList<>(); while (reader.hasNext()) { switch (reader.next()) { case XMLStreamConstants.START_ELEMENT: if (reader.getLocalName().equals("wpt")) { waypoints.add(parseWaypoint(reader)); } break; } } return waypoints; }private Gpx.Wpt parseWaypoint(XMLStreamReader reader) { final Gpx.Wpt wpt = new Gpx.Wpt(); final String lat = reader.getAttributeValue("", "lat"); if (lat != null) { wpt.setLat(new BigDecimal(lat)); } final String lon = reader.getAttributeValue("", "lon"); if (lon != null) { wpt.setLon(new BigDecimal(lon)); } return wpt; }See how we explicitly ask XMLStreamReader for more data? However the fact that we are using more low-level API (and a lot more code) doesn’t mean it has to be better if used incorrectly. We keep building huge waypoints list, so it’s not a surprise we again see OutOfMemoryError: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3204) at java.util.Arrays.copyOf(Arrays.java:3175) at java.util.ArrayList.grow(ArrayList.java:246) at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:220) at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:212) at java.util.ArrayList.add(ArrayList.java:443) at com.nurkiewicz.gpx.GpxTransformation.loadWaypoints(GpxTransformation.java:65) at com.nurkiewicz.gpx.GpxTransformation.transform(GpxTransformation.java:52)Exactly where we anticipated. The good news is that 1 GiB test passed (with 1 GiB heap), so we are sort of going in the right direction. But it took 1 minute to complete due to excessive GC. Step 3: StAX implemented properly Notice that implementation using StAX in previous example would be just as good with SAX. However the reason I chose StAX was that we can now turn an XML file into an Iterator<Gpx.Wpt>. This iterator will consume XML file in chunks, lazily and only when asked. We can later consume that iterator lazily as well, which means we no longer keep whole file in memory. Iterators, while clumsy to work with, are still much better than working with XML directly or with SAX callbacks: import com.google.common.collect.AbstractIterator;private Iterator<Gpx.Wpt> loadWaypoints(InputStream input) throws JAXBException, IOException, XMLStreamException { final XMLInputFactory factory = XMLInputFactory.newInstance(); final XMLStreamReader reader = factory.createXMLStreamReader(input); return new AbstractIterator<Gpx.Wpt>() {@Override protected Gpx.Wpt computeNext() { try { return tryPullNextWaypoint(); } catch (XMLStreamException e) { throw Throwables.propagate(e); } }private Gpx.Wpt tryPullNextWaypoint() throws XMLStreamException { while (reader.hasNext()) { int event = reader.next(); switch (event) { case XMLStreamConstants.START_ELEMENT: if (reader.getLocalName().equals("wpt")) { return parseWaypoint(reader); } break; case XMLStreamConstants.END_ELEMENT: if (reader.getLocalName().equals("gpx")) { return endOfData(); } break; } } throw new IllegalStateException("XML file didn't finish with </gpx> element, malformed?"); } }; }This is getting complex! I’m using AbstractIterator from Guava to handle tedious hasNext() state. Every time someone tries to pull next Gpx.Wpt item from an iterator (or call hasNext()) we consume a little bit of XML, just enough to return one entry. If XMLStreamReader encounters end of XML (</gpx> tag), we signal iterator end by returning endOfData(). This is a very handy pattern where XML is read lazily and served via convenient iterator. This implementation alone consumes very little, constant amount of memory. However we changed the API fromList<Gpx.Wpt> to Iterator<Gpx.Wpt>, which forces changes to the rest of our implementation: private static List<LatLong> toCoordinates(Iterator<Gpx.Wpt> waypoints) { final Spliterator<Gpx.Wpt> spliterator = Spliterators.spliteratorUnknownSize(waypoints, Spliterator.ORDERED); return StreamSupport .stream(spliterator, false) .filter(wpt -> wpt.getLat() != null) .filter(wpt -> wpt.getLon() != null) .map(LatLong::new) .collect(toList()); }toCoordinates() was previously accepting List<Gpx.Wpt>. Iterators can’t be turned into Stream directly, so we need this clunky transformation through Spliterator. Do you think it’s over? ! GiB test passes a little bit faster, but more demanding ones are failing just like before: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3175) at java.util.ArrayList.grow(ArrayList.java:246) at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:220) at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:212) at java.util.ArrayList.add(ArrayList.java:443) at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) at java.util.Iterator.forEachRemaining(Iterator.java:116) at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:512) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:502) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at com.nurkiewicz.gpx.GpxTransformation.toCoordinates(GpxTransformation.java:118) at com.nurkiewicz.gpx.GpxTransformation.transform(GpxTransformation.java:58) at com.nurkiewicz.LargeInputSpec.Should not fail with OOM for input of size #readableBytes(LargeInputSpec.groovy:49)Remember that OutOfMemoryError is not always thrown from a place that actually consumes most memory. Luckily it’s not the case this time. Look carefully to the bottom: collect(toList()). Step 4: Avoiding streams and collectors This is disappointing. Streams and collectors were designed from the ground up to support laziness. However it’s virtually impossible to implement a collector (see also: Introduction to writing custom collectors in Java 8 and Grouping, sampling and batching – custom collectors) from stream to iterator effectively, which is a big design flaw. Therefore we must forget about streams altogether and use plain iterators all the way down. Iterators aren’t very elegant, but allow consuming input item-by-item, having full control over memory consumption. We need a way to filter() input iterator, discarding broken items and map() entries to another representation. Guava, again, provides few handy utilities for that, replacing stream() completely: private static Iterator<LatLong> toCoordinates(Iterator<Gpx.Wpt> waypoints) { final Iterator<Gpx.Wpt> filtered = Iterators .filter(waypoints, wpt -> wpt.getLat() != null && wpt.getLon() != null); return Iterators.transform(filtered, LatLong::new); }Iterator<Gpx.Wpt> in, Iterator<LatLong> out. No processing was done, XML file was barely touched, marginal memory consumption. We are lucky, Jackson accepts iterators and transparently reads them, producing JSON iteratively. Thus, memory consumption is kept low as well. Guess what, we made it!  Memory consumption is low and stable, I think we can safely assume it’s constant. Our code processes about 40 MiB/s, so don’t be surprised by almost 14 minutes it took to process 32 GiB. Oh, and did I mention that I run the last test with -Xmx32M? That’s right, processing 32 GiB was successful without any performance loss using thousand times less memory. And 3000 times less, compared to initial implementation. As a matter of fact the last solution using iterators is capable of handling even infinite streams of XML. It’s not really just theoretical case, imagine some sort of streaming API that produces never-ending flow of messages… Final implementation This is our code in it’s entirety: package com.nurkiewicz.gpx;import com.google.common.base.Throwables; import com.google.common.collect.AbstractIterator; import com.google.common.collect.Iterators; import com.topografix.gpx._1._0.Gpx; import org.codehaus.jackson.map.ObjectMapper;import javax.xml.bind.JAXBException; import javax.xml.stream.XMLInputFactory; import javax.xml.stream.XMLStreamConstants; import javax.xml.stream.XMLStreamException; import javax.xml.stream.XMLStreamReader; import java.io.BufferedInputStream; import java.io.BufferedOutputStream; import java.io.File; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStream; import java.io.OutputStream; import java.math.BigDecimal; import java.util.Iterator;public class GpxTransformation {private static final ObjectMapper jsonMapper = new ObjectMapper();public void transform(File inputFile, File outputFile) throws JAXBException, IOException, XMLStreamException { try ( InputStream input = new BufferedInputStream(new FileInputStream(inputFile)); OutputStream output = new BufferedOutputStream(new FileOutputStream(outputFile))) { transform(input, output); } }public void transform(InputStream input, OutputStream output) throws JAXBException, IOException, XMLStreamException { final Iterator<Gpx.Wpt> waypoints = loadWaypoints(input); final Iterator<LatLong> coordinates = toCoordinates(waypoints); dumpJson(coordinates, output); }private Iterator<Gpx.Wpt> loadWaypoints(InputStream input) throws JAXBException, IOException, XMLStreamException { final XMLInputFactory factory = XMLInputFactory.newInstance(); final XMLStreamReader reader = factory.createXMLStreamReader(input); return new AbstractIterator<Gpx.Wpt>() {@Override protected Gpx.Wpt computeNext() { try { return tryPullNextWaypoint(); } catch (XMLStreamException e) { throw Throwables.propagate(e); } }private Gpx.Wpt tryPullNextWaypoint() throws XMLStreamException { while (reader.hasNext()) { int event = reader.next(); switch (event) { case XMLStreamConstants.START_ELEMENT: if (reader.getLocalName().equals("wpt")) { return parseWaypoint(reader); } break; case XMLStreamConstants.END_ELEMENT: if (reader.getLocalName().equals("gpx")) { return endOfData(); } break; } } throw new IllegalStateException("XML file didn't finish with </gpx> element, malformed?"); } }; }private Gpx.Wpt parseWaypoint(XMLStreamReader reader) { final Gpx.Wpt wpt = new Gpx.Wpt(); final String lat = reader.getAttributeValue("", "lat"); if (lat != null) { wpt.setLat(new BigDecimal(lat)); } final String lon = reader.getAttributeValue("", "lon"); if (lon != null) { wpt.setLon(new BigDecimal(lon)); } return wpt; }private static Iterator<LatLong> toCoordinates(Iterator<Gpx.Wpt> waypoints) { final Iterator<Gpx.Wpt> filtered = Iterators .filter(waypoints, wpt -> wpt.getLat() != null && wpt.getLon() != null); return Iterators.transform(filtered, LatLong::new); }private void dumpJson(Iterator<LatLong> coordinates, OutputStream output) throws IOException { jsonMapper.writeValue(output, coordinates); }}Summary (TL;DR) If you were not patient enough to follow all steps, here are three main takeaways:Your first goal is simplicity. Initial JAXB implementation was perfectly fine (with minor modifications), keep it like that if your code doesn’t have to handle large inputs. Test your code against insanely large inputs, e.g. using generated InputStream, producing gigabytes of input. Huge data set is another example of edge case. Don’t test manually, once. One careless change or “improvement” might ruin your performance down the road. Optimization is not an excuse for writing poor code. Notice that our implementation is still composable and easy to follow. If we went through SAX and simply inlined all logic in SAX callbacks, maintainability would greatly suffer.Reference: Testing code for excessively large inputs from our JCG partner Tomasz Nurkiewicz at the Java and neighbourhood blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close