Featured FREE Whitepapers

What's New Here?


EAGER fetching is a code smell

Introduction Hibernate fetching strategies can really make a difference between an application that barely crawls and a highly responsive one. In this post I’ll explain why you should prefer query based fetching instead of global fetch plans. Fetching 101 Hibernate defines four association retrieving strategies:    Fetching Strategy DescriptionJoin The association is OUTER JOINED in the original SELECT statementSelect An additional SELECT statement is used to retrieve the associated entity(entities)Subselect An additional SELECT statement is used to retrieve the whole associated collection. This mode is meant for to-many associationsBatch An additional number of SELECT statements is used to retrieve the whole associated collection. Each additional SELECT will retrieve a fixed number of associated entities. This mode is meant for to-many associations  These fetching strategies might be applied in the following scenarios:the association is always initialized along with its owner (e.g. EAGER FetchType) the uninitialized association (e.g. LAZY FetchType) is navigated, therefore the association must be retrieved with a secondary SELECTThe Hibernate mappings fetching information forms the global fetch plan. At query time, we may override the global fetch plan, but only for LAZY associations. For this we can use the fetch HQL/JPQL/Criteria directive. EAGER associations cannot be overridden, therefore tying your application to the global fetch plan. Hibernate 3 acknowledged that LAZY should be the default association fetching strategy: By default, Hibernate3 uses lazy select fetching for collections and lazy proxy fetching for single-valued associations. These defaults make sense for most associations in the majority of applications. This decision was taken after noticing many performance issues associated with Hibernate 2 default eager fetching. Unfortunately JPA has taken a different approach and decided that to-many associations be LAZY while to-one relationships be fetched eagerly.Association type Default fetching policy@OneTMany LAZY@ManyToMany LAZY@ManyToOne EAGER@OneToOne EAGER  EAGER fetching inconsistencies While it may be convenient to just mark associations as EAGER, delegating the fetching responsibility to Hibernate, it’s advisable to resort to query based fetch plans. An EAGER association will always be fetched and the fetching strategy is not consistent across all querying techniques. Next, I’m going to demonstrate how EAGER fetching behaves for all Hibernate querying variants. I will reuse the same entity model I’ve previously introduced in my fetching strategies article:The Product entity has the following associations: @ManyToOne(fetch = FetchType.EAGER) @JoinColumn(name = "company_id", nullable = false) private Company company;@OneToOne(fetch = FetchType.LAZY, cascade = CascadeType.ALL, mappedBy = "product", optional = false) private WarehouseProductInfo warehouseProductInfo;@ManyToOne(fetch = FetchType.LAZY) @JoinColumn(name = "importer_id") private Importer importer;@OneToMany(fetch = FetchType.LAZY, cascade = CascadeType.ALL, mappedBy = "product", orphanRemoval = true) @OrderBy("index") private Set<Image> images = new LinkedHashSet<Image>(); The company association is marked as EAGER and Hibernate will always employ a fetching strategy to initialize it along with its owner entity. Persistence Context loading First we’ll load the entity using the Persistence Context API: Product product = entityManager.find(Product.class, productId); Which generates the following SQL SELECT statement: Query:{[ select product0_.id as id1_18_1_, product0_.code as code2_18_1_, product0_.company_id as company_6_18_1_, product0_.importer_id as importer7_18_1_, product0_.name as name3_18_1_, product0_.quantity as quantity4_18_1_, product0_.version as version5_18_1_, company1_.id as id1_6_0_, company1_.name as name2_6_0_ from Product product0_ inner join Company company1_ on product0_.company_id=company1_.id where product0_.id=?][1] The EAGER company association was retrieved using an inner join. For M such associations the owner entity table is going to be joined M times. Each extra join adds up to the overall query complexity and execution time. If we don’t even use all these associations, for every possible business scenario, then we’ve just paid the extra performance penalty for nothing in return. Fetching using JPQL and Criteria Product product = entityManager.createQuery( "select p " + "from Product p " + "where p.id = :productId", Product.class) .setParameter("productId", productId) .getSingleResult(); or with CriteriaBuilder cb = entityManager.getCriteriaBuilder(); CriteriaQuery<Product> cq = cb.createQuery(Product.class); Root<Product> productRoot = cq.from(Product.class); cq.where(cb.equal(productRoot.get("id"), productId)); Product product = entityManager.createQuery(cq).getSingleResult(); Generates the following SQL SELECT statements: Query:{[ select product0_.id as id1_18_, product0_.code as code2_18_, product0_.company_id as company_6_18_, product0_.importer_id as importer7_18_, product0_.name as name3_18_, product0_.quantity as quantity4_18_, product0_.version as version5_18_ from Product product0_ where product0_.id=?][1]}Query:{[ select company0_.id as id1_6_0_, company0_.name as name2_6_0_ from Company company0_ where company0_.id=?][1]} Both JPQL and Criteria queries default to select fetching, therefore issuing a secondary select for each individual EAGER association. The larger the associations number, the more additional individual SELECTS, the more it will affect our application performance. Hibernate Criteria API While JPA 2.0 added support for Criteria queries, Hibernate has long been offering a specific dynamic query implementation. If the EntityManager implementation delegates method calls the the legacy Session API, the JPA Criteria implementation was written from scratch. That’s the reason why Hibernate and JPA Criteria API behave differently for similar querying scenarios. The previous example Hibernate Criteria equivalent looks like this: Product product = (Product) session.createCriteria(Product.class) .add(Restrictions.eq("id", productId)) .uniqueResult(); And the associated SQL SELECT is: Query:{[ select this_.id as id1_3_1_, this_.code as code2_3_1_, this_.company_id as company_6_3_1_, this_.importer_id as importer7_3_1_, this_.name as name3_3_1_, this_.quantity as quantity4_3_1_, this_.version as version5_3_1_, hibernatea2_.id as id1_0_0_, hibernatea2_.name as name2_0_0_ from Product this_ inner join Company hibernatea2_ on this_.company_id=hibernatea2_.id where this_.id=?][1]} This query uses the join fetch strategy as opposed to select fetching, employed by JPQL/HQL and Criteria API. Hibernate Criteria and to-many EAGER collections Let’s see what happens when the image collection fetching strategy is set to EAGER: @OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.ALL, mappedBy = "product", orphanRemoval = true) @OrderBy("index") private Set<Image> images = new LinkedHashSet<Image>(); The following SQL is going to be generated: Query:{[ select this_.id as id1_3_2_, this_.code as code2_3_2_, this_.company_id as company_6_3_2_, this_.importer_id as importer7_3_2_, this_.name as name3_3_2_, this_.quantity as quantity4_3_2_, this_.version as version5_3_2_, hibernatea2_.id as id1_0_0_, hibernatea2_.name as name2_0_0_, images3_.product_id as product_4_3_4_, images3_.id as id1_1_4_, images3_.id as id1_1_1_, images3_.index as index2_1_1_, images3_.name as name3_1_1_, images3_.product_id as product_4_1_1_ from Product this_ inner join Company hibernatea2_ on this_.company_id=hibernatea2_.id left outer join Image images3_ on this_.id=images3_.product_id where this_.id=? order by images3_.index][1]} Hibernate Criteria doesn’t automatically groups the parent entities list. Because of the one-to-many children table JOIN, for each child entity we are going to get a new parent entity object reference (all pointing to the same object in our current Persistence Context): product.setName("TV"); product.setCompany(company);Image frontImage = new Image(); frontImage.setName("front image"); frontImage.setIndex(0);Image sideImage = new Image(); sideImage.setName("side image"); sideImage.setIndex(1);product.addImage(frontImage); product.addImage(sideImage);List products = session.createCriteria(Product.class) .add(Restrictions.eq("id", productId)) .list(); assertEquals(2, products.size()); assertSame(products.get(0), products.get(1)); Because we have two image entities, we will get two Product entity references, both pointing to the same first level cache entry. To fix it we need to instruct Hibernate Criteria to use distinct root entities: List products = session.createCriteria(Product.class) .add(Restrictions.eq("id", productId)) .setResultTransformer(CriteriaSpecification.DISTINCT_ROOT_ENTITY) .list(); assertEquals(1, products.size()); Conclusion The EAGER fetching strategy is a code smell. Most often it’s used for simplicity sake without considering the long-term performance penalties. The fetching strategy should never be the entity mapping responsibility. Each business use case has different entity load requirements and therefore the fetching strategy should be delegated to each individual query. The global fetch plan should only define LAZY associations, which are fetched on a per query basis. Combined with the always check generated queries strategy, the query based fetch plans can improve application performance and reduce maintaining costs.Code available for Hibernate and JPA.Reference: EAGER fetching is a code smell from our JCG partner Vlad Mihalcea at the Vlad Mihalcea’s Blog blog....

Spring MVC 4 Quickstart Maven Archetype Improved

Spring Boot allows getting started with Spring extremely easy. But there are still people interested in not using Spring Boot and bootstrap the application in a more classical way. Several years ago, I created an archetype (long before Spring Boot) that simplifies bootstrapping Spring web applications. Although Spring Boot is already some time on the market, Spring MVC 4 Quickstart Maven Archetype is still quite popular project on GitHub. With some recent additions I hope it is even better.           Java 8 I have decided to switch target platform to Java 8. There is not specific Java 8 code in the generated project yet, but I believe all new Spring projects should be started with Java 8. The adoption of Java 8 is ahead of forecasts. Have a look at: https://typesafe.com/company/news/survey-of-more-than-3000-developers-reveals-java-8-adoption-ahead-of-previous-forecasts Introducing Spring IO Platform Spring IO Platform brings together the core Spring APIs into a cohesive platform for modern applications.. The main advantage is that it simplifies dependency management by providing versions of Spring projects along with their dependencies that are tested and known to work together. Previously, all the dependencies were specified manually and solving version conflicts took some time. With Spring IO platform we must change only platform version (and take care of dependencies outside the platform of course): <dependencyManagement> <dependencies> <dependency> <groupId>io.spring.platform</groupId> <artifactId>platform-bom</artifactId> <version>${io.spring.platform-version}</version> <type>pom</type> <scope>import</scope> </dependency> </dependencies> </dependencyManagement> The dependencies can be now used without specifying the version in POM: <!-- Spring --> <dependency> <groupId>org.springframework</groupId> <artifactId>spring-webmvc</artifactId> </dependency> <!-- Security --> <dependency> <groupId>org.springframework.security</groupId> <artifactId>spring-security-config</artifactId> </dependency> <dependency> <groupId>org.springframework.security</groupId> <artifactId>spring-security-web</artifactId> </dependency> Java Security configuration When I firstly created the archetype there was no possibility to configure Spring Security using Java code. But now it is, so I migrated the XML configuration to Java configuration. The SecurityConfig is now extending from WebSecurityConfigurerAdapter and it is marked with @Configuration and @EnableWebMvcSecurity annotations. Security configuration details Restrict access to every URL apart from The XML configuration: <security:intercept-url pattern="/" access="permitAll" /> <security:intercept-url pattern="/resources/**" access="permitAll" /> <security:intercept-url pattern="/signup" access="permitAll" /> <security:intercept-url pattern="/**" access="isAuthenticated()" /> became: http .authorizeRequests() .antMatchers("/", "/resources/**", "/signup").permitAll() .anyRequest().authenticated() Login / Logout The XML configuration: <security:form-login login-page="/signin" authentication-failure-url="/signin?error=1"/> <security:logout logout-url="/logout" /> became: http .formLogin() .loginPage("/signin") .permitAll() .failureUrl("/signin?error=1") .loginProcessingUrl("/authenticate") .and() .logout() .logoutUrl("/logout") .permitAll() .logoutSuccessUrl("/signin?logout"); Remember me The XML configuration: <security:remember-me services-ref="rememberMeServices" key="remember-me-key"/> became: http .rememberMe() .rememberMeServices(rememberMeServices()) .key("remember-me-key"); CSRF enabled for production and disabled for test Currently CSRF is by default enabled, so no additional configuration is needed. But in case of integration tests I wanted to be sure that CSRF is disabled. I could not find a good way of doing this. I started with CSRF protection matcher passed to CsrfConfigurer, but I ended up with lots of code I did not like to have in SecurityConfiguration. I ended up with a NoCsrfSecurityConfig that extends from the original SecurityConfig and disabled CSRF: @Configuration public class NoCsrfSecurityConfig extends SecurityConfig { @Override protected void configure(HttpSecurity http) throws Exception { super.configure(http); http.csrf().disable(); } } Connection pooling HikariCP is now used as default connection pool in the generated application. The default configuration is used: @Bean public DataSource configureDataSource() { HikariConfig config = new HikariConfig(); config.setDriverClassName(driver); config.setJdbcUrl(url); config.setUsername(username); config.setPassword(password); config.addDataSourceProperty("cachePrepStmts", "true"); config.addDataSourceProperty("prepStmtCacheSize", "250"); config.addDataSourceProperty("prepStmtCacheSqlLimit", "2048"); config.addDataSourceProperty("useServerPrepStmts", "true");return new HikariDataSource(config); } More to come Spring MVC 4 Quickstart Maven Archetype is far from finished. As the Spring platform involves the archetype must adjust accordingly. I am looking forward to hear what could be improved to make it a better project. If have an idea or suggestion drop a comment or create an issue on GitHub. ReferencesSpring MVC 4 Quickstart Maven ArchetypeReference: Spring MVC 4 Quickstart Maven Archetype Improved from our JCG partner Rafal Borowiec at the Codeleak.pl blog....

Playing With Java Concurrency

Recently I needed to transform some filet that each has a list (array) of objects in JSON format to files that each has separated lines of the same data (objects). It was a one time task and simple one. I did the reading and writing using some feature of Java nio. I used GSON in the simplest way. One thread runs over the files, converts and writes. The whole operation finished in a few seconds. However, I wanted to play a little bit with concurrency. So I enhanced the tool to work concurrently.         Threads Runnable for reading file. The reader threads are submitted to ExecutorService. The output, which is a list of objects (User in the example), will be put in a BlockingQueue. Runnable for writing file. Each runnable will poll from the blocking queue. It will write lines of data to a file. I don’t add the writer Runnable to the ExecutorService, but instead just start a thread with it. The runnable has a while(some boolen is true) {...} pattern. More about that below… Synchronizing Everything BlockingQueue is the interface of both types of threads. As the writer runnable runs in a while loop (consumer), I wanted to be able to make it stop so the tool will terminate. So I used two objects for that: Semaphore The loop that reads the input files increments a counter. Once I finished traversing the input files and submitted the writers, I initialized a semaphore in the main thread:semaphore.acquire(numberOfFiles); In each reader runable, I released the semaphore: semaphore.release(); AtomicBoolean The while loop of the writers uses an AtomicBoolean. As long as AtomicBoolean==true, the writer will continue. In the main thread, just after the acquire of the semaphore, I set the AtomicBoolean to false. This enables the writer threads to terminate. Using Java NIO In order to scan, read and write the file system, I used some features of Java NIO. Scanning: Files.newDirectoryStream(inputFilesDirectory, "*.json"); Deleting output directory before starting: Files.walkFileTree... BufferedReader and BufferedWriter: Files.newBufferedReader(filePath); Files.newBufferedWriter(fileOutputPath, Charset.defaultCharset()); One note. In order to generate random files for this example, I used apache commons lang: RandomStringUtils.randomAlphabetic All code in GitHub. public class JsonArrayToJsonLines { private final static Path inputFilesDirectory = Paths.get("src\\main\\resources\\files"); private final static Path outputDirectory = Paths .get("src\\main\\resources\\files\\output"); private final static Gson gson = new Gson(); private final BlockingQueue<EntitiesData> entitiesQueue = new LinkedBlockingQueue<>(); private AtomicBoolean stillWorking = new AtomicBoolean(true); private Semaphore semaphore = new Semaphore(0); int numberOfFiles = 0;private JsonArrayToJsonLines() { }public static void main(String[] args) throws IOException, InterruptedException { new JsonArrayToJsonLines().process(); }private void process() throws IOException, InterruptedException { deleteFilesInOutputDir(); final ExecutorService executorService = createExecutorService(); DirectoryStream<Path> directoryStream = Files.newDirectoryStream(inputFilesDirectory, "*.json"); for (int i = 0; i < 2; i++) { new Thread(new JsonElementsFileWriter(stillWorking, semaphore, entitiesQueue)).start(); }directoryStream.forEach(new Consumer<Path>() { @Override public void accept(Path filePath) { numberOfFiles++; executorService.submit(new OriginalFileReader(filePath, entitiesQueue)); } }); semaphore.acquire(numberOfFiles); stillWorking.set(false); shutDownExecutor(executorService); }private void deleteFilesInOutputDir() throws IOException { Files.walkFileTree(outputDirectory, new SimpleFileVisitor<Path>() { @Override public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException { Files.delete(file); return FileVisitResult.CONTINUE; } }); }private ExecutorService createExecutorService() { int numberOfCpus = Runtime.getRuntime().availableProcessors(); return Executors.newFixedThreadPool(numberOfCpus); }private void shutDownExecutor(final ExecutorService executorService) { executorService.shutdown(); try { if (!executorService.awaitTermination(120, TimeUnit.SECONDS)) { executorService.shutdownNow(); }if (!executorService.awaitTermination(120, TimeUnit.SECONDS)) { } } catch (InterruptedException ex) { executorService.shutdownNow(); Thread.currentThread().interrupt(); } }private static final class OriginalFileReader implements Runnable { private final Path filePath; private final BlockingQueue<EntitiesData> entitiesQueue;private OriginalFileReader(Path filePath, BlockingQueue<EntitiesData> entitiesQueue) { this.filePath = filePath; this.entitiesQueue = entitiesQueue; }@Override public void run() { Path fileName = filePath.getFileName(); try { BufferedReader br = Files.newBufferedReader(filePath); User[] entities = gson.fromJson(br, User[].class); System.out.println("---> " + fileName); entitiesQueue.put(new EntitiesData(fileName.toString(), entities)); } catch (IOException | InterruptedException e) { throw new RuntimeException(filePath.toString(), e); } } }private static final class JsonElementsFileWriter implements Runnable { private final BlockingQueue<EntitiesData> entitiesQueue; private final AtomicBoolean stillWorking; private final Semaphore semaphore;private JsonElementsFileWriter(AtomicBoolean stillWorking, Semaphore semaphore, BlockingQueue<EntitiesData> entitiesQueue) { this.stillWorking = stillWorking; this.semaphore = semaphore; this.entitiesQueue = entitiesQueue; }@Override public void run() { while (stillWorking.get()) { try { EntitiesData data = entitiesQueue.poll(100, TimeUnit.MILLISECONDS); if (data != null) { try { String fileOutput = outputDirectory.toString() + File.separator + data.fileName; Path fileOutputPath = Paths.get(fileOutput); BufferedWriter writer = Files.newBufferedWriter(fileOutputPath, Charset.defaultCharset()); for (User user : data.entities) { writer.append(gson.toJson(user)); writer.newLine(); } writer.flush(); System.out.println("=======================================>>>>> " + data.fileName); } catch (IOException e) { throw new RuntimeException(data.fileName, e); } finally { semaphore.release(); } } } catch (InterruptedException e1) { } } } }private static final class EntitiesData { private final String fileName; private final User[] entities;private EntitiesData(String fileName, User[] entities) { this.fileName = fileName; this.entities = entities; } } }Reference: Playing With Java Concurrency from our JCG partner Eyal Golan at the Learning and Improving as a Craftsman Developer blog....

Running Java Mission Control and Flight Recorder against WildFly and EAP

Java Mission Control (JMC) enables you to monitor and manage Java applications without introducing the performance overhead normally associated with these types of tools. It uses data which is already getting collected for normal dynamic optimization of the JVM resulting in a very lightweight approach to observe and analyze problems in the application code. The JMC consists of three different types of tools. A JMX browser which let’s you browse all available JVM instances on a machine and a JMX console which let’s you browse through the JMX tree on a connected JVM. Last but not least the most interesting aspect is the Java Flight Recorder (JFR). This is exactly the part of the tooling which does the low overhead profiling of JVM instances.   Disclaimer: A Word On Licensing The tooling is part of the Oracle JDK downloads. In particular the JMC 5.4 is part of JDK 8u20 and JDK 7u71 and is distributed under the Oracle Binary Code License Agreement for Java SE Platform products and commercially available features for Java SE Advanced and Java SE Suite. IANAL, but as far as I know this allows for using it for your personal education and potentially also as part of your developer tests. Make sure to check back with whomever you know that could answer this question. This blog post looks at it as a small little how-to and assumes, that you know what you are doing from a license perspective. Adding Java Optional Parameters Unlocking the JFR features requires you to put in some optional parameters to your WildFly 8.x/EAP 6.x configuration. Find the  $JBOSS_HOME/bin/standalone.conf|conf.bat and add the following parameters: -XX:+UnlockCommercialFeatures -XX:+FlightRecorder You can now use jcmd command like described in this knowledge-base entry to start a recording. Another way is actually to start a recording directly from JMC. Starting A Recording From JMC First step is to start JMC. Find it in the %JAVA_HOME%/bin folder. After it started you can use the JVM Browser to find the WildFly/EAP instance you want to connect to. Right click on it to see all the available options. You can either start the JMX Console or start a Flight Recording. The JMX console is a bit fancier than the JConsole and allows for a bunch of metrics and statistics. It also allows you to set a bunch of triggers and browser MBeans and whatnot. Please look at the documentation for all the details. What is really interesting is the function to start a Flight Recording. If you select this option, a new wizard pops up and lets you tweak the settings a bit. Beside having to select a folder where the recording gets stored you also have the choice between different recording templates.A one minute recording with the “Server Profiling” template with barely any load on the server results in a 1.5 MB file. So, better keep an eye on the volume you’re storing all that stuff at. You can also decide the profiling granularity for a bunch of parameters further down the dialogues. But at the end, you click “Finish” and the recording session starts. You can decide to push it to the background and keep working while the data gets captured. Analyzing Flight Recorder Files This is pretty easy. You can open the recording with JMC and click through the results. If you enabled the default recording with the additional parameter: -XX:FlightRecorderOptions=defaultrecording=true you can also directly dump the recording via the JVM browser. It is easy to pick a time-frame that you want to download the data for or alternatively you can also decide to download the complete recording.Reference: Running Java Mission Control and Flight Recorder against WildFly and EAP from our JCG partner Markus Eisele at the Enterprise Software Development with Java blog....

Thread local storage in Java

One of the rarely known features among developers is Thread-local storage.  The idea is simple and need for it comes in  scenarios where we need data that is … well local for the thread. If we have two threads we that refer to the same global variable but we wanna them to have separate value independently initialized of each other.     Most major programming languages have implementation of the concept. For example C++11 has even the thread_local keyword, Ruby has chosen an API approach . Java has also an implementation of the concept with  java.lang.ThreadLocal<T> and its subclass java.lang.InheritableThreadLocal<T> since version 1.2, so nothing new and shiny here. Let’s say that for some reason we need to have an Long specific for our thread. Using Thread local that would simple be: public class ThreadLocalExample {public static class SomethingToRun implements Runnable {private ThreadLocal threadLocal = new ThreadLocal();@Override public void run() { System.out.println(Thread.currentThread().getName() + " " + threadLocal.get());try { Thread.sleep(2000); } catch (InterruptedException e) { }threadLocal.set(System.nanoTime()); System.out.println(Thread.currentThread().getName() + " " + threadLocal.get()); } }public static void main(String[] args) { SomethingToRun sharedRunnableInstance = new SomethingToRun();Thread thread1 = new Thread(sharedRunnableInstance); Thread thread2 = new Thread(sharedRunnableInstance);thread1.start(); thread2.start(); }} One possible sample run of the following code will result into : Thread-0 nullThread-0 132466384576241Thread-1 nullThread-1 132466394296347 At the beginning the value is set to null to both threads, obviously each of them works with separate values since after setting the value to System.nanoTime() on Thread-0 it will not have any effect on the value of Thread-1 exactly as we wanted, a thread scoped long variable. One nice side effect is a case where the thread calls multiple methods from various classes. They will all be able to use the same thread scoped variable without major API changes. Since the value is not explicitly passed through one might argue it difficult to test and bad for design, but that is a separate topic altogether. In what areas are popular frameworks using Thread Locals? Spring being one of the most popular frameworks in Java uses ThreadLocals internally for many parts, easily shown by a simple github search. Most of the usages are related to the current’s user’s actions or information. This is actually one of the main uses for ThreadLocals in JavaEE world, storing information for the current request like in RequestContextHolder : private static final ThreadLocal<RequestAttributes> requestAttributesHolder = new NamedThreadLocal<RequestAttributes>("Request attributes"); Or the current JDBC connection user credentials in UserCredentialsDataSourceAdapter. If we get back on RequestContextHolder we can use this class to access all of the current request information for anywhere in our code. Common use case for this is LocaleContextHolder that helps us store the current user’s locale. Mockito uses it to store the current “global” configuration and if we take a look at any framework out there there is a high chance we’ll find it as well. Thread Locals and Memory Leaks We learned this awesome little feature so let’s use it all over the place. We can do that but few google searches and we can find out that most out there say ThreadLocal is evil. That’s not exactly true, it is a nice utility but in some contexts it might be easy to create a memory leak.“Can you cause unintended object retention with thread locals? Sure you can. But you can do this with arrays too. That doesn’t mean that thread locals (or arrays) are bad things. Merely that you have to use them with some care. The use of thread pools demands extreme care. Sloppy use of thread pools in combination with sloppy use of thread locals can cause unintended object retention, as has been noted in many places. But placing the blame on thread locals is unwarranted.” – Joshua BlochIt is very easy to create a memory leak in your server code using ThreadLocal if it runs on an application server. ThreadLocal context is associated to the thread where it runs, and will be garbaged once the thread is dead. Modern app servers use pool of threads instead of creating new ones on each request meaning you can end up holding large objects indefinitely in your application.  Since the thread pool is from the app server our memory leak could remain even after we unload our application. The fix for this is simple, free up resources you do not need. One other ThreadLocal misuse is API design. Often I have seen use of RequestContextHolder(that holds ThreadLocal) all over the place, like the DAO layer for example. Later on if one were to call the same DAO methods outside a request like and scheduler for example he would get a very bad surprise. This create black magic and many maintenance developers who will eventually figure out where you live and pay you a visit. Even though the variables in ThreadLocal are local to the thread they are very much global in your code. Make sure you really need this thread scope before you use it. More info on the topichttp://en.wikipedia.org/wiki/Thread-local_storage http://www.appneta.com/blog/introduction-to-javas-threadlocal-storage/ https://plumbr.eu/blog/how-to-shoot-yourself-in-foot-with-threadlocals http://stackoverflow.com/questions/817856/when-and-how-should-i-use-a-threadlocal-variable https://plumbr.eu/blog/when-and-how-to-use-a-threadlocal https://weblogs.java.net/blog/jjviana/archive/2010/06/09/dealing-glassfish-301-memory-leak-or-threadlocal-thread-pool-bad-ide https://software.intel.com/en-us/articles/use-thread-local-storage-to-reduce-synchronizationReference: Thread local storage in Java from our JCG partner Mite Mitreski at the Java Advent Calendar blog....

How and Why is Unsafe used in Java?

Overview sun.misc.Unsafe has been in Java from at least as far back as Java 1.4 (2004). In Java 9, Unsafe will be hidden along with many other, for-internal-use classes. to improve the maintainability of the JVM. While it is still unclear exactly what will replace Unsafe, and I suspect it will be more than one thing which replaces it, it raises the question, why is it used at all?         Doing things which the Java language doesn’t allow but are still useful. Java doesn’t allow many of the tricks which are available to lower level languages. For most developers this is very good thing, and it not only saves you from yourself, it also saves you from your co-workers. It also makes it easier to import open source code because you know there is limits to how much damage they can do. Or at least there is limits to how much you can do accidentally. If you try hard enough you can still do damage. But why would you even try, you might wonder? When building libraries many (but not all) of the methods in Unsafe are useful and in some cases, there is no other way to do the same thing without using JNI, which is even more dangerous and you lose the “compile once, run anywhere” Deserialization of objects When deserializing or building an object using a framework, you make the assumption you want to reconstitute an object which existed before. You expect that you will use reflection to either call the setters of the class, or more likely set the internal fields directly, even the final fields. The problem is you want to create an instance of an object, but you don’t really need a constructor as this is likely to only make things more difficult and have side effects. public class A implements Serializable { private final int num; public A(int num) { System.out.println("Hello Mum"); this.num = num; }public int getNum() { return num; } }In this class, you should be able to rebuild and set the final field, but if you have to call a constructor and it might do things which don’t have anything to do with deserialization. For these reasons many libraries use Unsafe to create instances without calling a constructor. Unsafe unsafe = getUnsafe(); Class aClass = A.class; A a = (A) unsafe.allocateInstance(aClass); Calling allocateInstance avoids the need to call the appropriate constructor, when we don’t need one. Thread safe access to direct memory Another use for Unsafe is thread safe access to off heap memory. ByteBuffer gives you safe access to off heap or direct memory, however it doesn’t have any thread safe operations. This is particularly useful if you want to share data between processes. import sun.misc.Unsafe; import sun.nio.ch.DirectBuffer;import java.io.File; import java.io.IOException; import java.io.RandomAccessFile; import java.lang.reflect.Field; import java.nio.MappedByteBuffer; import java.nio.channels.FileChannel;public class PingPongMapMain { public static void main(String... args) throws IOException { boolean odd; switch (args.length < 1 ? "usage" : args[0].toLowerCase()) { case "odd": odd = true; break; case "even": odd = false; break; default: System.err.println("Usage: java PingPongMain [odd|even]"); return; } int runs = 10000000; long start = 0; System.out.println("Waiting for the other odd/even"); File counters = new File(System.getProperty("java.io.tmpdir"), "counters.deleteme"); counters.deleteOnExit();try (FileChannel fc = new RandomAccessFile(counters, "rw").getChannel()) { MappedByteBuffer mbb = fc.map(FileChannel.MapMode.READ_WRITE, 0, 1024); long address = ((DirectBuffer) mbb).address(); for (int i = -1; i < runs; i++) { for (; ; ) { long value = UNSAFE.getLongVolatile(null, address); boolean isOdd = (value & 1) != 0; if (isOdd != odd) // wait for the other side. continue; // make the change atomic, just in case there is more than one odd/even process if (UNSAFE.compareAndSwapLong(null, address, value, value + 1)) break; } if (i == 0) { System.out.println("Started"); start = System.nanoTime(); } } } System.out.printf("... Finished, average ping/pong took %,d ns%n", (System.nanoTime() - start) / runs); }static final Unsafe UNSAFE;static { try { Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe"); theUnsafe.setAccessible(true); UNSAFE = (Unsafe) theUnsafe.get(null); } catch (Exception e) { throw new AssertionError(e); } } } When you run this in two programs, one with odd and the other with even. You can see that each process is changing data via  persisted shared memory. In each program it maps the same are of the disks cache into the process. There is actually only one copy of the file in memory. This means the memory can be shared, provided you use thread safe operations such as the volatile and CAS operations. The output on an i7-3970X is Waiting for the other odd/even Started … Finished, average ping/pong took 83 ns That is 83 ns round trip time between two processes. When you consider System V IPC takes around 2,500 ns and IPC volatile instead of persisted, that is pretty quick. Is using Unsafe suitable for work? I wouldn’t recommend you use Unsafe directly. It requires far more testing than natural Java development. For this reason I suggest you use a library where it’s usage has been tested already.  If you wan to use Unsafe yourself, I suggest you thoughly test it’s usage in a stand alone library. This limits how Unsafe is used in your application and give syou a safer, Unsafe. Conclusion It is interesting that Unsafe exists in Java, and you might like to play with it at home. It has some work applications especially in writing low level libraries, but in general it is better to use a library which uses Unsafe which has been tested than use it directly yourself.Reference: How and Why is Unsafe used in Java? from our JCG partner Peter Lawrey at the Java Advent Calendar blog....

Lightweight Integration with Java EE and Camel

Enterprise Java has different flavors and perspectives. Starting at the plain platform technology, which is well known as Java EE over to different frameworks and integration aspects and finally use-cases which involve data-centric user interfaces or specific visualizations. The most prominent problem which isn’t solved by Java EE itself is “integration”. There are plenty of products out there from well know vendors which solve all kinds of integration problems and promise to deliver complete solutions. As a developer, all you need from time to time is a solution that just works. This is the ultimate “Getting Started Resource” for Java EE developers when it comes to system integration.   A Bit Of Integration Theory Integration challenges are nothing new. Since there has been different kinds of system and the need to combine their data into another one, this has been a central topic. Gregor Hohpe and Bobby Woolf started to collect a set of basic patterns they used to solve their customers integration problems with. These Enterprise Integration Patterns (EIPs) can be considered the bible of integration. It tries to find a common vocabulary and body of knowledge around asynchronous messaging architectures by defining 65 integration pattern. Forrester calls those “The core language of EAI”. What Is Apache Camel? Apache Camel offers you the interfaces for the EIPs, the base objects, commonly needed implementations, debugging tools, a configuration system, and many other helpers which will save you a ton of time when you want to implement your solution to follow the EIPs. It’s a complete production-ready framework. But it does not stop at those initially defined 65 patterns. It extends those with over 150 ready-to-use components which solve different problems around endpoints or system or technology integration. At a high level Camel consists of a CamelContext which contains a collection of Component instances. A Component is essentially a factory of Endpoint instances. You can explicitly configure Component instances in Java code or an IoC container like Spring, Guice or CDI, or they can be auto-discovered using URIs. Why Should A Java EE Developer Care? Enterprise projects require us to do so. Dealing with all sorts of system integrations always has been a challenging topic. You can either chose the complex road by using messaging systems and wiring them into your application and implement everything yourself or go the heavyweight road by using different products. I have been a fan of more pragmatic solutions since ever. And this is what Camel actually is: Comparably lightweight, easy to bootstrap and coming with a huge amount of pre-built integration components which let the developer focus on solving the business requirement behind it. Without having to learn new APIs or tooling. Camel comes with either a Java-based Fluent API, Spring or Blueprint XML Configuration files, and even a Scala DSL. So no matter which base you start to jump off from, you’ll always find something that you already know. How To Get Started? Did I got you? Want to give it a try? That’s easy, too. You have different ways according to the frameworks and platform you use. Looking back at the post title, this is going to focus on Java EE. So, first thing you can do is just bootstrap Camel yourself. All you need is the core camel dependency and the cdi-camel dependency. Setting up a plain Java EE 7 maven project and adding those two is more than sufficient. <dependency> <groupId>org.apache.camel</groupId> <artifactId>camel-core</artifactId> <version>${camel.version}</version> </dependency> <dependency> <groupId>org.apache.camel</groupId> <artifactId>camel-cdi</artifactId> <version>${camel.version}</version> </dependency> Next thing you need to do is find a place to inject your first CamelContext. @Inject CdiCamelContext context; After everything is injected, you can start adding routes to it. A more complete example can be found in my CamelEE7 project on GitHub. Just fork it an go ahead. This one will work on a random Java EE application server. If you are on WildFly already, you can also take full advantage of the WildFly-Camel subsystem. The WildFly Camel Subsystem The strategy of wildfly-camel is, that a user can “just use” the camel core/component APIs in deployments that WildFly supports already. In other words, Camel should “just work” in standard Java EE deployments. The binaries are be provided by the platform. The deployment should not need to worry about module/wiring details. Defining and Deploying Camel Contexts can be done in different ways. You either can directly define a Context in your standalone-camel.xml server configuration or deploy it as part of your web-app either as a single XML file with a predefined -camel-context.xml file suffix or as part of another WildFly supported deployment as META-INF/jboss-camel-context.xml file. The WildFly Camel test suite uses the WildFly Arquillian managed container. This can connect to an already running WildFly instance or alternatively start up a standalone server instance when needed. A number of test enrichers have been implemented that allow you have these WildFly Camel specific types injected into your Arquillian test cases; you can inject a CamelContextFactory or a CamelContextRegistry as an  @ArquillianResource. If you want to get started with that you can take a more detailed look at my more detailed blog-post. Finding ExamplesIf you are excited and got everything up and running it is time to dig into some examples. First place to look at is the example directory in the distribution. There is an example for everything that you might need. One of the most important use-cases is the tight integration with ActiveMQ. And assuming that you have something like a bunch of JMS messages that need to be converted into Files that are stored in a filesystem: This is a perfect Camel job. You need to configure the ActiveMQ component additional to what you’ve seen above and it allows messages to be sent to a JMS Queue or Topic or messages to be consumed from a JMS Queue or Topic using Apache ActiveMQ. Teh following code shows you what it takes to convert a JMS messages from the test.queue queue into the file component which consumes them and stores them to disk.       context.addRoutes(new RouteBuilder() { public void configure() { from("test-jms:queue:test.queue").to("file://test"); } }); Imagine to do this yourself. Want more sophisticated examples? With Twitter integration? Or different other technologies? There are plenty of examples out there to pick from. Probably one of the most exciting aspects of Camel. It is lightweight, stable and out there since years. Make sure to also follow the mailing-lists and the discussion forums.Reference: Lightweight Integration with Java EE and Camel from our JCG partner Markus Eisele at the Java Advent Calendar blog....

Why Technical Resumes Need a Profile (because we’re dumb)

There is significant variation in résumé format across candidates. Name and contact information is always on top, but on any given day a recruiter might see the next section as Education, Skills, Experience, or even (gasp) an Objective. Career length influences which section comes first. Entry-level candidates usually choose Education, while veteran candidates gravitate towards experience and accomplishments. Unfortunately, going from a glance at contact information to dissecting intimate project details doesn’t make for a smooth transition. It’s jarring. A section that serves as a buffer to introduce résumé content that also subliminally instructs the reader on what content to pay attention to will help. And remember the résumé’s audience. Most recruiters and HR personnel don’t have the background of the candidates they assess, so reviewers benefit from any guidance (even subliminal) provided to understand content. Since few grow up aspiring to the glamorous world of tech recruitment, the industry is typically stocked with C students. Since a résumé “states your case” to employers, let’s look at lawyers… The Purpose of Opening Statements When trial attorneys present cases to juries, they don’t immediately start questioning witnesses. They start with opening statements. Why? The opening statement informs the jury as to what evidence they will hear during the trial, and why that evidence is significant to the case. The statement is a roadmap on what the attorney wants jurors to listen for (or ignore) during the trial and which elements are critical to the case. It provides background information and announces what is forthcoming. Before trial, jurors know nothing about the case. Without opening statements, jurors may get lost in trivial details while missing out on the important elements the attorney wants them to hear. Attorneys can’t trust a jury’s ability to make those decisions independently, so opening statements influence thought process. It is paramount that attorneys present their case in a manner consistent with their opening statements. Diversions from that roadmap will cause the jury to distrust the attorney and detract from the attorney’s credibility. Back to résumés… Just as jurors know nothing before trial, recruiters know nothing about applicants until they open the résumé. Job seekers today are less likely to provide a cover letter (with recruiters less likely to read them), and résumés are often given brief initial screening by untrained eyes. This creates a problem for qualified applicants who may be wrongfully passed over. What is the optimal strategy for expressing experience and ensuring that even novice reviewers will properly identify qualified candidates?  An opening statement. The Purpose of the Profile Profiles are the opening statement in a case for an interview, with the résumé content that follows the evidence. The Profile introduces experience to be detailed later in the document, which tacitly baits reviewers into seeking out evidence to specifically support (or refute) those claims. A résumé without a Profile makes no claims to be proven or disproven, and doesn’t give the reader any additional instruction on what to seek. When Profile claims are corroborated by details of experience, it results in a “buy” from the reader. The Profile was a promise of sorts, later fulfilled by the supporting evidence. When a Profile doesn’t reflect experience, it exposes the candidate as a potential fraud and detracts from any relevant experience the candidate does possess. Qualified candidates with overreaching Profiles put themselves in a precarious situation. Even well-written Profiles are a negative mark on applicants when the claims are inaccurate or unsupported. Just as attorneys must lay out cases in accordance with their opening statements, experience must match Profiles. Typical Profiles Are Noise, Not Signal The overwhelming majority of Profile statements are virtually identical. Words and phrases like hard-working, intelligent, dedicated, career-minded, innovative, etc. are, in this context, mere self-assessments impossible to qualify. It’s fluff, and contributes résumé noise that distracts readers’ attention from signal. Writing Profiles Useful Profiles clearly say what you have done and can do, and are ideally quantified for the reader to prevent any misunderstanding. If a temp at a startup is tasked to find résumés of software engineers with Python and Django experience, he is unlikely to ignore résumés with Profiles stating “Software engineer with six years of experience building solutions with Python and Django“. For candidates attempting to transition into new roles that might be less obvious to a reader, a Profile must double as a disguised Objective. These Profiles will first state the candidate’s current experience and end with what type of work the applicant seeks. “Systems Administrator with three years of Python and Bash scripting experience seeks transition into dedicated junior DevOps role” provides background as well as future intent, but the last seven words are needed to get the average recruiter’s attention. Just as Objectives are altered to match requirements, consider tweaking a Profile to highlight required skills. A candidate that identifies herself as a Mobile Developer in the Profile might not get selected for interview for a Web Developer position, even when the résumé demonstrates all necessary qualifications. How a candidate self-identifies suggests their career interests, unless stated otherwise (see paragraph above). Based on the importance of having Profile/experience agreement, it’s suggested that the Profile is written last. Lawyers can’t write an opening statement before knowing their case, and candidates should have all of their corroborating evidence in place before attempting to summarize it for clear interpretation. Conclusion Assume your résumé reviewer knows little about what you do, and that they need to be explicitly told what you do without having to interpret it through your listed experience. Identify yourself in the Profile as closely to the job description (and perhaps even title) as possible. Make sure that all claims made in the Profile are supported by evidence somewhere else in the résumé, ideally early on.Reference: Why Technical Resumes Need a Profile (because we’re dumb) from our JCG partner Dave Fecak at the Job Tips For Geeks blog....

Use Cases for Elasticsearch: Analytics

In the last post in this series we have seen how we can use Logstash, Elasticsearch and Kibana for doing logfile analytics. This week we will look at the general capabilities for doing analytics on any data using Elasticsearch and Kibana. Use Case We have already seen that Elasticsearch can be used to store large amounts of data. Instead of putting data into a data warehouse Elasticsearch can be used to do analytics and reporting. Another use case is social media data: Companies can look at what happens with their brand if they have the possibility to easily search it. Data can be ingested from multiple sources, e.g. Twitter and Facebook and combined in one system. Visualizing data in tools like Kibana can help with exploring large data sets. Finally mechanisms like Elasticsearchs Aggregations can help with finding new ways to look at the data. Aggregations Aggregations provide what the now deprecated facets have been providing but also a lot more. They can combine and count values from different documents and therefore show you what is contained in your data. For example if you have tweets indexed in Elasticsearch you can use the terms aggregation to find the most common hashtags. For details on indexing tweets in Elasticsearch see this post on the Twitter River and this post on the Twitter input for Logstash. curl -XGET "http://localhost:9200/devoxx/tweet/_search" -d' { "aggs" : { "hashtags" : { "terms" : { "field" : "hashtag.text" } } } }' Aggregations are requested using the aggs keyword, hashtags is a name I have chosen to identify the result and the terms aggregation counts the different terms for the given field (Disclaimer: For a sharded setup the terms aggregation might not be totally exact). This request might result in something like this: "aggregations": { "hashtags": { "buckets": [ { "key": "dartlang", "doc_count": 229 }, { "key": "java", "doc_count": 216 }, [...] The result is available for the name we have chosen. Aggregations put the counts into buckets that contain of a value and a count. This is very similar to how faceting works, only the names are different. For this example we can see that there are 229 documents for the hashtag dartlang and 216 containing the hashtag java. This could also be done with facets alone but there is more: Aggregations can even be combined. You can now nest another aggregation in the first one that for every bucket will give you more buckets for another criteria. curl -XGET "http://localhost:9200/devoxx/tweet/_search" -d' { "aggs" : { "hashtags" : { "terms" : { "field" : "hashtag.text" }, "aggs" : { "hashtagusers" : { "terms" : { "field" : "user.screen_name" } } } } } }' We still request the terms aggregation for the hashtag. But now we have another aggregation embedded, a terms aggregation that processes the user name. This will then result in something like this. "key": "scala", "doc_count": 130, "hashtagusers": { "buckets": [ { "key": "jaceklaskowski", "doc_count": 74 }, { "key": "ManningBooks", "doc_count": 3 }, [...] We can now see the users that have used a certain hashtext. In this case one user used one hashtag a lot. This is information that is not available that easily with queries and facets alone. Besides the terms aggreagtion we have seen here there are also lots of other interesting aggregations available and more are added with every release. You can choose between bucket aggregations (like the terms aggregation) and metrics aggregations, that calculate values from the buckets, e.g. averages oder other statistical values. Visualizing the Data Besides the JSON output we have seen above, the data can also be used for visualizations. This is something that can then be prepared even for a non technical audience. Kibana is one of the options that is often used for logfile data but can be used for data of all kind, e.g. the Twitter data we have already seen above.There are two bar charts that display the term frequencies for the mentions and the hashtags. We can already see easily which values are dominant. Also, the date histogram to the right shows at what time most tweets are sent. All in all these visualizations can provide a lot of value when it comes to trends that are only seen when combining the data. The image shows Kibana 3, which still relies on the facet feature. Kibana 4 will instead provide access to the aggregations. Conclusion This post ends the series on use cases for Elasticsearch. I hope you enjoyed reading it and maybe you learned something new along the way. I can’t spend that much time blogging anymore but new posts will be coming. Keep an eye on this blog.Reference: Use Cases for Elasticsearch: Analytics from our JCG partner Florian Hopf at the Dev Time blog....

RabbitMQ – Processing messages serially using Spring integration Java DSL

If you ever have a need to process messages serially with RabbitMQ with a cluster of listeners processing the messages, the best way that I have seen is to use a “exclusive consumer” flag on a listener with 1 thread on each listener processing the messages. Exclusive consumer flag ensures that only 1 consumer can read messages from the specific queue, and 1 thread on that consumer ensures that the messages are processed serially. There is a catch however, I will go over it later. Let me demonstrate this behavior with a Spring Boot and Spring Integration based RabbitMQ message consumer. First, this is the configuration for setting up a queue using Spring java configuration, note that since this is a Spring Boot application, it automatically creates a RabbitMQ connection factory when the Spring-amqp library is added to the list of dependencies: @Configuration @Configuration public class RabbitConfig {@Autowired private ConnectionFactory rabbitConnectionFactory;@Bean public Queue sampleQueue() { return new Queue("sample.queue", true, false, false); }} Given this sample queue, a listener which gets the messages from this queue and processes them looks like this, the flow is written using the excellent Spring integration Java DSL library: @Configuration public class RabbitInboundFlow { private static final Logger logger = LoggerFactory.getLogger(RabbitInboundFlow.class);@Autowired private RabbitConfig rabbitConfig;@Autowired private ConnectionFactory connectionFactory;@Bean public SimpleMessageListenerContainer simpleMessageListenerContainer() { SimpleMessageListenerContainer listenerContainer = new SimpleMessageListenerContainer(); listenerContainer.setConnectionFactory(this.connectionFactory); listenerContainer.setQueues(this.rabbitConfig.sampleQueue()); listenerContainer.setConcurrentConsumers(1); listenerContainer.setExclusive(true); return listenerContainer; }@Bean public IntegrationFlow inboundFlow() { return IntegrationFlows.from(Amqp.inboundAdapter(simpleMessageListenerContainer())) .transform(Transformers.objectToString()) .handle((m) -> { logger.info("Processed {}", m.getPayload()); }) .get(); }} The flow is very concisely expressed in the inboundFlow method, a message payload from RabbitMQ is transformed from byte array to String and finally processed by simply logging the message to the logs. The important part of the flow is the listener configuration, note the flag which sets the consumer to be an exclusive consumer and within this consumer the number of threads processing is set to 1. Given this even if multiple instances of the application is started up only 1 of the listeners will be able to connect and process messages. Now for the catch, consider a case where the processing of messages takes a while to complete and rolls back during processing of the message. If the instance of the application handling the message were to be stopped in the middle of processing such a message, then the behavior is a different instance will start handling the messages in the queue, when the stopped instance rolls back the message, the rolled back message is then delivered to the new exclusive consumer, thus getting a message out of order.If you are interested in exploring this further, here is a github project to play with this feature: https://github.com/bijukunjummen/test-rabbit-exclusive.Reference: RabbitMQ – Processing messages serially using Spring integration Java DSL from our JCG partner Biju Kunjummen at the all and sundry blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below: