Featured FREE Whitepapers

What's New Here?

software-development-2-logo

An open web application framework benchmark

Selecting a platform for your next application development project can be a complex and burdensome undertaking. It can also be very intriguing and a lot of fun. There’s a wide range of different approaches to take: at one end The Architect will attend conferences, purchase and study analyst reports from established technology research companies such as Gartner, and base his evaluation on analyst views. Another approach is to set up a cross-disciplinary evaluation committee that will collect a wishlist of platform requirements from around the organization and make its decision based on a consensus vote. The first approach is very autocratic, while the second can sometimes lead to lack of focus. A clear, coherent vision of requirements and prioritization is essential for the success of the evaluation. Due to these problems, a middle road and a more pragmatic approach is becoming increasingly popular: a tightly-knit group of senior propellerheads use a more empiric method of analysing requirements, study and experiment with potential solution stack elements, brainstorm to produce a short list of candidates to be validated using a hands-on architecture exercises and smell-tests. Though hands-on experimentation can lead to better results, the cost of this method can be prohibitive, so often only a handful of solutions that pass the first phase screening can be evaluated this way. Platform evaluation criteria depend on the project requirements and may include:developer productivity platform stability roadmap alignment with projected requirements tools support information security strategic partnerships developer ecosystem existing software license and human capital investments etc.Performance and scalability are often high priority concerns. They are also among those platform properties that can be formulated into quantifiable criteria, though the key challenge here is how to model the user and implement performance tests that accurately model your expected workloads. Benchmarking several different platforms can only add to the cost of benchmarking. A company called TechEmpower has started a project called TechEmpower Framework Benchmarks, or TFB for short, that aims to compare the performance of different web frameworks. The project publishes benchmark results that application developers can use to make more informed decisions when selecting frameworks. What’s particularly interesting about FrameworkBenchmarks, is that it’s a collaborative effort conducted in an open manner. Development related discussions take place in an online forum and the source code repository is publicly available on GitHub. Doing test implementation development in the open is important for enabling peer review and it allows implementations to evolve and improve over time. The project implements performance tests for a wide variety of frameworks, and chances are that the ones that you’re planning to use are included. If not, you can create your own tests and submit them to be included in the project code base. You can also take the tests and run the benchmarks on your own hardware. Openly published test implementations are not only useful for producing benchmark data, but can also be used by framework developers to communicate framework performance related best practices to application developers. They also allow framework developers to receive reproducible performance benchmarking feedback and data for optimization purposes. It’s interesting to note that the test implementations have been designed and built by different groups and individuals, and some may have been more rigorously optimized than others. The benchmarks measure the performance of the framework as much as they measure the test implementation, and in some cases suboptimal test implementation will result in poor overall performance. Framework torchbearers are expected to take their best shot in optimizing the test implementation, so the implementations should eventually converge to optimal solutions given enough active framework pundits. Test types In the project’s parlance, the combination of programming language, framework and database used is termed “framework permutation” or just permutation, and some test types have been implemented in 100+ different permutations. The different test types include:JSON serialization “test framework fundamentals including keep-alive support, request routing, request header parsing, object instantiation, JSON serialization, response header generation, and request count throughput.” Single database query “exercise the framework’s object-relational mapper (ORM), random number generator, database driver, and database connection pool.” Multiple database queries “This test is a variation of Test #2 and also uses the World table. Multiple rows are fetched to more dramatically punish the database driver and connection pool. At the highest queries-per-request tested (20), this test demonstrates all frameworks’ convergence toward zero requests-per-second as database activity increases.” Fortunes “This test exercises the ORM, database connectivity, dynamic-size collections, sorting, server-side templates, XSS countermeasures, and character encoding.” Database updates “This test is a variation of Test #3 that exercises the ORM’s persistence of objects and the database driver’s performance at running UPDATE statements or similar. The spirit of this test is to exercise a variable number of read-then-write style database operations.” Plaintext “This test is an exercise of the request-routing fundamentals only, designed to demonstrate the capacity of high-performance platforms in particular. The response payload is still small, meaning good performance is still necessary in order to saturate the gigabit Ethernet of the test environment.”Notes on Round 9 results Currently, the latest benchmark is Round 9 and the result data is published on the project web page. The data is not available in machine-readable form and it can’t be sorted by column for analysing patterns. It can, however, be imported into a spreadsheet program fairly easily, so I took the data and analyzed it a bit. Some interesting observations could be made just by looking at the raw data. In addition to comparing throughput, it’s also interesting to compare how well frameworks scale. One way of quantifying scalability is to take test implementation throughput figures for the lowest and highest concurrency level (for test types 1, 2, 4 and 6) per framework and plot them on a 2-D plane. A line can then be drawn between these two points with the slope characterizing scalability. Well-scaling test implementations would be expected to have a positive, steep slope for test types 1, 2, 4 and 6 whereas for test types 3 and 5 the slope is expected to be negative. This model is not entirely without problems since the scalability rating is not relative to the throughput, so e.g. a poorly performing framework can end up having a great scalability rating. As a result, you’d have to look at these figures together. To better visualize throughput against concurrency level (“Peak Hosting” environment data), I created a small web app that’s available at http://tfb-kippo.rhcloud.com/ (the app is subject to removal without notice). JSON serialization The JSON serialization test aims to measure framework overhead. One could argue that it’s a bit of a micro benchmark, but it should demonstrate how well the framework does with basic tasks like request routing, JSON serialization and response generation. The top 10 frameworks were based on the following programming languages: C++, Java, Lua, Ur and Go. C++ based CPPSP was the clear winner while the next 6 contestants were Java -based. No database is used in this test type. The top 7 frameworks with highest throughput also have the highest scalability rating. After that, both these figures start declining fairly rapidly. This is a very simple test and it’s a bit of a surprise to see such large variation in results. In their commentary TechEmpower attributes some of the differences to how well frameworks work on a NUMA-based system architecture. Quite many frameworks are Java or JVM based and rather large variations exist even within this group, so clearly neither the language nor the JVM is an impeding factor in this group. I was surprised about Node.js and HHVM rankings. Unfortunately, the Scala-based Spray test implementation, as well as the JVM-based polyglot framework Vert.x implementation, were removed due to being outdated. Hope to see these included in a future benchmark round. Single database query This test type measures database access throughput and parallelizability. Again, surprisingly large spread in performance can be observed for a fairly trivial test case. This would seem to suggest that framework or database access method overhead contributes significantly to the results. Is the database access technology (DB driver or ORM) a bottleneck? Or is the backend system one? It would be interesting to look at the system activity reports from test runs to analyze potential bottlenecks in more detail. Before seeing the results I would’ve expected the DB backend to be the bottleneck, but this doesn’t appear to be clear-cut based on the fact that the top, as well as many of the bottom performing test implementations, are using the same DB. It was interesting to note that the top six test implementations use a relational database with the first NoSQL based implementation taking 7th place. This test runs DB read statements by ID, which NoSQL databases should be very good at. Top performing 10 frameworks were based on Java, C++, Lua and PHP languages and are using MySQL, PostgreSQL and MongoDB databases. Java based Gemini leads with CPPSP being second. Both use MySQL DB. Spring based test implementation performance was a bit of a disappointment. Multiple database queries Where the previous test exercised a single database query per request this test does a variable number of database queries per request. Again, I would’ve assumed this test would measure the backend database performance more than the framework performance, but it seems that framework and database access method overhead can also contribute significantly. The top two performers in this test are Dart based implementations that use MongoDB. Top 10 frameworks in this test are based on Dart, Java, Clojure, PHP and C# languages and they use MongoDB and MySQL databases. Fortunes This is the most complex test that aims to exercise the full framework stack from request routing through business logic execution, database access, templating and response generation. Top 10 frameworks are based on C++, Java, Ur, Scala, PHP languages and with the full spectrum of databases being used (MySQL, PostgreSQL and MongoDB). Database updates In addition to reads this test exercises database updates as well. HHVM wins this test with 3 Node.js based frameworks coming next. Similar to the Single database query test the top 13 implementations work with relational MySQL DB, before NoSQL implementations. This test exercises simple read and write data access by ID which, again, should be one of NoSQL database strong points. Top performing 10 frameworks were based on PHP, JavaScript, Scala, Java and Go languages, all of which use the MySQL database. Plaintext The aim of this test is to measure how well the framework performs under extreme load conditions and massive client parallelism. Since there’s no backend system dependencies involved, this test measures platform and framework concurrency limits. Throughput plateaus or starts degrading with top-performing frameworks in this test before client concurrency level reaches the maximum value, which seems to suggest that a bottleneck is being hit somewhere in the test setup, presumably hardware, OS and/or framework concurrency. Many frameworks are at their best with concurrency level of 256, except CPPSP which peaks at 1024. CPPSP is the only one of the top-performing implementations that is able to significantly improve its performance as the concurrency level increases from 256, but even with CPPSP throughput actually starts dropping after concurrency level hits the 4,096 mark. Only 12 test implementations are able to exceed 1 M requests per second. Some well-known platforms e.g. Spring did surprisingly poorly. There seems to be something seriously wrong with HHVM test run as it generates only tens of responses per second with concurrency levels 256 and 1024. Top 10 frameworks are based on C++, Java, Scala and Lua languages. No database is used in this test. Benchmark repeatability In the scientific world research must be repeatable, in order to be credible. Similarly, the benchmark test methodology and relevant circumstances should be documented to make the results repeatable and credible. There’re a few details that could be documented to improve repeatability. The benchmarking project source code doesn’t seem to be tagged. Tagging would be essential for making benchmarks repeatable. A short description of the hardware and some other test environment parameters is available on the benchmark project web site. However, the environment setup (hardware + software) is expected to change over time, so this information should be documented per round. Also, Linux distribution minor release or the exact Linux kernel version don’t appear to be identified. Detailed data about what goes on inside the servers could be published, so that externals could analyze benchmark results in a more meaningful way. System activity reports e.g. system resource usage (CPU, memory, IO) can provide valuable clues to possible scalability issues. Also, application, framework, database and other logs can be useful to test implementers. Resin was chosen as the Java application server over Apache Tomcat and other servlet containers due to performance reasons. While I’m not contesting this statement, but there wasn’t any mention about software versions, and since performance attributes tend to change over time between releases, this premise is not repeatable. Neither the exact JVM version nor the JVM arguments are documented for JVM based test implementation execution. Default JVM arguments are used if test implementations don’t override the settings. Since the test implementations have very similar execution profiles by definition, it could be beneficial to explicitly configure and share some JVM flags that are commonly used with server-side applications. Also, due to JVM ergonomics different GC parameters can be automatically selected based on underlying server capacity and JVM version. Documenting these parameters per benchmark round would help with repeatability. Perhaps all the middleware software versions could be logged during test execution and the full test run logs could be made available. A custom test implementation: Asynchronous Java + NoSQL DB Since I’ve worked recently on implementing RESTful services based on JAX-RS 2 API with asynchronous processing (based on Jersey 2 implementation) and Apache Cassandra NoSQL database, I got curious about how this combination would perform against the competition so, I started coding my own test implementation. I decided to drop JAX-RS in this case, however, to eliminate any non-essential abstraction layers that might have a negative impact on performance. One of the biggest hurdles in getting started with test development was that, at the time I started my project there wasn’t a way to test run platform installation scripts in smaller pieces, and you had to run the full installation, which took a very long time. Fortunately, since then framework installation procedure has been compartmentalized, so it’s possible to install just the framework that you’re developing tests for. Also, recently the project has added support for fully automated development environment setup with Vagrant, which is a great help. Another excellent addition is Travis CI integration that allows test implementation developers to gain additional assurance that their code is working as expected also outside their sandbox. Unfortunately, Travis builds can take a very long time, so you might need to disable some of the tests that you’re not actively working on. The Travis CI environment is also a bit different from the developer and the actual benchmarking environments, so you could bump into issues with Travis builds that don’t occur in the development environment, and vice versa. Travis build failures can sometimes be very obscure and tricky to troubleshoot. The actual test implementation code is easy enough to develop and test in isolation, outside of the real benchmark environment, but if you’re adding support for new platform components such as databases or testing platform installation scripts, it’s easiest if you have an environment that’s a close replica of the actual benchmarking environment. In this case adding support for a new database involved creating a new DB schema, test data generation and automating database installation and configuration. Implementing the actual test permutation turned out to be interesting, but surprisingly laborious, as well. I started seeing strange error responses occasionally when benchmarking my test implementation with ab and wrk, especially with higher loads. TFB executes Java based performance implementations in the Resin web container, and after a while of puzzlement about the errors, I decided to test the code in other web containers, namely Tomcat and Jetty. It turned out that I had bumped into 1 Resin bug (5776) and 2 Tomcat bugs (56736, 56739) related to servlet asynchronous processing support. Architecturally, Test types 1 and 6 have been implemented using traditional synchronous Servlet API, while the rest of the test implementations leverage non-blocking request handling through Servlet 3 asynchronous processing support. The test implementations store their data in the Apache Cassandra 2 NoSQL database, which is accessed using the DataStax Java Driver. Asynchronous processing is also used in the data access tier in order to minimize resource consumption. JSON data is processed with the Jackson JSON library. In Java versions predating version 8, asynchronous processing requires passing around callbacks in the form of anonymous classes, which can at times be a bit high-ceremony syntactically. Java 8 Lambda expressions does away with some of the ceremonial overhead, but unfortunately TFB doesn’t yet fully support the latest Java version. I’ve previously used the JAX-RS 2 asynchronous processing API, but not the Servlet 3 async API. One thing I noticed during the test implementation was that the mechanism provided by Servlet 3 async API for generating error response to the client is much lower level, less intuitive and more cumbersome than its JAX-RS async counterpart. The test implementation code was merged in the FrameworkBenchmarks code base, so it should be benchmarked on the next round. The code can be found here: https://github.com/TechEmpower/FrameworkBenchmarks/tree/master/frameworks/Java/servlet3-cass Conclusions TechEmpower’s Framework Benchmarks is a really valuable contribution to the web framework developer and user community. It holds great potential for enabling friendly competition between framework developers, as well as, framework users, and thus driving up performance of popular frameworks and adoption of framework performance best practices. As always, there’s room for improvement. Some areas from a framework user and test implementer point of view include: make the benchmark tests and results more repeatable, publish raw benchmark data for analysis purposes and work on making test development and adding new framework components even easier. Good job TFB team + contributors – can’t wait to see Round 10 benchmark data!Reference: An open web application framework benchmark from our JCG partner Marko Asplund at the practicing techie blog....
java-logo

Runtime Class Loading to Support a Changing API

I maintain an IntelliJ plugin that improves the experience of writing Spock specifications. A challenge of this project is supporting multiple & incompatible IntelliJ API versions in a single codebase. The solution is simple in retrospect (it’s an example of the adapter pattern in the wild), but it originally took a bit of thought and example hunting. I was in the code again today to fix support for a new version, and I decided to document how I originally solved the problem. The fundamental issue is that my compiled code could be loaded in a JVM runtime environment with any of several different API versions present. My solution was to break up the project into four parts:  A main project that doesn’t depend on any varying API calls and is therefore compatible across all API versions. The main project also has code that loads the appropriate adapter implementation based on the runtime environment it finds itself in. In this case, I’m able to take advantage of the IntelliJ PicoContainer for service lookup, but the reflection API or dependency injection also have what’s needed. A set of abstract adapters that provide an API for the main project to use. This project also doesn’t depend on any code that varies across API versions. Sets of classes that implement the abstract adapters for each supported API versions. Each set of adapters wraps changing API calls and is compiled against a specific API version.The simplest case to deal with is a refactor where something in the API moves. This is also what actually broke this last version. My main code needs the Groovy instance of com.intellij.lang.Language. This instance moved in IntelliJ 14. This code was constant until 14, so in this case I’m adding a new adapter. In the adapter module, I have an abstract class LanguageLookup.java: package com.cholick.idea.spock;import com.intellij.lang.Language; import com.intellij.openapi.components.ServiceManager;public abstract class LanguageLookup { public static LanguageLookup getInstance() { return ServiceManager.getService(LanguageLookup.class); } public abstract Language groovy(); } The lowest IntelliJ API version that I support is 11. Looking up the Groovy language instance is constant across 11-13, so the first concrete adapter lives in the module compiled against the IntelliJ 11 API. LanguageLookup11.java: package com.cholick.idea.spock;import com.intellij.lang.Language; import org.jetbrains.plugins.groovy.GroovyFileType;public class LanguageLookup11 extends LanguageLookup { public Language groovy() { return GroovyFileType.GROOVY_LANGUAGE; } } The newest API introduced the breaking change, so a second concrete adapter lives in a module compiled against version 14 of their API. LanguageLookup14.java: package com.cholick.idea.spock;import com.intellij.lang.Language; import org.jetbrains.plugins.groovy.GroovyLanguage;public class LanguageLookup14 extends LanguageLookup { public Language groovy() { return GroovyLanguage.INSTANCE; } } Finally, the main project has a class SpockPluginLoader.java that registers the proper adapter class based on the runtime API that’s loaded (I omitted several methods not specifically relevant to the example): package com.cholick.idea.spock.adapter;import com.cholick.idea.spock.LanguageLookup; import com.cholick.idea.spock.LanguageLookup11; import com.cholick.idea.spock.LanguageLookup14; import com.intellij.openapi.application.ApplicationInfo; import com.intellij.openapi.components.ApplicationComponent; import com.intellij.openapi.components.impl.ComponentManagerImpl; import org.jetbrains.annotations.NotNull; import org.picocontainer.MutablePicoContainer;public class SpockPluginLoader implements ApplicationComponent { private ComponentManagerImpl componentManager;SpockPluginLoader(@NotNull ComponentManagerImpl componentManager) { this.componentManager = componentManager; }@Override public void initComponent() { MutablePicoContainer picoContainer = componentManager.getPicoContainer(); registerLanguageLookup(picoContainer); }private void registerLanguageLookup(MutablePicoContainer picoContainer) { if(isAtLeast14()) { picoContainer.registerComponentInstance(LanguageLookup.class.getName(), new LanguageLookup14()); } else { picoContainer.registerComponentInstance(LanguageLookup.class.getName(), new LanguageLookup11()); } }private IntelliJVersion getVersion() { int version = ApplicationInfo.getInstance().getBuild().getBaselineVersion(); if (version >= 138) { return IntelliJVersion.V14; } else if (version >= 130) { return IntelliJVersion.V13; } else if (version >= 120) { return IntelliJVersion.V12; } return IntelliJVersion.V11; }private boolean isAtLeast14() { return getVersion().compareTo(IntelliJVersion.V14) >= 0; }enum IntelliJVersion { V11, V12, V13, V14 } } Finally, in code where I need the Groovy com.intellij.lang.Language, I get a hold of the LanguageLookup service and call its groovy method: ... Language groovy = LanguageLookup.getInstance().groovy(); if (PsiUtilBase.getLanguageAtOffset(file, offset).isKindOf(groovy)) { ... This solution allows the same compiled plugin JAR to support IntelliJ’s varying API across versions 11-14. I imagine that Android developers commonly implement solutions like this, but it’s something I’d never had to write as a web application developer.Reference: Runtime Class Loading to Support a Changing API from our JCG partner Matt Cholick at the Cholick.com blog....
java-logo

Friday-Benchmarking Functional Java

Lets image our product owner goes crazy one day and ask to you to do the following : From a set of Strings as follows : "marco_8", "john_33", "marco_1", "john_33", "thomas_5", "john_33", "marco_4", .... give me a comma separated String with only the marco's numbers and numbers need to be in order. Example of expected result : "1,4,8"     I will implement this logic in 4 distinct ways and I will micro benchmark each one of them. The ways I’m going to implement the logic are :Traditional java with loops and all. Functional with Guava Functional with java 8 stream Functional with java 8 parallelStreamCode is below or in gist package com.marco.brownbag.functional; import java.util.ArrayList; import java.util.Collections; import java.util.HashSet; import java.util.Iterator; import java.util.List; import java.util.Set; import java.util.stream.Collectors; import com.google.common.base.Function; import com.google.common.base.Joiner; import com.google.common.base.Predicates; import com.google.common.collect.Collections2; import com.google.common.collect.Ordering; public class MicroBenchMarkFunctional {        private static final int totStrings = 2;        public static void main(String[] args) {                Set<String> someNames = new HashSet<String>();                init(someNames);                for (int i = 1; i < totStrings; i++) {                         someNames.add("marco_" + i);                         someNames.add("someone_else_" + i);                 }                System.out.println("start");                run(someNames);        }        private static void run(Set<String> someNames) {                 System.out.println("========================");                 long start = System.nanoTime();                 int totalLoops = 20;                 for (int i = 1; i < totalLoops; i++) {                         classic(someNames);                 }                 System.out.println("Classic         : " + ((System.nanoTime() - start)) / totalLoops);                start = System.nanoTime();                 for (int i = 1; i < totalLoops; i++) {                         guava(someNames);                 }                 System.out.println("Guava           : " + ((System.nanoTime() - start)) / totalLoops);                start = System.nanoTime();                 for (int i = 1; i < totalLoops; i++) {                         stream(someNames);                 }                 System.out.println("Stream          : " + ((System.nanoTime() - start)) / totalLoops);                start = System.nanoTime();                 for (int i = 1; i < totalLoops; i++) {                         parallelStream(someNames);                 }                 System.out.println("Parallel Stream : " + ((System.nanoTime() - start)) / totalLoops);                System.out.println("========================");         }        private static void init(Set<String> someNames) {                 someNames.add("marco_1");                 classic(someNames);                 guava(someNames);                 stream(someNames);                 parallelStream(someNames);                 someNames.clear();         }        private static String stream(Set<String> someNames) {                 return someNames.stream().filter(element -> element.startsWith("m")).map(element -> element.replaceAll("marco_", "")).sorted()                                 .collect(Collectors.joining(","));         }        private static String parallelStream(Set<String> someNames) {                 return someNames.parallelStream().filter(element -> element.startsWith("m")).map(element -> element.replaceAll("marco_", "")).sorted()                                 .collect(Collectors.joining(","));         }        private static String guava(Set<String> someNames) {                 return Joiner.on(',').join(                                 Ordering.from(String.CASE_INSENSITIVE_ORDER).immutableSortedCopy(                                                 Collections2.transform(Collections2.filter(someNames, Predicates.containsPattern("marco")), REPLACE_MARCO)));        }        private static Function<String, String> REPLACE_MARCO = new Function<String, String>() {                 @Override                 public String apply(final String element) {                         return element.replaceAll("marco_", "");                 }         };        private static String classic(Set<String> someNames) {                List<String> namesWithM = new ArrayList<String>();                for (String element : someNames) {                         if (element.startsWith("m")) {                                 namesWithM.add(element.replaceAll("marco_", ""));                         }                 }                Collections.sort(namesWithM);                StringBuilder commaSeparetedString = new StringBuilder();                Iterator<String> namesWithMIterator = namesWithM.iterator();                 while (namesWithMIterator.hasNext()) {                         commaSeparetedString.append(namesWithMIterator.next());                         if (namesWithMIterator.hasNext()) {                                 commaSeparetedString.append(",");                         }                }                return commaSeparetedString.toString();        } } Two points before we dig into performance :Forget about the init() method, that one is just to initialize objects in the jvm otherwise numbers are just crazy. The java 8 functional style looks nicer and cleaner than guava and than developing in a traditional way!Performance: Running that program on my mac with 4 cores, the result is the following : ======================== Classic         : 151941400 Guava           : 238798150 Stream          : 151853850 Parallel Stream : 55724700 ======================== Parallel Stream is 3 times faster. This is because java will split the job in multiple tasks (total of tasks depends on your machine, cores, etc) and will run them in parallel, aggregating the result at the end. Classic Java and java 8 stream have more or less the same performance. Guava is the looser. That is amazing, so someone could think: “cool, I can just always use parallelStream and I will have big bonus at the end of the year”.  But life is never easy. Here is what happens when you reduce that Set of strings from 200.000 to 20: ======================== Classic         : 36950 Guava           : 69650 Stream          : 29850 Parallel Stream : 143350 ======================== Parallel Stream became damn slow. This because parallelStream has a big overhead in terms of initializing and managing multitasking and assembling back the results. Java 8 stream looks now the winner compare to the other 2. Ok, at this point, someone could say something like : “for collections with lots of elements I use parallelStream, otherwise I use stream.” That would be nice and simple to get, but what happens when I reduce that Set again from 20 to 2? This : ======================== Classic         : 8500 Guava           : 20050 Stream          : 24700 Parallel Stream : 67850 ======================== Classic java loops are faster with very few elements. So at this point I can go back to my crazy product owner and ask how many Strings he thinks to have in that input collection. 20? less? more? much more? Like the Carpenter says : Measure Twice, Cut Once!!Reference: Friday-Benchmarking Functional Java from our JCG partner Marco Castigliego at the Remove duplication and fix bad names blog....
software-development-2-logo

Why You Should NOT Implement Layered Architecture

Abstraction layers in software are what architecture astronauts tell you to do. Instead, however, half of all applications out there would be so easy, fun, and most importantly: productive to implement if you just got rid of all those layers. Frankly, what do you really need? You need these two:Some data access Some UIBecause that’s the two things that you inevitably have in most systems. Users, and data. Here’s Kyle Boon’s opinion on possible choices that you may have.   Really enjoying #ratpack and #jooq. — Kyle Boon (@kyleboon) September 2, 2014Very nice choice, Kyle. Ratpack and jOOQ. You could choose any other APIs, of course. You could even choose to write JDBC directly in JSP. Why not. As long as you don’t go pile up 13 layers of abstraction:That’s all bollocks, you’re saying? We need layers to abstract away the underlying implementation so we can change it? OK, let’s give this some serious thought. How often do you really change the implementation? Some examples:SQL. You hardly change the implementation from Oracle to DB2 DBMS. You hardly change the model from relational to flat or XML or JSON JPA. You hardly switch from Hibernate to EclipseLink UI. You simply don’t replace HTML with Swing Transport. You just don’t switch from HTTP to SOAP Transaction layer. You just don’t substitute JavaEE with Spring, or JDBC transactionsNope. Your architecture is probably set in stone. And if – by the incredible influence of entropy and fate – you happen to have made the wrong decision in one aspect, about 3 years ago, well you’re in for a major refactoring anyway. If SQL was the wrong choice, well good luck to you migrating everything to MongoDB (which is per se the wrong choice again, so prepare for migrating back). If HTML was the wrong choice, well even more tough luck to you. Likelihood of your layers not really helping you when a concrete incident happens: 95% (because you missed an important detail) Layers = Insurance If you’re still thinking about implementing an extremely nice layered architecture, ready to deal with pretty much every situation where you simply switch a complete stack with another, then what you’re really doing is filing a dozen insurance policies. Think about it this way. You can get:Legal insurance Third party insurance Reinsurance Business interruption insurance Business overhead expense disability insurance Key person insurance Shipping insurance War risk insurance Payment protection insurance … pick a random categoryYou can pay and pay and pay in advance for things that probably won’t ever happen to you. Will they? Yeah, they might. But if you buy all that insurance, you pay heavily up front. And let me tell you a secret. IF any incident ever happens, chances are that you:Didn’t buy that particular insurance Aren’t covered appropriately Didn’t read the policy Got screwedAnd you’re doing exactly that in every application that would otherwise already be finished and would already be adding value to your customer, while you’re still debating if on layer 37 between the business rules and transformation layers, you actually need another abstraction because the rule engine could be switched any time. Stop doing that You get the point. If you have infinite amounts of time and money, implement an awesome, huge architecture up front. Your competitor’s time to market (and fun, on the way) is better than yours. But for a short period of time, you were that close to the perfect, layered architecture!Reference: Why You Should NOT Implement Layered Architecture from our JCG partner Lukas Eder at the JAVA, SQL, AND JOOQ blog....
java-logo

When the Java 8 Streams API is not Enough

Java 8 was – as always – a release of compromises and backwards-compatibility. A release where the JSR-335 expert group might not have agreed upon scope or feasibility of certain features with some of the audience. See some concrete explanations by Brian Goetz about why …… “final” is not allowed in Java 8 default methods … “synchronized” is not allowed in Java 8 default methodsBut today we’re going to focus on the Streams API’s “short-comings”, or as Brian Goetz would probably put it: things out of scope given the design goals.   Parallel Streams? Parallel computing is hard, and it used to be a pain. People didn’t exactly love the new (now old) Fork / Join API, when it was first shipped with Java 7. Conversely, and clearly, the conciseness of calling Stream.parallel() is unbeatable. But many people don’t actually need parallel computing (not to be confused with multi-threading!). In 95% of all cases, people would have probably preferred a more powerful Streams API, or perhaps a generally more powerful Collections API with lots of awesome methods on various Iterable subtypes. Changing Iterable is dangerous, though. Even a no-brainer as transforming an Iterable into a Stream via a potential Iterable.stream() method seems to risk opening pandora’s box!. Sequential Streams! So if the JDK doesn’t ship it, we create it ourselves! Streams are quite awesome per se. They’re potentially infinite, and that’s a cool feature. Mostly – and especially with functional programming – the size of a collection doesn’t really matter that much, as we transform element by element using functions. If we admit Streams to be purely sequential, then we could have any of these pretty cool methods as well (some of which would also be possible with parallel Streams):cycle() – a guaranteed way to make every stream infinite duplicate() – duplicate a stream into two equivalent streams foldLeft() – a sequential and non-associative alternative to reduce() foldRight() – a sequential and non-associative alternative to reduce() limitUntil() – limit the stream to those records before the first one to satisfy a predicate limitWhile() – limit the stream to those records before the first one not to satisfy a predicate maxBy() – reduce the stream to the maximum mapped value minBy() – reduce the stream to the minimum mapped value partition() – partition a stream into two streams, one satisfying a predicate and the other not satisfying the same predicate reverse() – produce a new stream in inverse order skipUntil() – skip records until a predicate is satisified skipWhile() – skip records as long as a predicate is satisfied slice() – take a slice of the stream, i.e. combine skip() and limit() splitAt() – split a stream into two streams at a given position unzip() – split a stream of pairs into two streams zip() – merge two streams into a single stream of pairs zipWithIndex() – merge a stream with its corresponding stream of indexes into a single stream of pairsjOOλ’s new Seq type does all thatAll of the above is part of jOOλ. jOOλ (pronounced “jewel”, or “dju-lambda”, also written jOOL in URLs and such) is an ASL 2.0 licensed library that emerged from our own development needs when implementing jOOQ integration tests with Java 8. Java 8 is exceptionally well-suited for writing tests that reason about sets, tuples, records, and all things SQL. But the Streams API just slightly feels insufficient, so we have wrapped JDK’s Streams into our own Seq type (Seq for sequence / sequential Stream):     // Wrap a stream in a sequence Seq<Integer> seq1 = seq(Stream.of(1, 2, 3));// Or create a sequence directly from values Seq<Integer> seq2 = Seq.of(1, 2, 3); We’ve made Seq a new interface that extends the JDK Stream interface, so you can use Seq fully interoperably with other Java APIs – leaving the existing methods unchanged: public interface Seq<T> extends Stream<T> {/** * The underlying {@link Stream} implementation. */ Stream<T> stream(); // [...] } Now, functional programming is only half the fun if you don’t have tuples. Unfortunately, Java doesn’t have built-in tuples and while it is easy to create a tuple library using generics, tuples are still second-class syntactic citizens when comparing Java to Scala, for instance, or C# and even VB.NET. Nonetheless… jOOλ also has tuples We’ve run a code-generator to produce tuples of degree 1-8 (we might add more in the future, e.g. to match Scala’s and jOOQ’s “magical” degree 22).And if a library has such tuples, the library also needs corresponding functions. The essence of these TupleN and FunctionN types is summarised as follows: public class Tuple3<T1, T2, T3> implements Tuple, Comparable<Tuple3<T1, T2, T3>>, Serializable, Cloneable { public final T1 v1; public final T2 v2; public final T3 v3; // [...] } and @FunctionalInterface public interface Function3<T1, T2, T3, R> {default R apply(Tuple3<T1, T2, T3> args) { return apply(args.v1, args.v2, args.v3); }R apply(T1 v1, T2 v2, T3 v3); } There are many more features in Tuple types, but let’s leave them out for today. On a side note, I’ve recently had an interesting discussion with Gavin King (the creator of Hibernate) on reddit. From an ORM perspective, Java classes seem like a suitable implementation for SQL / relational tuples, and they are indeed. From an ORM perspective. But classes and tuples are fundamentally different, which is a very subtle issue with most ORMs – e.g. as explained here by Vlad Mihalcea. Besides, SQL’s notion of row value expressions (i.e. tuples) is quite different from what can be modelled with Java classes. This topic will be covered in a subsequent blog post. Some jOOλ examples With the aforementioned goals in mind, let’s see how the above API can be put to work by example: zipping // (tuple(1, "a"), tuple(2, "b"), tuple(3, "c")) Seq.of(1, 2, 3).zip(Seq.of("a", "b", "c"));// ("1:a", "2:b", "3:c") Seq.of(1, 2, 3).zip( Seq.of("a", "b", "c"), (x, y) -> x + ":" + y );// (tuple("a", 0), tuple("b", 1), tuple("c", 2)) Seq.of("a", "b", "c").zipWithIndex();// tuple((1, 2, 3), (a, b, c)) Seq.unzip(Seq.of( tuple(1, "a"), tuple(2, "b"), tuple(3, "c") )); This is already a case where tuples have become very handy. When we “zip” two streams into one, we want a wrapper value type that combines both values. Classically, people might’ve used Object[] for quick-and-dirty solutions, but an array doesn’t indicate attribute types or degree. Unfortunately, the Java compiler cannot reason about the effective bound of the <T> type in Seq<T>. This is why we can only have a static unzip() method (instead of an instance one), whose signature looks like this: // This works static <T1, T2> Tuple2<Seq<T1>, Seq<T2>> unzip(Stream<Tuple2<T1, T2>> stream) { ... } // This doesn't work: interface Seq<T> extends Stream<T> { Tuple2<Seq<???>, Seq<???>> unzip(); } Skipping and limiting // (3, 4, 5) Seq.of(1, 2, 3, 4, 5).skipWhile(i -> i < 3);// (3, 4, 5) Seq.of(1, 2, 3, 4, 5).skipUntil(i -> i == 3);// (1, 2) Seq.of(1, 2, 3, 4, 5).limitWhile(i -> i < 3);// (1, 2) Seq.of(1, 2, 3, 4, 5).limitUntil(i -> i == 3); Other functional libraries probably use different terms than skip (e.g. drop) and limit (e.g. take). It doesn’t really matter in the end. We opted for the terms that are already present in the existing Stream API: Stream.skip() and Stream.limit() Folding // "abc" Seq.of("a", "b", "c").foldLeft("", (u, t) -> t + u);// "cba" Seq.of("a", "b", "c").foldRight("", (t, u) -> t + u); The Stream.reduce() operations are designed for parallelisation. This means that the functions passed to it must have these important attributes:Associativity Non-interference StatelessnessBut sometimes, you really want to “reduce” a stream with functions that do not have the above attributes, and consequently, you probably don’t care about the reduction being parallelisable. This is where “folding” comes in. A nice explanation about the various differences between reducing and folding (in Scala) can be seen here. Splitting // tuple((1, 2, 3), (1, 2, 3)) Seq.of(1, 2, 3).duplicate();// tuple((1, 3, 5), (2, 4, 6)) Seq.of(1, 2, 3, 4, 5, 6).partition(i -> i % 2 != 0)// tuple((1, 2), (3, 4, 5)) Seq.of(1, 2, 3, 4, 5).splitAt(2); The above functions all have one thing in common: They operate on a single stream in order to produce two new streams, that can be consumed independently. Obviously, this means that internally, some memory must be consumed to keep buffers of partially consumed streams. E.g.duplication needs to keep track of all values that have been consumed in one stream, but not in the other partitioning needs to fast forward to the next value that satisfies (or doesn’t satisfy) the predicate, without losing all the dropped values splitting might need to fast forward to the split indexFor some real functional fun, let’s have a look at a possible splitAt() implementation: static <T> Tuple2<Seq<T>, Seq<T>> splitAt(Stream<T> stream, long position) { return seq(stream) .zipWithIndex() .partition(t -> t.v2 < position) .map((v1, v2) -> tuple( v1.map(t -> t.v1), v2.map(t -> t.v1) )); } … or with comments: static <T> Tuple2<Seq<T>, Seq<T>> splitAt(Stream<T> stream, long position) { // Add jOOλ functionality to the stream // -> local Type: Seq<T> return seq(stream) // Keep track of stream positions // with each element in the stream // -> local Type: Seq<Tuple2<T, Long>> .zipWithIndex() // Split the streams at position // -> local Type: Tuple2<Seq<Tuple2<T, Long>>, // Seq<Tuple2<T, Long>>> .partition(t -> t.v2 < position) // Remove the indexes from zipWithIndex again // -> local Type: Tuple2<Seq<T>, Seq<T>> .map((v1, v2) -> tuple( v1.map(t -> t.v1), v2.map(t -> t.v1) )); } Nice, isn’t it? A possible implementation for partition(), on the other hand, is a bit more complex. Here trivially with Iterator instead of the new Spliterator: static <T> Tuple2<Seq<T>, Seq<T>> partition( Stream<T> stream, Predicate<? super T> predicate ) { final Iterator<T> it = stream.iterator(); final LinkedList<T> buffer1 = new LinkedList<>(); final LinkedList<T> buffer2 = new LinkedList<>();class Partition implements Iterator<T> {final boolean b;Partition(boolean b) { this.b = b; }void fetch() { while (buffer(b).isEmpty() && it.hasNext()) { T next = it.next(); buffer(predicate.test(next)).offer(next); } }LinkedList<T> buffer(boolean test) { return test ? buffer1 : buffer2; }@Override public boolean hasNext() { fetch(); return !buffer(b).isEmpty(); }@Override public T next() { return buffer(b).poll(); } }return tuple( seq(new Partition(true)), seq(new Partition(false)) ); } I’ll let you do the exercise and verify the above code. Get and contribute to jOOλ, now! All of the above is part of jOOλ, available for free from GitHub. There is already a partially Java-8-ready, full-blown library called functionaljava, which goes much further than jOOλ. Yet, we believe that all what’s missing from Java 8’s Streams API is really just a couple of methods that are very useful for sequential streams. In a previous post, we’ve shown how we can bring lambdas to String-based SQL using a simple wrapper for JDBC (of course, we still believe that you should use jOOQ instead). Today, we’ve shown how we can write awesome functional and sequential Stream processing very easily, with jOOλ. Stay tuned for even more jOOλ goodness in the near future (and pull requests are very welcome, of course!)Reference: When the Java 8 Streams API is not Enough from our JCG partner Lukas Eder at the JAVA, SQL, AND JOOQ blog....
jboss-wildfly-logo

API Management in WildFly 8.1 with Overlord

I gave a brief introduction about the Overlord project family yesterday. Today it’s time to test-drive a bit. The API Management sub-project released a 1.0.0.Alpha1 two days ago and introduces the first set of features according to the 18-Month roadmap. What is APIMan exactly? It is an API management system which can either be embedded with existing frameworks or applications or even run as a separate system. So far, so good. But what is API Management and why should you care about it? Fact is, that today’s applications grow in size and complexity and get distributed more widely. Add more consumers to the mix like mobile devices, TVs or the whole bunch of upcoming IoT devices and think about how you would implement access control or usage consistently over a whole bunch of applications. A nightmare candidate. But don’t worry too much. This is where API Management comes in. APIMan provides a flexible, policy-based runtime governance for your APIs. It allows API providers to offers the same API through multiple plans, allowing different levels of service to different API consumers. Sounds complicated still? Let’s give it a try. The Library REST-Service Imagine that a public library has a nice RESTful service which lists books. It’s running somewhere and usually is not really access restricted. Now someone came up with the idea to build an amazing mobile app which can find out if a book is in the library or not. A next step should be to add the option to reserve a book for a couple of hours, which the old system really can’t do for now. Instead of heavily tweaking the older version of the library applications we’re going to use APIMan to provide a consistent API to the mobile application and let it manage the authentication for now. The API I’m using here is a simple resteasy example. You can use whatever web-service endpoint you have to play around with. Getting Started on WildFly 8.1 The project can be built and deployed on a variety of runtime platforms, but if you want to see it in action as quickly as possible you just need to fork and clone the APIMan GitHub repository and simply build it with Maven 3.x. If you use the “run-all-wildfly8″ profile, you’re ready to instantly test drive it, because it does not only build the project, but also downloads and configures latest WildFly 8.1 and finally starts it for you. It takes a while to build and then start up, so you’d better bring some patience. So, all you have to do to explore it is to fire up the admin console at http://localhost:8080/apiman-dt-ui/ and use one of the following users to log-in (the “!” is part of the password, btw):admin/admin123! bwayne/bwayne123! ckent/ckent123! dprince/dprince123!Test-Driving The Quickstart The documentation is a bit weak for now so I will give you a short walk through the console. Open the console and log-in with the admin user. Now you can “Create a new Organisation” let’s call it “Public Library” for now. The newly created organization shows you some tabs (Applications, Services, Plans, Members). Switch to the services tab and click on the button “New Service”. Enter “BookListing” as a name, leave the 1.0 as Version and you might give it a description for informational purpose.After you click the “Create Service” button you are redirected to the overview page. Switch to the “Implementation” and fill in the final API Endpoint. In my case this would be:http://localhost:9080/jaxb-json/resteasy/library/books/badger (note: it is deployed on a different WildFly instance) Click “Save” when you’re done.If you switch back to the overview page, you see, that the service is in status “Created” and the Publish button is still grayed out. In order to reach this goal, we need to add some more information to APIMan. The next step is to add a so called Plan to the Organisation. Switch back to it and select the Plan tab and click the “New Plan” button. Plans basically allow to group individual policies and assign them to services. Call it “InternetBlackList” and create it by clicking the accompanying button. From the “Plan” overview select “Policies” and “Add Policy” by clicking the button. Define an “IP Blacklist Policy” and enter a potentially malicious IP address you don’t want the service to be accessed by.To be able to publish our service, we need to link the newly created Plan to the BookListing service. Navigate back there and select the Plans tab. Select the “InternetBlackList” plan and click “Save”. Reviewing the “Overview” page on the Service now finally shows the “Ready” state and let’s us publish it.Now that it is published, we can actually use it. But we’ll take one additional step here and link the service to an application via a contract. Creating a Contract allows you to connect an Application to a Service via a particular Plan offered by the Service. You would want to do this so that your Application can invoke the Service successfully. Create an application by navigating back to the Public Library Organization and clicking the “New App” button. Call it “Munich”, leave the 1.0 as a version and enter a description if you like to; Click “Create Application”. The one step left to do is to link the service and the application. This is done via a contract. Select the “Contracts” page and create a “New Contract” with the button. Enter “book” in the “Find a Service” field and search for our BookListing service. Select it. Now you can create the Contract.The last step is to register the newly created application in the “Overview” page.That was it. We now have a published service and a registered application. If you navigate to the API page of the application you can see the managed endpoints for the application. If you hover over the service, you get a “copy” button which let’s you copy the URL of the managed endpoint funneled through the APIMan gateway.If you try to access the service through the specified BlackListed IP address, you will now get an error. If not, you get proxied to the service by the gateway.Notice the apikey query-string? This is the key with which the gateway locates your service and proxies your call to the managed endpoint. If you don’t want to sent it as part of the query string you can also use a custom HTTP header called X-API-Key. What’s Next? That was a very quick and incomplete walk through. But you hopefully got an idea about the basic concepts behind it. APIMan and the other Overlord sub-projects are evolving quickly. They are happy to receive contributions and if you like what you’ve seen or have other feedback, don’t hesitate to get in touch with the project. If you want to see the more API like approach you can also watch and listen to the following screencast. It is a bit outdated, but still helpful.Reference: API Management in WildFly 8.1 with Overlord from our JCG partner Markus Eisele at the Enterprise Software Development with Java blog....
devops-logo

Deployment Script vs. Rultor

When I explain how Rultor automates deployment/release processes, very often I hear something like: But I already have a script that deploys everything automatically.This response is very common, so I decided to summarize my three main arguments for automated Rultor deployment/release processes in one article:  isolated docker containers, visibility of logs and security of credentials.Read about them and see what Rultor gives you on top of your existing deployment script(s).Before we start with the arguments, let me emphasize that Rultor is a useful interface to your custom scripts. When you decide to automate deployment with Rultor, you don’t throw away any of your existing scripts. You just teach Rultor how to call them. Isolated Docker Containers The first advantage you get once you start calling your deployment scripts from Rultor is the usage of Docker. I’m sure you know what Docker is, but for those who don’t — it is a manager of virtual Linux “machines”. It’s a command line script that you call when you need to run some script in a new virtual machine (aka “container”). Docker starts the container almost immediately and runs your script. The beauty of Docker is that every container is a perfectly isolated Linux environment, with its own file system, memory, processes, etc. When you tell Rultor to run your deployment script, it starts a new Docker container and runs your script there. But what benefit does this give me, you ask? The main benefit is that the container gets destroyed right after your script is done. This means that you can do all pre-configuration inside the container without any fear of conflict with your main working platform. Let me give an example. I’m developing on MacBook, where I install and remove packages which I need for development. At the same time, I have a project that, in order to be deployed, requires PHP 5.3, MySQL 5.6, phing, phpunit, phpcs and xdebug. Every MacOS version needs to be configured specifically to get these applications up and running, and it’s a time-consuming job. I can change laptops, and I can change MacOS versions, but the project stays the same. It still requires the same set of packages in order to run its deployment script successfully. And the project is not in active development any more. I simply don’t need these packages for my day-to-day work, since I’m working with Java more now. But, when I need to make a minor fix to that PHP project and deploy it, I have to install all the required PHP packages and configure them. Only after that can I deploy that minor fix. It is annoying, to say the least. Docker gives me the ability to automate all of this together. My existing deployment script will get a preamble, which will install and configure all necessary PHP-related packages in a clean Ubuntu container. This preamble will be executed on every run of my deployment script, inside a Docker container. For example, it may look like this: My deployment script looked like this before I started to use Rultor: #!/bin/bash phing test git ftp push --user ".." --passwd ".." --syncroot php/src ftp://ftp.example.com/ Just two lines. The first one is a full run of unit tests. The second one is an FTP deployment to the production server. Very simple. But this script will only work if PHP 5.3, MySQL, phing, xdebug, phpcs and phpunit are installed. Again, it’s a lot of work to install and configure them every time I upgrade my MacOS or change a laptop. Needless to say, that if/when someone joins the project and tries to run my scripts, he/she will have to do this pre-installation work again. So, here is a new script, which I’m using now. It is being executed inside a new Docker container, every time: #!/bin/bash # First, we install all prerequisites sudo apt-get install -y php5 php5-mysql mysql sudo apt-get install php-pear sudo pear channel-discover pear.phpunit.de sudo pear install phpunit/PHPUnit sudo pear install PHP_CodeSniffer sudo pecl install xdebug sudo pear channel-discover pear.phing.info sudo pear install phing/phing # And now the same script I had before phing test git ftp push --user ".." --passwd ".." --syncroot php/src ftp://ftp.example.com/ Obviously, running this script on my MacBook (without virtualization) would cause a lot of trouble. Well, I don’t even have apt-get here! Thus, the first benefit that Rultor gives you is an isolation of your deployment script in its own virtual environment. We have this mostly thanks to Docker. Visibility of Logs Traditionally, we keep deployment scripts in some ~/deploy directory and run them with a magic set of parameters. In a small project, you do this yourself and this directory is on your own laptop. In a bigger project, there is a “deployment” server, that has that magic directory with a set of scripts that can be executed only by a few trusted senior developers. I’ve seen this setup many times. The biggest issue here is traceability. It’s almost impossible to find out who deployed what and why some particular deployment failed. The senior deployment gurus simply SSH to the server and run those magic scripts with magic parameters. Logs are usually lost and problem tracking is very difficult or impossible. Rultor offers something different. With Rultor, there is no SSH access to deployment scripts any more. All scripts stay in the .rultor.yml configuration file, and you start them by posting messages in your issue tracking system (for example Github, JIRA or Trac). Rultor runs the script and publishes its full log right to your ticket. The log stays with your project forever. You can always get back to the ticket you were working with and check why deployment failed and what instructions were actually executed. For example, check out this Github issue, where I was deploying a new version of Rultor itself, and failed a few times: yegor256/rultor#563. All my failed attempts are protocolled. I can always get back to them and investigate. For a big project this information is vital. Thus, the second benefit of Rultor versus a standalone deployment script is visibility of every single operation. Security of Credentials When you have a custom script sitting in your laptop or in that secret team deployment server, your production credentials stay close to it. There is just no other way. If your software works with a database, it has to know login credentials (user name, password, DB name, port number, etc.). Well, in the worst case, some people just hard code that information right into the source code. We aren’t even going to discuss this case, that’s how bad it is. But let’s say you separate your DB credentials from the source code. You will have something like a db.properties or db.ini file, which will be attached to the application right before deployment. You can also keep that file directly in the production server, which is even better, but not always possible, especially with PaaS deployments, for example. A similar problem exists with deployments of artifacts to repositories. Say, you’re regularly deploying to RubyGems.org. Your ~/.gem/credentials will contain your secret API key. So, very often, your deployment scripts are accompanied by some files with sensitive and secure information. And these files have this information in a plain, open format. No encryption, no protection. Just user names, passwords, codes and tokens in plain text. Why is this bad? Well, for a single developer with a single laptop this doesn’t sound like a problem. Although, I don’t like the idea of losing a laptop somewhere in an airport with all credentials open and ready to be used. You may argue that there are disc protection tools, like FileVault for MacOS or BestCrypt for Windows. Yes, maybe. But let’s see what happens when we have a team of developers, working together and sharing those deployment scripts and files with credentials. Once you give access to your deployment scripts to a new member of the team, you have to share all that sensitive data. There is just no way around it. In order to use the scripts he/she has to be able to open files with credentials. This is a problem, if you care about the security of your data. Rultor solves this problem by offering an on-the-fly GPG decryption of your sensitive data, right before they are used by your deployment scripts. In the .rultor.yml configuration file you just say: decrypt: db.ini: "repo/db.ini.asc" deploy: script: ftp put db.ini production Then, you encrypt your db.ini using a Rultor GPG key, and fearlessly commit db.ini.asc to the repository. Nobody will be able to open and read that file, except the Rultor server itself, right before running the deployment script. Thus, the third benefit of Rultor versus a standalone deployment script is proper security of sensitive data. Related Posts You may also find these posts interesting:How to Publish to Rubygems, in One Click How to Deploy to CloudBees, in One Click How to Release to Maven Central, in One Click Rultor + Travis Every Build in Its Own Docker ContainerReference: Deployment Script vs. Rultor from our JCG partner Yegor Bugayenko at the About Programming blog....
akka-logo

Akka Notes – Introducing Actors

Anyone who has done multithreading in the past won’t deny how hard and painful it is to manage multithreaded applications. I said manage because it starts out simple and it became a whole lot of fun once you start seeing performance improvements. However, it aches when you see that you don’t have a easier way to recover from errors in your sub-tasks OR those zombie bugs that you find hard to reproduce OR when your profiler shows that your threads are spending a lot of time blocking wastefully before writing to a shared state. I prefer not to talk about how Java concurrency API and their collections made it better and easier because I am sure if you are here, you probably needed more control over the sub-tasks or simply because you don’t like to write locks and synchronized blocks and would prefer a higher level of abstraction. In this series of Akka Notes, we would go through simple Akka examples to explore the various features that we have in the toolkit. What are Actors? Akka’s Actors follow the Actor Model (duh!). Treat Actors like People. People who don’t talk to each other in person. They just talk through mails. Let’s expand on that a bit. 1. Messaging Consider two persons – A wise Teacher and Student. The Student sends a mail every morning to the Teacher and the wise Teacher sends a wise quote back. Points to note :The student sends a mail. Once sent, the mail couldn’t be edited. Talk about natural immutability. The Teacher checks his mailbox when he wishes to do so. The Teacher also sends a mail back (immutable again). The student checks the mailbox at his own time. The student doesn’t wait for the reply. (no blocking)That pretty much sums up the basic block of the Actor Model – passing messages.2. Concurrency Now, imagine there are 3 wise teachers and 3 students – every student sends notes to every other teacher. What happens then? Nothing changes actually. Everybody has their own mailbox. One subtle point to note here is this : By default, Mails in the mailbox are read/processed in the order they arrived. Internally, by default it is a ConcurrentLinkedQueue. And since nobody waits for the mail to be picked up, it is simply a non-blocking message. (There are a variety of built-in mailboxes including bounded and priority based. In fact, we could build one ourself too)3. Failover Imagine these 3 teachers are from three different departments – History, Geography and Philosophy. History teachers replies with a note on an Event in the past, Geography teachers sends an Interesting Place and Philosophy teachers, a quote. Each student sends message to each teacher and gets responses. The student doesnt care which teacher in the department sends the reply back. What if one day, a teacher falls sick? There has to be at least one teacher handling the mails from the department. In this case, another teacher in the department steps up and does the job.Points to note :There could be a pool of Actors who does different things. An Actor could do something that causes an exception. It wouldn’t be able to recover by itself. In which case a new Actor could be created in place of the old one. Alternatively, the Actor could just ignore that one particular message and proceed with the rest of the messages. These are called Directives and we’ll discuss them later.4. Multitasking For a twist, let’s assume that each of these teachers also send the exam score through mail too, if the student asks for it. Similarly, an the Actor could handle more than one type of message comfortably. 5. Chaining What if the student would like to get only one final consolidated trivia mail instead of three? We could do that too with Actors too. We could chain the teachers as a hierarchy. We’ll come back to that later when we talk about Supervisors and revisit the same thought when we talk about Futures. As requested by Mohan, let’s just try to map the analogy components with the the components in the Actor Model.Students and the Teachers becomes our Actors. The Email Inbox becomes the Mailbox component. The request and the response can’t be modified. They are immutable objects. Finally, the MessageDispatcher component in Actor manages the mailbox and routes the messages to the respective Mailbox. Enough talk, let’s cook up some code….Reference: Akka Notes – Introducing Actors from our JCG partner Arun Manivannan at the Rerun.me blog....
apache-camel-logo

More metrics in Apache Camel 2.14

Apache Camel 2.14 is being released later this month. There is a slight holdup due some Apache infrastructure issue which is being worked on. This blog post is to talk about one of the new functions we have added to this release. Thanks to Lauri Kimmel who donated a camel-metrics component, we integrated with the excellent codehale metrics library. So I took this component one step further and integrated it with the Camel routes so we have additional metrics about the route performances using codehale metrics. This allows end users to seamless feed Camel routing information together with existing data they are gathering using codehale metrics. Also take note we have a lot of existing metrics from camel-core which of course is still around. What codehale brings to the table is that they have additional statistical data which we do not have in camel-core. To use the codehale metics all you need to do is:add camel-metrics component enable route metrics in XML or Java codeTo enable in XML you declare a as shown below: &;t;bean id="metricsRoutePolicyFactory" class="org.apache.camel.component.metrics. routepolicy.MetricsRoutePolicyFactory"/>And doing so in Java code is easy as well by calling this method on your CamelContext context.addRoutePolicyFactory(new MetricsRoutePolicyFactory()); Now performance metrics is only useable if you have a way of displaying them, and for that you can use hawtio. Notice you can use any kind of monitoring tooling which can integrate with JMX, as the metrics is available over JMX. The actual data is 100% codehale json format, where a piece of the data is shown in the figure below.The next release of hawtio supports Camel 2.14 and automatic detects if you have enabled route metrics and if so, then shows a sub, where the information can be seen in real time in a graphical charts.The screenshot above is from the new camel-example-servlet-rest-tomcat which we ship out of the box. This example demonstrates another new functionality in Camel 2.14 which is the Rest DSL (I will do a blog about that later). This example enables the route metrics out of the box, so what I did was to deploy this example together with hawtio (the hawtio-default WAR) in Apache Tomcat 8. With hawtio you can also build custom dashboards, so here at the end I have put together a dashboard with various screens from hawtio to have a custom view of a Camel application.Reference: More metrics in Apache Camel 2.14 from our JCG partner Claus Ibsen at the Claus Ibsen riding the Apache Camel blog....
java-logo

A classloading mystery solved

Facing a good old problem I was struggling with some class loading issue on an application server. The libraries were defined as maven dependencies and therefore packaged into the WAR and EAR file. Some of these were also installed into the application server, unfortunately of different version. When we started the application we faced the various exceptions that were related to these types of problems. There is a good IBM article about these exceptions if you want to dig deeper. Even though we knew that the error was caused by some double defined libraries on the classpath it took more than two hours to investigate which version we really needed, and what JAR to remove. Same topic by accident on JUG the same week A few days later we participated the Do you really get Classloaders? session of Java Users’ Society in Zürich. Simon Maple delivered an extremely good intro about class loaders and went into very deep details from the very start. It was an eye opening session for many. I also have to note that Simon works Zero turnaround and he evangelizes for JRebel. In such a situation a tutorial session is usually biased towards the actual product that is the bread for the tutor. In this case my opinion is that Simon was absolutely gentleman and ethic keeping an appropriate balance. Creating a tool, to solve mystery just to create another one A week later I had some time to hobby program that I did not have time for a couple weeks by now and I decided to create a little tool that lists all the classes and JAR files that are on the classpath so investigation can be easier to find duplicates. I tried to rely on the fact that the classloaders are usually instances of URLClassLoader and thus the method getURLs() can be invoked to get all the directory names and JAR files. Unit testing in such a situation can be very tricky, since the functionality is strongly tied to the class loader behavior. To be pragmatic I decided to just do some manual testing started from JUnit so long as long the code is experimental. First of all I wanted to see if the concept is worth developing it further. I was planning to execute the test and look at the log statements reporting that there were no duplicate classes and then executing the same run but second time adding some redundant dependencies to the classpath. I was using JUnit 4.10 The version is important in this case. I executed the unit test from the command line and I saw that there were no duplicate classes, and I was happy. After that I was executing the same test from Eclipse and surprise: I got 21 classes redundantly defined! 12:41:51.670 DEBUG c.j.c.ClassCollector - There are 21 redundantly defined classes. 12:41:51.670 DEBUG c.j.c.ClassCollector - Class org/hamcrest/internal/SelfDescribingValue.class is defined 2 times: 12:41:51.671 DEBUG c.j.c.ClassCollector - sun.misc.Launcher$AppClassLoader@7ea987ac:file:/Users/verhasp/.m2/repository/junit/junit/4.10/junit-4.10.jar 12:41:51.671 DEBUG c.j.c.ClassCollector - sun.misc.Launcher$AppClassLoader@7ea987ac:file:/Users/verhasp/.m2/repository/org/hamcrest/hamcrest-core/1.1/hamcrest-core-1.1.jar ... Googling a bit I could discover easily that JUnit 4.10 has an extra dependency as shown by maven $ mvn dependency:tree [INFO] Scanning for projects... [INFO] [INFO] ------------------------------------------------------------------------ [INFO] Building clalotils 1.0.0-SNAPSHOT [INFO] ------------------------------------------------------------------------ [INFO] [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ clalotils --- [INFO] com.verhas:clalotils:jar:1.0.0-SNAPSHOT [INFO] +- junit:junit:jar:4.10:test [INFO] | \- org.hamcrest:hamcrest-core:jar:1.1:test [INFO] +- org.slf4j:slf4j-api:jar:1.7.7:compile [INFO] \- ch.qos.logback:logback-classic:jar:1.1.2:compile [INFO] \- ch.qos.logback:logback-core:jar:1.1.2:compile [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 2.642s [INFO] Finished at: Wed Sep 03 12:44:18 CEST 2014 [INFO] Final Memory: 13M/220M [INFO] ------------------------------------------------------------------------ This is actually fixed in 4.11 so if I change the dependency to JUnit 4.11 I do not face the issue. Ok. Half of the mystery solved. But why maven command line execution does not report the classes double defined? Extending the logging, logging more and more I could spot out a line: 12:46:19.433 DEBUG c.j.c.ClassCollector - Loading from the jar file /Users/verhasp/github/clalotils/target/surefire/surefirebooter235846110768631567.jar What is in this file? Let’s unzip it: $ ls -l /Users/verhasp/github/clalotils/target/surefire/surefirebooter235846110768631567.jar ls: /Users/verhasp/github/clalotils/target/surefire/surefirebooter235846110768631567.jar: No such file or directory The file does not exit! Seemingly maven creates this JAR file and then deletes it when the execution of the test is finished. Googling again I found the solution. Java loads the classes from the classpath. The classpath can be defined on the command line but there are other sources for the application class loaders to fetch files from. One such a source is the manifest file of a JAR. The manifest file of a JAR file can define what other JAR files are needed to execute the classes in the JAR file. Maven creates a JAR file that contains nothing else but the manifest file defining the JARs and directories to list the classpath. These JARs and directories are NOT returned by the method getURLs(), therefore the (first version) of my little tool did not find the duplicates. For demonstration purposes I was quick enough to make a copy of the file while the mvn test command was running, and got the following output: $ unzip /Users/verhasp/github/clalotils/target/surefire/surefirebooter5550254534465369201\ copy.jar Archive: /Users/verhasp/github/clalotils/target/surefire/surefirebooter5550254534465369201 copy.jar inflating: META-INF/MANIFEST.MF $ cat META-INF/MANIFEST.MF Manifest-Version: 1.0 Class-Path: file:/Users/verhasp/.m2/repository/org/apache/maven/surefi re/surefire-booter/2.8/surefire-booter-2.8.jar file:/Users/verhasp/.m 2/repository/org/apache/maven/surefire/surefire-api/2.8/surefire-api- 2.8.jar file:/Users/verhasp/github/clalotils/target/test-classes/ fil e:/Users/verhasp/github/clalotils/target/classes/ file:/Users/verhasp /.m2/repository/junit/junit/4.10/junit-4.10.jar file:/Users/verhasp/. m2/repository/org/hamcrest/hamcrest-core/1.1/hamcrest-core-1.1.jar fi le:/Users/verhasp/.m2/repository/org/slf4j/slf4j-api/1.7.7/slf4j-api- 1.7.7.jar file:/Users/verhasp/.m2/repository/ch/qos/logback/logback-c lassic/1.1.2/logback-classic-1.1.2.jar file:/Users/verhasp/.m2/reposi tory/ch/qos/logback/logback-core/1.1.2/logback-core-1.1.2.jar Main-Class: org.apache.maven.surefire.booter.ForkedBooter$ It really is nothing else than the manifest file defining the classpath. But why does maven do it? Sonatype people, some of whom I also know personally are clever people. They don’t do such a thing just for nothing. The reason to create a temporary JAR file to start the tests is that the length of the command line is limited on some of the operating systems that the length of the classpath may exceed. Even though Java (since Java 6) itself resolves wildcard characters in the classpath it is not an option to maven. The JAR files are in different directories in the maven repo each having long name. Wildcard resolution is not recursive, there is a good reason for it, and even if it were you just would not like to have all your local repo on the classpath. ConclusionDo not use JUnit 4.10! Use something older or newer, or be prepared for surprises. Understand what a classloader is and how it works, what is does. Use an operating system that has huge limit for the maximum size of a command line length. Or just live with the limitation.Something else? Your ideas?Reference: A classloading mystery solved from our JCG partner Peter Verhas at the Java Deep blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close