Featured FREE Whitepapers

What's New Here?

jboss-infinispan-logo

Using Infinispan as a persistency solution

Cross-posted from https://vaadin.com/blog/-/blogs/using-infinispan-as-a-persistency-solution. Thanks Fredrik and Matti for your permission! Various RDBMSs are the de-facto standard for persistency. Using them is such a safe bet by architects that I dare say they are used in too many places nowadays. To fight against this, I have recently been exploring with alternative persistency options, like graph databases. This time I played with Infinispan. In case you are not familiar with Infinispan, or distributed key/value data stores in general, you could think of it as a HashMap on steroids. Most essentially, the map is shared among all your cluster nodes. With clustering you can gain huge size, blazing fast access and redundancy, depending on how you configure it. There are several products that compete with Infinispan, like Ehcache and Hazelcast from OS world and Oracle Coherence from the commercial side. Actually, Infinispan is a technology that you might have used without noticing it at all. For example high availability features of Wildfly heavily rely on Infinispan caches. It is also often used as a second level cache for ORM libraries. But it can also be used directly as a persistency library as such. Why would you consider it as your persistency solution:It is a lightning fast in-memory data storage The stored value can be any serializable object, no complex mapping libraries needed It is built from the ground up for a clustered environment – your data is safer and faster to access. It is very easy for horizontal scaling It has multiple optional cache store alternatives, for writing the state to e.g. disk for cluster wide reboots Not all data needs to be stored forever, Infinispan has built-in sophisticated evict rules Possibility to use transactional access for ACID changesSounds pretty amazing, doesn’t it? And it sure is for certain use cases, but all technologies have their weaknesses and so do key/value data stores. When comparing to RDBMSs, the largest drawback is with relations to other entities. You’ll have to come up with a strategy for how to store references to other entities and searching based on related features must also be tackled. If you end up wondering these questions, be sure to check if Hibernate OGM could help you. Also, doing some analysis on the data can be considered simpler, or at least more familiar, with traditional SQL queries. Especially if you end up having a lot of data, distributed on multiple nodes, you’ll have to learn the basics of MapReduce programming model to do any non trivial queries. Using Infinispan in a web application Although Infinispan is not tied to Wildfly, I decided to base my experiments on Wildfly. Its built in version is available for web applications, if you explicitly request it. The easiest method to do this is to add the following MANIFEST.MF entry to your war file. If you don’t want to spoil your project with obsolete files, just add it using a small war plugin config. Dependencies: org.infinispan export Naturally you’ll still want to add an Infinispan dependency to your application, but you can leave it to provided. Be sure to use the same version provided by your server, in Wildlfy 8, Infinispan version is 6.0.2. In a Maven project, add this kind of dependency declaration: <dependency> <groupId>org.infinispan</groupId> <artifactId>infinispan-core</artifactId> <version>6.0.2.Final</version> <!-- Provided as we use the Infinispan provided by Wildfly --> <scope>provided</scope> </dependency> Before accessing Infinispan “caches”, you need to configure them. There are both programmatic and xml configurations available. With Wildfly, it is most natural to configure the Infinispan data store right into the server config. The “right” config file depends on how you are launching your Wildfly server. If you are testing clustering locally, you probably want to add something like this into your domain.xml, under the <subsystem xmlns="urn:jboss:domain:infinispan:2.0"> section. <cache-container name="myCache" default-cache="cachedb"> <transport lock-timeout="60000"/> <replicated-cache name="cachedb" batching="true" mode="SYNC"/> </cache-container> Note that with this config, the data is only stored within the memory of cluster nodes. To learn how to tweak cache settings or to set up disk “backup”, refer to the extensive Infinispan documentation. To remove all Infinispan references from the UI code, I created an EJB that does all the data access. There I inject the CacheContainer provided by Wildfly and fetch the default cache in an init method. @Resource(lookup = "java:jboss/infinispan/container/myCache") CacheContainer cc;Map<String, MyEntity> cache;@PostConstruct void init() { this.cache = cc.getCache(); } I guess you are already wondering it: yes, the Map is the very familiar java.util.Map interface and the rest of the implementation is trivial to any Java developer. Infinispan caches extend the basic Map interface, but in case you need some more advanced features, you can also use Cache or AdvancedCache types. The MyEntity in the previous code snippet is just a very simple POJO I created for the example. With Vaadin CDI usage, I can then inject the EJB to my UI class and do pretty much anything with it. The actual Vaadin code has no special tricks, just normal CDI spiced Vaadin code. Based on this exercise, would I use Infinispan directly for persistency in my next project? Probably not, but for certain apps, without hesitation. I can also imagine certain hybrid models where some of the data is only in an Infinispan cache and some in traditional RDBMS, naturally behind ORM, taking the best of both worlds. We’ll also be using Infinispan in our upcoming joint webinar with Arun Gupta from RedHat on September 8th, 2014. There we’ll show you a simple Vaadin application and how easy it can be to cluster it using Wildfly.Reference: Using Infinispan as a persistency solution from our JCG partner Arun Gupta at the Miles to go 2.0 … blog....
gradle-logo

Using Gradle to Build & Apply AST Transformations

Recently, I wanted to both build and apply local ast transformations in a Gradle project. While I could find several examples of how to write transformations, I couldn’t find a complete example showing the full build process. A transformation has to be compiled separately and then put on the classpath, so its source can’t simply sit in the rest of the Groovy source tree. This is the detail that tripped me up for a while. I initially setup a separate GroovyCompile task to process the annotation before the rest of the source (stemming from a helpful suggestion from Peter Niederwieser on the Gradle forums). While this worked, a much simpler solution for getting transformations to apply is to setup a multi-project build. The main project depends on a sub-project with the ast transformation source files. Here’s a minimal example’s directory structure: ast/build.gradle ast build file ast/src/main/groovy/com/cholick/ast/Marker.groovy marker interface ast/src/main/groovy/com/cholick/ast/Transform.groovy ast transformation build.gradle main build file settings.gradle project hierarchy configuration src/main/groovy/com/cholick/main/Main.groovy source to transform For the full working source (with simple tests and no * imports), clone https://github.com/cholick/gradle_ast_example The root build.gradle file contains a dependency on the ast project: dependencies { ... compile(project(':ast')) }The root settings.gradle defines the ast sub-project: include 'ast' The base project also has src/main/groovy/com/cholick/main/Main.groovy, with the source file to transform. In this example, the ast transformation I’ve written puts a method named ‘added’ onto the class. package com.cholick.mainimport com.cholick.ast.Marker@Marker class Main { static void main(String[] args) { new Main().run() } def run() { println 'Running main' assert this.class.declaredMethods.find { it.name == 'added' } added() } } In the ast sub-project, ast/src/main/groovy/com/cholick/ast/Marker.groovy defines an interface to mark classes for the ast transformation: package com.cholick.astimport org.codehaus.groovy.transform.GroovyASTTransformationClassimport java.lang.annotation.*@Retention(RetentionPolicy.SOURCE) @Target([ElementType.TYPE]) @GroovyASTTransformationClass(['com.cholick.ast.Transform']) public @interface Marker {} Finally, the ast transformation class processes source classes and adds a method: package com.cholick.astimport org.codehaus.groovy.ast.* import org.codehaus.groovy.ast.builder.AstBuilder import org.codehaus.groovy.control.* import org.codehaus.groovy.transform.*@GroovyASTTransformation(phase = CompilePhase.INSTRUCTION_SELECTION) class Transform implements ASTTransformation { void visit(ASTNode[] astNodes, SourceUnit sourceUnit) { if (!astNodes) return if (!astNodes[0]) return if (!astNodes[1]) return if (!(astNodes[0] instanceof AnnotationNode)) return if (astNodes[0].classNode?.name != Marker.class.name) returnClassNode annotatedClass = (ClassNode) astNodes[1] MethodNode newMethod = makeMethod(annotatedClass) annotatedClass.addMethod(newMethod) } MethodNode makeMethod(ClassNode source) { def ast = new AstBuilder().buildFromString(CompilePhase.INSTRUCTION_SELECTION, false, "def added() { println 'Added' }" ) return (MethodNode) ast[1].methods.find { it.name == 'added' } } } Thanks Hamlet D’Arcy for a great ast transformation example and Peter Niederwieser for answering my question on the forums.Reference: Using Gradle to Build & Apply AST Transformations from our JCG partner Matt Cholick at the Cholick.com blog....
software-development-2-logo

An Inconvenient Latency

Overview Vendors typically publish numbers they are happy with, and avoid telling you about a product’s weaknesses.  However, behind the numbers is a dirty secret if you know where to look. Why don’t we use GPUs for everything? Finding problems which naturally scale to thousands of data points/tasks is easy for some problems, and very hard for others.   GPUs are designed for computing large vector operations.  However, what is “large” and why does it matter? Say you have a GPU with 1024 cores. This means it can process a vector of length 1024 all at once and 2048 double the time.  But what happen if we only have a vector of 100, or 10 or 1. The inconvenient answer is it take the same amount of time because you can’t make use of all of your cores at once.  You get only 10%, 1% or just 0.1% of the peak performance.  If you want to get best efficiency, you want a problem which has many thousands of values which can be processed concurrently.  If you don’t have a problem which is so concurrent, you are not going to get maximum performance. Unlike a screen full of pixels, a lot of business logic deals with a small number values at once so you want a small number of fast cores which can perform independent tasks at once. Why web servers favour horizontal scalability Web services scale well up to one task per active user. To use multiple cores, you need a problem which naturally breaks into many independent tasks. When you provide a web service, you have many users and the work for each user is largely independent. If you have ten concurrent users, you can expect to close to ten time the throughput as having one user at a time. If you have one hundred users, you can get up to ten times the throughput of ten users etc. The limit of your concurrency is around the number of active/concurrent users you have. Throughput, latency and concurrency When a vendor benchmarks their product, a common benchmark is throughput. This is the total number of operations per second over a significant period of time.  Ideally this should be many minutes. Vendors are increasingly publishing average latency benchmarks.  Average latency is a fairly optimistic number as it is good at hiding small numbers of particularly bad latencies. For example, if you want to hide long GC pauses, use average latency. The problem for vendors is these two measures can be used in combination to determine the minimum concurrency required to achieve the throughput claimed. A given problem has a “natural concurrency” the problem can be easily broken into. If you have 100 concurrent users, you may have a natural latency of about 100.  There is also a “minimum concurrency” implied by a benchmark. minimum-concurrency = throughput * latency. To achieve the throughput a benchmark suggests your natural concurrency should greater than the minimum concurrency in the benchmark.  If you have less natural concurrency, you might expect you get pro-rata throughput as well. Consider these benchmarks for three key-value stores.key-store throughput latencyProduct 1 1,000,000/s   46 ms or 0.046sProduct 2    600,000/s   10 ms or 0.01sProduct 3 2,000,000/s   0.002 ms or 0.000002sYou might look at this table and say, they all look good, they all have high throughputs and low enough latencies.  The problem is; there is an implied natural concurrency requirement to achieve the throughput stated for the latency measured. Lets see what the minimum concurrency needed to achieve the latency:key-store throughput latency minimum concurrencyProduct 1 1,000,000/s   46 ms or 0.046s 46,000Product 2    600,000/s   10 ms or 0.01s   6,000Product 3 2,000,000/s     0.002 ms or 0.000002s         4Many problems have around 4 independent tasks, but 46K is pretty high.  So what?  What if you only have 10 concurrent tasks/users.key-store concurrency latency throughput achievedProduct 1   10   46 ms or 0.046s            220/sProduct 2   10   10 ms or 0.01s         1,000/sProduct 3   10  0.002 ms or 0.000002s  2,000,000/sNote: having more concurrency than you need, doesn’t help throughput much, but a lack of natural concurrency in your problem will hurt you throughput (and horizontal scalability). Conclusion Next time you read a benchmark which includes throughput and average latency, multiply them together to see what level of concurrency would be required to achieve that throughput and compare this with the natural concurrency of the problem you are trying to solve to see if the solution fits your problem. If you have more natural concurrency, you have more solutions you can consider, if you have a low natural concurrency, you need a solution with a low latency.Reference: An Inconvenient Latency from our JCG partner Peter Lawrey at the Vanilla Java blog....
software-development-2-logo

R: Calculating rolling or moving averages

I’ve been playing around with some time series data in R and since there’s a bit of variation between consecutive points I wanted to smooth the data out by calculating the moving average. I struggled to find an in built function to do this but came across Didier Ruedin’s blog post which described the following function to do the job:           mav <- function(x,n=5){filter(x,rep(1/n,n), sides=2)} I tried plugging in some numbers to understand how it works: > mav(c(4,5,4,6), 3) Time Series: Start = 1 End = 4 Frequency = 1 [1] NA 4.333333 5.000000 NA Here I was trying to do a rolling average which took into account the last 3 numbers so I expected to get just two numbers back – 4.333333 and 5 – and if there were going to be NA values I thought they’d be at the beginning of the sequence. In fact it turns out this is what the ‘sides’ parameter controls: sides for convolution filters only. If sides = 1 the filter coefficients are for past values only; if sides = 2 they are centred around lag 0. In this case the length of the filter should be odd, but if it is even, more of the filter is forward in time than backward. So in our ‘mav’ function the rolling average looks both sides of the current value rather than just at past values. We can tweak that to get the behaviour we want: mav <- function(x,n=5){filter(x,rep(1/n,n), sides=1)} > mav(c(4,5,4,6), 3) Time Series: Start = 1 End = 4 Frequency = 1 [1] NA NA 4.333333 5.000000 The NA values are annoying for any plotting we want to do so let’s get rid of them: > na.omit(mav(c(4,5,4,6), 3)) Time Series: Start = 3 End = 4 Frequency = 1 [1] 4.333333 5.000000 Having got to this point I noticed that Didier had referenced the zoo package in the comments and it has a built in function to take care of all this: > library(zoo) > rollmean(c(4,5,4,6), 3) [1] 4.333333 5.000000 I also realised I can list all the functions in a package with the ‘ls’ function so I’ll be scanning zoo’s list of functions next time I need to do something time series related – there’ll probably already be a function for it! > ls("package:zoo") [1] "as.Date" "as.Date.numeric" "as.Date.ts" [4] "as.Date.yearmon" "as.Date.yearqtr" "as.yearmon" [7] "as.yearmon.default" "as.yearqtr" "as.yearqtr.default" [10] "as.zoo" "as.zoo.default" "as.zooreg" [13] "as.zooreg.default" "autoplot.zoo" "cbind.zoo" [16] "coredata" "coredata.default" "coredata<-" [19] "facet_free" "format.yearqtr" "fortify.zoo" [22] "frequency<-" "ifelse.zoo" "index" [25] "index<-" "index2char" "is.regular" [28] "is.zoo" "make.par.list" "MATCH" [31] "MATCH.default" "MATCH.times" "median.zoo" [34] "merge.zoo" "na.aggregate" "na.aggregate.default" [37] "na.approx" "na.approx.default" "na.fill" [40] "na.fill.default" "na.locf" "na.locf.default" [43] "na.spline" "na.spline.default" "na.StructTS" [46] "na.trim" "na.trim.default" "na.trim.ts" [49] "ORDER" "ORDER.default" "panel.lines.its" [52] "panel.lines.tis" "panel.lines.ts" "panel.lines.zoo" [55] "panel.plot.custom" "panel.plot.default" "panel.points.its" [58] "panel.points.tis" "panel.points.ts" "panel.points.zoo" [61] "panel.polygon.its" "panel.polygon.tis" "panel.polygon.ts" [64] "panel.polygon.zoo" "panel.rect.its" "panel.rect.tis" [67] "panel.rect.ts" "panel.rect.zoo" "panel.segments.its" [70] "panel.segments.tis" "panel.segments.ts" "panel.segments.zoo" [73] "panel.text.its" "panel.text.tis" "panel.text.ts" [76] "panel.text.zoo" "plot.zoo" "quantile.zoo" [79] "rbind.zoo" "read.zoo" "rev.zoo" [82] "rollapply" "rollapplyr" "rollmax" [85] "rollmax.default" "rollmaxr" "rollmean" [88] "rollmean.default" "rollmeanr" "rollmedian" [91] "rollmedian.default" "rollmedianr" "rollsum" [94] "rollsum.default" "rollsumr" "scale_x_yearmon" [97] "scale_x_yearqtr" "scale_y_yearmon" "scale_y_yearqtr" [100] "Sys.yearmon" "Sys.yearqtr" "time<-" [103] "write.zoo" "xblocks" "xblocks.default" [106] "xtfrm.zoo" "yearmon" "yearmon_trans" [109] "yearqtr" "yearqtr_trans" "zoo" [112] "zooreg" Be Sociable, Share!Reference: R: Calculating rolling or moving averages from our JCG partner Mark Needham at the Mark Needham Blog blog....
devops-logo

Load Balance WebSockets using Apache HTTPD

JBoss EAP 6.3 provides a technology preview of WebSocket and WildFly have supported them as part of Java EE 7 compliance. github.com/javaee-samples/javaee7-samples/tree/master/websocket provide tons of Java EE 7 samples that run on WildFly. If you are interested in similar functionality on JBoss EAP 6.3 then github.com/jboss-developer/jboss-eap-quickstarts/tree/6.4.x-develop/websocket-hello is a quickstart. In addition, there are a few more samples at github.com/arun-gupta/jboss-samples/tree/master/eap63. One of the common questions asked related to WebSockets is how to how to load balance  them. This Tech Tip will explain that for WildFly and JBoss EAP 6.3. First, what are the main components ?At least Apache HTTPD 2.4.5 is required. Now HTTPD binaries are not available for Mac but fortunately compiling instructions are explained clearly in Tech Tip #45. mod_proxy_wstunnel is an Apache module that provides support for tunneling of Web Socket connections to a backend Web Sockets server, such as WildFly or JBoss EAP. It is a support module to mod_proxy that provide support for a number of popular protocols as well as several different load balancing algorithms. The connection is automagically upgraded to a WebSocket connection.  And all the modules are already included in the modules directory. mod_proxy_balancer module is required that provides load balancing for HTTP and other protocols.Let’s go!Download and unzip WildFly 8.1. Start WildFly 8.1 in domain mode using ./bin/domain.sh. Download this chat sample, rename the file to “chat.war” and deploy to “main-server-group” as: ~/tools/wildfly-8.1.0.Final/bin/jboss-cli.sh -c --command="deploy chat.war --server-groups=main-server-group" The only difference from the original Java EE 7 WebSocket Chat sample is the addition of System.getProperty("jboss.node.name") to display the name of WildFly instance serving the application. The source code is available at github.com/arun-gupta/wildfly-samples/tree/master/websocket-loadbalance. Uncomment the following lines in /usr/local/apache2/conf/httpd.conf: LoadModule lbmethod_byrequests_module modules/mod_lbmethod_byrequests.so LoadModule proxy_module modules/mod_proxy.so LoadModule proxy_wstunnel_module modules/mod_proxy_wstunnel.so LoadModule proxy_balancer_module modules/mod_proxy_balancer.so This will enable all the required modules. Add the following code fragment at the end of “httpd.conf”: <Proxy balancer://mycluster> BalancerMember ws://localhost:8080 BalancerMember ws://localhost:8230 </Proxy> ProxyPass /chat balancer://mycluster/chat Proxy is a container for proxied resources and is creating a load balancing group in this case using balancer directive. BalancerMember adds a member to this load balancing group.  ProxyPass is a standard directive that maps remote servers running on different ports into the space of the local server. In this case, WildFly is started in domain mode and so starts two instances on port 8080 and 8230. Both of these instances are mapped to localhost:80, which is where Apache HTTPD is running by default.The deployed chat sample is now accessible at localhost:8080/chat (first instance in the managed domain), localhost:8230/chat (second WildFly instance in the managed domain), and localhost/chat (via Apache HTTPD). Now even if you kill one of the WildFly instances, the other instance will continue to serve the client. Note, this only gives application availability as there is no session failover in this. This was also verified on JBoss EAP 6.3 and there are a few differences:Use the sample from github.com/arun-gupta/jboss-samples/tree/master/eap63/websocket-chat instead. The generated archive name is different and so the command would look slightly different too: ~/tools/jboss-eap-6.3/bin/jboss-cli.sh -c --command="deploy websocket-chat-1.0-SNAPSHOT.war --server-groups=main-server-group"Configure “httpd.conf” as: <Proxy balancer://mycluster> BalancerMember ws://localhost:8080 BalancerMember ws://localhost:8230 </Proxy> ProxyPass /websocket-chat-1.0-SNAPSHOT balancer://mycluster/websocket-chat-1.0-SNAPSHOTAnd that’s it! Watch this live in action:An important point to understand is that there is no concept of “sticky sessions” in WebSocket as, unlike HTTP, there is a direct and “permanent” connection between the client and the server in this case. Enjoy!Reference: Load Balance WebSockets using Apache HTTPD from our JCG partner Arun Gupta at the Miles to go 2.0 … blog....
junit-logo

JUnit Rules

Introduction In this post I would like to show an example of how to use JUnit Rule to make testing easier. Recently I inherited a rather complex system, which not everything is tested. And even the tested code is complex. Mostly I see lack of test isolation. (I will write a different blog about working with Legacy Code). One of the test (and code) I am fixing actually tests several components together. It also connect to the DB. It tests some logic and intersection between components. When the code did not compile in a totally different location, the test could not run because it loaded all Spring context. The structure was that before testing (any class) all Spring context was initiated. The tests extend BaseTest, which loads all Spring context. BaseTest also cleans the DB in the @After method. Important note: This article is about changing tests, which are not structured entirely correct. When creating new code and tests they should be isolated, testi one thing etc. Better tests should use mock DB / dependencies etc. After I fix the test and refactor, I’ll have confidence making more changes. Back to our topic… So, what I got is slow run of the test suit, no isolation and even problem running tests due to unrelated problems. So I decided separating the context loading with DB connection and both of them from the cleaning up of the database. Approach In order to achieve that I did three things: The first was to change inheritance of the test class. It stopped inheriting BaseTest. Instead it inherits AbstractJUnit4SpringContextTests Now I can create my own context per test and not load everything. Now I needed two rules, a @ClassRule and @Rule @ClassRule will be responsible for DB connection @Rule will cleanup the DB after / before each test. But first, what are JUnit Rules? A short explanation would be that they provide a possibility to intercept test method, similar to AOP concept. @Rule allows us to intercept method before and after the actual run of the method. @ClassRule intercepts test class run. A very known @Rule is JUnit’s TemporaryFolder. (Similar to @Before, @After and @BeforeClass). Creating @Rule The easy part was to create a Rule that cleanup the DB before and after a test method. You need to implement TestRule, which has one method: Statement apply(Statement base, Description description); You can do a-lot with it. I found out that usually I will have an inner class that extends Statement. The rule I created did not create the DB connection, but got it in the constructor. Here’s the full code: public class DbCleanupRule implements TestRule { private final DbConnectionManager connection;public MongoCleanupRule(DbConnectionManager connection) { this.connection = connection; }@Override public Statement apply(Statement base, Description description) { return new MongoCleanupStatement(base, connection); }private static final class DbCleanupStatement extends Statement { private final Statement base; private final DbConnectionManager connection; private MongoCleanupStatement(Statement base, DbConnectionManager connection) { this.base = base; this.connection = connection; }@Override public void evaluate() throws Throwable { try { cleanDb(); base.evaluate(); } finally { cleanDb(); } } private void cleanDb() { connection.doTheCleanup(); } } } Creating @ClassRule ClassRule is actually also TestRule. The only difference from Rule is how we use it in our test code. I’ll show it below. The challenge in creating this rule was that I wanted to use Spring context to get the correct connection. Here’s the code: (ExternalResource is TestRule) public class DbConnectionRule extends ExternalResource { private DbConnectionManager connection;public DbConnectionRule() { } @Override protected void before() throws Throwable { ClassPathXmlApplicationContext ctx = null; try { ctx = new ClassPathXmlApplicationContext("/META-INF/my-db-connection-TEST-ctx.xml"); mongoDb = (DbConnectionManager) ctx.getBean("myDbConnection"); } finally { if (ctx != null) { ctx.close(); } } } @Override protected void after() { } public DbConnectionManager getDbConnecttion() { return connection; } } (Did you see that I could make DbCleanupRule inherit ExternalResource?) Using it The last part is how we use the rules. A @Rule must be public field. A @ClassRule must be public static field. And there it is: @ContextConfiguration(locations = { "/META-INF/one-dao-TEST-ctx.xml", "/META-INF/two-TEST-ctx.xml" }) public class ExampleDaoTest extends AbstractJUnit4SpringContextTests {@ClassRule public static DbCleanupRule connectionRule = new DbCleanupRule (); @Rule public DbCleanupRule dbCleanupRule = new DbCleanupRule(connectionRule.getDbConnecttion());@Autowired private ExampleDao classToTest;@Test public void foo() { } } That’s all. Hope it helps.Reference: JUnit Rules from our JCG partner Eyal Golan at the Learning and Improving as a Craftsman Developer blog....
software-development-2-logo

An open web application framework benchmark

Selecting a platform for your next application development project can be a complex and burdensome undertaking. It can also be very intriguing and a lot of fun. There’s a wide range of different approaches to take: at one end The Architect will attend conferences, purchase and study analyst reports from established technology research companies such as Gartner, and base his evaluation on analyst views. Another approach is to set up a cross-disciplinary evaluation committee that will collect a wishlist of platform requirements from around the organization and make its decision based on a consensus vote. The first approach is very autocratic, while the second can sometimes lead to lack of focus. A clear, coherent vision of requirements and prioritization is essential for the success of the evaluation. Due to these problems, a middle road and a more pragmatic approach is becoming increasingly popular: a tightly-knit group of senior propellerheads use a more empiric method of analysing requirements, study and experiment with potential solution stack elements, brainstorm to produce a short list of candidates to be validated using a hands-on architecture exercises and smell-tests. Though hands-on experimentation can lead to better results, the cost of this method can be prohibitive, so often only a handful of solutions that pass the first phase screening can be evaluated this way. Platform evaluation criteria depend on the project requirements and may include:developer productivity platform stability roadmap alignment with projected requirements tools support information security strategic partnerships developer ecosystem existing software license and human capital investments etc.Performance and scalability are often high priority concerns. They are also among those platform properties that can be formulated into quantifiable criteria, though the key challenge here is how to model the user and implement performance tests that accurately model your expected workloads. Benchmarking several different platforms can only add to the cost of benchmarking. A company called TechEmpower has started a project called TechEmpower Framework Benchmarks, or TFB for short, that aims to compare the performance of different web frameworks. The project publishes benchmark results that application developers can use to make more informed decisions when selecting frameworks. What’s particularly interesting about FrameworkBenchmarks, is that it’s a collaborative effort conducted in an open manner. Development related discussions take place in an online forum and the source code repository is publicly available on GitHub. Doing test implementation development in the open is important for enabling peer review and it allows implementations to evolve and improve over time. The project implements performance tests for a wide variety of frameworks, and chances are that the ones that you’re planning to use are included. If not, you can create your own tests and submit them to be included in the project code base. You can also take the tests and run the benchmarks on your own hardware. Openly published test implementations are not only useful for producing benchmark data, but can also be used by framework developers to communicate framework performance related best practices to application developers. They also allow framework developers to receive reproducible performance benchmarking feedback and data for optimization purposes. It’s interesting to note that the test implementations have been designed and built by different groups and individuals, and some may have been more rigorously optimized than others. The benchmarks measure the performance of the framework as much as they measure the test implementation, and in some cases suboptimal test implementation will result in poor overall performance. Framework torchbearers are expected to take their best shot in optimizing the test implementation, so the implementations should eventually converge to optimal solutions given enough active framework pundits. Test types In the project’s parlance, the combination of programming language, framework and database used is termed “framework permutation” or just permutation, and some test types have been implemented in 100+ different permutations. The different test types include:JSON serialization “test framework fundamentals including keep-alive support, request routing, request header parsing, object instantiation, JSON serialization, response header generation, and request count throughput.” Single database query “exercise the framework’s object-relational mapper (ORM), random number generator, database driver, and database connection pool.” Multiple database queries “This test is a variation of Test #2 and also uses the World table. Multiple rows are fetched to more dramatically punish the database driver and connection pool. At the highest queries-per-request tested (20), this test demonstrates all frameworks’ convergence toward zero requests-per-second as database activity increases.” Fortunes “This test exercises the ORM, database connectivity, dynamic-size collections, sorting, server-side templates, XSS countermeasures, and character encoding.” Database updates “This test is a variation of Test #3 that exercises the ORM’s persistence of objects and the database driver’s performance at running UPDATE statements or similar. The spirit of this test is to exercise a variable number of read-then-write style database operations.” Plaintext “This test is an exercise of the request-routing fundamentals only, designed to demonstrate the capacity of high-performance platforms in particular. The response payload is still small, meaning good performance is still necessary in order to saturate the gigabit Ethernet of the test environment.”Notes on Round 9 results Currently, the latest benchmark is Round 9 and the result data is published on the project web page. The data is not available in machine-readable form and it can’t be sorted by column for analysing patterns. It can, however, be imported into a spreadsheet program fairly easily, so I took the data and analyzed it a bit. Some interesting observations could be made just by looking at the raw data. In addition to comparing throughput, it’s also interesting to compare how well frameworks scale. One way of quantifying scalability is to take test implementation throughput figures for the lowest and highest concurrency level (for test types 1, 2, 4 and 6) per framework and plot them on a 2-D plane. A line can then be drawn between these two points with the slope characterizing scalability. Well-scaling test implementations would be expected to have a positive, steep slope for test types 1, 2, 4 and 6 whereas for test types 3 and 5 the slope is expected to be negative. This model is not entirely without problems since the scalability rating is not relative to the throughput, so e.g. a poorly performing framework can end up having a great scalability rating. As a result, you’d have to look at these figures together. To better visualize throughput against concurrency level (“Peak Hosting” environment data), I created a small web app that’s available at http://tfb-kippo.rhcloud.com/ (the app is subject to removal without notice). JSON serialization The JSON serialization test aims to measure framework overhead. One could argue that it’s a bit of a micro benchmark, but it should demonstrate how well the framework does with basic tasks like request routing, JSON serialization and response generation. The top 10 frameworks were based on the following programming languages: C++, Java, Lua, Ur and Go. C++ based CPPSP was the clear winner while the next 6 contestants were Java -based. No database is used in this test type. The top 7 frameworks with highest throughput also have the highest scalability rating. After that, both these figures start declining fairly rapidly. This is a very simple test and it’s a bit of a surprise to see such large variation in results. In their commentary TechEmpower attributes some of the differences to how well frameworks work on a NUMA-based system architecture. Quite many frameworks are Java or JVM based and rather large variations exist even within this group, so clearly neither the language nor the JVM is an impeding factor in this group. I was surprised about Node.js and HHVM rankings. Unfortunately, the Scala-based Spray test implementation, as well as the JVM-based polyglot framework Vert.x implementation, were removed due to being outdated. Hope to see these included in a future benchmark round. Single database query This test type measures database access throughput and parallelizability. Again, surprisingly large spread in performance can be observed for a fairly trivial test case. This would seem to suggest that framework or database access method overhead contributes significantly to the results. Is the database access technology (DB driver or ORM) a bottleneck? Or is the backend system one? It would be interesting to look at the system activity reports from test runs to analyze potential bottlenecks in more detail. Before seeing the results I would’ve expected the DB backend to be the bottleneck, but this doesn’t appear to be clear-cut based on the fact that the top, as well as many of the bottom performing test implementations, are using the same DB. It was interesting to note that the top six test implementations use a relational database with the first NoSQL based implementation taking 7th place. This test runs DB read statements by ID, which NoSQL databases should be very good at. Top performing 10 frameworks were based on Java, C++, Lua and PHP languages and are using MySQL, PostgreSQL and MongoDB databases. Java based Gemini leads with CPPSP being second. Both use MySQL DB. Spring based test implementation performance was a bit of a disappointment. Multiple database queries Where the previous test exercised a single database query per request this test does a variable number of database queries per request. Again, I would’ve assumed this test would measure the backend database performance more than the framework performance, but it seems that framework and database access method overhead can also contribute significantly. The top two performers in this test are Dart based implementations that use MongoDB. Top 10 frameworks in this test are based on Dart, Java, Clojure, PHP and C# languages and they use MongoDB and MySQL databases. Fortunes This is the most complex test that aims to exercise the full framework stack from request routing through business logic execution, database access, templating and response generation. Top 10 frameworks are based on C++, Java, Ur, Scala, PHP languages and with the full spectrum of databases being used (MySQL, PostgreSQL and MongoDB). Database updates In addition to reads this test exercises database updates as well. HHVM wins this test with 3 Node.js based frameworks coming next. Similar to the Single database query test the top 13 implementations work with relational MySQL DB, before NoSQL implementations. This test exercises simple read and write data access by ID which, again, should be one of NoSQL database strong points. Top performing 10 frameworks were based on PHP, JavaScript, Scala, Java and Go languages, all of which use the MySQL database. Plaintext The aim of this test is to measure how well the framework performs under extreme load conditions and massive client parallelism. Since there’s no backend system dependencies involved, this test measures platform and framework concurrency limits. Throughput plateaus or starts degrading with top-performing frameworks in this test before client concurrency level reaches the maximum value, which seems to suggest that a bottleneck is being hit somewhere in the test setup, presumably hardware, OS and/or framework concurrency. Many frameworks are at their best with concurrency level of 256, except CPPSP which peaks at 1024. CPPSP is the only one of the top-performing implementations that is able to significantly improve its performance as the concurrency level increases from 256, but even with CPPSP throughput actually starts dropping after concurrency level hits the 4,096 mark. Only 12 test implementations are able to exceed 1 M requests per second. Some well-known platforms e.g. Spring did surprisingly poorly. There seems to be something seriously wrong with HHVM test run as it generates only tens of responses per second with concurrency levels 256 and 1024. Top 10 frameworks are based on C++, Java, Scala and Lua languages. No database is used in this test. Benchmark repeatability In the scientific world research must be repeatable, in order to be credible. Similarly, the benchmark test methodology and relevant circumstances should be documented to make the results repeatable and credible. There’re a few details that could be documented to improve repeatability. The benchmarking project source code doesn’t seem to be tagged. Tagging would be essential for making benchmarks repeatable. A short description of the hardware and some other test environment parameters is available on the benchmark project web site. However, the environment setup (hardware + software) is expected to change over time, so this information should be documented per round. Also, Linux distribution minor release or the exact Linux kernel version don’t appear to be identified. Detailed data about what goes on inside the servers could be published, so that externals could analyze benchmark results in a more meaningful way. System activity reports e.g. system resource usage (CPU, memory, IO) can provide valuable clues to possible scalability issues. Also, application, framework, database and other logs can be useful to test implementers. Resin was chosen as the Java application server over Apache Tomcat and other servlet containers due to performance reasons. While I’m not contesting this statement, but there wasn’t any mention about software versions, and since performance attributes tend to change over time between releases, this premise is not repeatable. Neither the exact JVM version nor the JVM arguments are documented for JVM based test implementation execution. Default JVM arguments are used if test implementations don’t override the settings. Since the test implementations have very similar execution profiles by definition, it could be beneficial to explicitly configure and share some JVM flags that are commonly used with server-side applications. Also, due to JVM ergonomics different GC parameters can be automatically selected based on underlying server capacity and JVM version. Documenting these parameters per benchmark round would help with repeatability. Perhaps all the middleware software versions could be logged during test execution and the full test run logs could be made available. A custom test implementation: Asynchronous Java + NoSQL DB Since I’ve worked recently on implementing RESTful services based on JAX-RS 2 API with asynchronous processing (based on Jersey 2 implementation) and Apache Cassandra NoSQL database, I got curious about how this combination would perform against the competition so, I started coding my own test implementation. I decided to drop JAX-RS in this case, however, to eliminate any non-essential abstraction layers that might have a negative impact on performance. One of the biggest hurdles in getting started with test development was that, at the time I started my project there wasn’t a way to test run platform installation scripts in smaller pieces, and you had to run the full installation, which took a very long time. Fortunately, since then framework installation procedure has been compartmentalized, so it’s possible to install just the framework that you’re developing tests for. Also, recently the project has added support for fully automated development environment setup with Vagrant, which is a great help. Another excellent addition is Travis CI integration that allows test implementation developers to gain additional assurance that their code is working as expected also outside their sandbox. Unfortunately, Travis builds can take a very long time, so you might need to disable some of the tests that you’re not actively working on. The Travis CI environment is also a bit different from the developer and the actual benchmarking environments, so you could bump into issues with Travis builds that don’t occur in the development environment, and vice versa. Travis build failures can sometimes be very obscure and tricky to troubleshoot. The actual test implementation code is easy enough to develop and test in isolation, outside of the real benchmark environment, but if you’re adding support for new platform components such as databases or testing platform installation scripts, it’s easiest if you have an environment that’s a close replica of the actual benchmarking environment. In this case adding support for a new database involved creating a new DB schema, test data generation and automating database installation and configuration. Implementing the actual test permutation turned out to be interesting, but surprisingly laborious, as well. I started seeing strange error responses occasionally when benchmarking my test implementation with ab and wrk, especially with higher loads. TFB executes Java based performance implementations in the Resin web container, and after a while of puzzlement about the errors, I decided to test the code in other web containers, namely Tomcat and Jetty. It turned out that I had bumped into 1 Resin bug (5776) and 2 Tomcat bugs (56736, 56739) related to servlet asynchronous processing support. Architecturally, Test types 1 and 6 have been implemented using traditional synchronous Servlet API, while the rest of the test implementations leverage non-blocking request handling through Servlet 3 asynchronous processing support. The test implementations store their data in the Apache Cassandra 2 NoSQL database, which is accessed using the DataStax Java Driver. Asynchronous processing is also used in the data access tier in order to minimize resource consumption. JSON data is processed with the Jackson JSON library. In Java versions predating version 8, asynchronous processing requires passing around callbacks in the form of anonymous classes, which can at times be a bit high-ceremony syntactically. Java 8 Lambda expressions does away with some of the ceremonial overhead, but unfortunately TFB doesn’t yet fully support the latest Java version. I’ve previously used the JAX-RS 2 asynchronous processing API, but not the Servlet 3 async API. One thing I noticed during the test implementation was that the mechanism provided by Servlet 3 async API for generating error response to the client is much lower level, less intuitive and more cumbersome than its JAX-RS async counterpart. The test implementation code was merged in the FrameworkBenchmarks code base, so it should be benchmarked on the next round. The code can be found here: https://github.com/TechEmpower/FrameworkBenchmarks/tree/master/frameworks/Java/servlet3-cass Conclusions TechEmpower’s Framework Benchmarks is a really valuable contribution to the web framework developer and user community. It holds great potential for enabling friendly competition between framework developers, as well as, framework users, and thus driving up performance of popular frameworks and adoption of framework performance best practices. As always, there’s room for improvement. Some areas from a framework user and test implementer point of view include: make the benchmark tests and results more repeatable, publish raw benchmark data for analysis purposes and work on making test development and adding new framework components even easier. Good job TFB team + contributors – can’t wait to see Round 10 benchmark data!Reference: An open web application framework benchmark from our JCG partner Marko Asplund at the practicing techie blog....
java-logo

Runtime Class Loading to Support a Changing API

I maintain an IntelliJ plugin that improves the experience of writing Spock specifications. A challenge of this project is supporting multiple & incompatible IntelliJ API versions in a single codebase. The solution is simple in retrospect (it’s an example of the adapter pattern in the wild), but it originally took a bit of thought and example hunting. I was in the code again today to fix support for a new version, and I decided to document how I originally solved the problem. The fundamental issue is that my compiled code could be loaded in a JVM runtime environment with any of several different API versions present. My solution was to break up the project into four parts:  A main project that doesn’t depend on any varying API calls and is therefore compatible across all API versions. The main project also has code that loads the appropriate adapter implementation based on the runtime environment it finds itself in. In this case, I’m able to take advantage of the IntelliJ PicoContainer for service lookup, but the reflection API or dependency injection also have what’s needed. A set of abstract adapters that provide an API for the main project to use. This project also doesn’t depend on any code that varies across API versions. Sets of classes that implement the abstract adapters for each supported API versions. Each set of adapters wraps changing API calls and is compiled against a specific API version.The simplest case to deal with is a refactor where something in the API moves. This is also what actually broke this last version. My main code needs the Groovy instance of com.intellij.lang.Language. This instance moved in IntelliJ 14. This code was constant until 14, so in this case I’m adding a new adapter. In the adapter module, I have an abstract class LanguageLookup.java: package com.cholick.idea.spock;import com.intellij.lang.Language; import com.intellij.openapi.components.ServiceManager;public abstract class LanguageLookup { public static LanguageLookup getInstance() { return ServiceManager.getService(LanguageLookup.class); } public abstract Language groovy(); } The lowest IntelliJ API version that I support is 11. Looking up the Groovy language instance is constant across 11-13, so the first concrete adapter lives in the module compiled against the IntelliJ 11 API. LanguageLookup11.java: package com.cholick.idea.spock;import com.intellij.lang.Language; import org.jetbrains.plugins.groovy.GroovyFileType;public class LanguageLookup11 extends LanguageLookup { public Language groovy() { return GroovyFileType.GROOVY_LANGUAGE; } } The newest API introduced the breaking change, so a second concrete adapter lives in a module compiled against version 14 of their API. LanguageLookup14.java: package com.cholick.idea.spock;import com.intellij.lang.Language; import org.jetbrains.plugins.groovy.GroovyLanguage;public class LanguageLookup14 extends LanguageLookup { public Language groovy() { return GroovyLanguage.INSTANCE; } } Finally, the main project has a class SpockPluginLoader.java that registers the proper adapter class based on the runtime API that’s loaded (I omitted several methods not specifically relevant to the example): package com.cholick.idea.spock.adapter;import com.cholick.idea.spock.LanguageLookup; import com.cholick.idea.spock.LanguageLookup11; import com.cholick.idea.spock.LanguageLookup14; import com.intellij.openapi.application.ApplicationInfo; import com.intellij.openapi.components.ApplicationComponent; import com.intellij.openapi.components.impl.ComponentManagerImpl; import org.jetbrains.annotations.NotNull; import org.picocontainer.MutablePicoContainer;public class SpockPluginLoader implements ApplicationComponent { private ComponentManagerImpl componentManager;SpockPluginLoader(@NotNull ComponentManagerImpl componentManager) { this.componentManager = componentManager; }@Override public void initComponent() { MutablePicoContainer picoContainer = componentManager.getPicoContainer(); registerLanguageLookup(picoContainer); }private void registerLanguageLookup(MutablePicoContainer picoContainer) { if(isAtLeast14()) { picoContainer.registerComponentInstance(LanguageLookup.class.getName(), new LanguageLookup14()); } else { picoContainer.registerComponentInstance(LanguageLookup.class.getName(), new LanguageLookup11()); } }private IntelliJVersion getVersion() { int version = ApplicationInfo.getInstance().getBuild().getBaselineVersion(); if (version >= 138) { return IntelliJVersion.V14; } else if (version >= 130) { return IntelliJVersion.V13; } else if (version >= 120) { return IntelliJVersion.V12; } return IntelliJVersion.V11; }private boolean isAtLeast14() { return getVersion().compareTo(IntelliJVersion.V14) >= 0; }enum IntelliJVersion { V11, V12, V13, V14 } } Finally, in code where I need the Groovy com.intellij.lang.Language, I get a hold of the LanguageLookup service and call its groovy method: ... Language groovy = LanguageLookup.getInstance().groovy(); if (PsiUtilBase.getLanguageAtOffset(file, offset).isKindOf(groovy)) { ... This solution allows the same compiled plugin JAR to support IntelliJ’s varying API across versions 11-14. I imagine that Android developers commonly implement solutions like this, but it’s something I’d never had to write as a web application developer.Reference: Runtime Class Loading to Support a Changing API from our JCG partner Matt Cholick at the Cholick.com blog....
java-logo

Friday-Benchmarking Functional Java

Lets image our product owner goes crazy one day and ask to you to do the following : From a set of Strings as follows : "marco_8", "john_33", "marco_1", "john_33", "thomas_5", "john_33", "marco_4", .... give me a comma separated String with only the marco's numbers and numbers need to be in order. Example of expected result : "1,4,8"     I will implement this logic in 4 distinct ways and I will micro benchmark each one of them. The ways I’m going to implement the logic are :Traditional java with loops and all. Functional with Guava Functional with java 8 stream Functional with java 8 parallelStreamCode is below or in gist package com.marco.brownbag.functional; import java.util.ArrayList; import java.util.Collections; import java.util.HashSet; import java.util.Iterator; import java.util.List; import java.util.Set; import java.util.stream.Collectors; import com.google.common.base.Function; import com.google.common.base.Joiner; import com.google.common.base.Predicates; import com.google.common.collect.Collections2; import com.google.common.collect.Ordering; public class MicroBenchMarkFunctional {        private static final int totStrings = 2;        public static void main(String[] args) {                Set<String> someNames = new HashSet<String>();                init(someNames);                for (int i = 1; i < totStrings; i++) {                         someNames.add("marco_" + i);                         someNames.add("someone_else_" + i);                 }                System.out.println("start");                run(someNames);        }        private static void run(Set<String> someNames) {                 System.out.println("========================");                 long start = System.nanoTime();                 int totalLoops = 20;                 for (int i = 1; i < totalLoops; i++) {                         classic(someNames);                 }                 System.out.println("Classic         : " + ((System.nanoTime() - start)) / totalLoops);                start = System.nanoTime();                 for (int i = 1; i < totalLoops; i++) {                         guava(someNames);                 }                 System.out.println("Guava           : " + ((System.nanoTime() - start)) / totalLoops);                start = System.nanoTime();                 for (int i = 1; i < totalLoops; i++) {                         stream(someNames);                 }                 System.out.println("Stream          : " + ((System.nanoTime() - start)) / totalLoops);                start = System.nanoTime();                 for (int i = 1; i < totalLoops; i++) {                         parallelStream(someNames);                 }                 System.out.println("Parallel Stream : " + ((System.nanoTime() - start)) / totalLoops);                System.out.println("========================");         }        private static void init(Set<String> someNames) {                 someNames.add("marco_1");                 classic(someNames);                 guava(someNames);                 stream(someNames);                 parallelStream(someNames);                 someNames.clear();         }        private static String stream(Set<String> someNames) {                 return someNames.stream().filter(element -> element.startsWith("m")).map(element -> element.replaceAll("marco_", "")).sorted()                                 .collect(Collectors.joining(","));         }        private static String parallelStream(Set<String> someNames) {                 return someNames.parallelStream().filter(element -> element.startsWith("m")).map(element -> element.replaceAll("marco_", "")).sorted()                                 .collect(Collectors.joining(","));         }        private static String guava(Set<String> someNames) {                 return Joiner.on(',').join(                                 Ordering.from(String.CASE_INSENSITIVE_ORDER).immutableSortedCopy(                                                 Collections2.transform(Collections2.filter(someNames, Predicates.containsPattern("marco")), REPLACE_MARCO)));        }        private static Function<String, String> REPLACE_MARCO = new Function<String, String>() {                 @Override                 public String apply(final String element) {                         return element.replaceAll("marco_", "");                 }         };        private static String classic(Set<String> someNames) {                List<String> namesWithM = new ArrayList<String>();                for (String element : someNames) {                         if (element.startsWith("m")) {                                 namesWithM.add(element.replaceAll("marco_", ""));                         }                 }                Collections.sort(namesWithM);                StringBuilder commaSeparetedString = new StringBuilder();                Iterator<String> namesWithMIterator = namesWithM.iterator();                 while (namesWithMIterator.hasNext()) {                         commaSeparetedString.append(namesWithMIterator.next());                         if (namesWithMIterator.hasNext()) {                                 commaSeparetedString.append(",");                         }                }                return commaSeparetedString.toString();        } } Two points before we dig into performance :Forget about the init() method, that one is just to initialize objects in the jvm otherwise numbers are just crazy. The java 8 functional style looks nicer and cleaner than guava and than developing in a traditional way!Performance: Running that program on my mac with 4 cores, the result is the following : ======================== Classic         : 151941400 Guava           : 238798150 Stream          : 151853850 Parallel Stream : 55724700 ======================== Parallel Stream is 3 times faster. This is because java will split the job in multiple tasks (total of tasks depends on your machine, cores, etc) and will run them in parallel, aggregating the result at the end. Classic Java and java 8 stream have more or less the same performance. Guava is the looser. That is amazing, so someone could think: “cool, I can just always use parallelStream and I will have big bonus at the end of the year”.  But life is never easy. Here is what happens when you reduce that Set of strings from 200.000 to 20: ======================== Classic         : 36950 Guava           : 69650 Stream          : 29850 Parallel Stream : 143350 ======================== Parallel Stream became damn slow. This because parallelStream has a big overhead in terms of initializing and managing multitasking and assembling back the results. Java 8 stream looks now the winner compare to the other 2. Ok, at this point, someone could say something like : “for collections with lots of elements I use parallelStream, otherwise I use stream.” That would be nice and simple to get, but what happens when I reduce that Set again from 20 to 2? This : ======================== Classic         : 8500 Guava           : 20050 Stream          : 24700 Parallel Stream : 67850 ======================== Classic java loops are faster with very few elements. So at this point I can go back to my crazy product owner and ask how many Strings he thinks to have in that input collection. 20? less? more? much more? Like the Carpenter says : Measure Twice, Cut Once!!Reference: Friday-Benchmarking Functional Java from our JCG partner Marco Castigliego at the Remove duplication and fix bad names blog....
software-development-2-logo

Why You Should NOT Implement Layered Architecture

Abstraction layers in software are what architecture astronauts tell you to do. Instead, however, half of all applications out there would be so easy, fun, and most importantly: productive to implement if you just got rid of all those layers. Frankly, what do you really need? You need these two:Some data access Some UIBecause that’s the two things that you inevitably have in most systems. Users, and data. Here’s Kyle Boon’s opinion on possible choices that you may have.   Really enjoying #ratpack and #jooq. — Kyle Boon (@kyleboon) September 2, 2014Very nice choice, Kyle. Ratpack and jOOQ. You could choose any other APIs, of course. You could even choose to write JDBC directly in JSP. Why not. As long as you don’t go pile up 13 layers of abstraction:That’s all bollocks, you’re saying? We need layers to abstract away the underlying implementation so we can change it? OK, let’s give this some serious thought. How often do you really change the implementation? Some examples:SQL. You hardly change the implementation from Oracle to DB2 DBMS. You hardly change the model from relational to flat or XML or JSON JPA. You hardly switch from Hibernate to EclipseLink UI. You simply don’t replace HTML with Swing Transport. You just don’t switch from HTTP to SOAP Transaction layer. You just don’t substitute JavaEE with Spring, or JDBC transactionsNope. Your architecture is probably set in stone. And if – by the incredible influence of entropy and fate – you happen to have made the wrong decision in one aspect, about 3 years ago, well you’re in for a major refactoring anyway. If SQL was the wrong choice, well good luck to you migrating everything to MongoDB (which is per se the wrong choice again, so prepare for migrating back). If HTML was the wrong choice, well even more tough luck to you. Likelihood of your layers not really helping you when a concrete incident happens: 95% (because you missed an important detail) Layers = Insurance If you’re still thinking about implementing an extremely nice layered architecture, ready to deal with pretty much every situation where you simply switch a complete stack with another, then what you’re really doing is filing a dozen insurance policies. Think about it this way. You can get:Legal insurance Third party insurance Reinsurance Business interruption insurance Business overhead expense disability insurance Key person insurance Shipping insurance War risk insurance Payment protection insurance … pick a random categoryYou can pay and pay and pay in advance for things that probably won’t ever happen to you. Will they? Yeah, they might. But if you buy all that insurance, you pay heavily up front. And let me tell you a secret. IF any incident ever happens, chances are that you:Didn’t buy that particular insurance Aren’t covered appropriately Didn’t read the policy Got screwedAnd you’re doing exactly that in every application that would otherwise already be finished and would already be adding value to your customer, while you’re still debating if on layer 37 between the business rules and transformation layers, you actually need another abstraction because the rule engine could be switched any time. Stop doing that You get the point. If you have infinite amounts of time and money, implement an awesome, huge architecture up front. Your competitor’s time to market (and fun, on the way) is better than yours. But for a short period of time, you were that close to the perfect, layered architecture!Reference: Why You Should NOT Implement Layered Architecture from our JCG partner Lukas Eder at the JAVA, SQL, AND JOOQ blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close