Featured FREE Whitepapers

What's New Here?

agile-logo

Agile outside of software

Later this week I’m doing a presentation at Agile On The Beach entitled: “Agile outside of software development”. (I resisted the temptation to call it “Agile Beyond Software”). The presentation will attempt to answer a question which is often asked at, and around, Agile On The Beach: “Is Agile only for Software Development?” In truth it is not just at Agile On The Beach (AOTB) that this question is asked, I hear it more and more often elsewhere but the origins of AOTB, and target audience for AOTB, mean that the question is very much on topic. I’ll post the presentation online after I have delivered it – until then I’ll be tweaking it – but right now I’d like to (kind of by way of rehearsal) run down my main argument, so here goes…. Before the term Agile Software Development was coined there was “Agile Manufacturing”. This was this term that inspired the software folks so perhaps the first answer is: “Yes! Of course Agile works outside of software because that is where it came from.” But, Agile Software Development has far and away surpassed Agile Manufacturing in writing and mindshare. So much so that others in the wider business community have looked at Agile Software Development as a model for other types of working. In a way, “Agile” has come full circle. Perhaps it is worth pausing for a moment and asking: “What do we mean by Agile?” or “What do we want from an Agile company?” This could be a big big debate, ultimately it is for each company, each team, to decide what Agile means to them. Rather than have that discussion let me quote MIT Professor Michael A Cusumano: “I can’t think of anything more important than building an agile company, because the world changes so quickly and unpredictably… [Agility] comes in different forms, but basically it’s the ability to quickly adapt to or even anticipate and lead change. Agility in the broadest form affects strategic thinking, operations, technology innovation and the ability to innovate in products, processes and business models.” (MIT Sloan Management Review Summer 2011 interview with Hopkins) Lets add to that a little bit by defining Agility from three views:Agile Strategy: Adaptability, listen to customer, leading the market, use change competitively Agile Tactically: Experimenting, “Expeditionary Marketing”, live in the now while preparing for the future Agile Operations: Deliver fast, deliver quality, deliver valueWe could go on but thats enough for now. Now we have an approximate understanding of what we want we can go back to the original question: “Is Agile only for Software Development?” There are three parts to this answer: Practices, Roots and Case Studies. Practices Lots of the practices associated with Agile actually come from elsewhere. Examples of stand-up meeting proliferate – bars, healthcare, the military, Japanese local Government and so on. Many companies operate regular status or planning meetings, Agile just elevates this practice. WIP limits are well established in manufacturing. Agile picks up some practices directly – visual boards (“information radiators”) are nothing new. Some practices it picks up and changes – Retrospectives have long existed as “Lessons Learned” or “After Action Reviews.” Some practices Agile (might have) invented – Planning poker but this is itself version of wide band delphi. And some Agile plays back to the business – TDD and BDD are writ large in Lean Startup (which is itself an extreme version of Expeditionary Marketing from 1991, Hamel & Prahalad). Roots As already mentioned: Agile Software Development was inspired by Agile Manufacturing. I’ve described my Agile Pyramid model before (Agile and Lean – the same but different, How do you make Lean Practical?) and my argument that “Agile is Lean thinking applied to software development” and I wrote whole book saying the software development, and Agile software development specifically, is an example of Organizational Learning. Well, Agile Software Development is not the only application built on these foundations. Obviously the Toyota Production System is. And so too is its close cousin the Ford Production System. Look further afield and you will find Last Planner in construction. I could continue but I think my point is proved. While not every Agile practice can be taken out of software development and used someplace else the roots of Agile mean that the principles, values and ideas which Agile is built on can be. In your domain Agile as now known might work quite well, but in someone else’s domain there may be more need to think deeper. One caveat: I believe the more deeply you look at Agile and more Agile is applied outside of the software development the more it looks like Lean. Ultimately the distinction between Lean and Agile breaks down in my book. Case Studies Good news: There are case studies of teams using Agile outside of software – and if you know of any more please tell me! Add them in the comments on this blog post. Bad news: There are not that many case studies. Unless you are a software team you probably can’t find one. For the Agile on the Beach conference I have deliberately sought out Agile outside of software development in the last few years. This has resulted in two good examples. Two years ago Kate Sullivan described how the Lonely Planet legal department adopted Agile working. In fact Kate said all of Lonely Planet adopted Agile. Last year Martin Rowe talked about how he used Scrum to manage the Foundation Computer Science course at Plymouth University. Elsewhere, six years ago the MIT Sloan Management Review carried a piece by Keith R. McFarland entitled “Should you build strategy like you build software?” in which he described how Shamrock Foods of Arizona used Agile to plan and execute their business strategy. Myself, I have seen the marketing team at Sullivan Cuff Software Ltd use an Agile like approach. And last year I helped the GSMA (the people responsible for SIM cards) use Agile on a project writing a document for cellphone manufacturers, mobile network operators and umpteen suppliers and partners to allow mobile phones to be used for loyalty coupons. So, back to the original question: “Is Agile only for Software Development?” Answer: No Question: Will Agile work outside software development? Answer: Yes But, the detail may vary. Finally, and please excuse the pull. As a result of this discussion my company has adapted its successful Foundations of Agile Software Development 2-day course for companies outside software. Please take a look at Agile Kick-Off (for non-software teams) and let me know what you think.Reference: Agile outside of software from our JCG partner Allan Kelly at the Agile, Lean, Patterns blog....
java-interview-questions-answers

Really Dynamic Declarative Components

In this short post I am going to focus on ADF dynamic declarative components. I mean a well known ADF tag af:declarativeComponent. It can be used as a pretty convenient way to design a page as a composition of page fragments and components. For example, our page can contain the following code snippet:                <af:declarativeComponent viewId="PageFragment.jsff" id="dc1">    <f:facet name="TheFacet">      <af:button text="button 1" id="b1"/>    </f:facet>                     </af:declarativeComponent> And the PageFragment.jsff is a usual page fragment like this one: <?xml version='1.0' encoding='UTF-8'?> <jsp:root xmlns:jsp="http://java.sun.com/JSP/Page" version="2.1"           xmlns:af="http://xmlns.oracle.com/adf/faces/rich">   <af:panelGroupLayout id="pgl1">     <af:outputText value="This is a page fragment.                           You can add your content to the following facet:"                    id="ot1"/>     <af:facetRef facetName="TheFacet"/>   </af:panelGroupLayout> </jsp:root> If we need to be able to pass some parameters to a page fragment, we can define the fragment as a component: <?xml version='1.0' encoding='UTF-8'?> <jsp:root xmlns:jsp="http://java.sun.com/JSP/Page" version="2.1"           xmlns:af="http://xmlns.oracle.com/adf/faces/rich"> <af:componentDef var="attrs">   <af:xmlContent>     <component xmlns="http://xmlns.oracle.com/adf/faces/rich/component">       <facet>         <facet-name>TheFacet</facet-name>       </facet>       <attribute>         <attribute-name>Title</attribute-name>       </attribute>     </component>   </af:xmlContent>   <af:panelGroupLayout id="pgl1">     <af:outputText value="This is a component #{attrs.Title}.                           You can add your content to the following facet:" id="ot1"/>     <af:facetRef facetName="TheFacet"/>   </af:panelGroupLayout>  </af:componentDef> </jsp:root> In this example we can pass the value of the Title attribute as it is shown in this code snippet: <af:declarativeComponent viewId="ComponentFragment.jsff"                          id="dc2"                          Title="Buttom Container">                       <f:facet name="TheFacet">         <af:button text="button 2" id="b2"/>     </f:facet>                    </af:declarativeComponent> And the most cool thing about this technique is that viewId attribute can accept not only static strings, but EL expressions as well:  <af:declarativeComponent viewId="#{TheBean.fragmentViewID}"                           id="dc1">    <f:facet name="TheFacet">      <af:button text="button 1" id="b1"/>    </f:facet>                     </af:declarativeComponent> public String getFragmentViewID() {     return "PageFragment.jsff"; } Actually that’s why this construction is called dynamic, and that’s why this feature can be considered as a powerful tool for building well structured, flexible and dynamic UI. That’s it!Reference: Really Dynamic Declarative Components from our JCG partner Eugene Fedorenko at the ADF Practice blog....
software-development-2-logo

Use Cases for Elasticsearch: Geospatial Search

In the previous posts we have seen that Elasticsearch can be used to store documents in JSON format and distribute the data across multiple nodes as shards and replicas. Lucene, the underlying library, provides the implementation of the inverted index that can be used to search the documents. Analyzing is a crucial step for building a good search application.            In this post we will look at a different feature that can be used for applications you would not immediately associate Elasticsearch with. We will look at the geo features that can be used to build applications that can filter and sort documents based on the location of the user. Locations in Applications Location based features can be useful for a wide range of applications. For merchants the web site can present the closest point of service for the current user. Or there is a search facility for finding points of services according to a location, often integrated with something like Google Maps. For classifieds it can make sense to sort them by distance from the user searching, the same is true for any search for locations like restaurants and the like. Sometimes it also makes sense to only show results that are in a certain area around me, in this case we need to filter by distance. Probably the user is looking for a new appartment and is not interested in results that are too far away from his workplace. Finally locations can also be of interest when doing analytics. Social media data can tell you where something interesting is happening just by looking at the amount of status messages sent from a certain area. Most of the time locations are stored as a pair of latitude and longitude, which denotes a point. The combination of 48.779506, 9.170045 for example points to Liederhalle Stuttgart which happens to be the location for Java Forum Stuttgart. Geohashes are an alternative means to encode latitude and longitude. They can be stored in arbitrary precision so those can also refer to a larger area instead of a point. When calculating a Geohash the map is divided into several buckets or cells. Each bucket is identified by a base 32 encoded value. The complete geohash then consists of a sequence of characters. Each following character marks the bucket in the previous bucket so you are zooming in to the location. The longer the geohash string the more precise the location is. For example u0wt88j3jwcp is the geohash for Liederhalle Stuttgart. The prefix u0wt on the other hand is the area of Stuttgart and some of the surrounding cities. The hierarchical nature of geohashes and the possiblity to express them as strings makes them a good choice for storing them in the inverted index. You can create geohashes using the original geohash service or more visually appealing using the nice GeohashExplorer. Locations in Elasticsearch Elasticsearch accepts lat and lon for specifying latitude and longitude. These are two documents for a conference in Stuttgart and one in Nuremberg. { "title" : "Anwendungsfälle für Elasticsearch", "speaker" : "Florian Hopf", "date" : "2014-07-17T15:35:00.000Z", "tags" : ["Java", "Lucene"], "conference" : { "name" : "Java Forum Stuttgart", "city" : "Stuttgart", "coordinates": { "lon": "9.170045", "lat": "48.779506" } } } { "title" : "Anwendungsfälle für Elasticsearch", "speaker" : "Florian Hopf", "date" : "2014-07-15T16:30:00.000Z", "tags" : ["Java", "Lucene"], "conference" : { "name" : "Developer Week", "city" : "Nürnberg", "coordinates": { "lon": "11.115358", "lat": "49.417175" } } } Alternatively you can use the GeoJSON format, accepting an array of longitude and latitude. If you are like me be prepared to hunt down why queries aren’t working just to notice that you messed up the order in the array. The field needs to be mapped with a geo_point field type. { "properties": { […], "conference": { "type": "object", "properties": { "coordinates": { "type": "geo_point", "geohash": "true", "geohash_prefix": "true" } } } } }' By passing the optional attribute geohash Elasticsearch will automatically store the geohash for you as well. Depending on your usecase you can also store all the parent cells of the geohash using the parameter geohash_prefix. As the values are just strings this is a normal ngram index operation which stores the different substrings for a term, e.g. u, u0, u0w and u0wt for u0wt. With our documents in place we can now use the geo information for sorting, filtering and aggregating results. Sorting by Distance First, let’s sort all our documents by distance from a point. This would allow us to build an application that displays the closest location for the current user. curl -XPOST "http://localhost:9200/conferences/_search " -d' { "sort" : [ { "_geo_distance" : { "conference.coordinates" : { "lon": 8.403697, "lat": 49.006616 }, "order" : "asc", "unit" : "km" } } ] }' We are requesting to sort by _geo_distance and are passing in another location, this time Karlsruhe, where I live. Results should be sorted ascending so the closer results come first. As Stuttgart is not far from Karlsruhe it will be first in the list of results. The score for the document will be empty. Instead there is a field sort that contains the distance of the locations from the one provided. This can be really handy when displaying the results to the user. Filtering by Distance For some usecase we would like to filter our results by distance. Some online real estate agencies for example provide the option to only display results that are in a certain distance from a point. We can do the same by passing in a geo_distance filter. curl -XPOST "http://localhost:9200/conferences/_search" -d' { "filter": { "geo_distance": { "conference.coordinates": { "lon": 8.403697, "lat": 49.006616 }, "distance": "200km", "distance_type": "arc" } } }' We are again passing the location of Karlsruhe. We request that only documents in a distance of 200km should be returned and that the arc distance_type should be used for calculating the distance. This will take into account that we are living on a globe. The resulting list will only contain one document, Stuttgart, as Nuremberg is just over 200km away. If we use the distance 210km both of the documents will be returned. Geo Aggregations Elasticsearch provides several useful geo aggregations that allow you to retrieve more information on the locations of your documents, e.g. for faceting. On the other hand as we do have the geohash as well as the prefix enabled we can retrieve all of the cells our results are in using a simple terms aggregation. This way you can let the user drill down on the results by filtering on the cell. curl -XPOST "http://localhost:9200/conferences/_search" -d' { "aggregations" : { "conference-hashes" : { "terms" : { "field" : "conference.coordinates.geohash" } } } }' Depending on the precision we have chosen while indexing this will return a long list of prefixes for hashes but the most important part is at the beginning. [...] "aggregations": { "conference-hashes": { "buckets": [ { "key": "u", "doc_count": 2 }, { "key": "u0", "doc_count": 2 }, { "key": "u0w", "doc_count": 1 }, [...] } } Stuttgart and Nuremberg both share the parent cells u and u0. Alternatively to the terms aggregation you can also use specialized geo aggregations, e.g. the geo distance aggregation for forming buckets of distances. Conclusion Besides the features we have seen here Elasticsearch offers a wide range of geo features. You can index shapes and query by arbitrary polygons, either by passing them in or by passing a reference of an indexed polygon. When geohash prefixes are turned on you can also filter by geohash cell. With the new HTML 5 location features location aware search and content delivery will become more important. Elasticsearch is a good fit for building this kind of applications. Two users in the geo space are Foursquare, a very early user of Elasticsearch, and Gild, a recruitment agency that does some magic with locations.Reference: Use Cases for Elasticsearch: Geospatial Search from our JCG partner Florian Hopf at the Dev Time blog....
java-logo

2 Examples to Convert Byte[] array to String in Java

Converting a byte array to String seems easy but what is difficult is, doing it correctly. Many programmers make mistake of ignoring character encoding whenever bytes are converted into a String or char or vice versa. As a programmer, we all know that computer’s only understand binary data i.e. 0 and 1. All things we see and use e.g. images, text files, movies, or any other multi-media is stored in form of bytes, but what is more important is process of encoding or decoding bytes to character. Data conversion is an important topic on any programming interview, and because of trickiness of character encoding, this questions is one of the most popular String Interview question on Java Interviews. While reading a String from input source e.g. XML files, HTTP request, network port, or database, you must pay attention on which character encoding (e.g. UTF-8, UTF-16, and ISO 8859-1) they are encoded. If you will not use the same character encoding while converting bytes to String, you would end up with a corrupt String which may contain totally incorrect values. You might have seen?, square brackets after converting byte[] to String, those are because of values your current character encoding is not supporting, and just showing some garbage values. I tried to understand why programmes make character encoding mistakes more often than not, and my little research and own experience suggests that, it may be because of two reasons, first not dealing enough with internationalization and character encodings and second because ASCII characters are supported by almost all popular encoding schemes and has same values.  Since we mostly deal with encoding like UTF-8, Cp1252 and Windows-1252, which displays ASCII characters (mostly alphabets and numbers) without fail, even if you use different encoding scheme. Real issue comes when your text contains special characters e.g. ‘é’, which is often used in French names. If your platform’s character encoding doesn’t recognize that character then either you will see a different character or something garbage, and sadly until you got your hands burned, you are unlikely to be careful with character encoding. In Java, things are little bit more tricky because many IO classes e.g. InputStreamReader by default use platform’s character encoding. What this means is that, if you run your program in different machine, you will likely get different output because of different character encoding used on that machine. In this article, we will learn how to convert byte[] to String in Java both by using JDK API and with the help of Guava and Apache commons. How to convert byte[] to String in Java There are multiple ways to change byte array to String in Java, you can either use methods from JDK, or you can use open source complimentary APIs like Apache commons and Google Guava. These API provides at least two sets of methods to create String form byte array;  one, which uses default platform encoding and other which takes character encoding. You should always use later one, don’t rely on platform encoding. I know, it could be same or you might not have faced any problem so far, but it’s better to be safe than sorry. As I pointed out in my last post about printing byte array as Hex String, It’s also one of the best practice to specify character encoding while converting bytes to character in any programming language. It might be possible that your byte array contain non-printable ASCII characters. Let’s first see JDK’s way of converting byte[] to String :You can use constructor of String, which takes byte array and character encoding: String str = new String(bytes, "UTF-8"); This is the right way to convert bytes to String, provided you know for sure that bytes are encoded in the character encoding you are using. If you are reading byte array from any text file e.g. XML document, HTML file or binary file, you can use the Apache Commons IO library to convert the FileInputStream to a String directly. This method also buffers the input internally, so there is no need to use another BufferedInputStream. String fromStream = IOUtils.toString(fileInputStream, "UTF-8");In order to correctly convert those byte array into String, you must first  discover correct character encoding by reading meta data e.g. Content-Type, <?xml encoding=”…”> etc, depending on the format/protocol of the data you are reading. This is one of the reason I recommend to use XML parsers e.g. SAX or DOM parsers to read XML files, they take care of character encoding by themselves. Some programmers, also recommends to use Charset over String for specifying character encoding, e.g. instead of “UTF-8″ use StandardCharsets.UTF_8 mainly to avoid UnsupportedEncodingException in worst case. There are six standard Charset implementations guaranteed to be supported by all Java platform implementations. You can use them instead specifying encoding scheme in String. In short, always prefer StandardCharsets.ISO_8859_1 over “ISO_8859_1″, as shown below : String str = IOUtils.toString(fis,StandardCharsets.UTF_8); Other standard charset supported by Java platform are :StandardCharsets.ISO_8859_1 StandardCharsets.US_ASCII StandardCharsets.UTF_16 StandardCharsets.UTF_16BE StandardCharsets.UTF_16LEIf you are reading bytes from input stream, you can also check my earlier post about 5 ways to convert InputStream to String in Java for details. Original XML Here is our sample XML snippet to demonstrate issues with using default character encoding. This file contains letter ‘é’, which is not correctly displayed in Eclipse because it’s default character encoding is Cp1252. xml version="1.0" encoding="UTF-8"?> <banks> <bank> <name>Industrial & Commercial Bank of China </name> <headquarters> Beijing , China</headquarters> </bank> <bank> <name>Crédit Agricole SA</name> <headquarters>Montrouge, France</headquarters> </bank> <bank> <name>Société Générale</name> <headquarters>Paris, Île-de-France, France</headquarters> </bank> </banks> And, this is what happens when you convert a byte array to String without specify character encoding, e.g. : String str = new String(filedata); This will use platform’s default character encoding, which is Cp1252 in this case, because we are running this program in Eclipse IDE. You can see that letter ‘é’ is not displayed correctly. xml version="1.0" encoding="UTF-8"?> <banks> <bank> <name>Industrial & Commercial Bank of China </name> <headquarters> Beijing , China</headquarters> </bank> <bank> <name>Crédit Agricole SA</name> <headquarters>Montrouge, France</headquarters> </bank> <bank> <name>Société Générale</name> <headquarters>Paris, ÃŽle-de-France, France</headquarters> </bank> </banks> To fix this, specify character encoding while creating String from byte array, e.g. String str = new String(filedata, "UTF-8"); By the way, let me make it clear that even though I have read XML files using InputStream here it’s not a good practice, in fact it’s a bad practice. You should always use proper XML parsers for reading XML documents. If you don’t know how, please check this tutorial. Since this example is mostly to show you why character encoding matters, I have chosen an example which was easily available and looks more practical. Java Program to Convert Byte array to String in JavaHere is our sample program to show why relying on default character encoding is a bad idea and why you must use character encoding while converting byte array to String in Java. In this program, we are using Apache Commons IOUtils class to directly read file into byte array. It takes care of opening/closing input stream, so you don’t need to worry about leaking file descriptors. Now how you create String using that array, is the key. If you provide right character encoding, you will get correct output otherwise a nearly correct but incorrect output. import java.io.FileInputStream; import java.io.IOException; import org.apache.commons.io.IOUtils;/** * Java Program to convert byte array to String. In this example, we have first * read an XML file with character encoding "UTF-8" into byte array and then created * String from that. When you don't specify a character encoding, Java uses * platform's default encoding, which may not be the same if file is a XML document coming from another system, emails, or plain text files fetched from an * HTTP server etc. You must first discover correct character encoding * and then use them while converting byte array to String. * * @author Javin Paul */ public class ByteArrayToString{public static void main(String args[]) throws IOException {System.out.println("Platform Encoding : " + System.getProperty("file.encoding")); FileInputStream fis = new FileInputStream("info.xml"); // Using Apache Commons IOUtils to read file into byte array byte[] filedata = IOUtils.toByteArray(fis); String str = new String(filedata, "UTF-8"); System.out.println(str); } }Output : Platform Encoding : Cp1252 <?xml version="1.0" encoding="UTF-8"?> <banks> <bank> <name>Industrial & Commercial Bank of China </name> <headquarters> Beijing , China</headquarters> </bank> <bank> <name>Crédit Agricole SA</name> <headquarters>Montrouge, France</headquarters> </bank> <bank> <name>Société Générale</name> <headquarters>Paris, Île-de-France, France</headquarters> </bank> </banks> Things to remember and Best Practices Always remember, using character encoding while converting byte array to String is not a best practice but mandatory thing. You should always use it irrespective of programming language. By the way, you can take note of following things, which will help you to avoid couple of nasty issues :Use character encoding from the source e.g. Content-Type in HTML files, or <?xml encoding=”…”>. Use XML parsers to parse XML files instead of finding character encoding and reading it via InputStream, some things are best left for demo code only. Prefer Charset constants e.g. StandardCharsets.UTF_16 instead of String “UTF-16″ Never rely on platform’s default encoding schemeThis rules should also be applied when you convert character data to byte e.g. converting String to byte array using String.getBytes() method. In this case it will use platform’s default character encoding, instead of this you should use overloaded version which takes character encoding. That’s all on how to convert byte array to String in Java. As you can see that Java API, particularly java.lang.String class provides methods and constructor that takes a byte[] and returns a String (or vice versa), but by default they rely on platform’s character encoding, which may not be correct, if byte array is created from XML files, HTTP request data or from network protocols. You should always get right encoding from source itself. If you like to read more about what every programmer should know about String, you can checkout this article.Reference: 2 Examples to Convert Byte[] array to String in Java from our JCG partner Javin Paul at the Javarevisited blog....
java-interview-questions-answers

Instant Big Data Stream Processing = Instant Storm

Every 6 months at Canonical, the company behind Ubuntu, I work on something technical to test our tools first hand and to show others new ideas. This time around I created an Instant Big Data solution, more concretely “Instant Storm”. Storm is now part of the Apache Foundation but previously Storm was build by Nathan Marz during his time at Twitter. Storm is a stream processing engine for real-time and distributed computation. You can use Storm to aggregate real-time flows of events, to do machine learning, for analytics, for distributed ETL, etc. Storm is build out of several services and requires Zookeeper. It is a complex solution and non-trivial to deploy, integrate and scale.  The first technical project I did at Canonical was to create a Storm Juju charm. Although I was able to automate the deployment of Storm, there were still problems because users still had to read about how to actually use Storm. Instant Storm is the first effort to resolve this problem. I created a StormDeployer charm that can read a yaml file in which a developer can specify multiple topologies. For each you specify the name of the topology, the jar file, the location in Github, how to package the jar file, etc. Afterwards by uploading the yaml file to Github or any public web server and giving it the extension .storm anybody in the world is able to reuse the topologies instantly in two steps: 1. Deploy the Storm bundle that comes with Storm + Zookeeper + StormDeployer via a simple drag and drop in Juju: 2. Get a URL to a storm file and put it into the deploy field of the service settings of the StormDeployer :Alternatively you can use the Juju command line: juju set stormdeployer "deploy=http://somedomain/somefile.storm" There are several examples already available on Github but here is one that for sure works: https://raw.githubusercontent.com/mectors/stormdeployer-examples/master/storm-hackaton/storm-hackaton.storm The StormDeployer will download the project from Github, package the jar with Maven and upload the jar to Storm.  You can check progress in the logs (/opt/storm/latest/log/deploy.log). This is the easiest way to deploy Storm on any public cloud, private cloud or if Ubuntu’s Metal-as-a-Service / MaaS is used on any bare metal server (X86, ARM64, Power 8). See here for Juju installation instructions. This is a first version with some limitations. One of the really nice things to add would be to use Juju to make integrations between a topology and other charms dynamic. You can for instance create a spout or bolt that connects to the Kafka or Cassandra charms. Juju can automatically tell the topology the connection information and make updates to the running topologies should anything change. This would make it a lot more robust to run long running Storm topologies. I am happy to donate my work to the Apache Foundation and guide anybody who wants to take ownership…Reference: Instant Big Data Stream Processing = Instant Storm from our JCG partner Maarten Ectors at the Telruptive blog....
software-development-2-logo

Awesome SQL Trick: Constraints on Views

CHECK constraints are already pretty great when you want to sanitize your data. But there are some limitations to CHECK constraints, including the fact that they are applied to the table itself, when sometimes, you want to specify constraints that only apply in certain situations. This can be done with the SQL standard WITH CHECK OPTION clause, which is implemented by at least Oracle and SQL Server. Here’s how to do that:         CREATE TABLE books ( id NUMBER(10) NOT NULL, title VARCHAR2(100 CHAR) NOT NULL, price NUMBER(10, 2) NOT NULL, CONSTRAINT pk_book PRIMARY KEY (id) ); /CREATE VIEW expensive_books AS SELECT id, title, price FROM books WHERE price > 100 WITH CHECK OPTION; /INSERT INTO books VALUES (1, '1984', 35.90);INSERT INTO books VALUES ( 2, 'The Answer to Life, the Universe, and Everything', 999.90 ); As you can see, expensive_books are all those books whose price is more than 100.00. This view will only report the second book: SELECT * FROM expensive_books; The above query yields: ID TITLE PRICE -- ----------------------------------------- ------- 2 The Answer to Life, the Universe, and ... 999.9 But now, that we have that CHECK OPTION, we can also prevent users from inserting “expensive books” that aren’t really expensive. For instance, let’s run this query: INSERT INTO expensive_books VALUES (3, '10 Reasons why jOOQ is Awesome', 9.99); This query won’t work now. We’re getting: ORA-01402: view WITH CHECK OPTION where-clause violation We also cannot update any of the “expensive books” to be non-expensive: UPDATE expensive_books SET price = 9.99; This query results in the same ORA-01402 error message. Inline WITH CHECK OPTION In case you need to locally prevent bogus data from being inserted into a table, you can also use inline WITH CHECK OPTION clauses like so: INSERT INTO ( SELECT * FROM expensive_books WHERE price > 1000 WITH CHECK OPTION ) really_expensive_books VALUES (3, 'Modern Enterprise Software', 999.99); And the above query again resutls in an ORA-01402 error. Using SQL transformation to generate ad-hoc constraints While CHECK OPTION is very useful for stored views, which can have proper grants for those users that may not access the underlying table directly, the inline CHECK OPTION is mainly useful when you transform dynamic SQL in an intermediate SQL transformation layer in your applciation. This can be done with jOOQ’s SQL transformation capabilities, for instance, where you can watch out for a certain table in your SQL statements, and then centrally prevent bogus DML from being executed. This is a great way to implement multi-tenancy, if your database doesn’t natively support row-level security. Stay tuned for a future blog post explaining how to transform your SQL with jOOQ to implement row-level security for any database.Reference: Awesome SQL Trick: Constraints on Views from our JCG partner Lukas Eder at the JAVA, SQL, AND JOOQ blog....
career-logo

NoSQL Job Trends – August 2014

Yes, it is already September, but I am only a day late for the NoSQL installment of the August job trends. In this update, I am splitting the graphs into two as I include more products. So, for the NoSQL job trends, we will be looking at  Cassandra,  Redis,  Couchbase,  SimpleDB,  CouchDB,  MongoDB,  HBase , Riak , Neo4j and MarkLogic. I am including Neo4j for two reasons. First, it is a graph DB which means it might have a different trend than the other solutions. Also, I have been seeing more mention of it in blog posts. The second new inclusion is MarkLogic, again for multiple reasons. First, I use it in my day job, so I am being somewhat selfish. Second, it is an XML-native document database and not open-source. Third, MarkLogic is making a push for semantics (meaning storage of RDF) and native javascript. So this installment is really a baseline for the MarkLogic trend. If I am missing popular options, or if you want to see what the trends are for other NoSQL solutions, please let me know. First, we look at the long-term trends from Indeed for our first 5, MongoDB, Cassandra, Redis, SimpleDB and HBase:As you can see in this first group, demand has flattened for the past year to 18 months. MongoDB is still the clear leader. Cassandra stayed stable for a while, allowing it to build a lead over HBase which declined heavily during 2013 before flattening its trend. Redis followed the same general trend while maintaining its fourth position. SimpleDB has been in slow decline since 2012. It is possible that it will be removed in the next update. Now, let’s look at Indeed for our second set of 5, CouchDB, Couchbase, Neo4j, MarkLogic and Riak:First, a note on scale. In the previous graph, Redis is at 0.03%, while the top of this graph is at 0.01%. I did not include SimpleDB in this group mainly because of its age and declining trend. Given the scale of the graph, demand seems to jump all over. CouchDB seems to be leading but has dropped steeply from its peak in mid-2011. At just about the same current demand sits Riak, which also had a steep drop in the past several months. Riak does seem to be growing overall since early 2011. The major positive trend here is Couchbase, growing steadily since 2012. MarkLogic was in a bit of decline from 2010 until 2013, but has been growing since early 2013. Neo4j has been mostly flat since early 2012, probably due to its specialized nature. Now, it’s time for the short term trends from SimplyHired for our first set of 5:MongoDB seems to be increasing its lead in the short term, even with a dip in demand the past few months. Cassandra and HBase followed similar trends for most of the past year with Cassandra leading slightly. Redis continues a fairly stable trend for the past 18 months, lagging the leaders by a bit. SimpleDB shows little demand in a fairly flat trend. For the second 5, the trends from SimplyHired are as follows:The note about scale from the first set of graphs applies here as well. For most of the trends, there is a big bump in early 2014, except for MarkLogic. However, that bump was not sustained. CouchDB leads this pack, with a small gap over Riak. Riak shows the most unstable trend with several peaks and valleys. Couchbase is showing a solid growth trend, ending up at the same spot as MarkLogic. MarkLogic is also showing a good overall short-term trend, which could point to a positive direction in the future as well. Much like the long-term trends for Neo4j, SimplyHired is showing a mostly flat trend since the Autumn of 2013. Lastly, we look at the relative growth from Indeed for the first 5:This chart really shows the struggle of SimpleDB. While many solutions are growing rapidly, SimpleDB barely registers on the graph and looks to have a slight declining trend. Cassandra is leading the growth, just above 12,000%. HBase follows with near 10,500% growth. Redis trails a bit at just under 9000%. MongoDB, which leads in overall demand, is not showing the same type of growth, sitting just under 5000%. Finally, we review the growth trends from Indeed for the second 5:Unlike the general long-term trends, the scale on the growth chart is fairly similar. Couchbase is leading with a steadily increasing trend. MarkLogic comes next with a stabilizing trend, growing nice lately but really unstable between 2009 and 2013. Riak follows with a fairly stable, generally rising trend. CouchDB is definitely declining slowly falling to under 500% from its peak of 1000% in early 2012. Neo4j trails everyone, showing a flat growth trend for the past two years. Currently, there are 4 major players and then a bunch of solutions fighting for attention. MongoDB, Cassandra, HBase and Redis all have solid demand and growth. This shows a very promising future for this segment of the industry. On the bad side of the trends we have SimpleDB, which looks like it is dying out. CouchDB seems to be floundering as well, even though its cousin Couchbase is growing nicely. The more interesting question is what happens with the other solutions, MarkLogic, Riak and Neo4j? The trends for those could be very telling for any solution trying to gain real acceptance.Reference: NoSQL Job Trends – August 2014 from our JCG partner Rob Diana at the Regular Geek blog....
java-logo

JAXB – A Newcomer’s Perspective, Part 2

In Part 1 of this series, I discussed the basics of loading data from an XML file into a database using JAXB and JPA. (If JSON is called for instead of XML, then the same idea should translate to a tool like Jackson.) The approach is to use shared domain objects – i.e. a single set of POJOs with annotations that describe both the XML mapping and the relational mapping. Letting one .java file describe all of the data’s representations makes it easy to write data loaders, unloaders, and translators. In theory it’s simple, but then I alluded to the difference between theory and practice. In theory, there is no difference. Now in Part 2, we’ll look at a few gotchas you can expect to encounter when asking these two tools to work together over a realistic data model, and techniques you might employ to overcome those hurdles. What’s In a Name? This first point might be obvious, but I’ll mention it anyway: As with any tool that relies on the bean property conventions, JAXB is sensitive to your method names. You could avoid the issue by configuring direct field access, but as we’ll see shortly, there may be reasons you’d want to stick with property access. The property name determines the default tag name of the corresponding element (though this can be overridden with annotations – such as @XmlElement in the simplest case). More importantly, your getter and setter names must match. The best advice, of course, is to let your IDE generate the getter and setter so that typos won’t be an issue. Dealing with @EmbeddedId Suppose you want to load some data representing orders. Each order might have multiple line items, with the line items for each order numbered sequentially from 1 so that the unique ID across all line items would be the combination of the order ID and the line item number. Assuming you use the @EmbeddedId approach to representing the key, your line items might be represented like this: @Embeddable public class LineItemKey { private Integer orderId; private Integer itemNumber;/* … getters and setters … */ }@XmlRootElement @Entity @Table(name=”ORDER_ITEM”) public class OrderLineItem { @EmbeddedId @AttributeOverrides(/*…*/) private LineItemKey lineItemKey;@Column(name=”PART_NUM”) private String partNumber;private Integer quantity;// … getters and setters … }; The marshalling and unmarshalling code will look a lot like that from the Employee example in Part 1. Note that we don’t have to explicitly tell JAXBContext about the LineItemKey class because it is referenced by OrderLineItem. LineItemKey liKey = new LineItemKey(); liKey.setOrderId(37042); liKey.setItemNumber(1);OrderLineItem lineItem = new OrderLineItem(); lineItem.setLineItemKey(liKey); lineItem.setPartNumber(“100-02”); lineItem.setQuantity(10);JAXBContext jaxb = JAXBContext.newInstance(OrderLineItem.class); Marshaller marshaller = jaxb.createMarshaller(); marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true); marshaller.marshal(lineItem, System.out); However, we may not be thrilled with the resulting XML structure: <?xml version=”1.0” encoding=”UTF-8” standalone=”yes”?> <orderLineItem> <lineItemKey> <itemNumber>1</itemNumber> <orderId>37042</orderId> </lineItemKey> <partNumber>100-02</partNumber> <quantity>10</quantity> </orderLineItem> What if we don’t want the <lineItemKey> element? If we have JAXB using property access, then one option is to change our property definitions (i.e. our getters and setters), making OrderLineItem look like a flat object to JAXB (and potentially to the rest of our app; which could be a good thing). @XmlRootElement @Entity @Table(name=”ORDER_ITEM”) public class OrderLineItem { @EmbeddedId @AttributeOverrides(/*…*/) private LineItemKey lineItemKey;// … additional fields …@XmlTransient public LineItemKey getLineItemKey() { return lineItemKey; }public void setLineItemKey(LineItemKey lineItemKey) { this.lineItemKey = lineItemKey; }// “pass-thru” properties to lineItemKey public Integer getOrderId() { return lineItemKey.getOrderId(); }public void setOrderId(Integer orderId) { if (lineItemKey == null) { lineItemKey = new LineItemKey(); } lineItemKey.setOrderId(orderId); }public Integer getItemNumber() { return lineItemKey.getItemNumber(); }public void setItemNumber(Integer itemNumber) { if (lineItemKey == null) { lineItemKey = new LineItemKey(); } lineItemKey.setItemNumber(itemNumber); }// … additional getters and setters … }; Note the addition of @XmlTransient to the lineItemKey getter; this tells JAXB not to map this particular property. (If JPA is using field access, we might get by with removing the lineItemKey getter and setter entirely. On the other hand, if JPA is using property access, then we’d need to mark our “pass-thru” getters as @Transient to keep the JPA provider from inferring an incorrect mapping to the ORDER_ITEM table.) With lineItemKey marked @XmlTransient, though, JAXB won’t know that it needs to create the embedded LineItemKey instance during unmarshalling. Here we’ve addressed that by making the “pass-thru” setters ensure that the instance exists. JPA should tolerate this at least if it’s using field access. If you want that approach to be thread safe, you’d have to synchronize the setters. As an alternative, you could create the LineItemKey in a default constructor (if you’re confident that your JPA provider won’t mind). Another option that’s sure to only affect JAXB (without dedicated getters and setters) might be to use an ObjectFactory that injects the LineItemKey into the OrderLineItem before returning it. However, to the best of my knowledge, ObjectFactory has to cover all of the classes in a package, so if you have many simple domain objects and a few complex ones in the same package (and have no other reason to create an ObjectFactory) then you may want to avoid this approach. You also might want to protect the pass-thru getters from null pointer exceptions by checking if LineITemKey exists before trying to fetch the return value. In any event, our XML should now look like this: <?xml version=”1.0” encoding=”UTF-8” standalone=”yes”?> <orderLineItem> <itemNumber>1</itemNumber> <orderId>37042</orderId> <partNumber>100-02</partNumber> <quantity>10</quantity> </orderLineItem> Related Objects: One to Many Of course, your line items belong to orders, so you might have an ORDER table (and corresponding Order class). @XmlRootElement @Entity @Table(name=”ORDER”) public class Order { @Id @Column(name=”ORDER_ID”) private Integer orderId;@OneToMany(mappedBy=”order”) private List<OrderLineItem> lineItems;// … getters and setters … } We’ve set up a one-to-many relationship with OrderLineItem. Note we’re expecting OrderLineItem to own this relationship for JPA purposes. For now we’ll take the @XmlRootElement annotation off of OrderLineItem. (We don’t have to; the annotation makes the class eligible to be a root element but does not preclude also using it as a nested element. However, if we want to continue writing XML that represents just the OrderLineItem, then we’ll have some additional decisions to make, so we’ll defer that for the moment.) To keep the marshaller happy, we make the Order property of OrderLineItem @XmlTransient. This avoids a circular reference that could otherwise be interpreted as an infinitely deep XML tree. (You probably wouldn’t intend to embed the full order detail under the <orderLineItem> element anyway.) With <orderLineItem> embedded under an <order> element, there’s no longer a reason to put a <orderId> element under <orderLineItem>. We remove the orderId property from OrderLineItem, knowing that code elsewhere in the app can still use lineItem.getOrder().getOrderId(). The new version of OrderLineItem looks like this: @Entity @Table(name=”ORDER_ITEM”) public class OrderLineItem { @EmbeddedId @AttributeOverrides(/*…*/) private LineItemKey lineItemKey;@MapsId(“orderId”) @ManyToOne private Order order;@Column(name=”PART_NUM”) private String partNumber;private Integer quantity;@XmlTransient public Order getOrder() { return order; }public void setOrder(Order order) { this.order = order; }public Integer getItemNumber() { return lineItemKey.getItemNumber(); }public void setItemNumber(Integer itemNumber) { if (lineItemKey == null) { lineItemKey = new LineItemKey(); } lineItemKey.setItemNumber(itemNumber); }// … more getters and setters … }; Our JAXBContext needs to be told about the Order class. In this situation it doesn’t need to be told explicitly about OrderLineItem. So we can test marshalling like this: JAXBContext jaxb = JAXBContext.newInstance(Order.class);List<OrderLineItem> lineItems = new ArrayList<OrderLineItem>();Order order = new Order(); order.setOrderId(37042); order.setLineItems(lineItems);OrderLineItem lineItem = new OrderLineItem(); lineItem.setOrder(order); lineItem.setLineNumber(1); lineItem.setPartNumber(“100-02”); lineItem.setQuantity(10); lineItems.add(lineItem);lineItem = new OrderLineItem(); lineItem.setOrder(order); lineItem.setLineNumber(2); lineItem.setPartNumber(“100-17”); lineItem.setQuantity(5); lineItems.add(lineItem);Marshaller marshaller = jaxb.createMarshaller(); marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true); marshaller.marshal(order, System.out); Note that we set the order property for each line item. JAXB won’t care about this when marshalling (because the property is @XmlTransient and no other property depends on the internal state it affects), but we want to keep our object relationships consistent. If we were to pass order to JPA, then failing to set the order property would become a problem – and we’ll come back to that point shortly. We should get output like this: <?xml version=”1.0” encoding=”UTF-8” standalone=”yes”?> <order> <orderId>37042</orderId> <lineItems> <lineNumber>1</lineNumber> <partNumber>100-02</partNumber> <quantity>10</quantity> </lineItems> <lineItems> <lineNumber>2</lineNumber> <partNumber>100-17</partNumber> <quantity>5</quantity> </lineItems> </order> The default element name mapping puts a <lineItems> tag around each line item (because that’s the property name), which is a little off. We can fix this by putting @XmlElement(name=”lineItem”) on the getLineItems() method of Order. (And if we then wanted the entire list of line item elements wrapped in a single <lineItems> element, we could do that with an @XmlElementWrapper(name=”lineItems”) annotation on the same method.) At this point the marshalling test should look pretty good, but we’ll run into trouble if we unmarshal an order and ask JPA to persist the resulting order line item objects. The problem is that the unmarshaller isn’t setting the order property of OrderLineItem (which owns the Order-to-OrderLineItem relationship for JPA’s purposes). We can solve this by having Order.setLineItems() iterate over the list of line items and call setOrder() on each one. This relies on JAXB building the line item list first and then passing it to setLineItems(); it worked in my tests, but I don’t know if it will always work with all JAXB implementations. Another option is to call setOrder() on each OrderLineItem after unmarshalling but before passing the objects to JPA. This is perhaps more foolproof, but it feels like a kludge. (Part of the point of encapsulation is that your setters supposedly can ensure your objects keep an internally consistent state, after all; so why pass that responsibility off to code outside the objects’ classes?) Favoring simplicity, I’ll skip over some more elaborate ideas I toyed with while trying to sort this problem out. We will look at one more solution when we talk about @XmlID and @XmlIDREF shortly. The Case for Property Access I’ve leaned on modified setters to address the previous two problems. If you’re used to the idea that a setter should have one line (this.myField = myArgument), this may seem questionable. (Then again, if you won’t let your setters do any work for you, what is it you’re buying by encapsulating your fields?) @XmlTransient public List<OrderLineItem> getLineItems() { return lineItems; }public void setLineItems(List<OrderLineItem> lineItems) { this.lineItems = lineItems; }// @Transient if JPA uses property access @XmlElement(name=”lineItem”) public List<OrderLineItem> getLineItemsForJAXB() { return getLineItems(); }public void setLineItemsForJAXB(List<OrderLineItems> lineItems) { setLineItems(lineItems); // added logic, such as calls to setOrder()… } You can avoid using the “ForJAXB” properties anywhere else in your app if you want, so if you feel you’re having to add setter logic “just for JAXB” this approach will keep that added logic from getting in your way. In my opinion, though, the types of setter logic I’ve described above merely hide the implementation details of the bean properties from outside code. I’d argue that JAXB is encouraging better abstraction in these instances. If you think of JAXB as a way to serialize the internal state of an object, then field access may seem preferable. (I’ve heard that argument for using field access with JPA, at any rate.) At the end of the day, though, you want the tool to do a job for you. Treating JAXB as an external mechanism for building (or recording) your objects may just be more pragmatic. Related Objects: One to One, Many to Many With one-to-many relationships working, it may seem like one-to-one relationships should be a snap. However, while a one-to-many relationship will often lend itself to the hierarchical nature of XML (with the “many” being children of the “one”), the objects in a one-to-one relationship are often just peers; so at best the choice to embed one element within the other in the XML representation would be arbitrary. Many-to-many relationships pose a bigger challenge to the hierarchical model. And if you have a more complex network of relationships (regardless of their cardinalities), there may not be a straightforward way to arrange the objects into a tree. Before exploring a general solution, it might be good to pause at this point and ask yourself if you need a general solution. Our project needed to load two types of object that conformed to a parent-child relationship, so the techniques I’ve described previously were sufficient. It may be that you simply don’t need to persist your entire object model in XML. But if you do find you need a way to model relationships that don’t fit the parent-child mold, you can do it with @XmlID and @XmlIDREF. As you learn the rules for using @XmlID, you might ask yourself if it wouldn’t be easier to just store the raw foreign key elements under the referencing element (analogous to the way an RDBMS typically represents a foreign key). You could, and the marshaller would have no problem producing nice-looking XML. But then during or after unmarshalling you’d be responsible for reassembling the relationship graph on your own. The rules for @XmlID are annoying, but I don’t find them so hard to accommodate that avoiding them would justify that kind of effort. The ID values must be Strings, and they must be unique across all elements in your XML document (not just across all elements of a given type). This is because conceptually an ID reference is untyped; in fact, if you let JAXB build your domain objects from a schema, it would map your @XmlIDREF elements (or attributes) to properties of type Object. (When you annotate your own domain classes, though, you’re allowed to use @XmlIDREF with typed fields and properties as long as the referenced type has a field or property annotated with @XmlID. I prefer to do this as it avoids unwanted casts in my code.) The keys for your relationships may not follow those rules; but that’s okay, because you can create a property (named xmlId, say) that will. Suppose each of our orders has a Customer and a “ship to” Address. Also, each Customer has a list of billing Addresses. Both tables in the database (CUSTOMER and ADDRESS) use Integer surrogate keys with sequences starting at 1. In our XML, the Customer and “ship to” Address could be represented as child elements under Order; but maybe we need to keep track of Customers who don’t currently have any orders. Likewise, the billing Address list could be represented as a list of child elements under Customer, but this will inevitably lead to duplication of data as customers have orders shipped to their billing addresses. So instead we’ll use @XmlID. We can define Address as follows: @Entity @Table(name=”ADDRESS”) public class Address { @Id @Column(name=”ADDRESS_ID”) private Integer addressId;// other fields…@XmlTransient public Integer getAddressId() { return addressId; }public void setAddressId(Integer addressId) { this.addressId = addressId; }// @Transient if JPA uses property access @XmlID @XmlElement(name=”addressId”) public String getXmlId() { return getClass().getName() + getAddressId(); }public void setXmlId(String xmlId) { //TODO: validate xmlId is of the form <className><Integer> setAddressId(Integer.parseInt( xmlId.substring( getClass().getName().length() ))); }// … more getters and setters … } Here the xmlId property provides JAXB’s view of the addressId. Prepending the class name provides uniqueness across types whose keys might otherwise clash. If we had a more complex natural key for the table, we’d have to convert each element of the key to a string, possibly with some sort of delimiter, and concatenate it all together. A variation on this idea is to use @XmlAttribute instead of @XmlElement. I generally prefer to use elements for data values (since they’re logically the content of the document), but the XmlId could arguably be seen as describing the <Address> XML element rather than the address itself, so it might make sense to record it as an attribute. For unmarshalling to work, we also have to parse the addressId value back out of the xmlId in the setter. We could avoid this if we persist both the xmlId property and the addressId property; in that case, the xmlId setter could just throw its value away; but I don’t like that option because it saves relatively little effort and creates the possibility of encountering an XML document with inconsistent values for xmlId and addressId. (Sometimes you might have to admit the possibility of an inconsistent document – such as if you persist both sides of a relationship, which I’ll talk about later.) Next we’ll create our Customer mapping: @Entity @Table(name=“CUSTOMER”) public class Customer { @Id @Column(name=”CUSTOMER_ID”) private Integer customerId;@ManyToMany @JoinTable(name = “CUST_ADDR”) private List<Address> billingAddresses;// other fields…@XmlTransient public Integer getCustomerId() { return customerId; }public void setCustomerId(Integer customerId) { this.customerId = customerId; }@XmlIDREF @XmlElement(name = “billingAddress”) public List<Address> getBillingAddresses() { return billingAddresses; }public void setBillingAddresses(List<Address> billingAddresses) { this.billingAddresses = billingAddresses; }// @Transient if JPA uses property access @XmlID @XmlElement(name=”customerId”) public String getXmlId() { return getClass().getName() + getCustomerId(); }public void setXmlId(String xmlId) { //TODO: validate xmlId is of the form <className><Integer> setCustomerId(Integer.parseInt( xmlId.substring( getClass().getName().length() ))); }// … more getters and setters … } The handling of Customer’s xmlId is the same as that for Address. We’ve marked the billingAddresses property with the @XmlIDREF annotation, telling JAXB that each <billingAddress> element should contain an ID value referencing an Address rather than the actual Address element structure. In the same way, we would add customer and shipToAddress properties to Order, annotated with @XmlIDREF. At this point, every reference to a Customer or an Address is marked as an @XmlIDREF. This means that while we can marshal our data into XML, the result won’t actually contain any Customer or Address data. If an @XmlIDREF doesn’t correspond to an @XmlID in a document when you unmarshal it, then the corresponding property on the unmarshalled object will be null. So if we really want this to work, we have to create a new @XmlRootElement that can contain all of our data. @XmlRootElement public class OrderData { private List<Order> orders; private List<Address> addresses; private List<Customer> customers;// getters and setters } This class doesn’t correspond to any table in our database, so it doesn’t have JPA annotations. Our getters can have @XmlElement and @XmlElementWrapper annotations as on previous List-type properties. If we assemble and marshal an OrderData object, we might get something like this: <?xml version=”1.0” encoding=”UTF-8” standalone=”yes”?> <orderData> <addresses> <address> <addressId>Address1010</addressId> <!-- … other elements … --> </address> <address> <addressId>Address1011</addressId> <!-- … --> </address> </addresses> <customers> <customer> <billingAddress>Address1010</billingAddress> <billingAddress>Address1011</billingAddress> <customerId>Customer100</customerId> </customer> </customers> <orders> <order> <customer>Customer100</customer> <lineItem> <itemNumber>1</itemNumber> <partNumber>100-02</partNumber> <quantity>10</quantity> </lineItem> <lineItem> <lineNumber>2</lineNumber> <partNumber>100-17</partNumber> <quantity>5</quantity> </lineItem> <orderId>37042</orderId> <shipToAddress>Address1011</shipToAddress> </order> </orders> </orderData> So far we’ve only mapped one side of each relationship. If our domain objects need to support navigation in both directions, then we have a choice to make: We can mark the property on one side of the relationship as @XmlTransient; this puts us in the same situation we were in with a hierarchically represented one-to-many relationship, in that unmarshalling will not automatically set the @XmlTransient property. Or, we can make both properties @XmlIDREF, recognizing that someone could write an inconsistent XML document. Revisiting Related Objects: One to Many Earlier when we looked at one-to-many relationships, we relied solely on containment – child elements embedded within a parent element. One of the limitations of containment is that it only allows us to map one side of a relationship. This caused us to jump through some hoops during unmarshalling since our domain objects would need the reverse relationship to work well with JPA. We’ve seen that @XmlID and @XmlIDREF provide a more general representation of relationships. Mixing the two techniques, we can represent both sides of a parent-child relationship (with the caveat, as with any case where we show both sides of a relationship in XML, that you could hand-write an XML document with inconsistent relationships). We can modify our previous one-to-many example to look like this: @XmlRootElement @Entity @Table(name=”ORDER”) public class Order { @Id @Column(name=”ORDER_ID”) private Integer orderId;@OneToMany(mappedBy=”order”) private List<OrderLineItem> lineItems;@XmlTransient public Integer getOrderId() { return orderId; }public void setOrderId(Integer orderId) { this.orderId = orderId; }@XmlID @XmlElement(name=”orderId”) public String getXmlId() { return getClass().getName() + getOrderId; }public void setXmlId(String xmlId) { //TODO: validate xmlId is of the form <className><Integer> setOrderId(Integer.parseInt( xmlId.substring( getClass().getName().length() ))); }@XmlElement(“lineItem”) public List<OrderLineItem> getLineItems() { return lineItems; }public void setLineItems(List<OrderLineItem> lineItems) { this.lineItems = lineItems; } }@Entity @Table(name=”ORDER_ITEM”) public class OrderLineItem { @EmbeddedId @AttributeOverrides(/*…*/) private LineItemKey lineItemKey;@MapsId(“orderId”) @ManyToOne private Order order;@Column(name=”PART_NUM”) private String partNumber;private Integer quantity;@XmlIDREF public Order getOrder() { return order; }public void setOrder(Order order) { this.order = order; }public Integer getItemNumber() { return lineItemKey.getItemNumber(); }public void setItemNumber(Integer itemNumber) { if (lineItemKey == null) { lineItemKey = new LineItemKey(); } lineItemKey.setItemNumber(itemNumber); }// … more getters and setters … } When we marshal Order, we now write out the orderId as an XML ID. Instead of making the order property of OrderLineItem @XmlTransient, we avoid the infinite recursion by having it write the @XmlIDREF instead of the full Order structure; so both sides of the relationship are preserved in a way we can understand at unmarshalling time. The resulting XML would look like this: <?xml version=”1.0” encoding=”UTF-8” standalone=”yes”?> <order> <orderId>Order37042</orderId> <lineItem> <lineNumber>1</lineNumber> <order>Order37042</order> <partNumber>100-02</partNumber> <quantity>10</quantity> </lineItem> <lineItem> <lineNumber>2</lineNumber> <order>Order37042</order> <partNumber>100-17</partNumber> <quantity>5</quantity> </lineItem> </order> And both marshalling and unmarshalling work as we would want. The repetition of the containing order ID value is the only complaint we might have with the output. We could reduce the visual impact by making in an @XmlAttribute rather than an @XmlElement; and this is another case where we could possibly make the argument that the value isn’t “real content,” since we’re just putting it in to help JAXB with unmarshalling. Closing Thoughts As the title says, I walked through this exercise as a newcomer to JAXB. This is by no means a thorough discussion of what JAXB can do, and from the documentation I’ve read I’d even say that I’ve ignored some of its most sophisticated functionality. However, I hope this might serve as a useful primer and might illustrate the power that comes from the bean conventions and from tools and frameworks that unobtrusively interact with POJOs. I’d also reiterate the point that you can make a technique like this as complicated as you care to; so knowing how much complexity your requirements really warrant is key.Reference: JAXB – A Newcomer’s Perspective, Part 2 from our JCG partner Mark Adelsberger at the Keyhole Software blog....
junit-logo

JUnit in a Nutshell: Test Isolation

Working as a consultant I still meet quite often programmers, who have at most a vague understanding of JUnit and its proper usage. This gave me the idea to write a multi-part tutorial to explain the essentials from my point of view. Despite the existence of some good books and articles about testing with the tool, maybe the hands-on approach of this mini-series might be appropriate to get one or two additional developers interested in unit testing – which would make the effort worthwhile. Note that the focus of this chapter is on fundamental unit testing techniques rather than on JUnit features or API. More of the latter will be covered in the following posts. The nomenclature used to describe the techniques is based on the definitions presented in Meszaros’ xUnit Test Patterns [MES]. Previously on JUnit in a Nutshell The tutorial started with a Hello World chapter, introducing the very basics of a test: how it is written, executed and evaluated. It continued with the post Test Structure, explaning the four phases (setup, exercise, verify and teardown) commonly used to structure unit tests. The lessons were accompanied by a consistent example to make the abstract concepts easier to understand. It was demonstrated how a test case grows little by little – starting with happy path up to corner case tests, including expected exeptions. Overall it was emphasized that a test is more than a simple verification machine and can serve also as kind of low level specification. Hence it should be developed with the highest possible coding standards one could think of. Dependencies It takes two to tango ProverbThe example used throughout this tutorial is about writing a simple number range counter, which delivers a certain amount of consecutive integers, starting from a given value. A test case specifying the unit’s behavior might look in excerpts somewhat like this: public class NumberRangeCounterTest { private static final int LOWER_BOUND = 1000; private static final int RANGE = 1000; private static final int ZERO_RANGE = 0; private NumberRangeCounter counter = new NumberRangeCounter( LOWER_BOUND, RANGE ); @Test public void subsequentNumber() { int first = counter.next(); int second = counter.next(); assertEquals( first + 1, second ); } @Test public void lowerBound() { int actual = counter.next(); assertEquals( LOWER_BOUND, actual ); } @Test( expected = IllegalStateException.class ) public void exeedsRange() { new NumberRangeCounter( LOWER_BOUND, ZERO_RANGE ).next(); }[...] } Note that I go with a quite compact test case here to save space, using implicit fixture setup and exception verification for example. For an in detail discussion about test structuring patterns see the previous chapter. Note also that I stick with the JUnit build-in functionality for verification. I will cover the pro and cons of particular matcher libraries (Hamcrest, AssertJ) in a separate post. While the NumberRangeCounter‘s initial description was sufficient to get this tutorial started, the attentive reader may have noticed that the approach was admittedly a bit naive. Consider for example that a program’s process might get terminated. To be able to reinitialize the counter properly on system restart, it should have preserved at least its latest state. However persisting the counter’s state involves access to resources (database, filesystem or the like) via software components (database driver, file system API etc.) that are not part of the unit, aka system under test (SUT). This means the unit depends on such components, which Meszaros describes with the term depended-on component (DOC). Unfortunaly this brings along testing related trouble in many respects:Depending on components we cannot control might impede the decent verification of a test specification. Just think of a real world web service that could be unavailable at times. This could be the cause of a test failure, although the SUT itself is working properly. DOCs might also slow down test execution. To enable unit tests to act as safety net the complete test-suite of a system under development has to be executed very often. This is only feasible if each test runs incredible fast. Again think of the web service example. Last but not least a DOC’s behavior may change unexpectedly due to the usage of a newer version of a third party library for example. This shows how depending directly on components we cannot control makes a test fragile.So what can we do to circumvent this problems? Isolation – A Unit Tester’s SEP Field An SEP is something we can’t see, or don’t see, or our brain doesn’t let us see, because we think that it’s Somebody Else’s Problem…. Ford PrefectAs we do not want our unit tests to be dependent on the behavior of a DOC, nor want them to be slow or fragile, we strive to shield our unit as much as possible from all other parts of the software. Flippantly spoken we make these particular problems the concern of other test types – thus the joking SEP Field quote. In general this principle is known as Isolation of the SUT and expresses the aspiration to test concerns seperately and keep tests independent of each other. Practically this implies that a unit should be designed in a way that each DOC can be replaced by a so called Test Double, which is a lightweight stand-in component for the DOC [MES1]. Related to our example we might decide not to access a database, file-system or the like directly from within the unit itself. Instead we may choose to separate this concern into a shielding interface type, without being interested in how a concrete implementation would look like. While this choice is certainly also reasonable from a low-level design point of view, it does not explain how the test double is created, installed and used throughout a test. But before elaborating on how to use doubles, there is one more topic that needs to be discussed. Indirect Inputs and OutputsSo far our testing efforts confronted us with direct inputs and outputs of the SUT only. I.e. each instance of NumberRangeCounter is equipped with a lower bound and a range value (direct input). And after each call to next() the SUT returns a value or throws an exception (direct output) used to verified the SUT’s expected behavior. But now the situation gets a bit more complicated. Considering that the DOC provides the latest counter value for SUT initialization, the result of next() depends on this value. If a DOC provides the SUT input in this manner, we talk about indirect inputs. Conversely assuming that each call of next() should persist the counter’s current state, we have no chance to verify this via direct outputs of the SUT. But we could check that the counter’s state has been delegated to the DOC. This kind of delegation is denoted as indirect output. With this new knowledge we should be prepared to proceed with the NumberRangeCounter example. Controlling Indirect Inputs with Stubs From what we have learned it would probably be a good idea to separate the counter’s state-preserving into a type of its own. This type would isolate the SUT from the actual storage implementation, since from the SUT’s point of view we are not interested in how the problem of preservation is actually solved. For that reason we introduce the interface CounterStorage. Although there is no real storage implementation so far, we can go ahead using a test double instead. It is trivial to create a test double type at this point as the interface has no methods yet. public class CounterStorageDouble implements CounterStorage { } To provide the storage for a NumberRangeCounter in a loosely coupled way we can use dependency injection. Enhancing the implicit fixture setup with a storage test double and injecting it into the SUT may look like this: private CounterStorage storage;@Before public void setUp() { storage = new CounterStorageDouble(); counter = new NumberRangeCounter( storage, LOWER_BOUND, RANGE ); } After fixing the compile errors and running all tests the bar should remain green, as we have not changed any behavior yet. But now we want the first call of NumberRangeCounter#next() to respect the storage’s state. If the storage provides a value n within the counter’s defined range, the first call of next() should also return n, which is expressed by the following test: private static final int IN_RANGE_NUMBER = LOWER_BOUND + RANGE / 2;[...]@Test public void initialNumberFromStorage() { storage.setNumber( IN_RANGE_NUMBER ); int actual = counter.next(); assertEquals( IN_RANGE_NUMBER, actual ); } Our test double must provide a deterministic indirect input, in our case the IN_RANGE_NUMBER. Because of this it is equipped with the value using setNumber(int). But as the storage is not used yet the test fails. To change this it is about time to declare the CounterStorage‘s first method: public interface CounterStorage { int getNumber(); } Which allows us to implement the test double like this: public class CounterStorageDouble implements CounterStorage {private int number;public void setNumber( int number ) { this.number = number; }@Override public int getNumber() { return number; } } As you can see the double implements getNumber() by returning a configuration value fed by setNumber(int). A test double that provides indirect inputs in this way is called a stub. Now we would be able to implement the expected behaviour of NumberRangeCounter and pass the test. If you think that get/setNumber make poor names to describe a storage’s behaviour, I agree. But it eases the post’s evolution. Please feel invited to make well conceived refactoring proposals… Indirect Output Verification with Spies To be able to restore a NumberRangeCounter instance after system restart, we expect that each state change of a counter gets persisted. This could be achieved by dispatching the current state to the storage each time a call to next() occurs. Because of this we add a method setNumber(int) to our DOC type: public interface CounterStorage { int getNumber(); void setNumber( int number ); } What an odd coincidence that the new method has the same signature as the one used to configure our stub! After amending that method with @Override it is easy to reuse our fixture setup also for the following test: @Test public void storageOfStateChange() { counter.next(); assertEquals( LOWER_BOUND + 1, storage.getNumber() ); } Compared to the initial state we expect the counter’s new state to be increased by one after a call to next(). More important we expect this new state to be passed on to the storage DOC as an indirect output. Unfortunately we do not witness the actual invocation, so we record the result of the invocation in our double’s local variable. The verification phase deduces that the correct indirect output has been passed to the DOC, if the recorded value matches the expected one. Recording state and/or behavior for later verification, described above in its simplest manner, is also denoted as spying. A test double using this technique is therefore called a spy. What About Mocks? There is another possibility to verify the indirect output of next() by using a mock. The most important characteristic of this type of double is, that the indirect output verification is performed inside the delegation method. Furthermore it allows to ensure that the expected method has actually been called: public class CounterStorageMock implements CounterStorage {private int expectedNumber; private boolean done;public CounterStorageMock( int expectedNumber ) { this.expectedNumber = expectedNumber; }@Override public void setNumber( int actualNumber ) { assertEquals( expectedNumber, actualNumber ); done = true; }public void verify() { assertTrue( done ); }@Override public int getNumber() { return 0; } } A CounterStorageMock instance is configured with the expected value by a constructor parameter. If setNumber(int) is called, it is immediately checked whether the given value matches the expected one. A flag stores the information that the method has been called. This allows to check the actual invocation using the verify() method. And this is how the storageOfStateChange test might look like using a mock: @Test public void storageOfStateChange() { CounterStorageMock storage = new CounterStorageMock( LOWER_BOUND + 1 ); NumberRangeCounter counter = new NumberRangeCounter( storage, LOWER_BOUND, RANGE );counter.next(); storage.verify(); } As you can see there is no specification verification left in the test. And it seems strange that the usual test structure has been twisted a bit. This is because the verification condition gets specified prior to the exercise phase in the middle of the fixture setup. Only the mock invocation check is left in the verify phase. But in return a mock provides a precise stacktrace in case behavior verification fails, which can ease problem analysis. If you take a look at the spy solution again, you will recognize that a failure trace would point to the verify section of the test, only. There would be no information about the line of production-code that has actually caused the test to fail. This is completely different with a mock. The trace would let us identify exactly the position where setNumber(int) was called. Having this information we could easily set a break point and debug the problematic matter. Due to the scope of this post I confined test double introduction on stubs, spies and mocks. For a short explanation on the other types you might have a look at Martin Fowler‘s post TestDouble, but the in-depth explanation of all types and their variations can be found in Meszaros’ xUnit Test Patterns book [MES]. A good comparison of mock vs. spy based on test double frameworks (see next section) can be found in Tomek Kaczanowski‘s book Practical Unit Testing with JUnit and Mockito [KAC]. After reading this section you may have the impression that writing all those test doubles is tedious work. Not very surprisingly, libraries have been written to simplify double handling considerably. Test Double Frameworks – The Promised Land? If all you have is a hammer, everything looks like a nail ProverbThere are a couple of frameworks, developed to ease the task of using test doubles. Unfortunately these libraries do not always a good job with respect to a precise Test Double Terminology. While e.g. JMock and EasyMock focus on mocks, Mockito is despite its name spy centric. Maybe that is why most people talk about mocking, regardless of what kind of double they are actually using. Nevertheless there are indications that Mockito is the preferred test double tool at the time being. I guess this is because it provides a good to read fluent interface API and compensates the drawback of spies mentionend above a bit, by providing detailed verification failure messages. Without going into detail I provide a version of the storageOfStateChange() test, that uses Mockito for spy creation and test verification. Note that mock and verify are static methods of the type Mockito. It is common practice to use static import with Mockito expressions to improve readability: @Test public void storageOfStateChange() { CounterStorage storage = mock( CounterStorage.class ); NumberRangeCounter counter = new NumberRangeCounter( storage, LOWER_BOUND, RANGE ); counter.next();verify( storage ).setNumber( LOWER_BOUND + 1 ); } There has been written a lot about whether to use such tools or not. Robert C. Martin for example prefers hand written doubles and Michael Boldischar even considers mocking frameworks harmful. The latter is describing just plain misuse in my opinion and for once I disagree with Martin saying ‘Writing those mocks is trivial.’ I have been using hand-written doubles by myself for years before I discovered Mockito. Instantly I was sold to the fluent syntax of stubbing, the intuitive way of verification and I considered it an improvment to get rid of those crabbed double types. But this surely is in the eye of the beholder. However I experienced that test double tools tempt developers to overdo things. For instance it is very easy to replace third-party components, which otherwise might be expensive to create, with doubles. But this is considered a bad practice and Steve Freeman and Nat Pryce explain in detail why you should only mock types that you own [FRE_PRY]. Third-party code calls for integration tests and an abstracting adapter layer. The latter is actually what we have indicated in our example by introducing the CounterStorage. And as we own the adapter, we can replace it safely with a double. The second trap one easily walks into is writing tests, where a test double returns another test double. If you come to this point you should reconsider the design of the code you are working with. It probably breaks the law of demeter, which means that there might be something wrong with the way your objects are coupled together. Last but not least if you think about to go with a test double framework you should keep in mind that this is usually a long term decision affecting a whole team. It is probably not the best idea to mix different frameworks due to a coherent coding style and even if you use only one, each (new) member has to learn the tool specific API. Before you start using test doubles extensively you might consider to read Martin Fowler’s Mocks Aren’t Stubs that compares classical vs. mockist testing, or Robert C. Martin’s When to Mock that introduces some heuristics to find the golden ratio between no doubles and too many doubles. Or as Tomek Kaczanowski puts it: ‘Excited that you can mock everything, huh? Slow down and make sure that you really need to verify interactions. Chances are you don’t.’ [KAC1] Conclusion This chapter of JUnit in a Nutshell discussed the implications of unit dependencies for testing. It illustrated the principle of isolation and showed how it can be put into practice by replacing DOCs with test doubles. In this context the concept of indirect in- and outputs was presented and its relevance for testing was described. The example deepened the knowledge with hands-on examples and introduced several test double types and their purpose of use. Finally a short explanation of test double frameworks and their pros and cons brought this chapter to an end. It was hopefully well-balanced enough to provide a comprehensible overview of the topic without being trivial. Suggestions for improvements are of course highly appreciated. The tutorial’s next post will cover JUnit features like Runners and Rules and show how to use them by means of the ongoing example. References [MES] xUnit Test Patterns, Gerard Meszaros, 2007 [MES1] xUnit Test Patterns, Chapter 5, Principle: Isolate the SUT, Gerard Meszaros, 2007 [KAC] Practical Unit Testing with JUnit and Mockito, Appendix C. Test Spy vs. Mock, Tomek Kaczanowski, 2013 [KAC1] Bad Tests, Good Tests, Chapter 4, Maintainability,Tomek Kaczanowski, 2013 [FRE_PRY] Growing Object-Oriented Software, Guided by Tests, Chapter 8, Steve Freeman, Nat Pryce, 2010Reference: JUnit in a Nutshell: Test Isolation from our JCG partner Frank Appel at the Code Affine blog....
career-logo

Programming Language Job Trends Part 2 – August 2014

In part 1 of the programming language job trends, we reviewed Java, C++, C#, Objective C, and Visual Basic. In today’s installment, we will review trends for PHP, Python, JavaScript, Ruby, and PERL. Watch for part 3 in the next few days, where we will look at some emerging languages and others gaining steam. First, let’s look at the trends from Indeed.com:          Much like the languages in part 1, there is a general downward trend for about 2 years. JavaScript still leads comfortably with Python demand staying almost flat during the past two years. PERL has been in a long decline since 2010, but still stays above PHP and Ruby. PHP had stayed flat for a while, but the past year has not been kind, with a more significant downward trend. Ruby trails, but like Python, has been almost flat for close to three years and is closing the gap with PHP and PERL. The stability of the Python and Ruby trends is probably due to their growth in non-startup environments. Much like part 1 of the job trends, the SimplyHired trends are mostly unusable. The data is definitely close to current, but wild swings in demand show me that I cannot trust the data. I will review SimplyHired again in the next installment. Lastly, we look at the relative growth trends from Indeed.com:Given what the job demand graph shows, it seems surprising that the Ruby growth would outpace all of the others so dramatically. However, due to the length of time in these charts, where Ruby did not have much demand in 2006, the growth is a little misleading. When you look at the remaining languages, Python has a clear lead on the others, hovering at 500% for the past three years. PHP and Javascript come next, but still below 100% growth. PERL lags the group, near -50% growth. The overall demand trend is similar to the trends in part 1, though these languages show a little more stability. The stability of Ruby and Python are a bright spot in some otherwise dismal trends in Part 1 and this Part 2. For people looking to learn new languages for web development or even some scripting, PERL seems to be losing relevance. I think Python and Ruby have taken over in both of those cases. Due to the popularity of Web CMS systems like WordPress, PHP demand may decline but will stick around for a long time. Please visit the blog in a few days when we look at some relatively newer languages to see if the trends remain the same.Reference: Programming Language Job Trends Part 2 – August 2014 from our JCG partner Rob Diana at the Regular Geek blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close