Featured FREE Whitepapers

What's New Here?


Develop, test and deploy standalone apps on CloudBees

CloudBees is a cloud platform providing repository, CI service (Jenkins) and server for your apps. So everything you need to develop, test and deploy. There are many options, e.g. repository can be Git or SVN, for server you can choose Jetty, Tomcat, Glassfish, JBoss, Wildfly etc. It is also possible to run standalone applications, which are provided with port number, so you can start your own server. And that’s the case we’ll cover here. spray.io is Scala framework for web apps. It allows you to create standalone web-apps (starting their own server, spray-can) or somewhat limited .war ones (spray-servlet), which you can deploy on JEE server like Glassfish, JBoss etc. We are going to use standalone here. You can clone the app from github. Let’s take a quick look at it now. The app Boot The Boot file is Scala App, so it’s like java class with main method. It’s runnable. It creates Service actor, which is handling all the HTTP requests. It also reads port number from app.port system property and binds the service to the host and port. app.port is provided by CloudBees, if you want to run the app locally, you need to set it e.g. by jvm command line -Dapp.port=8080. Service Service has MyService trait, which handles routing to empty path only. Yes, the app is not very complicated! Buildfile build.gradle file is a bit more interesting. Let’s start from it’s end.mainClassName attribute is set to Scala App. This is the class that is going to be run when you run it locally from command line by gradlew run. applicationDefaultJvmArgs is set to -Dapp.port=8080 and it’s also necessery for running locally from gradle. This way we set port which Service is going to be bound to. jar.archiveName is a setting used to set generated .jar name. Without it it’s dependent on the project directory name.You can run the application by issuing gradlew run (make sure gradlew file is executable). When it’s running, you can point your browser to http://localhost:8080 and you should see “Say hello to spray-routing on spray-can!” Nothing fancy, sorry. There is also “cb” task definde for gradle. If you issue gradlew cb, it builds zip file, with all the dependency .jars, and szjug-sprayapp-1.0.jar in it’s root. This layout is necessary for CloudBees stand alone apps. Deploy to CloudBees First you need to create an account on CloudBees. If you have one, download CloudBees SDK – so you can run commands from your command line. On Mac, I prefer brew install, but you are free to choose your way. When installed, run bees command. When run for the first time, it asks your login/password, so you don’t need to provide it every time you want to use bees. Build .zip we’ll deploy to the cloud. Go into the app directory (szjug-sprayapp) and issue gradlew cb command. This command not only creates the .zip file, it also prints .jars list useful to pass to bees command as classpath. Deploy the application with the following command run from szjug-sprayapp directory: bees app:deploy -a spray-can -t java -R class=pl.szjug.sprayapp.Boot -R classpath=spray-can-1.3.1.jar:spray-routing-1.3.1.jar:spray-testkit-1.3.1.jar:akka-actor_2.10-2.3.2.jar:spray-io-1.3.1.jar:spray-http-1.3.1.jar:spray-util-1.3.1.jar:scala-library-2.10.3.jar:spray-httpx-1.3.1.jar:shapeless_2.10-1.2.4.jar:akka-testkit_2.10-2.3.0.jar:config-1.2.0.jar:parboiled-scala_2.10-1.1.6.jar:mimepull-1.9.4.jar:parboiled-core-1.1.6.jar:szjug-sprayapp-1.0.jar build/distributions/szjug-sprayapp-1.0.zip And here abbreviated version for readability: bees app:deploy -a spray-can -t java -R class=pl.szjug.sprayapp.Boot -R classpath=...:szjug-sprayapp-1.0.jar build/distributions/szjug-sprayapp-1.0.zip spray-can is an application name, -t java is application type. -R are CloudBees properties, like class to run and classpath to use. Files for classpath are helpfully printed when gradle runs cb task, so you just need to copy & paste. And that’s it! Our application is running on the CloudBees server. It’s accessible at the URL from CloudBees console.  Use CloudBees services The app is deployed on CloudBees, but is that all? As I mentioned we could also use git repository and Jenkins. Let’s do it now. Repository (Git) Create new git repository on your CloudBees account. Choose “Repos” on the left, “Add Repository”… it’s all pretty straightforward.  Name it “szjug-app-repo” and remember it should be Git.Next add this repository as remote one to your local git repo. On the repositories page on your CloudBees console there is very helpful cheetsheet about how to do it. First add git remote repository. Let’s name it cb git remote add cb ssh://git@git.cloudbees.com/pawelstawicki/szjug-app-repo.git Then push your commits there: git push cb master Now you have your code on CloudBees. CI build server (Jenkins) It’s time to configure the app build on CI server. Go to “Builds”. This is where Jenkins lives. Create new “free-style” job.Set your git repository to the job, so that Jenkins checks out always fresh code version. You’ll need the repository URL. You can take it from “Repos” page.Set the URL here:Next thing to set up is gradle task. Add next build step of type “Invoke gradle script”. Select “Use Gradle Wrapper” – this way you can use gradle version provided with the project. Set “cb” as the gradle task to run.Well, that’s all you need to have the app built. But we want to deploy it, don’t we? Add post-build action “Deploy applications”. Enter Application ID (spray-can in our case, region should change automatically). This way we tell Jenkins where to deploy. It also needs to know what to deploy. Enter build/distributions/szjug-app-job-*.zip as “Application file”.Because you deployed the application earlier from the command line, settings like application type, main class, classpath etc. are already there and you don’t need to provide it again. It might also be useful to keep the zip file from each build, so we can archive it. Just add post-build action “Archive the artifacts” and set the same zip file.Ok, that’s all for build configuration on Jenkins. Now you can hit “Build now” link and the build should be added to the queue. When it is finished, you can see the logs, status etc. But what’s more important, the application should be deployed and accessible to the whole world. You can now change something in it, hit “Build now” and after it’s finished, check if the changes are applied. Tests Probably you also noticed there is a test attached. You can run it by gradlew test. It’s specs2 test, with trait MyService so we have access to myRoute, and Specs2RouteTest so we have access to spray.io testing facilities. @RunWith(classOf[JUnitRunner]) is necessary to run tests in gradle. Now when we have tests, we’d like to see tests results. That’s another post-build step in Jenkins. Press “Add post-build action” -> “Publish JUnit test result report”. Gradle doesn’t put test results where maven does, so you’ll need to specify the location of report files.When it’s done, next build should show test results. Trigger build job You now have build job able to build, test and deploy the application. However, this build is going to run only when you run it by hand. Let’s make it run every day, and after every change pushed to the repository.Summary So now you have everything necessary to develop an app. Git repository, continous integration build system, and infrastructure to deploy the app to (actually, also continously). Think of your own app, and… happy devopsing!Reference: Develop, test and deploy standalone apps on CloudBees from our JCG partner Pawel Stawicki at the Java, the Programming, and Everything blog....

Examining Red Hat JBoss BRMS deployment architectures for rules and events (part I)

(Article guest authored together with John Hurlocker, Senior Middleware Consultant at Red Hat in North America)  In this weeks tips & tricks we will be slowing down and taking a closer look at possible Red Hat JBoss BRMS deployment architectures.     When we talk about deployment architectures we are referring to the options you have to deploy a rules and/or events project in your enterprise. This is the actual runtime architecture that you need to plan for at the start of your design phases, determining for your enterprise and infrastructure what the best way would be to deploy your upcoming application. It will also most likely have an effect on how you design the actual application that you want to build, so being aware of your options should help make your projects a success. This will be a multi-part series that will introduce the deployment architectures in phases, starting this week with the first two architectures. The possibilities A rule administrator or architect work with application team(s) to design the runtime architecture for rules and depending on the organizations needs the architecture could be any one of the following architectures or a hybrid of the designs below. In this series we will present four different deployment architectures and discuss one design time architecture while providing the pros and cons for each one to allow for evaluation of each one for your own needs. The basic components in these architectures shown in the accompanying illustrations are:JBoss BRMS server Rules developer / Business analyst Version control (GIT) Deployment servers (JBoss EAP) Clients using your applicationIllustration 1: Rules in applicationRules deployed in application The first architecture is the most basic and static in nature of all the options you have to deploy rules and events in your enterprise architecture. A deployable rule package (e.g. JAR) is included in your application’s deployable artifact (e.g. EAR, WAR). In this architecture the JBoss BRMS server acts as a repository to hold your rules and a design time tool. Illustration 1 shows how the JBoss BRMS server is and remains completely disconnected from the deployment or runtime environment.   ProsTypically better performance than using a rule execution server since the rule execution occurs within the same JVM as your applicationConsDo not have the ability to push rule updates to production applicationsrequires a complete rebuild of the application requires a complete re-testing of the application (Dev – QA – PROD) Illustration 2: KieScanner deploymentRules scanned from application A second architecture that you can use to slightly modify the previous one, is to add a scanner to your application that then monitors for new rule and event updates, pulling them in as they are deployed into your enterprise architecture. The JBoss BRMS API contains a KieScanner that monitors the rules repository for new rule package versions. Once a new version is available it will be picked up by the KieScanner and loaded into your application, as shown in illustration 2. The Cool Store demo project provides an example that demonstrates the usage of JBoss BRMS KieScanner, with an example implementation showing how to scan your rule repository for the last freshly built package. ProsNo need to restart your application serversin some organizations the deployment process for applications can be very lengthy this allows you to push rule updates to your application(s) in real timeConsNeed to create a deployment process for testing the rule updates with the application(s)risk of pushing incorrect logic into application(s) if the above process doesn’t thoroughly testNext up Next time we will dig into the two remaining deployment architectures that provide you with an Execution Server deployment and a hybrid deployment model to leverage several elements in a single architecture. Finally, we will cover a design time architecture for your teams to use while crafting and maintaining the rules and events in your enterprise.Reference: Examining Red Hat JBoss BRMS deployment architectures for rules and events (part I) from our JCG partner Eric Schabell at the Eric Schabell’s blog blog....

JavaFX Tip 7: Use CSS Color Constants / Derive Colors

When working on FlexCalendarFX I got to the point where I had to define a set of colors to visualize the controls for different calendars in different colors. And not just one color per calendar but several: a background and a text color for deselected / selected / hover states. The  colors were used in several places but for the sake of brevity I only focus on the visual calendar entries in the day view of FlexCalendarFX. The two screenshots below show the same entry, first deselected, then selected.          What is important to notice is that these are not completely different colors but they all have the same base color (green) but with different saturation. The code below shows the best way I could find to define related colors in JavaFX CSS. I define the base color globally under “.root” and derive all other colors using this constant. .root { -style1-color: rgb(119, 192, 75, .9); }.style1-entry { -fx-background-color: derive(-style1-color, 50%); }.style1-entry:selected { -fx-background-color: -style1-color; }.style1-entry-time-label, .style1-entry-title-label { -fx-text-fill: derive(-style1-color, -50%); } Please notice that the base color is using transparency as described in my previous blog about transparent colors. The other background colors in this CSS fragment are all derived from the base color. They are either brighter (positive percentage value in derive function) or darker (negative percentage value). By using this approach to defining colors you can achieve a consistent and smooth look for your application and it will not look like your child’s coloring book.Reference: JavaFX Tip 7: Use CSS Color Constants / Derive Colors from our JCG partner Dirk Lemmermann at the Pixel Perfect blog....

JavaFX Tip 6: Use Transparent Colors

Picking the right colors for your user interface elements is always a great challenge, but it is even more challenging when you develop reusable framework controls where you as a developer have no control over the look and feel of the application using them. While you might always add elements on top of the default gray background the developers embedding your controls  might have more of a gothic tendency and use a black background. All of a sudden the nice colors your picked clash with the rest of the application. To tackle this problem the best way I found while working on FlexGanttFX and FlexCalendarFX was to use semi-transparent colors. When you do the color of your UI elements will always be a mix of their own color and the background color. Your colors will become brighter if the application uses a white background and darker if it is using a black background. The contrast between your element and the background will never be strong, which makes for a smooth appearance. The following screenshots were taken from FlexCalendarFX (work-in-progress).Same UI now with a darker background. You might not see it at first, but the green and blue are actually different between these two screenshots. These are very subtle differences, but they make a big difference in the overall impression of your application.In JavaFX you can define colors in CSS with an alpha channel value smaller than 1 to achieve transparency: .my-style {     -fx-background-color: rgba(255, 255, 255, .7); // transparent white } Using opacity also has the nice side-effect that you can still distinguish different elements even when they overlap each other.Reference: JavaFX Tip 6: Use Transparent Colors from our JCG partner Dirk Lemmermann at the Pixel Perfect blog....

Improve your Feedbackloop with Continuous Testing

Have you ever though about what the most valueable thing in software development was for you? And im not talking about things that value for you personally, but for the success of the development itself. Well i have thought about it, and for me it was Feedback – in any form. It is so important, because it enables steering. Development practices are made for the purpose of better feedback. TDD, Continuous Integration, Iterations, to name only a view. Many of the agile methods and XP are basically about better feedback. It begins with customer interaction. I need as much feedback as possible, as frequent as possible. If i don’t get feedback, i’m likely to get on the wrong track, resulting in a product that is not going to be used, because it’s not what the customer needed. The more feedback i get, the better the outcome will be. If i get feedback rarely, i am unable to steer. I’m forced to make assumptions which are likely just obstacles. The quicker i get feedback, the faster i can deliver value to my customer. Feedback allows me to steer. Feedback is as important for programming. I want it early and often. If i write hundreds of lines of code without running them, they will most likely result in a very long and painful debugging session, and a lot of changes. I don’t want that, so i take baby steps. They make me go safer, faster and happier. There are two phases that define my programming feedback loop.The phase where i write code. Lets call it alpha, like so ‘α’. The phase where i evaluate my code, and eventually fix errors. Lets call it beta, like so ‘β’.You could also see those phases as modes. It is important to understand here, that these phases have nothing todo with the alpha/beta definition of a software cycle. I just invented them to describe my programming feedbackloop.In the following graphics you’ll notice that the lines get shorter and shorter by example which is intentional and should point out how i got faster using new strategies. When i first started coding, i did not write any tests. I wrote lots of code before i tried it out manually. Obviously it didn’t work when i first ran it. I ended up in rather long α-phases, where i just wrote code, and also long β-phases, where i evaluated (got my feedback), and fixed it. Like this:  I was very slow back then. I soon started with an interpreted language, which was very cool because i could run the scripts immediately. No need to compile or anything. Just write and run. It shortened my feedback loop, and i became faster overall:  Sooner or later i eventually started tdd. And regardless of the language that i was using, interpreted or not, it again shortened my feedback loop and made me go faster. The loop was shortened to a single ‘unit’, which is obviously smaller than anything manually executable. It allowed me to evaluate small behaviours, long before the program was even runnable. It is important to understand, that the α-phase in the following graphic contains both writing tests and implementation. The β-phase is much shorter, since unittests run very fast.  I thought this was just great, and it could not get any better. Wrong i was!! Later, i tried something that made me go like this:  “What the…?” You might ask. No, i did not break space-time. The thing i tried was Continuous Testing. Which basically means, that i do tdd, but i don’t run my tests by pressing a button and then wait. I just have them run all the time in the background automatically…Everytime i change my code, my tests immediately run automatically, and show me “OK” or “NOT OK” on a small icon on my screen. Since the tests only take about a second to run, this feedback is instant. And since my IDE saves my files onchange automatically, i do not have to press ctrl+s or anything. I just code…and as i code my files get saved….and as the files get saved my tests get run…fluently, immediately. This is HUGE. I am now progressing without disruption. I completely broke out of the phases, or if you want to call them ‘modes’. I love it. I have used infinitest for this. It is a Continuous Testing plugin for Intellij/Eclipse. If you are doing javascript, i can also recommend grunt for Continuous Testing.Reference: Improve your Feedbackloop with Continuous Testing from our JCG partner Gregor Riegler at the Be a better Developer blog....

Java 8 Friday: More Functional Relational Transformation

In the past, we’ve been providing you with a new article every Friday about what’s new in Java 8. It has been a very exciting blog series, but we would like to focus again more on our core content, which is Java and SQL. We will still be occasionally blogging about Java 8, but no longer every Friday (as some of you have already notice). In this last, short post of the Java 8 Friday series, we’d like to re-iterate the fact that we believe that the future belongs to functional relational data transformation (as opposed to ORM). We’ve spent about 20 years now using the object-oriented software development paradigm. Many of us have been very dogmatic about it. In the last 10 years, however, a “new” paradigm has started to get increasing traction in programming communities: Functional programming. Functional programming is not that new, however. Lisp has been a very early functional programming language. XSLT and SQL are also somewhat functional (and declarative!). As we’re big fans of SQL’s functional (and declarative!) nature, we’re quite excited about the fact that we now have sophisticated tools in Java to transform tabular data that has been extracted from SQL databases. Streams! SQL ResultSets are very similar to Streams As we’ve pointed out before, JDBC ResultSets and Java 8 Streams are quite similar. This is even more true when you’re using jOOQ, which replaces the JDBC ResultSet by an org.jooq.Result, which extends java.util.List, and thus automatically inherits all Streams functionality. Consider the following query that allows fetching a one-to-many relationship between BOOK and AUTHOR records: Map<Record2<String, String>, List<Record2<Integer, String>>> booksByAuthor =// This work is performed in the database // -------------------------------------- ctx.select( BOOK.ID, BOOK.TITLE, AUTHOR.FIRST_NAME, AUTHOR.LAST_NAME ) .from(BOOK) .join(AUTHOR) .on(BOOK.AUTHOR_ID.eq(AUTHOR.ID)) .orderBy(BOOK.ID) .fetch()// This work is performed in Java memory // ------------------------------------- .stream()// Group BOOKs by AUTHOR .collect(groupingBy(// This is the grouping key r -> r.into(AUTHOR.FIRST_NAME, AUTHOR.LAST_NAME),// This is the target data structure LinkedHashMap::new,// This is the value to be produced for each // group: A list of BOOK mapping( r -> r.into(BOOK.ID, BOOK.TITLE), toList() ) )); The fluency of the Java 8 Streams API is very idiomatic to someone who has been used to writing SQL with jOOQ. Obviously, you can also use something other than jOOQ, e.g. Spring’s JdbcTemplate, or Apache Commons DbUtils, or just wrap the JDBC ResultSet in an Iterator… What’s very nice about this approach, compared to ORM is the fact that there is no magic happening at all. Every piece of mapping logic is explicit and, thanks to Java generics, fully typesafe. The type of the booksByAuthor output is complex, and a bit hard to read / write, in this example, but it is also fully descriptive and useful. The same functional transformation with POJOs If you aren’t too happy with using jOOQ’s Record2 tuple types, no problem. You can specify your own data transfer objects like so: class Book { public int id; public String title;@Override public String toString() { ... }@Override public int hashCode() { ... }@Override public boolean equals(Object obj) { ... } }static class Author { public String firstName; public String lastName;@Override public String toString() { ... }@Override public int hashCode() { ... }@Override public boolean equals(Object obj) { ... } } With the above DTO, you can now leverage jOOQ’s built-in POJO mapping to transform the jOOQ records into your own domain classes: Map<Author, List<Book>> booksByAuthor = ctx.select( BOOK.ID, BOOK.TITLE, AUTHOR.FIRST_NAME, AUTHOR.LAST_NAME ) .from(BOOK) .join(AUTHOR) .on(BOOK.AUTHOR_ID.eq(AUTHOR.ID)) .orderBy(BOOK.ID) .fetch() .stream() .collect(groupingBy(// This is the grouping key r -> r.into(Author.class), LinkedHashMap::new,// This is the grouping value list mapping( r -> r.into(Book.class), toList() ) )); Explicitness vs. implicitness At Data Geekery, we believe that a new time has started for Java developers. A time where Annotatiomania™ (finally!) ends and people stop assuming all that implicit behaviour through annotation magic. ORMs depend on a huge amount of specification to explain how each annotation works with each other annotation. It is hard to reverse-engineer (or debug!) this kind of not-so-well-understood annotation-language that JPA has brought to us. On the flip side, SQL is pretty well understood. Tables are an easy-to-handle data structure, and if you need to transform those tables into something more object-oriented, or more hierarchically structured, you can simply apply functions to those tables and group values yourself! By grouping those values explicitly, you stay in full control of your mapping, just as with jOOQ, you stay in full control of your SQL. This is why we believe that in the next 5 years, ORMs will lose relevance and people start embracing explicit, stateless and magicless data transformation techniques again, using Java 8 Streams.Reference: Java 8 Friday: More Functional Relational Transformation from our JCG partner Lukas Eder at the JAVA, SQL, AND JOOQ blog....

Fridays Fun

                          Reference: Fridays Fun from our JCG partner Bohdan Bandrivskyy at the Java User Group of Lviv blog....

Use Cases for Elasticsearch: Full Text Search

In the last post of this series on use cases for Elasticsearch we looked at the features Elasticsearch provides for storing even large amounts of documents. In this post we will look at another one of its core features: Search. I am building on some of the information in the previous post so if you haven’t read it you should do so now. As we have seen we can use Elasticsearch to store JSON documents that can even be distributed across several machine. Indexes are used to group documents and each document is stored using a certain type. Shards are used to distribute parts of an index across several nodes and replicas are copies of shards that are used for distributing load as well as for fault tolerance.   Full Text Search Everybody uses full text search. The amount of information has just become too much to access it using navigation and categories alone. Google is the most prominent example offering instant keyword search across a huge amount of information.Looking at what Google does we can already see some common features of full text search. Users only provide keywords and expect the search engine to provide good results. Relevancy of documents is expected to be good and users want the results they are looking for on the first page. How relevant a document is can be influenced by different factors like h the queried term exists in a document. Besides getting the best results the user wants to be supported during the search process. Features like suggestions and highlighting on the result excerpt can help with this. Another area where search is important is E-Commerce with Amazon being one of the dominant players.The interface looks similar to the Google one. The user can enter keywords that are then searched for. But there are also slight differences. The suggestions Amazon provides are more advanced, also hinting at categories a term might be found in. Also the result display is different, consisting of a more structured view. The structure of the documents being searched is also used for determining the facets on the left that can be used to filter the current result based on certain criteria, e.g. all results that cost between 10 and 20 €. Finally, relevance might mean something completely different when it comes to something like an online store. Often the order of the result listing is influenced by the vendor or the user can sort the results by criteria like price or release date. Though neither Google nor Amazon are using Elasticsearch you can use it to build similar solutions. Searching in Elasticsearch As with everything else, Elasticsearch can be searched using HTTP. In the most simple case you can append the _search endpoint to the url and add a parameter: curl -XGET "http://localhost:9200/conferences/talk/_search?q=elasticsearch⪯tty=true". Elasticsearch will then respond with the results, ordered by relevancy. { "took" : 81, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.067124054, "hits" : [ { "_index" : "conferences", "_type" : "talk", "_id" : "iqxb7rDoTj64aiJg55KEvA", "_score" : 0.067124054, "_source":{ "title" : "Anwendungsfälle für Elasticsearch", "speaker" : "Florian Hopf", "date" : "2014-07-17T15:35:00.000Z", "tags" : ["Java", "Lucene"], "conference" : { "name" : "Java Forum Stuttgart", "city" : "Stuttgart" } }} ] } } Though we have searched on a certain type now you can also search multiple types or multiple indices. Adding a parameter is easy but search requests can become more complex. We might request highlighting or filter the documents according to a criteria. Instead of using parameters for everything Elasticsearch offers the so called Query DSL, a search API that is passed in the body of the request and is expressed using JSON. This query could be the result of a user trying to search for elasticsearch but mistyping parts of it. The results are filtered so that only talks for conferences in the city of Stuttgart are returned. curl -XPOST "http://localhost:9200/conferences/_search " -d' { "query": { "match": { "title" : { "query": "elasticsaerch", "fuzziness": 2 } } }, "filter": { "term": { "conference.city": "stuttgart" } } }' This time we are querying all documents of all types in the index conferences. The query object requests one of the common queries, a match query on the title field of the document. The query attribute contains the search term that would be passed in by the user. The fuzziness attribute requests that we should also find documents that contain terms that are similar to the term requested. This will take care of the misspelled term and also return results containing elasticsearch. The filter object requests that all results should be filtered according to the city of the conference. Filters should be used whenever possible as they can be cached and do not calculate the relevancy which should make them faster. Normalizing Text As search is used everywhere users also have some expectations of how it should work. Instead of issuing exact keyword matches they might use terms that are only similar to the ones that are in the document. For example a user might be querying for the term Anwendungsfall which is the singular of the contained term Anwendungsfälle, meaning use cases in German: curl -XGET "http://localhost:9200/conferences/talk/_search?q=title:anwendungsfall⪯tty=true" { "took" : 2, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 0, "max_score" : null, "hits" : [ ] } } No results. We could try to solve this using the fuzzy search we have seen above but there is a better way. We can normalize the text during indexing so that both keywords point to the same term in the document. Lucene, the library search and storage in Elasticsearch is implemented with provides the underlying data structure for search, the inverted index. Terms are mapped to the documents they are contained in. A process called analyzing is used to split the incoming text and add, remove or modify terms.On the left we can see two documents that are indexed, on the right we can see the inverted index that maps terms to the documents they are contained in. During the analyzing process the content of the documents is split and transformed in an application specific way so it can be put in the index. Here the text is first split on whitespace or punctuation. Then all the characters are lowercased. In a final step the language dependent stemming is employed that tries to find the base form of terms. This is what transforms our Anwendungsfälle to Anwendungsfall. What kind of logic is executed during analyzing depends on the data of your application. The analyzing process is one of the main factors for determining the quality of your search and you can spend quite some time with it. For more details you might want to look at my post on the absolute basics of indexing data. In Elasticsearch, how fields are analyzed is determined by the mapping of the type. Last week we have seen that we can index documents of different structure in Elasticsearch but as we can see now Elasticsearch is not exactly schema free. The analyzing process for a certain field is determined once and cannot be changed easily. You can add additional fields but you normally don’t change how existing fields are stored. If you don’t supply a mapping Elasticsearch will do some educated guessing for the documents you are indexing. It will look at any new field it sees during indexing and do what it thinks is best. In the case of our title it uses the StandardAnalyzer because it is a string. Elasticsearch does not know what language our string is in so it doesn’t do any stemming which is a good default. To tell Elasticsearch to use the GermanAnalyzer instead we need to add a custom mapping. We first delete the index and create it again: curl -XDELETE "http://localhost:9200/conferences/"curl -XPUT "http://localhost:9200/conferences/“ We can then use the PUT mapping API to pass in the mapping for our type. curl -XPUT "http://localhost:9200/conferences/talk/_mapping" -d' { "properties": { "tags": { "type": "string", "index": "not_analyzed" }, "title": { "type": "string", "analyzer": "german" } } }' We have only supplied a custom mapping for two fields. The rest of the fields will again be guessed by Elasticsearch. When creating a production app you will most likely map all of your fields in advance but the ones that are not that relevant can also be mapped automatically. Now, if we index the document again and search for the singular, the document will be found. Advanced Search Besides the features we have seen here Elasticsearch provides a lot more. You can automatically gather facets for the results using aggregations which we will look at in a later post. The suggesters can be used to perform autosuggestion for the user, terms can be highlighted, results can be sorted according to fields, you get pagination with each request, …. As Elasticsearch builds on Lucene all the goodies for building an advanced search application are available. Conclusion Search is a core part of Elasticsearch that can be combined with its distributed storage capabilities. You can use to Query DSL to build expressive queries. Analyzing is a core part of search and can be influenced by adding a custom mapping for a type. Lucene and Elasticsearch provide lots of advanced features for adding search to your application. Of course there are lots of users that are building on Elasticsearch because of its search features and its distributed nature. GitHub uses it to let users search the repositories, StackOverflow indexes all of its questions and answers in Elasticsearch and SoundCloud offers search in the metadata of the songs. In the next post we will look at another aspect of Elasticsearch: Using it to index geodata, which lets you filter and sort results by postition and distance.Reference: Use Cases for Elasticsearch: Full Text Search from our JCG partner Florian Hopf at the Dev Time blog....

Server vs Client Side Rendering (AngularJS vs Server Side MVC)

There’s a lot of discussion related to server vs client side application rendering. While there is no “one choice fits all” solution, I’ll try to argue in favor of client side (specifically AngularJS) from different points of view. First of them is architecture. Architecture Well done architecture has clearly defined separation of concerns (SoS). In most cases minimal high level configuration is:      Data storage Services API PresentationEach of those layers should have only minimal knowledge of the one above. Services need to know where to store data, API needs to know what services to call and the presentation layer can communicate with the rest only through the API. Important thing to note here is that knowledge about layers below should be non-existent. For example API should not know who or what will consume it. It should have no knowledge of the presentation layer. A lot more should be said for each of those layers and the situation in the “real world” is much more complicated than this. However, for the rest of the article the important takeaway is that the presentation layer communicates with the server through the API which, in turn, does not know anything about the “world outside”. This separation is becoming more important with ever-increasing types of machines and devices (laptop, mobile, tablet, desktop). Back-end should only provide business logic and data. Skills Taking developers skills into account is an important aspect of the architecture. If, for example, developers are used to work in Java, one should not design a system that is based on C# unless there is a clear advantage to do the change. That does not mean that skills should not be increased with new languages (who said Scala and Clojure?). It’s just that productivity of the team must be taken into account and knowledge of programming languages are an important element. No matter the existing knowledge, there are some skills required by the type of the application. For example, if the application will have a Web site as the presentation layer, HTML, CSS and JavaScript knowledge is a must. There are frameworks that can be used to avoid the need for that knowledge (i.e. Vaadin). However, usage of those frameworks only postpones the inevitable: acceptance that HTML, CSS and JS are, one way or another, languages that browser understands. The question is only whether you’ll adopt them directly or use something else (i.e. Java or C#) to write them for you. Later case might give an impression of being able to do things faster but, in many situations, there is a price to pay later when the time comes to do something those “translators” do not support. Server side is easier to cope with. There are more choices and there are good (if not great) solutions for every skill set. We might argue whether Scala is better than Java but both provide results good enough and the decisions to learn a new language are much harder to make (even though I think that a developer should continuously extend his knowledge by trying new languages and frameworks). One can code the back-end in Java, Scala, C#, Clojure, JavaScript/NodeJS, etc. We don’t have that luxury in browsers. Adobe Flash is dead, Silverlight never lifted off. Tools like Vaadin that were designed primarily to alleviate the pain that JavaScript was causing are loosing their ground due to continuous improvements we’re seeing with HTML and JavaScript. The “browser” world is changing rapidly and the situation is quite different from what it was not so long ago. Welcome to the world of HTML5. Similar can be said for development of mobile devices. There is no one language fits all. We cannot develop iPhone applications in Java. While HTML can be the solution in some cases, in others one needs to go for “native” development. The only constant is that, no matter whether it’s Web, mobile, desktop or Google glass, they should all communicate with the rest of the system using an API. The point I’m trying to make is that there must be a balance between the adoption of languages needed to do the work and switching to a new language with every new project. Some languages are a must and some are good (but not mandatory) to have. When working with Web, HTML, CSS and JavaScript are a must. Server vs client side rendering Since we established that, in case of Web sites (who said applications?) HTML with CSS and JavaScript is a must and tools that are trying to create it for us are “evil”, the question remains who renders the HTML. For most of the history of browsers, we were used to render HTML in the server and send it to the browser. There were strong reasons for that. Front-end technologies and frameworks were young and immature, browsers had serious compatibility issues and, generally speaking, working with JavaScript was painful. That picture is not valid any more. Google showed us that in many cases browser is as good as desktop. JQuery revolutionized the way we work with JavaScript by letting us manipulate DOM in a relatively easy way. Many other JS frameworks and libraries were released. However, until recently there was no substitute for the good old model-view-controller (MVC) pattern. Server rendering is a must for all but small sites. Or is it? AngularJS changed the way we perceive MVC (actually it’s model-view-whatever but let’s not get sidetracked). It can be done in the client without sacrificing productivity. I’d argue that, in many cases, with AngularJS productivity increases. There are other client side MVCs like BackboneJS and EmberJS. However, as far as my experience goes, nothing beats AngularJS. AngularJS is not without its problems. Let’s go through pros and cons of client vs server-side page rendering. By client side I’m assuming AngularJS. For this comparison, server-side can be anything (Java, C#, etc). AngularJS cons Page rendering is slower since browser needs to do the extra work of DOM manipulation, watch for changes in bind data, do additional REST requests to the server, etc. First time the application is opened, it needs to download all JavaScript files. Depending on the complexity of the application, this might or might not be a problem. Modern computers are perfectly capable to take over the extra work. Mobile devices are more powerful than older computers. In most cases, clients will not notice this increase in the work browser needs to do. Compatibility with older browsers is hard to accomplish. One would need to render alternative pages on the server. The weight of this argument depends on whether you care for (very) old browsers. The main culprit is Internet Explorer. Version 8 works (somehow) if additional directives are applied. Earlier versions are not supported. Future versions of AngularJS will drop support for Internet Explorer 8. It’s up to you to decide whether support for IE8 and earlier is important. If it is, alternative pages need to be served and that will result in additional development time. Depending on the complexity of the application, same problem might exist in non-AngularJS development. Search Engines Optimisation (SEO) is probably the biggest issue. At the moment, most common technique for mitigating this problem is to pre-render pages on the server. It’s a relatively simple process that requires a small amount of code that will work for any screen. More information can be found in How do I create an HTML snapshot? and Prerender.io. In May 2014 Understanding web pages better article appeared giving us good news about Google being able to execute JavaScript thus solving (or being on the way to solve) SEO problems for sites relying heavily on JS. AngularJS pros Server performance, if done well (clever usage of JSON, client-side caching, etc), increases. The amount of traffic between client and the server is reduced. Server itself does not need to create page before sending it to the client. It only needs to serve static files and respond to API calls with JSON. The traffic and server workload is reduced. AngularJS is designed having testing needs in mind. Together with the dependency injection, mocking objects, services and functions it is very easy to write tests (easier than in most other cases I worked with). Both unit and end-to-end tests can be written and run fast. As suggested in the architecture section, front-end is (almost) completely decoupled from the back-end. AngularJS needs to have knowledge of the REST API. Server still needs to deliver static files (HTML, CSS and JavaScript) and to pre-render screens when crawlers are visiting. However, both jobs do not need any internal knowledge of the rest of the system and can be done on the same or completely different server. Simple NodeJS HTTP server can serve the purpose. This decoupling allows us to develop back-end and front-end independently from each other. With client side rendering, browser is the API consumer in the same way as an Android, iPhone or desktop application would be. Knowledge of server-side programming languages is not needed. No matter the approach one takes (server or client rendering), HTML/CSS/JavaScript is required. Not mixing server-side into this picture makes lives of front-end developers much easier. Google support for Angular is a plus. Having someone like Google behind it makes it more likely that its support and future improvements will continue will full speed. Once used to AngularJS way of working, development speed increases. Amount of code can be greatly reduced. Elimination of the need to re-compile the back-end code allows us to see changes to the front-end almost immediately. Summary This view of the client vs server-side rendering should be taken with caution. There is no “one fits all” solution. Depending on needs and solutions employed, many pros and cons listed above are not valid or can be applied to the server-side rendering as well. Server side rendering is in many cases chosen in order to avoid the dive into HTML, CSS and JavaScript. It makes developers that are used to work with the server-side programming languages (Java, C#, etc) more comfortable thinking that there’s no need to learn “browser” languages. Also, in many cases it produces (often unintentional) coupling with the back-end code. Both situations should be avoided. I’m not arguing that server-side rendering inevitably leads to those situations but that it makes them more likely. It’s a brave new world out there. Client-side programming is quite different from what it was before. There are many reasons to at least try it out. Whatever the decision, it should be taken with enough information that can be obtained only through practical experience. Try it out and don’t give up on the first obstacle (there will be many). If you choose not to take this route, make it an informed decision. Client side MVCs like AngularJS are far from perfect. They are relatively young and have a long way to go. Many improvements will come soon and I’m convinced that the future of Web is going in that direction.Reference: Server vs Client Side Rendering (AngularJS vs Server Side MVC) from our JCG partner Viktor Farcic at the Technology conversations blog....

Applied Big Data : The Freakonomics of Healthcare

I went with a less provocative title this time because my last blog post (http://brianoneill.blogspot.com/2014/04/big-data-fixes-obamacare.html) evidently incited political flame wars. In this post, I hope to avoid that by detailing exactly how Big Data can help our healthcare system in a nonpartisan way. First, let’s decompose the problem a bit. Economics Our healthcare system is still (mostly) based on capitalism: more patients + more visits = more money. Within such a system, it is not in the best interest of healthcare providers to have healthy patients. Admittedly, this is a pessimistic view, and doctors and providers are not always prioritizing financial gain. Minimally however, at a macro-scale there exists a conflict of interest for some segment of the market, because not all healthcare providers profit entirely from preventative care. Behavior Right now, with a few exceptions, everyone pays the same for healthcare. Things are changing, but broadly speaking, there are no financial incentives to make healthy choices. We are responsible only for a fraction of the medical expenses we incur. That means everyone covered by my payer (the entity behind the curtain that actually foots the bills) is helping pay for the medical expenses I may rack up as a result of my Friday night pizza and beer binges. Government Finally, the government is trying. They are trying really hard. Through transparency, reporting, and compliance, they have the correct intentions and ideas to bend the cost curve of healthcare. But the government is the government, and large enterprises are large enterprises. And honestly, getting visibility into the disparate systems of any large single large enterprise is difficult (ask any CIO). Imagine trying to gain visibility into thousands enterprises, all at once. It’s daunting: schematic disparities, messy data, ETL galore. Again, this is a pessimistic view and there are remedies in the works. Things like high deductible plans are making people more aware of their expenses. Payers are trying to transition away from fee-for-service models. (http://en.m.wikipedia.org/wiki/Fee-for-service). But what do these remedies need to be effective? You guessed it. Data. Mounds of it. If you are a payer and want to reward the doctors that are keeping their patients healthy (and out of the doctors offices!), how would you find them? If you are a patient, and want to know who provides the most effective treatments at the cheapest prices, where would you look?  If you are the government and want to know how much pharmaceutical companies are spending on doctors, or which pharmacies are allowing fraudulent prescriptions, what systems would you need to integrate? Hopefully now, you are motivated. This is a big data problem. What’s worse is that it is a messy data problem.  At HMS, its taken us more than three years and significant blood, sweat and tears to put together a platform that deals with the big and messy mound o’ data. The technologies had to mature, along with people and processes. And finally, on sunny days, I can see a light at the end of the tunnel for US healthcare. If you are on the same mission, please don’t hesitate to reach out. Ironically, I’m posting this from a hospital bed as I recover from the bite of a brown recluse spider. I guess there are certain things that big data can’t prevent!Reference: Applied Big Data : The Freakonomics of Healthcare from our JCG partner Brian ONeill at the Brian ONeill’s Blog blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below: