Featured FREE Whitepapers

What's New Here?


Avoid unwanted component scanning of Spring Configuration

I came through interesting problem on Stack Overflow. Brett Ryan had problem that Spring Security configuration was initialized twice. When I was looking into his code I spot the problem. Let me show show the code. He has pretty standard Spring application (not using Spring Boot). Uses more modern Java servlet Configuration based on Spring’s AbstractAnnotationConfigDispatcherServletInitializer.         import org.springframework.web.servlet.support.AbstractAnnotationConfigDispatcherServletInitializer;public class AppInitializer extends AbstractAnnotationConfigDispatcherServletInitializer {@Override protected Class<?>[] getRootConfigClasses() { return new Class[]{SecurityConfig.class}; }@Override protected Class<?>[] getServletConfigClasses() { return new Class[]{WebConfig.class}; }@Override protected String[] getServletMappings() { return new String[]{"/"}; }} As you can see, there are two configuration classes:SecurityConfig – holds Spring Security configuration WebConfig – main Spring’s IoC container configurationpackage net.lkrnac.blog.dontscanconfigurations;import org.springframework.beans.factory.annotation.Autowired; import org.springframework.context.annotation.Configuration; import org.springframework.security.config.annotation.authentication.builders.AuthenticationManagerBuilder; import org.springframework.security.config.annotation.web.configuration.WebSecurityConfigurerAdapter; import org.springframework.security.config.annotation.web.servlet.configuration.EnableWebMvcSecurity;@Configuration @EnableWebMvcSecurity public class SecurityConfig extends WebSecurityConfigurerAdapter {@Autowired public void configureGlobal(AuthenticationManagerBuilder auth) throws Exception { System.out.println("Spring Security init..."); auth .inMemoryAuthentication() .withUser("user").password("password").roles("USER"); }} import org.springframework.context.annotation.ComponentScan; import org.springframework.context.annotation.Configuration; import org.springframework.web.servlet.config.annotation.EnableWebMvc; import org.springframework.web.servlet.config.annotation.WebMvcConfigurerAdapter;@Configuration @EnableWebMvc @ComponentScan(basePackages = "net.lkrnac.blog.dontscanconfigurations") public class WebConfig extends WebMvcConfigurerAdapter {} Pay attention to the component scanning in WebConfig. It is scanning package where all three classes are located. When you run this on servlet container, text “Spring Security init…” is written to console twice. It mean mean SecurityConfig configuration is loaded twice. It was loaded:During initialization of servlet container in method AppInitializer.getRootConfigClasses() By component scan in class WebConfigWhy? I found this explanation in Spring’s documentation: Remember that @Configuration classes are meta-annotated with @Component, so they are candidates for component-scanning! So this is feature of Spring and therefore we want to avoid component scanning of Spring @Configuration used by Servlet configuration. Brett Ryan independently found this problem and showed his solution in mentioned Stack Overflow question: @ComponentScan(basePackages = "com.acme.app", excludeFilters = { @Filter(type = ASSIGNABLE_TYPE, value = { WebConfig.class, SecurityConfig.class }) }) I don’t like this solution. Annotation is too verbose for me. Also some developer can create new @Configuration class and forget to include it into this filter. I would rather specify special package that would be excluded from Spring’s component scanning.I created sample project on Github so that you can play with it.Reference: Avoid unwanted component scanning of Spring Configuration from our JCG partner Lubos Krnac at the Lubos Krnac Java blog blog....

Black Box Testing of Spring Boot Microservice is so easy

When I needed to do prototyping, proof of concept or play with some new technology in free time, starting new project was always a little annoying barrier with Maven. Have to say that setting up Maven project is not hard and you can use Maven Archetypes. But Archetypes are often out of date. Who wants to play with old technologies? So I always end up wiring in dependencies I wanted to play with. Not very productive spent time. But than Spring Boot came to my way. I fell in love. In last few months I created at least 50 small playground projects, prototypes with Spring Boot. Also incorporated it at work. It’s just perfect for prototyping, learning, microservices, web, batch, enterprise, message flow or command line applications. You have to be dinosaur or be blind not to evaluate Spring Boot for your next Spring project. And when you finish evaluate it, you will go for it. I promise. I feel a need to highlight how easy is Black Box Testing of Spring Boot microservice. Black Box Testing refers to testing without any poking with application artifact. Such testing can be called also integration testing. You can also perform performance or stress testing way I am going to demonstrate. Spring Boot Microservice is usually web application with embedded Tomcat. So it is executed as JAR from command line. There is possibility to convert Spring Boot project into WAR artifact, that can be hosted on shared Servlet container. But we don’t want that now. It’s better when microservice has its own little embedded container. I used existing Spring’s REST service guide as testing target. Focus is mostly on testing project, so it is handy to use this “Hello World” REST application as example. I expect these two common tools are set up and installed on your machine:Maven 3 GitSo we’ll need to download source code and install JAR artifact into our local repository. I am going to use command line to download and install the microservice. Let’s go to some directory where we download source code. Use these commands: git clone git@github.com:spring-guides/gs-rest-service.git cd gs-rest-service/complete mvn clean install If everything went OK, Spring Boot microservice JAR artifact is now installed in our local Maven repository. In serious Java development, it would be rather installed into shared repository (e.g. Artifactory, Nexus,… ). When our microservice is installed, we can focus on testing project. It is also Maven and Spring Boot based. Black box testing will be achieved by downloading the artifact from Maven repository (doesn’t matter if it is local or remote). Maven-dependency-plugin can help us this way: <plugin>     <groupId>org.apache.maven.plugins</groupId>     <artifactId>maven-dependency-plugin</artifactId>     <executions>         <execution>             <id>copy-dependencies</id>             <phase>compile</phase>             <goals>                 <goal>copy-dependencies</goal>             </goals>             <configuration>                 <includeArtifactIds>gs-rest-service</includeArtifactIds>                 <stripVersion>true</stripVersion>             </configuration>         </execution>     </executions> </plugin> It downloads microservice artifact into target/dependency directory by default. As you can see, it’s hooked to compile phase of Maven lifecycle, so that downloaded artifact is available during test phase. Artifact version is stripped from version information. We use latest version. It makes usage of JAR artifact easier during testing. Readers skilled with Maven may notice missing plugin version. Spring Boot driven project is inherited from parent Maven project called spring-boot-starter-parent. It contains versions of main Maven plugins. This is one of the Spring Boot’s opinionated aspects. I like it, because it provides stable dependencies matrix. You can change the version if you need. When we have artifact in our file system, we can start testing. We need to be able to execute JAR file from command line. I used standard Java ProcessBuilder this way: public class ProcessExecutor { public Process execute(String jarName) throws IOException { Process p = null; ProcessBuilder pb = new ProcessBuilder("java", "-jar", jarName); pb.directory(new File("target/dependency")); File log = new File("log"); pb.redirectErrorStream(true); pb.redirectOutput(Redirect.appendTo(log)); p = pb.start(); return p; } } This class executes given process JAR based on given file name. Location is hard-coded to  target/dependency directory, where maven-dependency-plugin located our artifact. Standard and error outputs are redirected to file. Next class needed for testing is DTO (Data  transfer object). It is simple POJO that will be used for deserialization from JSON. I use Lombok project to reduce boilerplate code needed for getters, setters, hashCode and equals. @Data @AllArgsConstructor @NoArgsConstructor public class Greeting { private long id; private String content; } Test itself looks like this: public class BlackBoxTest { private static final String RESOURCE_URL = "http://localhost:8080/greeting";@Test public void contextLoads() throws InterruptedException, IOException { Process process = null; Greeting actualGreeting = null; try { process = new ProcessExecutor().execute("gs-rest-service.jar");RestTemplate restTemplate = new RestTemplate(); waitForStart(restTemplate);actualGreeting = restTemplate.getForObject(RESOURCE_URL, Greeting.class); } finally { process.destroyForcibly(); } Assert.assertEquals(new Greeting(2L, "Hello, World!"), actualGreeting); }private void waitForStart(RestTemplate restTemplate) { while (true) { try { Thread.sleep(500); restTemplate.getForObject(RESOURCE_URL, String.class); return; } catch (Throwable throwable) { // ignoring errors } } } } It executes Spring Boot microservice process first and wait unit it starts. To verify if microservice is started, it sends HTTP request to URL where it’s expected. The service is ready for testing after first successful response. Microservice should send simple greeting JSON response for HTTP GET request. Deserialization from JSON into our Greeting DTO is verified at the end of the test.Source code is shared on Github.Reference: Black Box Testing of Spring Boot Microservice is so easy from our JCG partner Lubos Krnac at the Lubos Krnac Java blog blog....

Converting between Completablefuture and Observable

CompletableFuture<T> from Java 8 is an advanced abstraction over a promise that value of type T will be available in the future. Observable<T> is quite similar, but it promises arbitrary number of items in the future, from 0 to infinity. These two representations of asynchronous results are quite similar to the point where Observable with just one item can be used instead of CompletableFuture and vice-versa. On the other hand CompletableFuture is more specialized and because it’s now part of JDK, should become prevalent quite soon. Let’s celebrate RxJava 1.0 release with a short article showing how to convert between the two, without loosing asynchronous and event-driven nature of them.     From CompletableFuture<T> to Observable<T> CompletableFuture represents one value in the future, so turning it into Observable is rather simple. When Futurecompletes with some value, Observable will emit that value as well immediately and close stream: class FuturesTest extends Specification { public static final String MSG = "Don't panic" def 'should convert completed Future to completed Observable'() { given: CompletableFuture<String> future = CompletableFuture.completedFuture("Abc") when: Observable<String> observable = Futures.toObservable(future) then: observable.toBlocking().toIterable().toList() == ["Abc"] } def 'should convert failed Future into Observable with failure'() { given: CompletableFuture<String> future = failedFuture(new IllegalStateException(MSG)) when: Observable<String> observable = Futures.toObservable(future) then: observable .onErrorReturn({ th -> th.message } as Func1) .toBlocking() .toIterable() .toList() == [MSG] } CompletableFuture failedFuture(Exception error) { CompletableFuture future = new CompletableFuture() future.completeExceptionally(error) return future } }First test of not-yet-implemented Futures.toObservable() converts Future into Observable and makes sure value is propagated correctly. Second test created failed Future, replaces failure with exception’s message and makes sure exception was propagated. The implementation is much shorter: public static <T> Observable<T> toObservable(CompletableFuture<T> future) { return Observable.create(subscriber -> future.whenComplete((result, error) -> { if (error != null) { subscriber.onError(error); } else { subscriber.onNext(result); subscriber.onCompleted(); } })); }NB: Observable.fromFuture() exists, however we want to take full advantage of ComplatableFuture‘s asynchronous operators. From Observable<T> to CompletableFuture<List<T>> There are actually two ways to convert Observable to Future – creating CompletableFuture<List<T>> orCompletableFuture<T> (if we assume Observable has just one item). Let’s start from the former case, described with the following test cases: def 'should convert Observable with many items to Future of list'() { given: Observable<Integer> observable = Observable>just(1, 2, 3) when: CompletableFuture<List<Integer>> future = Futures>fromObservable(observable) then: future>get() == [1, 2, 3] } def 'should return failed Future when after few items exception was emitted'() { given: Observable<Integer> observable = Observable>just(1, 2, 3) >concatWith(Observable>error(new IllegalStateException(MSG))) when: Futures>fromObservable(observable) then: def e = thrown(Exception) e>message == MSG }Obviously Future doesn’t complete until source Observable signals end of stream. Thus Observable.never() would never complete wrapping Future, rather then completing it with empty list. The implementation is much shorter and sweeter: public static <T> CompletableFuture<List<T>> fromObservable(Observable<T> observable) { final CompletableFuture<List<T>> future = new CompletableFuture<>(); observable .doOnError(future::completeExceptionally) .toList() .forEach(future::complete); return future; } The key is Observable.toList() that conveniently converts from Observable<T> and Observable<List<T>>. The latter emits one item of List<T> type when source Observable<T> finishes. From Observable<T> to CompletableFuture<T> Special case of the previous transformation happens when we know that CompletableFuture<T> will return exactly one item. In that case we can convert it directly to CompletableFuture<T>, rather thanCompletableFuture<List<T>> with one item only. Tests first: def 'should convert Observable with single item to Future'() { given: Observable<Integer> observable = Observable.just(1) when: CompletableFuture<Integer> future = Futures.fromSingleObservable(observable) then: future.get() == 1 } def 'should create failed Future when Observable fails'() { given: Observable<String> observable = Observable.<String> error(new IllegalStateException(MSG)) when: Futures.fromSingleObservable(observable) then: def e = thrown(Exception) e.message == MSG } def 'should fail when single Observable produces too many items'() { given: Observable<Integer> observable = Observable.just(1, 2) when: Futures.fromSingleObservable(observable) then: def e = thrown(Exception) e.message.contains("too many elements") }Again the implementation is quite straightforward and almost identical: public static <T> CompletableFuture<T> fromSingleObservable(Observable<T> observable) { final CompletableFuture<T> future = new CompletableFuture<>(); observable .doOnError(future::completeExceptionally) .single() .forEach(future::complete); return future; } Helpers methods above aren’t fully robust yet, but if you ever need to convert between JDK 8 and RxJava style of asynchronous computing, this article should be enough to get you started.Reference: Converting between Completablefuture and Observable from our JCG partner Tomasz Nurkiewicz at the Java and neighbourhood blog....

Deployment Pipeline for Java EE 7 with WildFly, Arquillian, Jenkins, and OpenShift

Tech Tip #54 showed how to Arquillianate (Arquillianize ?) an existing Java EE project and run those tests in remote mode where WildFly is running on a known host and port. Tech Tip #55 showed how to run those tests when WildFly is running in OpenShift. Both of these tips used Maven profiles to separate the appropriate Arquillian dependencies in “pom.xml” and <container> configuration in “arquillian.xml” to define where WildFy is running and how to connect to it. This tip will show how to configure Jenkins in OpenShift and invoke these tests from Jenkins. Lets see it in action first!    Configuration required to connect from Jenkins on OpenShift to a WildFly instance on OpenShift is similar to that required for  connecting from local machine to WildFly on OpenShift. This configuration is specified in “arquillian.xml” and we can specify some parameters which can then be defined in Jenkins. On a high level, here is what we’ll do:Use the code created in Tech Tip #54 and #55 and add configuration for Arquillian/Jenkins/OpenShift Enable Jenkins Create a new WildFly Test instance Configure Jenkins to run tests on the Test instance Push the application to Production only if tests pass on Test instanceLets get started!Remove the existing boilerplate source code, only the src directory, from the WildFly git repo created in Tech Tip #55. mywildfly> git rm -rf src/ pom.xml rm 'pom.xml' rm 'src/main/java/.gitkeep' rm 'src/main/resources/.gitkeep' rm 'src/main/webapp/WEB-INF/web.xml' rm 'src/main/webapp/images/jbosscorp_logo.png' rm 'src/main/webapp/index.html' rm 'src/main/webapp/snoop.jsp' mywildfly> git commit . -m"removing source and pom" [master 564b275] removing source and pom 7 files changed, 647 deletions(-) delete mode 100644 pom.xml delete mode 100644 src/main/java/.gitkeep delete mode 100644 src/main/resources/.gitkeep delete mode 100644 src/main/webapp/WEB-INF/web.xml delete mode 100644 src/main/webapp/images/jbosscorp_logo.png delete mode 100644 src/main/webapp/index.html delete mode 100644 src/main/webapp/snoop.jspSet a new remote to javaee7-continuous-delivery repository: mywildfly> git remote add javaee7 https://github.com/arun-gupta/javaee7-continuous-delivery.git mywildfly> git remote -v javaee7 https://github.com/arun-gupta/javaee7-continuous-delivery.git (fetch) javaee7 https://github.com/arun-gupta/javaee7-continuous-delivery.git (push) origin ssh://54699516ecb8d41cb8000016@mywildfly-milestogo.rhcloud.com/~/git/mywildfly.git/ (fetch) origin ssh://54699516ecb8d41cb8000016@mywildfly-milestogo.rhcloud.com/~/git/mywildfly.git/ (push)Pull the code from new remote: mywildfly> git pull javaee7 master warning: no common commits remote: Counting objects: 62, done. remote: Compressing objects: 100% (45/45), done. remote: Total 62 (delta 14), reused 53 (delta 5) Unpacking objects: 100% (62/62), done. From https://github.com/arun-gupta/javaee7-continuous-delivery * branch master -> FETCH_HEAD * [new branch] master -> javaee7/master Merge made by the 'recursive' strategy. .gitignore | 6 +++ README.asciidoc | 15 ++++++ pom.xml | 197 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ src/main/java/org/javaee7/sample/MyApplication.java | 9 ++++ src/main/java/org/javaee7/sample/Person.java | 31 ++++++++++++ src/main/java/org/javaee7/sample/PersonDatabase.java | 39 ++++++++++++++ src/main/java/org/javaee7/sample/PersonResource.java | 29 +++++++++++ src/main/webapp/index.jsp | 13 +++++ src/test/java/org/javaee7/sample/PersonTest.java | 77 ++++++++++++++++++++++++++++ src/test/resources/arquillian.xml | 26 ++++++++++ 10 files changed, 442 insertions(+) create mode 100644 .gitignore create mode 100644 README.asciidoc create mode 100644 pom.xml create mode 100644 src/main/java/org/javaee7/sample/MyApplication.java create mode 100644 src/main/java/org/javaee7/sample/Person.java create mode 100644 src/main/java/org/javaee7/sample/PersonDatabase.java create mode 100644 src/main/java/org/javaee7/sample/PersonResource.java create mode 100644 src/main/webapp/index.jsp create mode 100644 src/test/java/org/javaee7/sample/PersonTest.java create mode 100644 src/test/resources/arquillian.xml This will bring all the source code, include our REST endpoints, web pages, tests, updated “pom.xml” and “arquillian.xml”. The updated “pom.xml” has two new profiles.openshift org.apache.maven.plugins maven-war-plugin 2.3 false deployments ROOTjenkins-openshift maven-surefire-plugin 2.14.1 jenkins-openshift org.jboss.arquillian.container arquillian-openshift 1.0.0.Final-SNAPSHOT testFew points to observe here:“openshift” profile is used when building an application on OpenShift. This is where the application’s WAR file is created and deployed to WildFly. A new profile “jenkins-openshift” is added that will be used by the Jenkins instance (to be enabled shortly) in OpenShift to run tests. “arquillian-openshift” dependency is the same as used in Tech Tip #55 and allows to run Arquillian tests on a WildFly instance on OpenShift. This profile refers to “jenkins-openshift” container configuration that will be defined in “arquillian.xml”.Updated “src/test/resources/arquillian.xml” has the following container: <container qualifier="jenkins-openshift"> <configuration> <property name="namespace">${env.ARQ_DOMAIN}</property> <property name="application">${env.ARQ_APPLICATION}</property> <property name="libraDomain">rhcloud.com</property> <property name="sshUserName">${env.ARQ_SSH_USER_NAME}</property> <property name="login">arungupta@redhat.com</property> <property name="deploymentTimeoutInSeconds">300</property> <property name="disableStrictHostChecking">true</property> </configuration> </container> This container configuration is similar to the one that was added in Tech Tip #55. The only difference here is that the domain name, application name, and the SSH user name are parametrized. The value of these properties is defined in the configuration of Jenkins instance and allows to run the test against a separate test node. Two more things need to be done before changes can be pushed to the remote repository. First is to create a WildFly Test instance which can be used to run the tests. This can be easily done as shown: workspaces> rhc app-create mywildflytest jboss-wildfly-8 Application Options ------------------- Domain: milestogo Cartridges: jboss-wildfly-8 Gear Size: default Scaling: noCreating application 'mywildflytest' ... Artifacts deployed: ./ROOT.war doneWildFly 8 administrator added. Please make note of these credentials:Username: adminITJt7Yh Password: yXP2mUd1w4_8 run 'rhc port-forward mywildflytest' to access the web admin area on port 9990.Waiting for your DNS name to be available ... doneCloning into 'mywildflytest'... Warning: Permanently added the RSA host key for IP address '' to the list of known hosts.Your application 'mywildflytest' is now available.URL: http://mywildflytest-milestogo.rhcloud.com/ SSH to: 546e3743ecb8d49ca9000014@mywildflytest-milestogo.rhcloud.com Git remote: ssh://546e3743ecb8d49ca9000014@mywildflytest-milestogo.rhcloud.com/~/git/mywildflytest.git/ Cloned to: /Users/arungupta/workspaces/javaee7/mywildflytestRun 'rhc show-app mywildflytest' for more details about your app. Note the domain here is milestogo, application name is mywildflytest, and SSH user name is 546e3743ecb8d49ca9000014. These will be passed to Arquillian for running the tests. Second is to enable and configure Jenkins.In your OpenShift Console, pick the “mywildfly” application and click on “Enable Jenkins” link as shown below:   Remember this is not your Test instance because all the source code lives on the instance created earlier.Provide the appropriate name, e.g. jenkins-milestogo.rhcloud.com in my case, and click on “Add Jenkins” button. This will provision a Jenkins instance, if not already there and also configure the project with a script to build and deploy the application. Note down the name and password credentials. Use the credentials to login to your Jenkins instance.Select the appropriate build, “mywildfly-build” in this case. Scroll down to the “Build” section and add the following script right after “# Run tests here” in the Execute Shell: export ARQ_DOMAIN=milestogo export ARQ_SSH_USER_NAME=546e3743ecb8d49ca9000014 export ARQ_APPLICATION=mywildflytest mvn test -Pjenkins-openshift Click on “Save” to save the configuration. This will allow to run the Arquillian tests on the Test instance. If the tests pass then the app is deployed. If the tests fail, then none of the steps after that step are executed and so the app is not deployed. Lets push the changes to remote repo now: mywildfly> git push Counting objects: 68, done. Delta compression using up to 8 threads. Compressing objects: 100% (49/49), done. Writing objects: 100% (61/61), 8.85 KiB | 0 bytes/s, done. Total 61 (delta 14), reused 0 (delta 0) remote: Executing Jenkins build. remote: remote: You can track your build at https://jenkins-milestogo.rhcloud.com/job/mywildfly-build remote: remote: Waiting for build to schedule............................................................................................Done remote: Waiting for job to complete................................................................................................................................................................................................................................................................................................................................................................................................Done remote: SUCCESS remote: New build has been deployed. remote: ------------------------- remote: Git Post-Receive Result: success remote: Deployment completed with status: success To ssh://546cef93ecb8d4ff37000003@mywildfly-milestogo.rhcloud.com/~/git/mywildfly.git/ e8f6c61..e9ad206 master -> master The number of dots indicate the wait for a particular task and will most likely vary for different runs.  And Jenkins console (jenkins-milestogo.rhcloud.com/job/mywildfly-build/1/console) shows the output as: ------------------------------------------------------- T E S T S ------------------------------------------------------- Running org.javaee7.sample.PersonTest Nov 20, 2014 2:54:56 PM org.jboss.arquillian.container.openshift.OpenShiftContainer start INFO: Preparing Arquillian OpenShift container at http://mywildflytest-milestogo.rhcloud.com Nov 20, 2014 2:55:48 PM org.jboss.arquillian.container.openshift.OpenShiftRepository push INFO: Pushed to the remote repository ssh://546e3743ecb8d49ca9000014@mywildflytest-milestogo.rhcloud.com/~/git/mywildflytest.git/ Nov 20, 2014 2:56:37 PM org.jboss.arquillian.container.openshift.OpenShiftRepository push INFO: Pushed to the remote repository ssh://546e3743ecb8d49ca9000014@mywildflytest-milestogo.rhcloud.com/~/git/mywildflytest.git/ Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 103.056 sec Nov 20, 2014 2:56:37 PM org.jboss.arquillian.container.openshift.OpenShiftContainer stop INFO: Shutting down Arquillian OpenShift container at http://mywildflytest-milestogo.rhcloud.com Results :Tests run: 2, Failures: 0, Errors: 0, Skipped: 0[INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 3:13.069s [INFO] Finished at: Thu Nov 20 14:57:34 EST 2014 [INFO] Final Memory: 10M/101M [INFO] ------------------------------------------------------------------------ + /usr/libexec/openshift/cartridges/jenkins/bin/git_ssh_wrapper.sh 546e36e5e0b8cd4e2a000007@mywildfly-milestogo.rhcloud.com 'gear stop --conditional' Warning: Permanently added 'mywildfly-milestogo.rhcloud.com,' (RSA) to the list of known hosts. Stopping gear... Stopping wildfly cart Sending SIGTERM to wildfly:418673 ... + rsync --delete-after -azO -e /usr/libexec/openshift/cartridges/jenkins/bin/git_ssh_wrapper.sh /var/lib/openshift/546e46304382ec3f29000012//.m2/ '546e36e5e0b8cd4e2a000007@mywildfly-milestogo.rhcloud.com:~/.m2/' Warning: Permanently added 'mywildfly-milestogo.rhcloud.com,' (RSA) to the list of known hosts. + rsync --delete-after -azO -e /usr/libexec/openshift/cartridges/jenkins/bin/git_ssh_wrapper.sh /var/lib/openshift/546e46304382ec3f29000012/app-root/runtime/repo/deployments/ '546e36e5e0b8cd4e2a000007@mywildfly-milestogo.rhcloud.com:${OPENSHIFT_REPO_DIR}deployments/' Warning: Permanently added 'mywildfly-milestogo.rhcloud.com,' (RSA) to the list of known hosts. + rsync --delete-after -azO -e /usr/libexec/openshift/cartridges/jenkins/bin/git_ssh_wrapper.sh /var/lib/openshift/546e46304382ec3f29000012/app-root/runtime/repo/.openshift/ '546e36e5e0b8cd4e2a000007@mywildfly-milestogo.rhcloud.com:${OPENSHIFT_REPO_DIR}.openshift/' Warning: Permanently added 'mywildfly-milestogo.rhcloud.com,' (RSA) to the list of known hosts. + /usr/libexec/openshift/cartridges/jenkins/bin/git_ssh_wrapper.sh 546e36e5e0b8cd4e2a000007@mywildfly-milestogo.rhcloud.com 'gear remotedeploy' Warning: Permanently added 'mywildfly-milestogo.rhcloud.com,' (RSA) to the list of known hosts. Preparing build for deployment Deployment id is dff28e58 Activating deployment Deploying WildFly Starting wildfly cart Found listening port Found listening port /var/lib/openshift/546e36e5e0b8cd4e2a000007/wildfly/standalone/deployments /var/lib/openshift/546e36e5e0b8cd4e2a000007/wildfly /var/lib/openshift/546e36e5e0b8cd4e2a000007/wildfly CLIENT_MESSAGE: Artifacts deployed: ./ROOT.war Archiving artifacts Finished: SUCCESS Log files for Jenkins can be viewed as shown: Nov 20, 2014 2:51:11 PM hudson.plugins.openshift.OpenShiftCloud provision INFO: Provisioning new node for workload = 2 and label = mywildfly-build in domain milestogo Nov 20, 2014 2:51:11 PM hudson.plugins.openshift.OpenShiftCloud getOpenShiftConnection INFO: Initiating Java Client Service - Configured for OpenShift Server https://openshift.redhat.com Nov 20, 2014 2:51:11 PM com.openshift.internal.client.RestService request INFO: Requesting GET with protocol 1.2 on https://openshift.redhat.com/broker/rest/api Nov 20, 2014 2:51:11 PM com.openshift.internal.client.RestService request INFO: Requesting GET with protocol 1.2 on https://openshift.redhat.com/broker/rest/user Nov 20, 2014 2:51:11 PM com.openshift.internal.client.RestService request. . .INFO: Checking availability of computer hudson.plugins.openshift.OpenShiftSlave@8ce21115 Nov 20, 2014 2:53:35 PM com.openshift.internal.client.RestService request INFO: Requesting GET with protocol 1.2 on https://openshift.redhat.com/broker/rest/domain/milestogo/application/mywildflybldr/gear_groups Nov 20, 2014 2:53:35 PM hudson.plugins.openshift.OpenShiftComputerLauncher launch INFO: Checking SSH access to application mywildflybldr-milestogo.rhcloud.com Nov 20, 2014 2:53:35 PM hudson.plugins.openshift.OpenShiftComputerLauncher launch INFO: Connecting via SSH '546e46304382ec3f29000012' 'mywildflybldr-milestogo.rhcloud.com' '/var/lib/openshift/546e393e5973ca0492000070/app-root/data/.ssh/jenkins_id_rsa' Nov 20, 2014 2:53:35 PM hudson.slaves.NodeProvisioner update INFO: mywildfly-build provisioningE successfully completed. We have now 2 computer(s) Nov 20, 2014 2:53:35 PM hudson.plugins.openshift.OpenShiftComputerLauncher launch INFO: Connected via SSH. Nov 20, 2014 2:53:35 PM hudson.plugins.openshift.OpenShiftComputerLauncher launch INFO: Exec mkdir -p $OPENSHIFT_DATA_DIR/jenkins && cd $OPENSHIFT_DATA_DIR/jenkins && rm -f slave.jar && wget -q --no-check-certificate https://jenkins-milestogo.rhcloud.com/jnlpJars/slave.jar Nov 20, 2014 2:53:42 PM hudson.plugins.openshift.OpenShiftComputerLauncher launch INFO: Slave connected. Nov 20, 2014 2:58:24 PM hudson.model.Run execute INFO: mywildfly-build #1 main build action completed: SUCCESS This shows the application was successfully deployed at mywildfly-milestogo.rhcloud.com/index.jsp and looks like as shown:  Now change “src/main/webapp/index.jsp” to show a different heading. And change  “src/test/java/org/javaee7/sample/PersonTest.java” to make one of the tests fail. Doing “git commit” and “git push” shows the following results on command line: mywildfly> git commit . -m"breaking the test" [master ff2de09] breaking the test 2 files changed, 2 insertions(+), 2 deletions(-) mywildfly> git push Counting objects: 23, done. Delta compression using up to 8 threads. Compressing objects: 100% (8/8), done. Writing objects: 100% (12/12), 771 bytes | 0 bytes/s, done. Total 12 (delta 5), reused 0 (delta 0) remote: Executing Jenkins build. remote: remote: You can track your build at https://jenkins-milestogo.rhcloud.com/job/mywildfly-build remote: remote: Waiting for build to schedule.......Done remote: Waiting for job to complete.....................................................................................................................................................................Done remote: FAILED remote: !!!!!!!! remote: Deployment Halted! remote: If the build failed before the deploy step, your previous remote: build is still running. Otherwise, your application may be remote: partially deployed or inaccessible. remote: Fix the build and try again. remote: !!!!!!!! remote: An error occurred executing 'gear postreceive' (exit code: 1) remote: Error message: CLIENT_ERROR: Failed to execute: 'control post-receive' for /var/lib/openshift/546e36e5e0b8cd4e2a000007/jenkins-client remote: remote: For more details about the problem, try running the command again with the '--trace' option. To ssh://546e36e5e0b8cd4e2a000007@mywildfly-milestogo.rhcloud.com/~/git/mywildfly.git/ d618fad..ff2de09 master -> master The key statement to note is that deployment is halted after the tests are failing. And you can verify this by revisiting mywildfly-milestogo.rhcloud.com/index.jsp and check that the updated “index.jsp” is not visible. In short, tests pass, website is updated. And tests fail, the website is not updated. So you’ve built a simple deployment pipeline for Java EE 7 using WildFly, OpenShift, Arquillian, and Jenkins!Reference: Deployment Pipeline for Java EE 7 with WildFly, Arquillian, Jenkins, and OpenShift from our JCG partner Arun Gupta at the Miles to go 2.0 … blog....

Getting Started with Machine Learning

“Machine learning” is a mystical term. Most developers don’t need it at all in their daily work, and the only details about it we know are from some university course 5 years ago (which is already forgotten). I’m not a machine learning expert, but I happened to work in a company that does a bit of that, so I got to learn the basics. I never programmed actual machine learning tasks, but got a good overview. But what is machine learning? It’s instructing the computer to make sense of big amounts of data (#bigdata hashtag – check). In what ways?      classifying a new entry into existing classes – is this email spam, is this news article about sport or politics, is this symbol the letter “a”, or “b”, or “c”, is this object in front of the self-driving car a pedestrian or a road sign. predicting a value of a new entry (regression problems) – how much does my car cost, how much will the stock price be tomorrow. grouping entries into classes that are not known in advance (clustering) – what are you market segments, what are the communities within a given social network (and many more applications)How? With many different algorithms and data structures. Which are fortunately already written by computer scientists and developers can just reuse them (with a fair amount of understanding, of course). But if the algorithms are already written, then it must be easy to use machine learning? No. Ironically, the hardest part of machine learning is the part where a human tells the machine what is important about the data. This process is called feature selection. What are those features, that describe the data in a way, that the computer can use it to identify meaningful patterns. I am no machine learning expert, but the way I see it, this step is what most machine learning engineers (or data scientists) are doing on a day-to-day basis. They aren’t inventing new algorithms; they are trying to figure out what combinations of features for a given data gives best results. And it’s a process with many “heuristics” that I have no experience with. (That’s an oversimplification, of course, as my colleagues were indeed doing research and proposing improvements to algorithms, but that’s the scientific aspect of things) I’ll now limit myself only to classification problems and leave the rest. And when I say “best results”, how is that measured? There are the metrics of “precision” and “recall” (they are most easily used for classification into two groups, but there are ways to apply them to multi-class or multi-label classification). If you have to classify an email as spam or not spam, your precision is the percentage of the emails properly marked as spam from all the emails marked as spam. And the recall is the percentage of emails properly marked as spam from the total number of emails marked as spam. So if you have 200 emails, 100 of them are spam, and your program marks 80 of them as spam correctly and 20 incorrectly, you have a 80% precision (80/80+20) and 80% recall (80/100 actual spam emails). Good results are achieved when you score higher in these two metrics. I.e. your spam filter is good if it correctly detects most spam emails, and it also doesn’t mark non-spam emails as spam. The process of feeding data into the algorithm is simple. You usually have two sets of data – the training set and the evaluation set. You normally start with one set and split it in two (the training set should be the larger one). These sets contain the values for all the features that you have identified for the data in question. You first “train” your classifier statistical model with the training set (if you want to know how training happens, read about the various algorithms), and then run the evaluation set to see how many items were correctly classified (the evaluation set has the right answer in it, so you compare that to what the classifier produced as output). Let me illustrate that with my first actual machine learning code (with the big disclaimer, that the task is probably not well-suited for machine learning, as there is a very small data set). I am a member (and currently chair) of the problem committee (and jury) of the International Linguistics Olympiad. We construct linguistics problems, combine them into problem sets and assign them at the event each year. But we are still not good at assessing how hard a problem is for high-school students. Even though many of us were once competitors in such olympiads, we now know “too much” to be able to assess the difficulty. So I decided to apply machine learning to the problem. As mentioned above, I had to start with selecting the right features. After a couple of iterations, I ended up using: the number of examples in a problem, the average length of an example, the number of assignments, the number of linguistic components to discover as part of the solution, and whether the problem data is scrambled or not. The complexity (easy, medium, hard) comes from the actual scores of competitors at the olympiad (average score of: 0-8 points = hard, 8-12 – medium, >12 easy). I am not sure whether these features are related to problem complexity, hence I experimented with adding and removing some. I put the feature data into a Weka arff file, which looks like this (attributes=features): @RELATION problem-complexity@ATTRIBUTE examples NUMERIC @ATTRIBUTE avgExampleSize NUMERIC @ATTRIBUTE components NUMERIC @ATTRIBUTE assignments NUMERIC @ATTRIBUTE scrambled {true,false} @ATTRIBUTE complexity {easy,medium,hard}@DATA 34,6,11,8,false,medium 12,21,7,17,false,medium 14,11,11,17,true,hard 13,16,9,14,false,hard 16,35,7,17,false,hard 20,9,7,10,false,hard 24,5,8,6,false,medium 9,14,13,4,false,easy 18,7,17,7,true,hard 18,7,12,10,false,easy 10,16,9,11,false,hard 11,3,17,13,true,easy ... The evaluation set looks exactly like that, but smaller (in my case, only 3 items so far, waiting for more entries). Weka was recommended as a good tool (at least for starting), and it has a lot of algorithms included, which one can simply reuse. Following the getting started guide, I produced the following simple code: public static void main(String[] args) throws Exception { ArffLoader loader = new ArffLoader(); loader.setFile(new File("problem_complexity_train_3.arff")); Instances trainingSet = loader.getDataSet(); // this is the complexity, here we specify what are our classes, // into which we want to classify the data int classIdx = 5; ArffLoader loader2 = new ArffLoader(); loader2.setFile(new File("problem_complexity_test_3.arff")); Instances testSet = loader2.getDataSet(); trainingSet.setClassIndex(classIdx); testSet.setClassIndex(classIdx); // using the LMT classification algorithm. Many more are available Classifier classifier = new LMT(); classifier.buildClassifier(trainingSet); Evaluation eval = new Evaluation(trainingSet); eval.evaluateModel(classifier, testSet);System.out.println(eval.toSummaryString()); // Get the confusion matrix double[][] confusionMatrix = eval.confusionMatrix(); .... } A comment about the choice of the algorithm – having insufficient knowledge, I just tried a few and selected the one that produced the best result. After performing the evaluation, you can get the so called “confusion matrix”, (eval.toConfusionMatrix) which you can use to see the quality of the result. When you are satisfied with the results, you can proceed to classify new entries, that you don’t know the complexity of. To do that, you have to provide a data set, and the only difference to the other two is that you put question mark instead of the class (easy, medium, hard). E.g.: ... @DATA 34,6,11,8,false,? 12,21,7,17,false,? Then you can run the classifier: ArffLoader loader = new ArffLoader(); loader.setFile(new File("unclassified.arff")); Instances dataSet = loader.getDataSet();DecimalFormat df = new DecimalFormat("#.##"); for (Enumeration<Instance> en = dataSet.enumerateInstances(); en.hasMoreElements();) { double[] results = classifier.distributionForInstance(en.nextElement()); for (double result : results) { System.out.print(df.format(result) + " "); } System.out.println(); }; This will print the probabilities for each of your entries to fall into each of the classes. As we are going to use this output only as a hint towards the complexity, and won’t use it as a final decision, it is fine to yield wrong results sometimes. But in many machine learning problems there isn’t a human evaluation of the result, so getting higher accuracy is the most important task. How does this approach scale, however. Can I reuse the code above for a high volume production system? On the web you normally do not run machine learning tasks in real time (you run them as scheduled tasks instead), so probably the answer is “yes”. I still feel pretty novice in the field, but having done one actual task made me share my tiny experience and knowledge. Meanwhile I’m following the Stanford machine learning course on Coursera, which can give you way more details. Can we, as developers, use machine learning in our work projects? If we have large amounts of data – yes. It’s not that hard to get started, and although probably we will be making stupid mistakes, it’s an interesting thing to explore and may bring value to the product we are building.Reference: Getting Started with Machine Learning from our JCG partner Bozhidar Bozhanov at the Bozho’s tech blog blog....

Developing a Data Export Utility with PrimeFaces

My day job involves heavy use of data. We use relational databases to store everything, because we rely on enterprise level data management. Sometimes it is useful to have the ability to extract the data into a simple format, such as a spreadsheet, so that we can manipulate it as-needed. This post outlines the steps that I’ve taken to produce a effective and easy-to-use JSF-based data export utility using PrimeFaces 5.0. The export utility produces a spreadsheet, including column headers. The user has the ability to select which database fields to export, and in which order they should be exported. We want to ensure that we have a clean user interface that is intuitive. For that reason, I chose not to display any data on the screen. Rather, the user interface contains a PrimeFaces PickList component that lists the different data fields to choose from, along with a button to produce the export. Let’s begin by setting up the database infrastructure to make this export utility possible. For this post, I’ve enhanced the AcmePools application, which was developed via my article that was posted on OTN entitled PrimeFaces in the Enterprise. The export utility allows one to export customer data into a spreadsheet.  The customer data is included in the sample database which is installed within Apache Derby by NetBeans, or you can use the SQL script for this post. To follow along with the creation of this export utility, please download or create the AcmePools project within your environment. There are two parts to the data export utility, the first part being a PrimeFaces PickList component for the user to select which fields to export, and the second being an export button which will extract the selected field contents into a spreadsheet. The end result will resemble a user interface that looks like Figure 1.  Developing the PickList Component To begin, create the data infrastructure to support the PickList component. This consists of a single database table to hold column names and labels for the entity data you wish to export, and optionally a database sequence to populate the primary key for that table. In this case, the database table is named COLUMN_MODEL, and we populate the table with the entity field names that correspond to the database column names for the CUSTOMER database table. -- Add support for data export create table column_model( id int primary key, column_name varchar(30), column_label varchar(150)); -- Optional sequence for primary key generation create sequence column_model_s start with 1 increment by 1; -- Load with field (database column) names insert into column_model values( 1, 'addressline1', 'Address Line 1');insert into column_model values( 2, 'addressline2', 'Address Line 2');insert into column_model values( 3, 'city', 'City');insert into column_model values( 4, 'creditLimit', 'Credit Limit');insert into column_model values( 5, 'customerId', 'Customer Id');insert into column_model values( 6, 'discountCode', 'Discount Code');insert into column_model values( 7, 'email', 'Email');insert into column_model values( 8, 'fax', 'Fax');insert into column_model values( 9, 'name', 'Name');insert into column_model values( 10, 'phone', 'Phone');insert into column_model values( 11, 'state', 'State');insert into column_model values( 12, 'zip', 'Zip'); Next, create an entity class that can be used for accessing the column data from within the component. If you use an IDE such as NetBeans, this can be done very easily via a wizard. If using NetBeans, right click on the com.acme.acmepools.entity package, and select “New”-> “Entity Classes from Database”, and then choose the data source for our sample database. When the list of tables populates, select the COLUMN_MODEL table, as shown in Figure 2. Lastly, choose “Next” and “Finish” to create the entity class.  Once completed, the entity class entitled ColumnModel should look as follows: package com.acme.acmepools.entity;import java.io.Serializable; import java.math.BigDecimal; import javax.persistence.Basic; import javax.persistence.Column; import javax.persistence.Entity; import javax.persistence.Id; import javax.persistence.NamedQueries; import javax.persistence.NamedQuery; import javax.persistence.Table; import javax.validation.constraints.NotNull; import javax.validation.constraints.Size; import javax.xml.bind.annotation.XmlRootElement;/** * * @author Juneau */ @Entity @Table(name = "COLUMN_MODEL") @XmlRootElement @NamedQueries({ @NamedQuery(name = "ColumnModel.findAll", query = "SELECT c FROM ColumnModel c"), @NamedQuery(name = "ColumnModel.findById", query = "SELECT c FROM ColumnModel c WHERE c.id = :id"), @NamedQuery(name = "ColumnModel.findByColumnName", query = "SELECT c FROM ColumnModel c WHERE c.columnName = :columnName"), @NamedQuery(name = "ColumnModel.findByColumnLabel", query = "SELECT c FROM ColumnModel c WHERE c.columnLabel = :columnLabel")}) public class ColumnModel implements Serializable { private static final long serialVersionUID = 1L; @Id @Basic(optional = false) @NotNull @Column(name = "ID") private BigDecimal id; @Size(max = 30) @Column(name = "COLUMN_NAME") private String columnName; @Size(max = 150) @Column(name = "COLUMN_LABEL") private String columnLabel;public ColumnModel() { }public ColumnModel(BigDecimal id) { this.id = id; }public BigDecimal getId() { return id; }public void setId(BigDecimal id) { this.id = id; }public String getColumnName() { return columnName; }public void setColumnName(String columnName) { this.columnName = columnName; }public String getColumnLabel() { return columnLabel; }public void setColumnLabel(String columnLabel) { this.columnLabel = columnLabel; }@Override public int hashCode() { int hash = 0; hash += (id != null ? id.hashCode() : 0); return hash; }@Override public boolean equals(Object object) { // TODO: Warning - this method won't work in the case the id fields are not set if (!(object instanceof ColumnModel)) { return false; } ColumnModel other = (ColumnModel) object; if ((this.id == null && other.id != null) || (this.id != null && !this.id.equals(other.id))) { return false; } return true; }@Override public String toString() { return "com.acme.acmepools.entity.ColumnModel[ id=" + id + " ]"; } } Next, create an EJB session bean for the newly generated entity class so that the component can query the column data.  You can use your IDE for this as well if you’d like.  If using NetBeans, right-click on the com.acme.acmepools.session package, and select “New”->”Session Beans for Entity Classes”.  Once the dialog opens, select the entity class “com.acme.acmepools.entity.ColumnModel” from the left-hand list, and click “Finish” (Figure 3).  After the session bean has been created, add a method named findId(), which can be used for returning the column id value based upon a specified column name.  The full sources for the ColumnModelFacade should look as follows: package com.acme.acmepools.session;import com.acme.acmepools.entity.ColumnModel; import javax.ejb.Stateless; import javax.persistence.EntityManager; import javax.persistence.PersistenceContext;/** * * @author Juneau */ @Stateless public class ColumnModelFacade extends AbstractFacade { @PersistenceContext(unitName = "com.acme_AcmePools_war_AcmePools-1.0-SNAPSHOTPU") private EntityManager em;@Override protected EntityManager getEntityManager() { return em; }public ColumnModelFacade() { super(ColumnModel.class); }public ColumnModel findId(String columnName){ return (ColumnModel) em.createQuery("select object(o) from ColumnModel as o " + "where o.columnName = :columnName") .setParameter("columnName", columnName) .getSingleResult(); } } Next, create a some helper classes that will be utilized for loading and managing the data within the PickList component. The first class is named ColumnBean, and it is used to store the entity data, which is later passed off to the PickList for use. The code for ColumnBean is a simple POJO:< package com.acme.acmepools.bean;import java.math.BigDecimal;/** * * @author juneau */ public class ColumnBean {private BigDecimal id; private String columnName; private String columnLabel;public ColumnBean(BigDecimal id, String columnName, String columnLabel){ this.id = id; this.columnName = columnName; this.columnLabel = columnLabel; }/** * @return the id */ public BigDecimal getId() { return id; }/** * @param id the id to set */ public void setId(BigDecimal id) { this.id = id; }/** * @return the columnName */ public String getColumnName() { return columnName; }/** * @param columnName the columnName to set */ public void setColumnName(String columnName) { this.columnName = columnName; }/** * @return the columnLabel */ public String getColumnLabel() { return columnLabel; }/** * @param columnLabel the columnLabel to set */ public void setColumnLabel(String columnLabel) { this.columnLabel = columnLabel; }} The PickList component needs to use a PrimeFaces DualListModel for accessing and updating the data. Therefore, we must implement a class that can be used for coercing the entity data into our ColumnBean POJO, and then storing it into the DualListModel so that it can be utilized by the PickList component. In the following class, entitled PickListBean, the constructor accepts a List<ColumnModel>, which is the entity data as an argument, performs the coercion, and then stores it into a DualListModel<ColumnBean> collection for use by the component. package com.acme.acmepools.bean;/** * * @author juneau */import java.util.ArrayList; import java.util.List; import com.acme.acmepools.entity.ColumnModel;import org.primefaces.model.DualListModel;public class PickListBean {private DualListModel<ColumnBean> columns;private List<ColumnBean> source = null; private List<ColumnBean> target = null;public PickListBean(List<ColumnModel> columnModelList) { //Columns source = new ArrayList<ColumnBean>(); target = new ArrayList<ColumnBean>(); for(ColumnModel column:columnModelList){ ColumnBean bean = new ColumnBean(column.getId(), column.getColumnName(), column.getColumnLabel()); source.add(bean); }columns = new DualListModel<ColumnBean>(source, target);}public DualListModel<ColumnBean> getColumns() { return columns; }public void setColumns(DualListModel<ColumnBean> columns) { this.columns = columns; } } Lastly, we need to create a controller class to access all of this data. To do so, create a class named ColumnModelController within the com.acme.acmepools.jsf package, and make it a CDI managed bean by annotating it with @Named and @SessionScoped. Make the class implement Serializable. The initial controller class should look as follows (we will be updating it later to include methods to facilitate the export): @Named @SessionScoped public class ColumnModelController implements Serializable {@EJB ColumnModelFacade ejbFacade;private PickListBean pickListBean; private List<ColumnModel> columns;public DualListModel<ColumnBean> getColumns() { pickListBean = new PickListBean(ejbFacade.findAll());return pickListBean.getColumns(); } public void setColumns(DualListModel<ColumnBean> columns) { pickListBean.setColumns(columns); } } As you can see, the getColumns() method queries the ColumnModel entity, which populates the DualListModel<ColumnBean> via the PickListBean constructor. That takes care of the database infrastructure and business logic…now let’s look at the PrimeFaces component that is used for the PickList.  The following excerpt, taken from the WebPages/poolCustomer/CustomerExport.xhtml view, contains the markup for the PickList component: <p:panel header="Choose Columns for Export"> <p:picklist effect="bounce" itemlabel="#{column.columnLabel}" itemvalue="#{column.columnName}" showsourcecontrols="true" showtargetcontrols="true" value="#{columnModelController.columns}" var="column"> <f:facet name="sourceCaption">Columns</f:facet> <f:facet name="targetCaption">Selected</f:facet> </p:picklist> </p:panel> As you can see, the PickList is using columnModelController.columns for the data, which then uses the columnLabel field for displaying the names of the entity fields for export. The titles for the source and target PickList windows are customizable via a facet. Adding the Export Functionality Now that we’ve developed a functional pick list, we need to do something with the data that is selected. In this exercise, we will use a PrimeFaces DataExporter component to extract the data and store it into an Excel spreadsheet. In reality, we need to incorporate a DataTable into the view to display the data first, and then we can use the DataExporter component to export the data which resides in the table. To construct the DataTable that will be used for displaying the data, we need to add a few methods to the ColumnModelController class. These methods will allow us to process the DataTable dynamically, so that we can construct columns based upon those that are chosen within the PickList. In reality, the DataTable will query all of the Customer data, and then it will only display those columns of data that are selected within the PickList. (We could modify this query by adding a filter, but that is beyond the scope of this post). To load the table with data, we simply call upon the com.acme.acmepools.jsf.CustomerController getItems() method to return all of the data… public List<Customer> getItems() { if (items == null) { items = getFacade().findAll(); } return items; }…Now let’s add the necessary methods to the ColumnModelController so that we can dynamically construct the table. First, add a method that will be invoked when we click the “Export” button. This method will be responsible for building the currently selected column list: public void preProcess(Object document) {        System.out.println("starting preprocess");        updateColumns();    } Next, let’s take a look at the code for updateColumns(), which is invoked by the preProcess() method: /**     * Called as preprocessor to export (after clicking Excel icon) to capture     * the table component and call upon createDynamicColumns()     */    public void updateColumns() {        //reset table state        UIComponent table = FacesContext.getCurrentInstance().getViewRoot().findComponent(":customerExportForm:customerTable");        table.setValueExpression("sortBy", null);        //update columns        createDynamicColumns();    } The updateColumns() method binds a UIComponent to the table within the JSF view. It then has the capability of providing sorting, if elected.  Subsequently, lets now look at the createDynamicColumns() method that is called upon.     private void createDynamicColumns() {        String[] columnKeys = this.getIncludedColumnsByName().split(",");        columns = new ArrayList<>();        for (String columnKey : columnKeys) {            String key = columnKey.trim();            columns.add(new ColumnModel(getColumnLabel(key), key));        }    } The createDynamicColumns() method does a few things. First, it captures all of the selected columns from the PickList, and stores them into a String[] named columnKeys. To do this we use the helper method named getIncludedColumnsByName(), and split the results by comma. The sources for this method are as follows, and it basically grabs the currently selected columns from the PickListBean and appends each of them to a String, which is then returned to the caller.     public String getIncludedColumnsByName() {        String tempIncludedColString = null;        System.out.println("Number of included columns:" + pickListBean.getColumns().getTarget().size());        List localSource = pickListBean.getColumns().getTarget();        for (int x = 0; x <= localSource.size() - 1; x++) {            String tempModel = (String) localSource.get(x);            if (tempIncludedColString == null) {                tempIncludedColString = tempModel;            } else {                tempIncludedColString = tempIncludedColString + "," + tempModel;            }        }        return tempIncludedColString;    } Next, the createDynamicColumns() method then uses a loop to parse through each of the selected columns within the String[], and add them to the columnList, which going to be used to construct the DataTable with the appropriate columns. Now let’s take a look at the markup that is used to construct the DataExport utility: <p:datatable id="customerTable" rendered="false" value="#{customerController.items}" var="item" widgetvar="customerTable"> <p:columns columnindexvar="colIndex" value="#{columnModelController.dynamicColumns}" var="column"> <f:facet name="header"> <h:outputtext value="#{column.header}"> </h:outputtext></f:facet> <h:outputtext value="#{item[column.property]}"> </h:outputtext></p:columns> </p:datatable><hr /> <h:outputtext value="Type of file to export: "> <h:commandlink><p:graphicimage value="/faces/resources/images/excel.png"> <p:dataexporter filename="customers" id="propertyXlsExport" preprocessor="#{columnModelController.preProcess}" target="customerTable" type="xls"> </p:dataexporter></p:graphicimage></h:commandlink> </h:outputtext> As you can see, the DataTable is set to not render, because we really do not wish to display it. Instead, we wish to export its contents using the DataExporter component. To construct the DataTable dynamically, the columns call upon the columnModelController.dynamicColumns method to return the dynamic column list. This method looks as follows: public List<ColumnModel> getDynamicColumns() { return columns; } Within the DataExporter utility component, the columnModelController.preProcess method is assigned to the preprocessor attribute to initiate the dynamic column list. The target is set to the customerTable widget, which is the DataTable that we’ve dynamically constructed based upon the selected columns. In order to export this to an xls spreadsheet, you must add the org.apache.poi dependency within the Maven POM for the project, as follows: <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi</artifactId> <version>3.7</version> </dependency>That’s it…now you should have a fully functional data export utility using PrimeFaces components. The complete sources are available on GitHub using the link below. This code has been written in NetBeans IDE 8.0, and deployed to GlassFish 4.0. I utilized PrimeFaces 5.0 for this project.GitHub Sources: https://github.com/juneau001/AcmePoolsReference: Developing a Data Export Utility with PrimeFaces from our JCG partner Josh Juneau at the Josh’s Dev Blog – Java, Java EE, Jython, Oracle, and More… blog....

Getting Started with PrimeFaces Mobile

Introduction If you have developed an application that utilizes PrimeFaces, or if you are planning to develop a web application for use on desktop and mobile devices, then consider PrimeFaces Mobile for your mobile implementation. This blog post will cover some basics to help you get started developing a mobile interface for an existing PrimeFaces application. However, the same procedures can be applied to an application that is being written from scratch. This article is a precursor to an article that I am currently writing for OTN, which will cover the PrimeFaces Mobile API in more detail.  That article will be published later this year.   Getting in the Mobile Mindset One of the most important pieces of a mobile project is getting into the mobile mindset. While you may have a set of components that you are comfortable using on standard web applications, these components may not provide the best experience when transferred to the smaller screen. For that reason, you need to think about how your user is going to be interacting with your application on the small screen, and provide them with the most convenient user interface as possible. Some things to consider are the amount of text that you will want your users to be typing. If they are on a small device, it may be cumbersome to type lots of text, so we will want to provide them with easy to use components, allowing them to type as little as possible, and even selecting from lists instead. We also need to consider real estate (no not the housing market). Adding a menu to the top or bottom of the screen may not be beneficial to the user if they do not have enough screen left to easily navigate an application. These are just a couple of the issues tat are presented when developing applications for a mobile device.  PrimeFaces Mobile is well suited to assist in these areas since it is built upon one of the leading mobile HTML5 based UI frameworks.  PrimeFaces Mobile consists of many UI components that can enable users to be highly productive on a mobile device. If you take a look at the PrimeFaces Showcase, you can see many of these mobile components in action. This enables you to get an idea of how these components look, and how they react for the user. It is recommended to visit the PrimeFaces mobile showcase on a mobile device such as a smartphone or tablet gain the best understanding of how they’ll react. Creating a Mobile Root Now that you have a basic understanding of some mobile design concepts, let’s take a look at how easy it is to get started creating mobile views using PrimeFaces mobile. Before PrimeFaces 5, mobile was a separate download that needed to be included in your project. Now it is easier than ever to get going with PrimeFaces Mobile, as it is packaged as part of PrimeFaces 5. This makes it easy to build enterprise web applications on PrimeFaces for the standard browser, and then build separate views for use on mobile devices, oftentimes utilizing the same back-end business methods for each. I recommend creating a view that is dedicated as a starting point or “root” for mobile device users. I also recommend creating a separate MobileNavigationController class to handle navigation throughout the mobile views, as needed. We can utilize the mobile root view to set the hook for using the MobileNavigationController vs. the standard web application navigation. For the purposes of this article, let’s simply call our mobile root mobileRoot.xhtml.  In this case, mobleRoot.xhtml may look something like the following: <html xmlns:f="http://xmlns.jcp.org/jsf/core" xmlns:h="http://xmlns.jcp.org/jsf/html" xmlns:p="http://primefaces.org/ui" xmlns:ui="http://xmlns.jcp.org/jsf/facelets" xmlns="http://www.w3.org/1999/xhtml"><f:metadata> <f:viewaction action="#{mobileNavigationController.doMobile()}" id="useMobile"> </f:viewaction> </f:metadata><h:head> <h:outputscript library="js" name="addtohomescreen.js"> <h:outputstylesheet library="css" name="addtohomescreen.css"> <script> addToHomescreen(); </script> </h:outputstylesheet></h:outputscript></h:head> <h:body></h:body> </html>In the view above, a JSF viewAction is used to initiate the MobileNavigationController doMobile() method, which sets the mobile UI into motion. From this point, the navigation can take the user to the primary movile view of the application, and it can also set any other necessary configurations.  The addtohomescreen.js script (http://cubiq.org/add-to-home-screen) can be also be used to supply a nice button to recommend mobile device users to add the mobile application to their homescreen for a more rich experience. I will address some additional custom configurations for full screen web applications in a future post or the upcoming OTN article. Creating a Simple Mobile View Once we’ve provided our users with a clear path to access the mobile views, we need to ensure that the PrimeFaces mobile render kit is being used to display the mobile views. To flag a view for use with PrimeFaces Mobile, supply the “renderKitId” attribute in the <f:view> tag of your view, and apply PRIMEFACES_MOBILE as the value. <f:view renderKitId="PRIMEFACES_MOBILE"> Another requirement or building a PrimeFaces Mobile view is to add the mobile namespace (xmlns:pm=”http://primefaces.org/mobile”), as it will be used for each of the PrimeFaces Mobile specific components.  It is also a good idea to include the JSF passthrough (xmlns:pt=”http://xmlns.jcp.org/jsf/passthrough”) namespace, as we may wish to make use of some HTML5-specific features. A mobile page consists of a header, content, and a footer. Each mobile page is enclosed within the <pm:page> tag. A mobile view can consist of a single page enclosed in <pm:page>, or multiple pages, each with their own identifiers. In this example, we will create two views that constitute two mobile pages, the second page is accessed via a link on the first page. It is possible to utilize Facelets templates to build an entire mobile application solution, but in this example we will create each view separately. It is also possible to develop using the “single page” application strategy that is currently quite popular…we will cover more on that in the OTN article. The PrimeFaces Mobile example in this post and also the upcoming OTN article builds upon the Acme Pools example that was used in my “PrimeFaces in the Enterprise” article for OTN (http://www.oracle.com/technetwork/articles/java/java-primefaces-2191907.html).  In the full web version, the root view includes a listing of Acme Pool customers in a table view (Figure 1).  We would like to transform this view (and the others) to work better on a mobile device, and also allow selection of each row, which will take us to more information on the selected customer.For the purposes of this post, we will work with the initial customer view to convert it into a mobile view.  The view will contain a list of customers, and if you select a particular row in the view, then more information on the selected customer will be displayed.  To display a table using PrimeFaces mobile, you must make use of the DataList component, which provides a convenient “clickable” link or button for each row of data.  The DataList differs from a DataTable in that there are no columns in a DataList, but rather, there is one group of related data for each row of data.  The group of data should be wrapped with a clickable link, which will then provide navigation for the user to the second view displaying more details on the selected item.  The following code is used to develop the mobile UI for the customer data list. Listing 1:  Mobile View (mobile/index.xhtml) <html xmlns="http://www.w3.org/1999/xhtml" xmlns:h="http://xmlns.jcp.org/jsf/html" xmlns:p="http://primefaces.org/ui" xmlns:f="http://xmlns.jcp.org/jsf/core" xmlns:pm="http://primefaces.org/mobile" xmlns:pt="http://xmlns.jcp.org/jsf/passthrough"> <f:view renderKitId="PRIMEFACES_MOBILE"> <h:head></h:head> <h:body> <pm:page id="customerListing"> <pm:header> Acme Pools </pm:header> <pm:content> <h:form id="indexForm"> <p:panel header="Acme Pools Customer Listing"> <p:dataList id="datalist" value="#{customerController.items}" var="item" paginator="true" pt:data-role="listview" pt:data-filter="true" rows="10" rowsPerPageTemplate="10,20,30,40,50" > <p:commandLink action="#{customerController.loadCustomer}"> <f:param name="customer" value="#{item.customerId}"/> <h:panelGroup><h:outputText value="#{item.customerId} - #{item.name}"/><br/> <h:outputText value="#{item.email}"/></h:panelGroup> </p:commandLink> </p:dataList> </p:panel> </h:form> </pm:content> <pm:footer> Author: Josh Juneau </pm:footer> </pm:page> </h:body> </f:view> </html> As you can see, we flag the view for PrimeFaces Mobile use via the specification in the <f:view> tag. We then create a <pm:page>, and inside of the page we have sections for <pm:header>, <pm:content>, and <pm:footer>. The main content consists of a PrimeFaces mobile DataList that displays customer data, and the data is wrapped in a p:commandLink component. When the link is clicked, the #{customerController.loadCustomer} method is invoked, passing the ID of the selected customer. Note that the DataList component uses passthrough attributes to specify the data-role and data-filter HTML5 attributes. These are used to provide the user with a more rich experience. The filter makes it easy for the user to begin typing a value into a filter textbox, and have the list shortened to contain only the records that contain the typed text. The resulting view looks like Figure 2.The code in Listing 2 contains the implementation for loadCustomer(). The customer ID is passed to the find() method of the EJB, which then returns the selected customer data. Listing 2:  CustomerController loadCustomer() public String loadCustomer() {Map requestMap = FacesContext.getCurrentInstance().getExternalContext().getRequestParameterMap();String customer = (String) requestMap.get("customer");selected = ejbFacade.find(Integer.valueOf(customer));return "customerInfo";} When a customer is selected in the DataList, the loadCustomer() method is invoked, and it results in the navigation to our second mobile view, customerInfo.xhtml (Figure 3). The second mobile view basically displays customer details, and provides a link to go back to the DataList of customers. The code for customerInfo looks like that in Listing 3. Listing 3:  customerInfo.xhtml View <?xml version='1.0' encoding='UTF-8' ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:h="http://xmlns.jcp.org/jsf/html" xmlns:p="http://primefaces.org/ui" xmlns:f="http://xmlns.jcp.org/jsf/core" xmlns:pm="http://primefaces.org/mobile"> <f:view renderKitId="PRIMEFACES_MOBILE"> <h:head></h:head> <h:body> <pm:page id="customerInfo"> <pm:header> Acme Pools </pm:header> <pm:content> <h:form> <p:panel header="Acme Pools Customer Information"> #{customerController.selected.name} <br/> #{customerController.selected.addressline1} <br/> #{customerController.selected.addressline2} <br/> #{customerController.selected.phone} </p:panel> <p:commandLink action="index?transition=slide" value="Go Back"/> </h:form> </pm:content> <pm:footer> Author: Josh Juneau </pm:footer> </pm:page> </h:body> </f:view> </html> As you can see, the customerInfo view contains the same structure as the original mobile view.  There are no special mobile components added, but as you can see from Figure 3, the standard PrimeFaces panel is styled to display nicely on the mobile device.Conclusion That wraps it up for this brief look into using PrimeFaces mobile.  As you can see, it is easy to develop a mobile interface for your applications.  The PrimeFaces mobile suite also includes custom frameworks for navigation, events, and more, making it easy to provide a nice mobile experience.  For instance, the events framework includes some swipe events, as well as taphold.  It is even possible to hook into the JQuery Mobile framework to provide even more mobile events to your application. The PrimeFaces mobile navigation framework consists of transitions, which ultimately provide your application with a smoother feel.  For instance, one can provide a transition of “slide” to a button navigation, which will result in a UI view that slides into focus when the button is clicked.  All of this can be tested using the PrimeFaces Showcase. For more information on these and other important features of PrimeFaces mobile, please watch for my upcoming OTN article. ResourcesPrimeFaces Showcase: http://www.primefaces.org/showcase/mobile/ JQuery Mobile: http://jquerymobile.com/Reference: Getting Started with PrimeFaces Mobile from our JCG partner Josh Juneau at the Josh’s Dev Blog – Java, Java EE, Jython, Oracle, and More… blog....

Cannot Uninstall JavaFX SceneBuilder 1.0 with JDK 8

I was recently removing some of the software development applications, tools, and files I had used from an old Vista-based laptop because the people who are primarily using that laptop now have no interest in software development. As part of that effort, I tried to remove JavaFX Scene Builder 1.0, which I had installed a couple of years ago on that laptop. I hadn’t used it recently (JavaFX Scene Builder 2.0 is available) but I had not removed the version from the laptop when I stopped using that old version. My first attempt to remove JavaFX Scene Builder 1.0 was via the Windows Vista menu option Control Panel | Programs | Uninstall a program. The next screen snapshot shows this version of JavaFX Scene Builder 1.0 that I wanted to install along with the version of Java installed on that machine (JDK 8 and Java 8 JRE). No versions of Java (JDK or JRE) before Java 8 were on this machine.The next screen snapshot demonstrates the normal requested confirmation of the removal of JavaFX Scene Builder 1.0.Clicking the “Yes” button on the confirmation dialog just shown led to the removal process beginning.Unfortunately, the removal of JavaFX Scene Builder 1.0 aborted and showed the error message: “No suitable 32-bit Java Runtime Environment (JRE) has been found. You should install Java 6 Update 29 (32-bit) or above OR Java 7 Update 2 (32-bit) or above.”I was a bit surprised that JavaFX Scene Builder could not be uninstalled with a Java 8 JRE installed on the machine. I tried to uninstall it more than once to make sure, but it was resistant to removal with only JRE 8 installed. I ended up simply removing the JavaFX Scene Builder 1.0 directory with Windows Explorer as shown in the next screen snapshot.Because I could not use the uninstaller to remove JavaFX Scene Builder 1.0, I also needed to manually remove the shortcut as shown in the next screen snapshot.It was not a big deal to remove the directory and shortcut when the installer was unable to remove JavaFX Scene Builder 1.0 from this machine. It also would not have been too difficult to download and install a Java SE 7 JRE to use in uninstalling JavaFX Scene Builder. However, I was a bit surprised that it was written so that an appropriate version of JRE 6 or JRE 7 was required. It explicitly prevents JRE 8 or any future JRE from being used to uninstall it. I saw this same type of situation recently with a different tool in a different environment. In that case, the version of SQLDeveloper being used would only work with a certain specified range of updates for Java SE 6 and not for any Java SE 6 updates outside of that range and not for any versions of JDK 7 or JDK 8. Conclusion There is a software development reminder (or lesson to be learned) from this. It is easy as humans to think only about the current timeframe and about the past, but we as software developers should put some thought into what the future holds. The prevailing version of software is not always going to be the prevailing version and when our software’s documentation or the software itself advertises supporting certain versions “and above” or “and later,” then we should probably not put an explicit check in our code that forces the software to have the one of the expected major revisions or that caps the supported versions.Reference: Cannot Uninstall JavaFX SceneBuilder 1.0 with JDK 8 from our JCG partner Dustin Marx at the Inspired by Actual Events blog....

JMS with JBoss A-MQ on OpenShift. Lessons learned about remote Clients and Encryption.

OpenShift is the “open hybrid cloud application platform by Red Hat”. It comes in different flavors and the most interesting part for most of the things you want to do is the public cloud application development and hosting platform “OpenShift Online“. You can easily try it out because using OpenShift Online in the cloud is free and it’s easy. All it takes is an email address. The free offering allows for up to three basic small gears and host up to three applications from a variety of different languages and frameworks. If you need more, you can upgrade your plan to a paid version. For more details look at the online feature comparison website.     JBoss A-MQ on OpenShift The Java Message Service is an effective method for cross-system communication, even among non-Java applications. By basing itself on open source technologies and strong standards, RedHat OpenShift allows developers to easily move their JMS applications to the cloud or write new systems that leverage JMS messages with encrypted internet connectivity. This post will cover the means for using two major applications: WildFly 8 for hosting web applications, and JBoss A-MQ for asynchronous messaging. Both applications can run on gears within the free tier of OpenShift. Creating an A-MQ Gear By deploying A-MQ to the OpenShift cloud, your gear will receive several publicly accessible ports. Client systems can then use these remote ports to connect to your A-MQ service. The endpoints require encryption, so no JMS message will ever be sent in plain-text across the internet. The first step in creating your A-MQ gear is to clone the existing JBoss Fuse A-MQ cartridge. For those interested in cartridge management, you can view full details on this cartridge. (Note: If you are looking for an upstream cartridge with ActiveMQ, take a look at this blog.) rhc create-app amq http://is.gd/Q5ihum Upon creating, the gear provides three important pieces of information:The administrative password that you will use to log in to JBoss Fuse, for managing A-MQ. A new public key that clients must have in order to communicate with A-MQ. This information looks like: —–BEGIN CERTIFICATE—– … —–END CERTIFICATE—–A list of public ports A-MQ is using for remote connections.Managing the encryption on OpenShift The difference between clients and your OpenShift gear is that OpenShift needs the private key. If you need to change the keys, the keystore file is FILENAME. If you change keys, clients must have the public key before they will trust it. If you change the keys, you must restart the gear. If you forgot to copy your certificate during gear creation of you changed the keystore and need to extract is, use the following commands: keytool -list -keystore ~/jboss-amq/jboss-a-mq-6.1.0.redhat-378/etc/keystore.jks keytool -exportcert -alias (whatever it says) -keystore -file openshiftamq.cer Download the openshiftamq.cer file using an SFTP client and configure clients. Managing the encryption on clients Copy the text of your key into a file called amqpublic.cer. Copy each line, inclusive of the BEGIN and END lines. Import the public certificate into a trust store that your clients will use. keytool -importcert -alias openshiftamq -file openshiftamq.cer openshiftamq.jksPut the openshiftamq.jks file as a classpath resource of your application or somewhere memorable. You won’t need the .cer file anymore but can still keep it around. Within client code, configure this trust store to be used with A-MQ connections. If you do not do this step, clients will not trust the server. private ConnectionFactory connection(String url) {     ActiveMQSslConnectionFactory connectionFactory = new ActiveMQSslConnectionFactory(url);     try {         connectionFactory.setTrustStore("openshiftamq.jks"); //or file if not in classpath root     } catch (Exception ex) {         Logger.getLogger(getClass().getName()).log(Level.SEVERE, "Unable to load trust store.", ex);     }     connectionFactory.setTrustStorePassword("put your password here");     return connectionFactory; }Remote communication from clients One benefit of using the OpenShift Fuse A-MQ gear is that is exposes several external ports. As a result, your A-MQ service is available without requiring the rhc port-forward command. The URL for your A-MQ clients will look like this: ssl://gearname-YourDomain.rhcloud.com:PORTGearname – the name of your gear within the administrative console. YourDomain – Your standard OpenShift domain. PORT – the numeric port number provided when you created the cartridge.Configure clients using the ConnectionFactory code from above. Additional ActiveMQ Configurations in your OpenShift Gear Many configuration options from a standard A-MQ instance are available within your OpenShift instance. The configuration file for this is: ~/jboss-amq/jboss-a-mq-6.1.0.redhat-78/etc/activemq.xml with a few caveats. Namely, you can change the protocol of a <transportConnector /> but must not change the IP or port. The ports are controlled by your OpenShift Gear and are the only ones actually allowed from external areas. Prevent accidental Gear idling OpenShift is designed as a resource sharing system, and idle resources will essentially be put to sleep until accessed. JMS poses a particular problem on OpenShift in that if it is idle, connections will not function and new clients cannot connect. To prevent this behavior, automate a script that periodically interacts with the JBoss Fuse web console or always keep at least one client connected to your A-MQ.Reference: JMS with JBoss A-MQ on OpenShift. Lessons learned about remote Clients and Encryption. from our JCG partner Markus Eisele at the Enterprise Software Development with Java blog....

Big Data… Is Hadoop the good way to start?

In the past 2 years, I have met many developers, architects that are working on “big data” projects. This sounds amazing, but quite often the truth is not that amazing. TL;TR You believe that you have a big data project?          Do not start with the installation of an Hadoop Cluster — the “how“ Start to talk to business people to understand their problem — the “why“ Understand the data you must process Look at the volume — very often it is not “that” big Then implement it, and take a simple approach, for example start with MongoDB + Apache SparkThe infamous “big data project” A typical discussion would look like: Me: “Can you tell me more about this project, what do you do with your data?” Mr. Big Bytes: “Sure, we have a 40 nodes Hadoop cluster…” Me: “This is cool but which type of data do you store, and what is the use case, business value?” Mr. Big Bytes: “We store, all the logs of our applications, we have hundreds of gigabits…” After a long blank: “We have not yet started to analyze these data. For now it is jut  ‘us, the IT team,’ we store the data, like that soon we will be able to do interesting things with them” You can meet the same person few months later; the cluster is still sitting here, with no activity on it. I even met some consultants telling me they received calls from their customer asking the following: “Hmmm, we have an Hadoop cluster installed, can you help us to find what to do with it?” Wrong! That is wrong!!!!! This means that the IT Team has spent lot of time for nothing, at least for the business; and I am not even sure the team has learned something technically. Start with the “Why” not with the “How”! The solution to this, could be obvious, start your “big data project” answering the “why/what” questions first! The “how”, the implementation, will come later. I am sure that most of the enterprises will benefit of a so called “big data project”, but it is really important to understand the problems first. And these problems are not technical… at least at the beginning. So you must spend time with the business persons to understand what could help them. Let’s take some examples. You are working in a bank or insurance, business people will be more than happy to predict when/why customer will be leaving the company by doing some churn analysis; or it will be nice to be able to see when it makes lot of sense to sell new contracts, service to existing customers. You are working in retail/commerce, your business will be happy to see if they can adjust the price to the market, or provide precise recommendations to a user from an analysis of other customer behavior. We can find many other examples. But as you can see we are not talking about technology, just business and possible benefits. In fact nothing new, compare with the applications you are building, you need first to have some requirements/ideas to build a product. Here we just need to have some “data input” to see how we can enrich the data with some business value. Once you have started to ask all these questions you will start to see some input, and possible processing around them:You are an insurance, you customers has no contact with your representative, or the customer satisfaction is medium/bad; you start to see some customer name in quotes coming from price comparison website…. hmm you can guess that they are looking for a new insurance. Still in the insurance, when your customer are close to the requirement age, or has teenagers learning how to drive, moving to college, you know that you have some opportunity to sell new contract, or adapt existing ones to the new needs In retail, you may want to look to all customers and what they have ordered, and based on this be able to recommend some products to a customer that “looks” the same. Another very common use case these days, you want to do some sentiment analysis of social networks to see how your brand is perceived by your communityAs you can see now, we can start to think about the data we have to use and the type of processing we have to do on them. Let’s now talk about the “How” Now that you have a better idea about what you want to do, it does not mean that you should dive into a large cluster installation. Before that, you should continue to analyze the data:What is the structure of the data that I have to analyze? How big is my dataset? How much data I have to ingest on a period of time (minute, hour, day, …)All these questions will help you to understand better your application. This is where it is often interesting too, and we realize that for most of us the “big data” is not that big! I was working the other day with a Telco company in Belgium, and I was talking about possible new project. I simply said:Belgium is what, 11+ millions of people If you store a 50kb object for each person this represent: Your full dataset will be 524Gb, yes not even a Terabyte!Do you need a large Hadoop cluster to store and process this? You can use it, but you do not need to! You can find something smaller, and easier to start with. Any database will do the job, starting with MongoDB. I think it is really interesting to start this project with a MongoDB cluster, not only because it will allow you to scale out as much as you need, but also because you will leverage the flexibility of the document model. This will allow you to store any type of data, and easily adapt the structure to the new data, or requirements. Storing the data is only one part of the equation. The other part is how you achieve the data processing. Lately I am playing a lot with Apache Spark. Spark provides a very powerful engine for large scale data processing, and it is a lot simpler than Map Reduce jobs. In addition to this, you can run Spark without Hadoop! This means you can connect you Spark to your MongoDB, with the MongoDB Hadoop Connector and other data sources and directly execute job on your main database. What I like also about this approach, you can when you dataset starts to grow, and it become harder to process all the data on your operational database, you can easily add Hadoop; and keep most of your data processing layer intact, and only change the data sources information. In this case you will connect MongoDB and Hadoop to get/push the data into HDFS, once again using the MongoDB Hadoop Connector. Conclusion Too many times, projects are driven by technology instead of focusing on the business value. This is particularly true around big data projects. So be sure you start by understanding the business problem, and find the data that could help to solve it. Once you have the business problem and the data, select the good technology, that could be very simple, simple files and python scripts, or more often a database like MongoDB with a data processing layer like Spark. And start to move to Hadoop when it is really mandatory… a very, very, very, large dataset.Reference: Big Data… Is Hadoop the good way to start? from our JCG partner Tugdual Grall at the Tug’s Blog blog....
Java Code Geeks and all content copyright © 2010-2015, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below: