Featured FREE Whitepapers

What's New Here?


Java EE 7 with Angular JS – CRUD, REST, Validations – Part 2

This is the promised follow up to the Java EE 7 with Angular JS – Part 1. It took longer than I expect (to find the time to prepare the code and blog post), but it’s finally here! The Application The original application in Part 1 it’s only a simple list with pagination and a REST service that feeds the list data.        In this post we’re going to add CRUD (Create, Read, Update, Delete) capabilities, bind REST services to perform these operations on the server side and validate the data. The Setup The Setup is the same from Part 1, but here is the list for reference:Java EE 7 Angular JS ng-grid UI Bootstrap WildflyThe Code Backend – Java EE 7 The backend does not required many changes. Since we want the ability to create, read, update and delete, we need to add the appropriate methods in the REST service to perform these operations: PersonResource package com.cortez.samples.javaee7angular.rest;import com.cortez.samples.javaee7angular.data.Person; import com.cortez.samples.javaee7angular.pagination.PaginatedListWrapper;import javax.ejb.Stateless; import javax.persistence.EntityManager; import javax.persistence.PersistenceContext; import javax.persistence.Query; import javax.ws.rs.*; import javax.ws.rs.core.Application; import javax.ws.rs.core.MediaType; import java.util.List;@Stateless @ApplicationPath("/resources") @Path("persons") @Consumes(MediaType.APPLICATION_JSON) @Produces(MediaType.APPLICATION_JSON) public class PersonResource extends Application { @PersistenceContext private EntityManager entityManager;private Integer countPersons() { Query query = entityManager.createQuery("SELECT COUNT(p.id) FROM Person p"); return ((Long) query.getSingleResult()).intValue(); }@SuppressWarnings("unchecked") private List<Person> findPersons(int startPosition, int maxResults, String sortFields, String sortDirections) { Query query = entityManager.createQuery("SELECT p FROM Person p ORDER BY " + sortFields + " " + sortDirections); query.setFirstResult(startPosition); query.setMaxResults(maxResults); return query.getResultList(); }private PaginatedListWrapper<Person> findPersons(PaginatedListWrapper<Person> wrapper) { wrapper.setTotalResults(countPersons()); int start = (wrapper.getCurrentPage() - 1) * wrapper.getPageSize(); wrapper.setList(findPersons(start, wrapper.getPageSize(), wrapper.getSortFields(), wrapper.getSortDirections())); return wrapper; }@GET public PaginatedListWrapper<Person> listPersons(@DefaultValue("1") @QueryParam("page") Integer page, @DefaultValue("id") @QueryParam("sortFields") String sortFields, @DefaultValue("asc") @QueryParam("sortDirections") String sortDirections) { PaginatedListWrapper<Person> paginatedListWrapper = new PaginatedListWrapper<>(); paginatedListWrapper.setCurrentPage(page); paginatedListWrapper.setSortFields(sortFields); paginatedListWrapper.setSortDirections(sortDirections); paginatedListWrapper.setPageSize(10); return findPersons(paginatedListWrapper); }@GET @Path("{id}") public Person getPerson( @PathParam("id") Long id) { return entityManager.find(Person.class, id); }@POST public Person savePerson(Person person) { if (person.getId() == null) { Person personToSave = new Person(); personToSave.setName(person.getName()); personToSave.setDescription(person.getDescription()); personToSave.setImageUrl(person.getImageUrl()); entityManager.persist(person); } else { Person personToUpdate = getPerson(person.getId()); personToUpdate.setName(person.getName()); personToUpdate.setDescription(person.getDescription()); personToUpdate.setImageUrl(person.getImageUrl()); person = entityManager.merge(personToUpdate); }return person; }@DELETE @Path("{id}") public void deletePerson(@PathParam("id") Long id) { entityManager.remove(getPerson(id)); } } The code is exactly as a normal Java POJO, but using the Java EE annotations to enhance the behaviour. @ApplicationPath("/resources") and @Path("persons") will expose the REST service at the url yourdomain/resources/persons (yourdomain will be the host where the application is running). @Consumes(MediaType.APPLICATION_JSON) and @Produces(MediaType.APPLICATION_JSON) accept and format REST request and response as JSON. For the REST operations:Annotation / HTTP Method Java Method URL Behaviour@GET / GET listPersons http://yourdomain/resources/persons Returns a paginated list of 10 persons.@GET / GET getPerson http://yourdomain/resources/persons/{id} Returns a Person entity by it’s id.@POST / POST savePerson http://yourdomain/resources/persons Creates or Updates a Person.@DELETE / DELETE deletePerson http://yourdomain/resources/persons/{id} Deletes a Person entity by it’s id.The url invoked for each operations is very similar. The magic to distinguish which operation needs to be called is defined in the HTTP method itself when the request is submitted. Check HTTP Method definitions. For getPerson and deletePerson note that we added the annotation @Path("{id}") which defines an optional path to call the service. Since we need to know which object we want to get or delete, we need to indicate the id somehow. This is done in the service url to be called, so if we want to delete the Person with id 1, we would call http://yourdomain/resources/persons/1 with the HTTP method DELETE. That’s it for the backend stuff. Only 30 lines of code added to the old REST service. I have also added a new property to the Person object, to hold a link to image with the purpose of displaying an avatar of the person. UI – Angular JS For the UI part, I’ve decided to split it into 3 sections: the grid, the form and the feedback messages sections, each with its own Angular controller. The grid is mostly the same from Part 1, but it did require some tweaks for the new stuff: Grid HTML <!-- Specify a Angular controller script that binds Javascript variables to the grid.--> <div class="grid" ng-controller="personsListController"> <div> <h3>List Persons</h3> </div><!-- Binds the grid component to be displayed. --> <div class="gridStyle" ng-grid="gridOptions"></div><!-- Bind the pagination component to be displayed. --> <pagination direction-links="true" boundary-links="true" total-items="persons.totalResults" items-per-page="persons.pageSize" ng-model="persons.currentPage" ng-change="refreshGrid()"> </pagination> </div> Nothing special here. Pretty much the same as Part 1. Grid Angular Controller app.controller('personsListController', function ($scope, $rootScope, personService) { // Initialize required information: sorting, the first page to show and the grid options. $scope.sortInfo = {fields: ['id'], directions: ['asc']}; $scope.persons = {currentPage: 1};$scope.gridOptions = { data: 'persons.list', useExternalSorting: true, sortInfo: $scope.sortInfo,columnDefs: [ { field: 'id', displayName: 'Id' }, { field: 'name', displayName: 'Name' }, { field: 'description', displayName: 'Description' }, { field: '', width: 30, cellTemplate: '<span class="glyphicon glyphicon-remove remove" ng-click="deleteRow(row)"></span>' } ],multiSelect: false, selectedItems: [], // Broadcasts an event when a row is selected, to signal the form that it needs to load the row data. afterSelectionChange: function (rowItem) { if (rowItem.selected) { $rootScope.$broadcast('personSelected', $scope.gridOptions.selectedItems[0].id); } } };// Refresh the grid, calling the appropriate rest method. $scope.refreshGrid = function () { var listPersonsArgs = { page: $scope.persons.currentPage, sortFields: $scope.sortInfo.fields[0], sortDirections: $scope.sortInfo.directions[0] };personService.get(listPersonsArgs, function (data) { $scope.persons = data; }) };// Broadcast an event when an element in the grid is deleted. No real deletion is perfomed at this point. $scope.deleteRow = function (row) { $rootScope.$broadcast('deletePerson', row.entity.id); };// Watch the sortInfo variable. If changes are detected than we need to refresh the grid. // This also works for the first page access, since we assign the initial sorting in the initialize section. $scope.$watch('sortInfo.fields[0]', function () { $scope.refreshGrid(); }, true);// Do something when the grid is sorted. // The grid throws the ngGridEventSorted that gets picked up here and assigns the sortInfo to the scope. // This will allow to watch the sortInfo in the scope for changed and refresh the grid. $scope.$on('ngGridEventSorted', function (event, sortInfo) { $scope.sortInfo = sortInfo; });// Picks the event broadcasted when a person is saved or deleted to refresh the grid elements with the most // updated information. $scope.$on('refreshGrid', function () { $scope.refreshGrid(); });// Picks the event broadcasted when the form is cleared to also clear the grid selection. $scope.$on('clear', function () { $scope.gridOptions.selectAll(false); }); }); A few more attributes are required to configure the behaviour of the grid. The important bits are the data: 'persons.list' which binds the grid data to Angular model value $scope.persons, the columnDefs which allow us to model the grid as we see fit. Since I wanted to add an option to delete each row, I needed to add a new cell which call the function deleteRow when you click in cross icon. The afterSelectionChanges function is required to update the form data with the person selected in the grid. You can check other grid options here. The rest of the code is self-explanatory and there is also a few comments in there. A special note about $rootScope.$broadcast: this is used to dispatch an event to all the other controllers. This is a way to communicate between controllers, since the grid, form and feedback messages have separate controllers. If everything was in only one controller, this was not required and a simple function call would be enough. Another possible solution if we want to keep the multiple controllers, would be to use Angular services. The used approach seems much cleaner since it separates the application concerns and does not require you to implement additional Angular services, but it might be a little harder to debug if needed. Form HTML <div class="form" ng-controller="personsFormController"> <!-- Verify person, if there is no id present, that we are Adding a Person --> <div ng-if="person.id == null"> <h3>Add Person</h3> </div> <!-- Otherwise it's an Edit --> <div ng-if="person.id != null"> <h3>Edit Person</h3> </div><div> <!-- Specify the function to be called on submit and disable HTML5 validation, since we're using Angular validation--> <form name="personForm" ng-submit="updatePerson()" novalidate><!-- Display an error if the input is invalid and is dirty (only when someone changes the value) --> <div class="form-group" ng-class="{'has-error' : personForm.name.$invalid && personForm.name.$dirty}"> <label for="name">Name:</label> <!-- Display a check when the field is valid and was modified --> <span ng-class="{'glyphicon glyphicon-ok' : personForm.name.$valid && personForm.name.$dirty}"></span><input id="name" name="name" type="text" class="form-control" maxlength="50" ng-model="person.name" required ng-minlength="2" ng-maxlength="50"/><!-- Validation messages to be displayed on required, minlength and maxlength --> <p class="help-block" ng-show="personForm.name.$error.required">Add Name.</p> <p class="help-block" ng-show="personForm.name.$error.minlength">Name must be at least 2 characters long.</p> <p class="help-block" ng-show="personForm.name.$error.maxlength">Name cannot be longer than 50 characters.</p> </div><!-- Display an error if the input is invalid and is dirty (only when someone changes the value) --> <div class="form-group" ng-class="{'has-error' : personForm.description.$invalid && personForm.description.$dirty}"> <label for="description">Description:</label> <!-- Display a check when the field is valid and was modified --> <span ng-class="{'glyphicon glyphicon-ok' : personForm.description.$valid && personForm.description.$dirty}"></span><input id="description" name="description" type="text" class="form-control" maxlength="100" ng-model="person.description" required ng-minlength="5" ng-maxlength="100"/><!-- Validation messages to be displayed on required, minlength and maxlength --> <p class="help-block" ng-show="personForm.description.$error.required">Add Description.</p> <p class="help-block" ng-show="personForm.description.$error.minlength">Description must be at least 5 characters long.</p> <p class="help-block" ng-show="personForm.description.$error.maxlength">Description cannot be longer than 100 characters.</p> </div><!-- Display an error if the input is invalid and is dirty (only when someone changes the value) --> <div class="form-group" ng-class="{'has-error' : personForm.imageUrl.$invalid && personForm.imageUrl.$dirty}"> <label for="imageUrl">Image URL:</label> <!-- Display a check when the field is valid and was modified --> <span ng-class="{'glyphicon glyphicon-ok' : personForm.imageUrl.$valid && personForm.imageUrl.$dirty}"></span><input id="imageUrl" name="imageUrl" type="url" class="form-control" maxlength="500" ng-model="person.imageUrl" required/><!-- Validation messages to be displayed on required and invalid. Type 'url' makes checks to a proper url format. --> <p class="help-block" ng-show="personForm.imageUrl.$error.required">Add Image URL.</p> <p class="help-block" ng-show="personForm.imageUrl.$invalid && personForm.imageUrl.$dirty">Invalid Image URL.</p> </div><div class="avatar" ng-if="person.imageUrl"> <img ng-src="{{person.imageUrl}}" width="400" height="250"/> </div><!-- Form buttons. The 'Save' button is only enabled when the form is valid. --> <div class="buttons"> <button type="button" class="btn btn-primary" ng-click="clearForm()">Clear</button> <button type="submit" class="btn btn-primary" ng-disabled="personForm.$invalid">Save</button> </div> </form> </div> </div> Here is the looks:A lot of codeis for validation purposes, but lets look into this a bit more in detail: each input element binds its value to person.something. This allows to model the data between the HTML and the Javascript controller, so we can write $scope.person.name in our controller to get the value filled in the form input with name, name. To access the data inside the HTML form we use the form name personForm plus the name of the input field. HTML5 have its own set of validations in the input fields, but we want to use the Angular ones. In that case, we need to disable form validations by using novalidate at the form element. Now, to use Angular validations, we can use a few Angular directives in the input elements. For this very basic form, we only use required, ng-minlength and ng-maxlength, but you can use others. Just look into the documentation. Angular assigns CSS classes based on the input validation state. To have an idea, these are the possible values:State CSS Onvalid ng-valid When the field is valid.invalid ng-invalid When the field is invalid.pristine ng-pristine When the field was never touched before.dirty ng-dirty When the field is changed.These CSS classes are empty. You need to create them and assign them styles in an included CSS sheet for the application. Instead, we’re going to use styles from Bootstrap which are very nice. For them to work, a few additional classes need to be applied to the elements. The div element enclosing the input needs the CSS class form-group and the input element needs the CSS class form-control. To display an invalid input field we add ng-class="{'has-error' : personForm.name.$invalid && personForm.name.$dirty}" to the containing input div. This code evaluates if the name in the personForm is invalid and if it’s dirty. It the condition verifies, then the input is displayed as invalid. Finally, for the form validation messages we need to verify the $error directive for each of the inputs and types of validations being performed. Just add ng-show="personForm.name.$error.minlength" to an HTML display element with a message to warn the user that the name input field is too short. Form Angular Controller // Create a controller with name personsFormController to bind to the form section. app.controller('personsFormController', function ($scope, $rootScope, personService) { // Clears the form. Either by clicking the 'Clear' button in the form, or when a successfull save is performed. $scope.clearForm = function () { $scope.person = null; // For some reason, I was unable to clear field values with type 'url' if the value is invalid. // This is a workaroud. Needs proper investigation. document.getElementById('imageUrl').value = null; // Resets the form validation state. $scope.personForm.$setPristine(); // Broadcast the event to also clear the grid selection. $rootScope.$broadcast('clear'); };// Calls the rest method to save a person. $scope.updatePerson = function () { personService.save($scope.person).$promise.then( function () { // Broadcast the event to refresh the grid. $rootScope.$broadcast('refreshGrid'); // Broadcast the event to display a save message. $rootScope.$broadcast('personSaved'); $scope.clearForm(); }, function () { // Broadcast the event for a server error. $rootScope.$broadcast('error'); }); };// Picks up the event broadcasted when the person is selected from the grid and perform the person load by calling // the appropiate rest service. $scope.$on('personSelected', function (event, id) { $scope.person = personService.get({id: id}); });// Picks us the event broadcasted when the person is deleted from the grid and perform the actual person delete by // calling the appropiate rest service. $scope.$on('deletePerson', function (event, id) { personService.delete({id: id}).$promise.then( function () { // Broadcast the event to refresh the grid. $rootScope.$broadcast('refreshGrid'); // Broadcast the event to display a delete message. $rootScope.$broadcast('personDeleted'); $scope.clearForm(); }, function () { // Broadcast the event for a server error. $rootScope.$broadcast('error'); }); }); }); For the form controller, we need the two functions that perform the operations associated with the button Clear and the button Save which are self-explanatory. A quick note: for some reason, Angular does not clear input fields which are in invalid state. I did found a few people complaining about the same problem, but I need to investigate this further. Maybe it’s something I’m doing wrong. REST services are called using save and delete from the $resource object which already implement the correspondent HTTP methods. Check the documentation. You can get a $resource with the following factory: REST Service // Service that provides persons operations app.factory('personService', function ($resource) { return $resource('resources/persons/:id'); }); The rest of the controller code, are functions to pickup the events created by the grid to load the person data in the form and delete the person. This controller also create a few events. If we add or remove persons, the grid needs to be updated so an event is generated requesting the grid to be updated. Feedback Messages HTML <!-- Specify a Angular controller script that binds Javascript variables to the feedback messages.--> <div class="message" ng-controller="alertMessagesController"> <alert ng-repeat="alert in alerts" type="{{alert.type}}" close="closeAlert($index)">{{alert.msg}}</alert> </div> This is just the top section of the application, to display success or error messages based on save, delete or server error. Feedback Messages Angular Controller // Create a controller with name alertMessagesController to bind to the feedback messages section. app.controller('alertMessagesController', function ($scope) { // Picks up the event to display a saved message. $scope.$on('personSaved', function () { $scope.alerts = [ { type: 'success', msg: 'Record saved successfully!' } ]; });// Picks up the event to display a deleted message. $scope.$on('personDeleted', function () { $scope.alerts = [ { type: 'success', msg: 'Record deleted successfully!' } ]; });// Picks up the event to display a server error message. $scope.$on('error', function () { $scope.alerts = [ { type: 'danger', msg: 'There was a problem in the server!' } ]; });$scope.closeAlert = function (index) { $scope.alerts.splice(index, 1); }; }); This is the controller that push the messages to the view. Listens to the events created by the grid and the form controllers. The End Result Uff.. that was a lot of code and new information. Let’s see the final result:There is also a live version running in http://javaee7-angular.radcortez.cloudbees.net, thanks to Cloudbees. It may take a while to open if the cloud instances is hibernated (because of no usage). Resources You can clone a full working copy from my github repository and deploy it to Wildfly. You can find instructions there to deploy it. Should also work on Glassfish. Java EE – Angular JS Source Since I may modify the code in the future, you can download the original source of this post from the release 3.0. In alternative, clone the repo and checkout the tag from release 3.0 with the following command: git checkout 3.0. Check also:Java EE 7 with Angular JS – Part 1 Javascript Package Management – NPM – Bower – GruntFinal ThoughtsThe form validation kicks in right after you start typing. Angular 1.3 will have an on blur property to validate only after loosing focus, but I’m still using Angular 1.2.x. I have to confess that I found the validation code a bit too verbose. I don’t know if there is a way to simplify it, but you shouldn’t need to add each message validation to each input. A few things are still lacking here, like parameters sanitisation or server side validation. I’ll cover those in a next blog post.This was a very long post, actually the longest I’ve wrote on my blog. If you reached this far, thank you so much for your time reading this post. I hope you enjoyed it! Let me know if you have any comments.Reference: Java EE 7 with Angular JS – CRUD, REST, Validations – Part 2 from our JCG partner Roberto Cortez at the Roberto Cortez Java Blog blog....

Kafka Benchmark on Chronicle Queue

Overview I was recently asked to compare the performance of Kafka with Chronicle Queue.  No two products are exactly alike, and performing a fair comparison is not easy.  We can try to run similar tests and see what results we get. This test is based on Apache Kafka Performance Results.         What was the test used? One area Kafka tests is multi-threaded performance.  In tests we have done, it is neither better or worse to use more more threads (up to the number CPUs you have).  We didn’t benchmark this here. All tests use one producer. Another difference, is that we flush to disk periodically by time rather than by count.  Being able to say you are never behind by more than X milli-seconds is often more useful than say 600 messages, as you don’t know how long those messages could have been waiting there.  For our tests, we look at flush periods of between 1 ms and 10 ms.  In Kafka’s tests, they appears to be every 3 ms approximately. The message size used was 200 bytes in each case, and we explored the difference writing batches of 1, 2, 5 and 10 messages at once made.  We also tried 200 messages in a batch and the performance was similar to batches of 10. We only tested writing to SSD disks for persistence.  Note: Chronicle is broker-less. The results The result of this test show you the message rate in terms of MB/s.  This is a reasonable way to describe the performance as the message size can vary, but you will get a similar amount of bandwidth, especially over 1 KB message sizes.device flush period (ms) 1 2 5 10ssd.ext4 1 236 MB/s 300 MB/s 340 MB/s 363 MB/sssd.ext4 3 378 MB/s 483 MB/s 556 MB/s 583 MB/sssd.ext4 10 495 MB/s 595 MB/s 687 MB/s 705 MB/stmpfs na 988 MB/s 1317 MB/s 1680 MB/s 1847 MB/sWe also tested “writing” to a tmpfs file system.  This is much faster as there is no actual writes to a device performed. Conclusions It isn’t possible to draw a direct comparison with Kafka as it is a broker based system as must send ever message over TCP.  Chronicle can replicate over TCP, however it doesn’t have to, and if you want to maximise performance you will use a high speed network, the fastest being the memory bus of your server. You can run similar tests and get exceptional results.  If you need to handle bursts of hundreds of MB/s, Chronicle may be a better solution.Reference: Kafka Benchmark on Chronicle Queue from our JCG partner Peter Lawrey at the Vanilla Java blog....

How to connect to MongoDB from a Java EE stateless application

In this post I will present how to connect to MongoDB from a stateless Java EE application, to take advantage of the built-in pool of connections to the database offered by the MongoDB Java Driver. This might be the case if you develop a REST API, that executes operations against a MongoDB. Get the Java MongoDb Driver To connect from Java to MongoDB, you can use the Java MongoDB Driver.  If you are building your application with Maven, you can add the dependency to the pom.xml file: MongoDB java driver dependencyorg.mongodb mongo-java-driver 2.12.3The driver provides a MongoDB client (com.mongodb.MongoClient) with internal pooling. The MongoClient class is designed to be thread safe and shared among threads. For most applications, you should have one MongoClient instace for the entire JVM. Because of that you wouldn’t want create a new MongoClient instace with each request in your Java EE stateless application. Implement a @Singleton EJB A simple solution is to use a @Singleton EJB to hold the MongoClient: Singleton to hold the MongoClient package org.codingpedia.demo.mongoconnection;import java.net.UnknownHostException;import javax.annotation.PostConstruct; import javax.ejb.ConcurrencyManagement; import javax.ejb.ConcurrencyManagementType; import javax.ejb.Lock; import javax.ejb.LockType; import javax.ejb.Singleton;import com.mongodb.MongoClient;@Singleton @ConcurrencyManagement(ConcurrencyManagementType.CONTAINER) public class MongoClientProvider { private MongoClient mongoClient = null; @Lock(LockType.READ) public MongoClient getMongoClient(){ return mongoClient; } @PostConstruct public void init() { String mongoIpAddress = "x.x.x.x"; Integer mongoPort = 11000; try { mongoClient = new MongoClient(mongoIpAddress, mongoPort); } catch (UnknownHostException e) { // TODO Auto-generated catch block e.printStackTrace(); } } } Note:@Singleton – probably the most important line of code in this class. This annotation specifies that there will be exactly one singleton of this type of bean in the application. This bean can be invoked concurrently by multiple threads. It comes also with a @PostConstruct annotation. This annotation is used on a method that needs to be executed after dependency injection is done to perform any initialization – in our case is to initialize the MongoClient the @ConcurrencyManagement(ConcurrencyManagementType.CONTAINER) declares a singleton session bean’s concurrency management type. By default it is set to Container, I use it here only to highlight its existence. The other option ConcurrencyManagementType.BEAN specifies that the bean developer is responsible for managing concurrent access to the bean instance. the @Lock(LockType.READ) specifies the concurrency lock type for singleton beans with container-managed concurrency. When set to LockType.READ, it enforces the method to permit full concurrent access to it (assuming no write locks are held). This permits several threads to access the same MongoClient instance and take advantage of the internal pool of connections to the database. This is VERY IMPORTANT, because the other more conservative option @Lock(LockType.WRITE), is the DEFAULT and enforces exclusive access to the bean instance. This should make the method slower in a highly concurrent environment…Use the @Singleton EJB Now that you have the MongoClient “persisted” in the application, you can inject the MongoClientProvider to access the MongoDB (to get the collection names for example): Access MongoClient from other beans example package org.codingpedia.demo.mongoconnection;import java.util.Set;import javax.ejb.EJB; import javax.ejb.Stateless;import com.mongodb.BasicDBObject; import com.mongodb.DB; import com.mongodb.DBCollection; import com.mongodb.DBObject; import com.mongodb.MongoClient; import com.mongodb.util.JSON;@Stateless public class TestMongoClientProvider { @EJB MongoClientProvider mongoClientProvider; public Set<String> getCollectionNames(){ MongoClient mongoClient = mongoClientProvider.getMongoClient(); DB db = mongoClient.getDB("myMongoDB"); Set<String> colls = db.getCollectionNames(); for (String s : colls) { System.out.println(s); } return colls; } } Note: The db object will be a connection to a MongoDB server for the specified database. With it, you can do further operations.  I encourage you to read the Getting Started with Java Driver for more on that… Be aware One aspect to bear mind: “For every request to the DB (find, insert, etc) the Java thread will obtain a connection from the pool, execute the operation, and release the connection. This means the connection (socket) used may be different each time. Additionally in the case of a replica set with slaveOk option turned on, the read operations will be distributed evenly across all slaves. This means that within the same thread, a write followed by a read may be sent to different servers (master then slave). In turn the read operation may not see the data just written since replication is asynchronous. If you want to ensure complete consistency in a “session” (maybe an http request), you would want the driver to use the same socket, which you can achieve by using a “consistent request”. Call requestStart() before your operations and requestDone() to release the connection back to the pool: Ensuring complete consistency in a DB db...; db.requestStart(); try { db.requestEnsureConnection();code.... } finally { db.requestDone(); } DB and DBCollection are completely thread safe. In fact, they are cached so you get the same instance no matter what.” [3] ResourcesJava MongoDB Driver Getting Started with Java Driver Java Driver Concurrency GitHub – mongodb / mongo-java-driver examplesReference: How to connect to MongoDB from a Java EE stateless application from our JCG partner Adrian Matei at the Codingpedia.org blog....

Reducing the frequency of major GC pauses

This post will discuss a technique to reduce the burden garbage collection pauses put on the latency of your application. As I have written couple of years ago, disabling garbage collection is not possible in JVM. But there is a clever trick that can be used to significantly reduce the length and frequency of the long pauses. As you are aware, there are two different GC events taking place within the JVM, called minor and major collections. There is a lot of material available about what takes place during those collections, so I will not focus on describing the mechanics in detail. I will just remind that in Hotspot JVM – during minor collection, eden and survivor spaces are collected, in major collection the tenured space also gets cleaned and (possibly) compacted. If you turn on the GC logging (-XX:+PrintGCDetails for example) then you immediately notice that the major collections are the ones you should focus. The length of a major garbage collection taking place is typically several times larger than the one cleaning young space. During a major GC there are two aspects requiring more time to complete. First and foremost, the survivors from young space are copied to old. Next, besides cleaning the unused references from the old generation, most of the GC algorithms also compact the old space, again requiring precious CPU cycles to be burnt. Having lots of objects in old space also increases the likelihood of having more references from old space to young space. This results in larger card tables, keeping track of the references and increases the length of the minor GC pauses, when these tables are checked to decide whether objects in young space are eligible for GC. So, if we cannot turn off the garbage collection, can we make sure these lengthy major GCs run less often and the reference count from Tenured space to Young stays low? The answer is yes. There are even some crazy configurations which have managed to get rid of the major GC altogether. Getting rid of major GC events  is truly a complex exercise, but reducing the frequency of those long pauses is something every deployment can achieve. The strategy we are looking at is limiting the number of objects which get tenured. In a typical web application for example, most of the objects created are useful only during the HttpRequest. There is and always will be shared state having longer life span, but key is in the fact that there is a very high ratio of short lived objects versus long lived shared state. The tricky part for any deployment out there now is to understand how much elbow room to give for the short-lived objects, so thatYou can guarantee that the short lived objects do not get promoted to Tenured space You are not over-provisioning, increasing the cost of your infrastructureOn conceptual level, achieving this is easy. You just need to measure the amount of memory allocated for short-lived objects during the requests and multiply it with the peak load time. What you will end up is the amount of memory you would want to fit either into eden or into a single survivor space. This will allow the GC to run truly efficiently without any accidental promotions to tenured. Zooming in from the conceptual level surfaces several complex technical issues, which I will open up in the forthcoming posts. So what to conclude from here? First and foremost – determining the perfect GC configuration for your application is a complex exercise. This is both bad and good news. Bad in regard that – it needs a lot of experiments from your side. Good in regard that – we like difficult problems and we are currently crafting experiments to investigate the domain further. Some day, not too far in the future, Plumbr is able to do it for you, saving you from boring plumbing job and allowing you to focus on the actual problem at hand.Reference: Reducing the frequency of major GC pauses from our JCG partner Nikita Salnikov Tarnovski at the Plumbr Blog blog....

Dead simple configuration

Whole frameworks have been written with the purpose of handling the configuration of your application. I prefer a simpler way. If by configuration we mean “everything that is likely to vary between deploys“, it follows that we should try and keep configuration simple. In Java, the simplest option is the humble properties file. The downside of a properties file is that you have to restart your application when you want it to pick up changes. Or do you? Here’s a simple method I’ve used on several projects:     public class MyAppConfig extends AppConfiguration {private static MyAppConfig instance = new MyAppConfig();public static MyAppConfig instance() { return instance; }private MyAppConfig() { this("myapp.properties"); }public String getServiceUrl() { return getRequiredProperty("service.url"); }public boolean getShouldStartSlow() { return getFlag("start-slow", false); } public int getHttpPort(int defaultPort) { return getIntProperty("myapp.http.port", defaultPort); }} The AppConfiguration class looks like this: public abstract class AppConfiguration {private static Logger log = LoggerFactory.getLogger(AppConfiguration.class);private long nextCheckTime = 0; private long lastLoadTime = 0; private Properties properties = new Properties(); private final File configFile;protected AppConfiguration(String filename) { this.configFile = new File(filename); }public String getProperty(String propertyName, String defaultValue) { String result = getProperty(propertyName); if (result == null) { log.trace("Missing property {} in {}", propertyName, properties.keySet()); return defaultValue; } return result; }public String getRequiredProperty(String propertyName) { String result = getProperty(propertyName); if (result == null) { throw new RuntimeException("Missing property " + propertyName); } return result; }private String getProperty(String propertyName) { if (System.getProperty(propertyName) != null) { log.trace("Reading {} from system properties", propertyName); return System.getProperty(propertyName); } if (System.getenv(propertyName.replace('.', '_')) != null) { log.trace("Reading {} from environment", propertyName); return System.getenv(propertyName.replace('.', '_')); }ensureConfigurationIsFresh(); return properties.getProperty(propertyName); }private synchronized void ensureConfigurationIsFresh() { if (System.currentTimeMillis() < nextCheckTime) return; nextCheckTime = System.currentTimeMillis() + 10000; log.trace("Rechecking {}", configFile);if (!configFile.exists()) { log.error("Missing configuration file {}", configFile); }if (lastLoadTime >= configFile.lastModified()) return; lastLoadTime = configFile.lastModified(); log.debug("Reloading {}", configFile);try (FileInputStream inputStream = new FileInputStream(configFile)) { properties.clear(); properties.load(inputStream); } catch (IOException e) { throw new RuntimeException("Failed to load " + configFile, e); } } } This reads the configuration file in an efficient way and updates the settings as needed. It supports environment variables and system properties as defaults. And it even gives a pretty good log of what’s going on.For the full source code and a magic DataSource which updates automatically, see this gist: https://gist.github.com/jhannes/b8b143e0e5b287d73038Enjoy!Reference: Dead simple configuration from our JCG partner Johannes Brodwall at the Thinking Inside a Bigger Box blog....

Akka Notes – Actor Logging and Testing

In the first two parts (one, two), we briefly talked about Actors and how messaging works. In this part, let’s look at fixing up Logging and Testing our TeacherActor. Recap This is how our Actor from the previous part looked like :           class TeacherActor extends Actor {val quotes = List( "Moderation is for cowards", "Anything worth doing is worth overdoing", "The trouble is you think you have time", "You never gonna know if you never even try")def receive = {case QuoteRequest => {import util.Random//Get a random Quote from the list and construct a response val quoteResponse=QuoteResponse(quotes(Random.nextInt(quotes.size)))println (quoteResponse)} } } Logging Akka with SLF4J You notice that in the code we are printing the quoteResponse to the standard output which you would obviously agree is a bad idea. Let’s fix that up by enabling logging via the SLF4J Facade. 1. Fix the Class to use Logging Akka provides a nice little trait called ActorLogging to achieve it. Let’s mix that in : class TeacherLogActor extends Actor with ActorLogging {val quotes = List( "Moderation is for cowards", "Anything worth doing is worth overdoing", "The trouble is you think you have time", "You never gonna know if you never even try")def receive = {case QuoteRequest => {import util.Random//get a random element (for now) val quoteResponse=QuoteResponse(quotes(Random.nextInt(quotes.size))) log.info(quoteResponse.toString()) } }//We'll cover the purpose of this method in the Testing section def quoteList=quotes} A small detour here : Internally, when we log a message, the logging methods in the ActorLogging (eventually) publish the log message to an EventStream. Yes, I did say publish. So, what actually is an EventStream? EventStream and Logging EventStream behaves just like a message broker to which we could publish and receive messages. One subtle distinction from a regular MOM is that the subscribers of the EventStream could only be an Actor. In case of logging messages, all log messages would be published to the EventStream. By default, the Actor that subscribes to these messages is the DefaultLogger which simply prints the message to the standard output. class DefaultLogger extends Actor with StdOutLogger { override def receive: Receive = { ... case event: LogEvent ⇒ print(event) } } So, that’s the reason when we try to kick off the StudentSimulatorApp, we see the log message written to the console. That said, EventStream isn’t suited only for logging. It is a general purpose publish-subscribe mechanism available inside the ActorWorld inside a VM (more on that later). Back to SLF4J setup : 2. Configure Akka to use SLF4J akka{ loggers = ["akka.event.slf4j.Slf4jLogger"] loglevel = "DEBUG" logging-filter = "akka.event.slf4j.Slf4jLoggingFilter" } We store this information in a file called application.conf which should be in your classpath. In our sbt folder structure, we would throw this in your main/resources directory. From the configuration, we could derive that:the loggers property indicates the Actor that is going to subscribe to the log events. What Slf4jLogger does is to simply consume the log messages and delegate that to the SLF4J Logger facade. the loglevel property simply indicates the minimum level that should be considered for logging. the logging-filter compares the currently configured loglevel and incoming log message level and chucks out any log message below the configured loglevel before publishing to the EventStream.But why didn’t we have an application.conf for the previous example? Simply because Akka provides some sane defaults so that we needn’t build a configuration file before we start playing with it. We’ll revisit this file too often here on for customizing various things. There are a whole bunch of awesome parameters that you could use inside the application.conf for logging alone. They are explained in detail here. 3. Throw in a logback.xml We’ll be configuring an SLF4J logger backed by logback now. <?xml version="1.0" encoding="UTF-8"?> <configuration> <appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender"> <encoder> <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern> </encoder><rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy"> <fileNamePattern>logs\akka.%d{yyyy-MM-dd}.%i.log</fileNamePattern> <timeBasedFileNamingAndTriggeringPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP"> <maxFileSize>50MB</maxFileSize> </timeBasedFileNamingAndTriggeringPolicy> </rollingPolicy> </appender><root level="DEBUG"> <appender-ref ref="FILE" /> </root> </configuration> I threw this inside the main/resources folder too along with application.conf. Please ensure that the main/resources is now in your eclipse or other IDE’s classpath. Also include logback and slf4j-api to your build.sbt. And when we kick off our StudentSimulatorApp and send a message to our new TeacherLogActor, the akkaxxxxx.log file that we configured looks like this.Testing Akka Please note that this is by no means an exhaustive coverage of Testing Akka. We would be building our tests on more features of Testing in the following parts under their respective topic headers. These testcases are aimed to cover the Actors we wrote earlier. While the StudentSimulatorApp does what we need, you would agree that it should be driven out of testcases. To ease the testing pain, Akka came up with an amazing testing toolkit with which we could do some magical stuff like probing directly into the Actor implementation’s internals. Enough talk, let’s see the testcases. Let’s first try to map the StudentSimulatorApp to a Testcase.Let’s look at the declaration alone now. class TeacherPreTest extends TestKit(ActorSystem("UniversityMessageSystem")) with WordSpecLike with MustMatchers with BeforeAndAfterAll { So, from the definition of the TestCase class we see that :The TestKit trait accepts an ActorSystem through which we would be creating Actors. Internally, the TestKit decorates the ActorSystem and replaces the default dispatcher too. We use WordSpec which is one of the many fun ways to write testcases with ScalaTest. The MustMatchers provide convenient methods to make the testcase look like natural language We mixin the BeforeAndAfterAll to shutdown the ActorSystem after the testcases are complete. The afterAll method that the trait provides is more like our tearDown in JUnit1, 2 – Sending message to ActorsThe first testcase just sends a message to the PrintActor. It doesn’t assert anything! The second case sends message to the Log actor which uses the log field of the ActorLogging to publish the message to the EventStream. This doesn’t assert anything too!//1. Sends message to the Print Actor. Not even a testcase actually "A teacher" must {"print a quote when a QuoteRequest message is sent" in {val teacherRef = TestActorRef[TeacherActor] teacherRef ! QuoteRequest } }//2. Sends message to the Log Actor. Again, not a testcase per se "A teacher with ActorLogging" must {"log a quote when a QuoteRequest message is sent" in {val teacherRef = TestActorRef[TeacherLogActor] teacherRef ! QuoteRequest } 3 – Asserting internal state of Actors The third case uses the underlyingActor method of the TestActorRef and calls upon the quoteList method of the TeacherActor. The quoteList method returns the list of quotes back. We use this list to assert its size. If reference to quoteList throws you back, refer to the TeacherLogActor code listed above and look for: //From TeacherLogActor //We'll cover the purpose of this method in the Testing section def quoteList=quotes //3. Asserts the internal State of the Log Actor. "have a quote list of size 4" in {val teacherRef = TestActorRef[TeacherLogActor] teacherRef.underlyingActor.quoteList must have size (4) teacherRef.underlyingActor.quoteList must have size (4) } 4 – Asserting log messages As we discussed earlier in the EventStream and Logging section (above), all log messages go to the EventStream and the SLF4JLogger subscribes to it and uses its appenders to write to the log file/console etc. Wouldn’t it be nice to subscribe to the EventStream directly in our testcase and assert the presence of the log message itself? Looks like we can do that too. This involves two steps :You need to add an extra configuration to your TestKit like so : class TeacherTest extends TestKit(ActorSystem("UniversityMessageSystem", ConfigFactory.parseString("""akka.loggers = ["akka.testkit.TestEventListener"]"""))) with WordSpecLike with MustMatchers with BeforeAndAfterAll {Now that we have a subscription to the EventStream, we could assert it from our testcase as : //4. Verifying log messages from eventStream "be verifiable via EventFilter in response to a QuoteRequest that is sent" in {val teacherRef = TestActorRef[TeacherLogActor] EventFilter.info(pattern = "QuoteResponse*", occurrences = 1) intercept { teacherRef ! QuoteRequest } }The EventFilter.info block just intercepts for 1 log message which starts with QuoteResponse (pattern='QuoteResponse*). (You could also achieve it by using a start='QuoteResponse'. If there is no log message as a result of sending a message to the TeacherLogActor, the testcase would fail. 5 – Testing Actors with constructor parameters Please note that the way we create Actors in the testcase is via the TestActorRef[TeacherLogActor] and not via system.actorOf. This is just so that we could get access to the Actor’s internals through the underlyingActor method in the TeacherActorRef. We wouldn’t be able to achieve this via the ActorRef that we have access during the regular runtime. (That doesn’t give us any excuse to use TestActorRef in production. You’ll be hunted down). If the Actor accepts parameters, then the way we create TestActorRef would be like : val teacherRef = TestActorRef(new TeacherLogParameterActor(quotes)) The entire testcase would then look something like : //5. have a quote list of the same size as the input parameter " have a quote list of the same size as the input parameter" in {val quotes = List( "Moderation is for cowards", "Anything worth doing is worth overdoing", "The trouble is you think you have time", "You never gonna know if you never even try")val teacherRef = TestActorRef(new TeacherLogParameterActor(quotes)) //val teacherRef = TestActorRef(Props(new TeacherLogParameterActor(quotes)))teacherRef.underlyingActor.quoteList must have size (4) EventFilter.info(pattern = "QuoteResponse*", occurrences = 1) intercept { teacherRef ! QuoteRequest } } Shutting down ActorSystem And finally, the afterAll lifecycle method: override def afterAll() { super.afterAll() system.shutdown() } CODEAs always, the entire project could be downloaded from github here.Reference: Akka Notes – Actor Logging and Testing from our JCG partner Arun Manivannan at the Rerun.me blog....

Visualizing engineering fields

Civil engineering.                      Mechanical engineering.    Electronic engineering.    Software engineering.    Summary. Seriously, software engineering? Photo credit attribution. Image Clemuel Ricketts House drawing 1 courtesy of wikimedia. Image Bequet-Ribault House Transverse Section with Details courtesy of wikimedia. Image Grand Central Terminal courtesy of wikimedia. Image Jackhammer_blow courtesy of wikimedia. Image OMS Pod schematic courtesy of wikimedia. Image Nikon f-mount courtesy of wikimedia. Image Kidule 2550 courtesy of wikimedia. Image AT89C55WD courtesy of wikimedia. Image WPEVCContactorCharge2B courtesy of wikimedia.Reference: Visualizing engineering fields from our JCG partner Edmund Kirwan at the A blog about software. blog....

R: A first attempt at linear regression

I’ve been working through the videos that accompany the Introduction to Statistical Learning with Applications in R book and thought it’d be interesting to try out the linear regression algorithm against my meetup data set. I wanted to see how well a linear regression algorithm could predict how many people were likely to RSVP to a particular event. I started with the following code to build a data frame containing some potential predictors:         library(RNeo4j) officeEventsQuery = "MATCH (g:Group {name: \"Neo4j - London User Group\"})-[:HOSTED_EVENT]->(event)<-[:TO]-({response: 'yes'})<-[:RSVPD]-(), (event)-[:HELD_AT]->(venue) WHERE (event.time + event.utc_offset) < timestamp() AND venue.name IN [\"Neo Technology\", \"OpenCredo\"] RETURN event.time + event.utc_offset AS eventTime,event.announced_at AS announcedAt, event.name, COUNT(*) AS rsvps"   events = subset(cypher(graph, officeEventsQuery), !is.na(announcedAt)) events$eventTime <- timestampToDate(events$eventTime) events$day <- format(events$eventTime, "%A") events$monthYear <- format(events$eventTime, "%m-%Y") events$month <- format(events$eventTime, "%m") events$year <- format(events$eventTime, "%Y") events$announcedAt<- timestampToDate(events$announcedAt) events$timeDiff = as.numeric(events$eventTime - events$announcedAt, units = "days") If we preview ‘events’ it contains the following columns: > head(events) eventTime announcedAt event.name rsvps day monthYear month year timeDiff 1 2013-01-29 18:00:00 2012-11-30 11:30:57 Intro to Graphs 24 Tuesday 01-2013 01 2013 60.270174 2 2014-06-24 18:30:00 2014-06-18 19:11:19 Intro to Graphs 43 Tuesday 06-2014 06 2014 5.971308 3 2014-06-18 18:30:00 2014-06-08 07:03:13 Neo4j World Cup Hackathon 24 Wednesday 06-2014 06 2014 10.476933 4 2014-05-20 18:30:00 2014-05-14 18:56:06 Intro to Graphs 53 Tuesday 05-2014 05 2014 5.981875 5 2014-02-11 18:00:00 2014-02-05 19:11:03 Intro to Graphs 35 Tuesday 02-2014 02 2014 5.950660 6 2014-09-04 18:30:00 2014-08-26 06:34:01 Hands On Intro to Cypher - Neo4j's Query Language 20 Thursday 09-2014 09 2014 9.497211 We want to predict ‘rsvps’ from the other columns so I started off by creating a linear model which took all the other columns into account: > summary(lm(rsvps ~., data = events))   Call: lm(formula = rsvps ~ ., data = events)   Residuals: Min 1Q Median 3Q Max -8.2582 -1.1538 0.0000 0.4158 10.5803   Coefficients: (14 not defined because of singularities) Estimate Std. Error t value Pr(>|t|) (Intercept) -9.365e+03 3.009e+03 -3.113 0.00897 ** eventTime 3.609e-06 2.951e-06 1.223 0.24479 announcedAt 3.278e-06 2.553e-06 1.284 0.22339 event.nameGraph Modelling - Do's and Don'ts 4.884e+01 1.140e+01 4.286 0.00106 ** event.nameHands on build your first Neo4j app for Java developers 3.735e+01 1.048e+01 3.562 0.00391 ** event.nameHands On Intro to Cypher - Neo4j's Query Language 2.560e+01 9.713e+00 2.635 0.02177 * event.nameIntro to Graphs 2.238e+01 8.726e+00 2.564 0.02480 * event.nameIntroduction to Graph Database Modeling -1.304e+02 4.835e+01 -2.696 0.01946 * event.nameLunch with Neo4j's CEO, Emil Eifrem 3.920e+01 1.113e+01 3.523 0.00420 ** event.nameNeo4j Clojure Hackathon -3.063e+00 1.195e+01 -0.256 0.80203 event.nameNeo4j Python Hackathon with py2neo's Nigel Small 2.128e+01 1.070e+01 1.989 0.06998 . event.nameNeo4j World Cup Hackathon 5.004e+00 9.622e+00 0.520 0.61251 dayTuesday 2.068e+01 5.626e+00 3.676 0.00317 ** dayWednesday 2.300e+01 5.522e+00 4.165 0.00131 ** monthYear01-2014 -2.350e+02 7.377e+01 -3.185 0.00784 ** monthYear02-2013 -2.526e+01 1.376e+01 -1.836 0.09130 . monthYear02-2014 -2.325e+02 7.763e+01 -2.995 0.01118 * monthYear03-2013 -4.605e+01 1.683e+01 -2.736 0.01805 * monthYear03-2014 -2.371e+02 8.324e+01 -2.848 0.01468 * monthYear04-2013 -6.570e+01 2.309e+01 -2.845 0.01477 * monthYear04-2014 -2.535e+02 8.746e+01 -2.899 0.01336 * monthYear05-2013 -8.672e+01 2.845e+01 -3.049 0.01011 * monthYear05-2014 -2.802e+02 9.420e+01 -2.975 0.01160 * monthYear06-2013 -1.022e+02 3.283e+01 -3.113 0.00897 ** monthYear06-2014 -2.996e+02 1.003e+02 -2.988 0.01132 * monthYear07-2014 -3.123e+02 1.054e+02 -2.965 0.01182 * monthYear08-2013 -1.326e+02 4.323e+01 -3.067 0.00976 ** monthYear08-2014 -3.060e+02 1.107e+02 -2.763 0.01718 * monthYear09-2013 NA NA NA NA monthYear09-2014 -3.465e+02 1.164e+02 -2.976 0.01158 * monthYear10-2012 2.602e+01 1.959e+01 1.328 0.20886 monthYear10-2013 -1.728e+02 5.678e+01 -3.044 0.01020 * monthYear11-2012 2.717e+01 1.509e+01 1.800 0.09704 . month02 NA NA NA NA month03 NA NA NA NA month04 NA NA NA NA month05 NA NA NA NA month06 NA NA NA NA month07 NA NA NA NA month08 NA NA NA NA month09 NA NA NA NA month10 NA NA NA NA month11 NA NA NA NA year2013 NA NA NA NA year2014 NA NA NA NA timeDiff NA NA NA NA --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1   Residual standard error: 5.287 on 12 degrees of freedom Multiple R-squared: 0.9585, Adjusted R-squared: 0.8512 F-statistic: 8.934 on 31 and 12 DF, p-value: 0.0001399 As I understand it we can look at the R-squared value to understand how much of the variance in the data has been explained by the model – in this case it’s 85%. A lot of the coefficients seem to be based around specific event names which seems a bit too specific to me so I wanted to see what would happen if I derived a feature which indicated whether a session was practical: events$practical = grepl("Hackathon|Hands on|Hands On", events$event.name) We can now run the model again with the new column having excluded ‘event.name’ field: > summary(lm(rsvps ~., data = subset(events, select = -c(event.name))))   Call: lm(formula = rsvps ~ ., data = subset(events, select = -c(event.name)))   Residuals: Min 1Q Median 3Q Max -18.647 -2.311 0.000 2.908 23.218   Coefficients: (13 not defined because of singularities) Estimate Std. Error t value Pr(>|t|) (Intercept) -3.980e+03 4.752e+03 -0.838 0.4127 eventTime 2.907e-06 3.873e-06 0.751 0.4621 announcedAt 3.336e-08 3.559e-06 0.009 0.9926 dayTuesday 7.547e+00 6.080e+00 1.241 0.2296 dayWednesday 2.442e+00 7.046e+00 0.347 0.7327 monthYear01-2014 -9.562e+01 1.187e+02 -0.806 0.4303 monthYear02-2013 -4.230e+00 2.289e+01 -0.185 0.8553 monthYear02-2014 -9.156e+01 1.254e+02 -0.730 0.4742 monthYear03-2013 -1.633e+01 2.808e+01 -0.582 0.5676 monthYear03-2014 -8.094e+01 1.329e+02 -0.609 0.5496 monthYear04-2013 -2.249e+01 3.785e+01 -0.594 0.5595 monthYear04-2014 -9.230e+01 1.401e+02 -0.659 0.5180 monthYear05-2013 -3.237e+01 4.654e+01 -0.696 0.4952 monthYear05-2014 -1.015e+02 1.509e+02 -0.673 0.5092 monthYear06-2013 -3.947e+01 5.355e+01 -0.737 0.4701 monthYear06-2014 -1.081e+02 1.604e+02 -0.674 0.5084 monthYear07-2014 -1.110e+02 1.678e+02 -0.661 0.5163 monthYear08-2013 -5.144e+01 6.988e+01 -0.736 0.4706 monthYear08-2014 -1.023e+02 1.784e+02 -0.573 0.5731 monthYear09-2013 -6.057e+01 7.893e+01 -0.767 0.4523 monthYear09-2014 -1.260e+02 1.874e+02 -0.672 0.5094 monthYear10-2012 9.557e+00 2.873e+01 0.333 0.7430 monthYear10-2013 -6.450e+01 9.169e+01 -0.703 0.4903 monthYear11-2012 1.689e+01 2.316e+01 0.729 0.4748 month02 NA NA NA NA month03 NA NA NA NA month04 NA NA NA NA month05 NA NA NA NA month06 NA NA NA NA month07 NA NA NA NA month08 NA NA NA NA month09 NA NA NA NA month10 NA NA NA NA month11 NA NA NA NA year2013 NA NA NA NA year2014 NA NA NA NA timeDiff NA NA NA NA practicalTRUE -9.388e+00 5.289e+00 -1.775 0.0919 . --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1   Residual standard error: 10.21 on 19 degrees of freedom Multiple R-squared: 0.7546, Adjusted R-squared: 0.4446 F-statistic: 2.434 on 24 and 19 DF, p-value: 0.02592 Now we’re only accounting for 44% of the variance and none of our coefficients are significant so this wasn’t such a good change. I also noticed that we’ve got a bit of overlap in the date related features – we’ve got one column for monthYear and then separate ones for month and year. Let’s strip out the combined one: > summary(lm(rsvps ~., data = subset(events, select = -c(event.name, monthYear))))   Call: lm(formula = rsvps ~ ., data = subset(events, select = -c(event.name, monthYear)))   Residuals: Min 1Q Median 3Q Max -16.5745 -4.0507 -0.1042 3.6586 24.4715   Coefficients: (1 not defined because of singularities) Estimate Std. Error t value Pr(>|t|) (Intercept) -1.573e+03 4.315e+03 -0.364 0.7185 eventTime 3.320e-06 3.434e-06 0.967 0.3425 announcedAt -2.149e-06 2.201e-06 -0.976 0.3379 dayTuesday 4.713e+00 5.871e+00 0.803 0.4294 dayWednesday -2.253e-01 6.685e+00 -0.034 0.9734 month02 3.164e+00 1.285e+01 0.246 0.8075 month03 1.127e+01 1.858e+01 0.607 0.5494 month04 4.148e+00 2.581e+01 0.161 0.8736 month05 1.979e+00 3.425e+01 0.058 0.9544 month06 -1.220e-01 4.271e+01 -0.003 0.9977 month07 1.671e+00 4.955e+01 0.034 0.9734 month08 8.849e+00 5.940e+01 0.149 0.8827 month09 -5.496e+00 6.782e+01 -0.081 0.9360 month10 -5.066e+00 7.893e+01 -0.064 0.9493 month11 4.255e+00 8.697e+01 0.049 0.9614 year2013 -1.799e+01 1.032e+02 -0.174 0.8629 year2014 -3.281e+01 2.045e+02 -0.160 0.8738 timeDiff NA NA NA NA practicalTRUE -9.816e+00 5.084e+00 -1.931 0.0645 . --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1   Residual standard error: 10.19 on 26 degrees of freedom Multiple R-squared: 0.666, Adjusted R-squared: 0.4476 F-statistic: 3.049 on 17 and 26 DF, p-value: 0.005187 Again none of the coefficients are statistically significant which is disappointing. I think the main problem may be that I have very few data points (only 42) making it difficult to come up with a general model. I think my next step is to look for some other features that could impact the number of RSVPs e.g. other events on that day, the weather. I’m a novice at this but trying to learn more so if you have any ideas of what I should do next please let me know.Reference: R: A first attempt at linear regression from our JCG partner Mark Needham at the Mark Needham Blog blog....

How To Get a Job in a Different City

There are subtle nuances to job searches outside of the local area. Unless a candidate is considered superlative, non-local applicants are not always given the same level of attention as locals when employers have healthy candidate pools with local applicants. Why might remoteness impact interview decisions (even in a tight market), and how can the potential for negative bias be minimized? We’ll get to that in a minute. Before we can apply for a job, we need to find it.         Finding jobs Job sites – The usual suspects are where some people start, and those jobs will have multiple applicants. Googling to find regional job sites may help find companies that fly under the radar. LinkedIn – The Jobs tab can create a search for new posts, but everybody may use that strategy. Try an Advanced People Search using one or more of the technologies or skills (in keywords box) that might be used by an attractive employer, and enter a zip code and mileage range using the desired location. Note both the current and past employers for the profiles, then research those firms. Remote networking – Reaching out directly to some of the profiles found during the LinkedIn search will produce leads. Many fellow technologists will respond to messages stating a desire to move to their area. Finding a local recruiter on LinkedIn or via web search may bring several opportunities. User groups and meetups – Some user group sites have job ads, and sponsoring firms usually have a local presence. Speakers from past meetings often live locally. User group leaders are often contacted by recruiters and hiring companies that are looking for talent, so contacting group leaders directly and asking “Who is hiring?” should be helpful. or let the jobs find you… – Change the location field on a LinkedIn profile to the desired location and add language indicating an interest in new opportunities, and companies and agencies from that location may start knocking. Applying for jobsNow that the jobs are identified, initial contact must be made. This is where things can get complicated. Recruiters and HR professionals are tasked with looking at résumés and any accompanying material in order to make a reasonably quick yes/no decision on an initial interview. Screeners know an interview process is time consuming, and the decision to start that process will usually take valuable time from several employees of the organization. There are several factors that go into this decision, with candidate’s qualifications being the most important and obvious. Another factor is the recruiter’s assessment regarding the likelihood that a candidate would accept the job if offered, which is based on any obvious or assumed barriers. Details such as candidate compensation requirements in relation to company pay ranges or current job title in relation to vacant job title may play a role in the decision. Is someone making 150K likely to accept our job paying 110K? Is a Chief Architect likely to accept our Intermediate Developer position? And generally speaking, is this person likely to accept a job in another location? For exceptional candidates these questions are irrelevant, as they will be screened. But if a candidate barely meets the minimum requirements, has a couple additional flags, and happens to be non-local, will the employer even bother screening the candidate? Should they? Without additional context, it may be assumed that a recent graduate in the midwest that applies to a job in New York City is probably shipping résumés to Silicon Valley, Chicago, or Seattle. The HR rep could believe that they are competing with many companies across several markets, each with its own reputation, characteristics, and cost of living. How likely is it that this candidate will not only choose our market, but also choose our company? How can we lessen the impact of these assumptions and potential biases? Mention location -When location isn’t mentioned by non-local applicants and no other information is given, the screener is likely to get the impression that this candidate is indiscriminately applying to positions. An applicant’s non-local address is the elephant in the room, so it is vital to reference that immediately in a cover letter. If a future address is known, it should be listed on the résumé along with the current address. Keep in mind that the screener may open a résumé before reading any accompanying material. When there is a specific reason for relocating to this location, such as a family situation or a spouse’s job relocation, that information will be additional evidence of intent. Availability for interviews - Listing available dates for on-site interviews demonstrates at least some level of commitment to the new location. Screeners interpret this as a buying sign. Availability for start – Candidates that relocate for positions may have to sell their home, sublet an apartment, or have children in the middle of a school year. A mention of start date helps to set expectations early. Additional considerations Cost of living and salary – Some ads request salary history and compensation expectations. Be sure to research salaries and market values in the new city, and state that committing to a future salary figure is difficult until all of the data is collected. Relocation assistance – Companies may be willing to provide some relocation assistance even for candidates who are planning a move. Requesting a relo package in an application adds a potential reason for rejection, but negotiating relo money during the offer process is common. Since it is a one-time cost, companies may be more willing to provide relo if negotiations on salary or benefits become sticky. Consider the overall market – Before committing to an opportunity in another city, research employment prospects beyond the target company. How healthy is the job market, and how many other local companies have specific demand for the same skills? A strong local tech market does not always indicate a strong market for certain specialties.Reference: How To Get a Job in a Different City from our JCG partner Dave Fecak at the Job Tips For Geeks blog....

Neo4j: Generic/Vague relationship names

An approach to modelling that I often see while working with Neo4j users is creating very generic relationships (e.g. HAS, CONTAINS, IS) and filtering on a relationship property or on a property/label at the end node. Intuitively this doesn’t seem to make best use of the graph model as it means that you have to evaluate many relationships and nodes that you’re not interested in. However, I’ve never actually tested the performance differences between the approaches so I thought I’d try it out. I created 4 different databases which had one node with 60,000 outgoing relationships – 10,000 which we wanted to retrieve and 50,000 that were irrelevant. I modelled the ‘relationship’ in 4 different ways…Using a specific relationship type (node)-[:HAS_ADDRESS]->(address) Using a generic relationship type and then filtering by end node label (node)-[:HAS]->(address:Address) Using a generic relationship type and then filtering by relationship property (node)-[:HAS {type: “address”}]->(address) Using a generic relationship type and then filtering by end node property (node)-[:HAS]->(address {type: “address”})…and then measured how long it took to retrieve the ‘has address’ relationships.The code is on github if you want to take a look.Although it’s obviously not as precise as a JMH micro benchmark I think it’s good enough to get a feel for the difference between the approaches. I ran a query against each database 100 times and then took the 50th, 75th and 99th percentiles (times are in ms): Using a generic relationship type and then filtering by end node label 50%ile: 6.0 75%ile: 6.0 99%ile: 402.60999999999825   Using a generic relationship type and then filtering by relationship property 50%ile: 21.0 75%ile: 22.0 99%ile: 504.85999999999785   Using a generic relationship type and then filtering by end node label 50%ile: 4.0 75%ile: 4.0 99%ile: 145.65999999999931   Using a specific relationship type 50%ile: 0.0 75%ile: 1.0 99%ile: 25.749999999999872 We can drill further into why there’s a difference in the times for each of the approaches by profiling the equivalent cypher query. We’ll start with the one which uses a specific relationship name: Using a specific relationship type neo4j-sh (?)$ profile match (n) where id(n) = 0 match (n)-[:HAS_ADDRESS]->() return count(n); +----------+ | count(n) | +----------+ | 10000 | +----------+ 1 row   ColumnFilter | +EagerAggregation | +SimplePatternMatcher | +NodeByIdOrEmpty   +----------------------+-------+--------+-----------------------------+-----------------------+ | Operator | Rows | DbHits | Identifiers | Other | +----------------------+-------+--------+-----------------------------+-----------------------+ | ColumnFilter | 1 | 0 | | keep columns count(n) | | EagerAggregation | 1 | 0 | | | | SimplePatternMatcher | 10000 | 10000 | n, UNNAMED53, UNNAMED35 | | | NodeByIdOrEmpty | 1 | 1 | n, n | { AUTOINT0} | +----------------------+-------+--------+-----------------------------+-----------------------+   Total database accesses: 10001 Here we can see that there were 10,002 database accesses in order to get a count of our 10,000 HAS_ADDRESS relationships. We get a database access each time we load a node, relationship or property. By contrast the other approaches have to load in a lot more data only to then filter it out: Using a generic relationship type and then filtering by end node label neo4j-sh (?)$ profile match (n) where id(n) = 0 match (n)-[:HAS]->(:Address) return count(n); +----------+ | count(n) | +----------+ | 10000 | +----------+ 1 row   ColumnFilter | +EagerAggregation | +Filter | +SimplePatternMatcher | +NodeByIdOrEmpty   +----------------------+-------+--------+-----------------------------+----------------------------------+ | Operator | Rows | DbHits | Identifiers | Other | +----------------------+-------+--------+-----------------------------+----------------------------------+ | ColumnFilter | 1 | 0 | | keep columns count(n) | | EagerAggregation | 1 | 0 | | | | Filter | 10000 | 10000 | | hasLabel( UNNAMED45:Address(0)) | | SimplePatternMatcher | 10000 | 60000 | n, UNNAMED45, UNNAMED35 | | | NodeByIdOrEmpty | 1 | 1 | n, n | { AUTOINT0} | +----------------------+-------+--------+-----------------------------+----------------------------------+   Total database accesses: 70001 Using a generic relationship type and then filtering by relationship property neo4j-sh (?)$ profile match (n) where id(n) = 0 match (n)-[:HAS {type: "address"}]->() return count(n); +----------+ | count(n) | +----------+ | 10000 | +----------+ 1 row   ColumnFilter | +EagerAggregation | +Filter | +SimplePatternMatcher | +NodeByIdOrEmpty   +----------------------+-------+--------+-----------------------------+--------------------------------------------------+ | Operator | Rows | DbHits | Identifiers | Other | +----------------------+-------+--------+-----------------------------+--------------------------------------------------+ | ColumnFilter | 1 | 0 | | keep columns count(n) | | EagerAggregation | 1 | 0 | | | | Filter | 10000 | 20000 | | Property( UNNAMED35,type(0)) == { AUTOSTRING1} | | SimplePatternMatcher | 10000 | 120000 | n, UNNAMED63, UNNAMED35 | | | NodeByIdOrEmpty | 1 | 1 | n, n | { AUTOINT0} | +----------------------+-------+--------+-----------------------------+--------------------------------------------------+   Total database accesses: 140001 Using a generic relationship type and then filtering by end node property neo4j-sh (?)$ profile match (n) where id(n) = 0 match (n)-[:HAS]->({type: "address"}) return count(n); +----------+ | count(n) | +----------+ | 10000 | +----------+ 1 row   ColumnFilter | +EagerAggregation | +Filter | +SimplePatternMatcher | +NodeByIdOrEmpty   +----------------------+-------+--------+-----------------------------+--------------------------------------------------+ | Operator | Rows | DbHits | Identifiers | Other | +----------------------+-------+--------+-----------------------------+--------------------------------------------------+ | ColumnFilter | 1 | 0 | | keep columns count(n) | | EagerAggregation | 1 | 0 | | | | Filter | 10000 | 20000 | | Property( UNNAMED45,type(0)) == { AUTOSTRING1} | | SimplePatternMatcher | 10000 | 120000 | n, UNNAMED45, UNNAMED35 | | | NodeByIdOrEmpty | 1 | 1 | n, n | { AUTOINT0} | +----------------------+-------+--------+-----------------------------+--------------------------------------------------+   Total database accesses: 140001 So in summary…specific relationships #ftw!Reference: Neo4j: Generic/Vague relationship names from our JCG partner Mark Needham at the Mark Needham Blog blog....
Java Code Geeks and all content copyright © 2010-2015, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below: