Featured FREE Whitepapers

What's New Here?

software-development-2-logo

The madness of layered architecture

I once visited a team that had fifteen layers in their code. That is: If you wanted to display some data in the database in a web page, that data passed through 15 classes in the application. What did these layers do? Oh, nothing much. They just copied data from one object to the next. Or sometimes the “access object layer” would perform a check that objects were valid. Or perhaps the check would be done in the “boundary object layer”. It varied, depending on which part of the application you looked. Puzzled (and somewhat annoyed), I asked the team why they had constructed their application this way. The answer was simple enough: They had been told so by the expensive consultant who had been hired to advice on the organization’s architecture. I asked the team what rationale the consultant had given. They just shrugged. Who knows? Today, I often visit teams who have three to five layers in their code. When asked why, the response is usually the same: This is the advice they have been given. From a book, a video or a conference talk. And the rationale remains elusive or muddled at best. Why do we construct layered applications? There’s an ancient saying in the field of computing: Any problem in computer science can be solved by adding a layer of indirection. Famously, this is the guiding principle behind our modern network stack. In web services SOAP performs method calls on top of HTTP. HTTP sends requests and receives responses on top of TCP. TCP streams data in two directions on top of IP. IP routes packets of bits through a network on top of physical protocols like Ethernet. Ethernet broadcasts packets of bits with a destination address to all computers on a bus. Each layer performs a function that lets the higher layer abstract away the complexities of for example resending lost packets or routing packets through a globally interconnected network. The analogy is used to argue for layers in enterprise application architecture. But enterprise applications are not like network protocols. Every layer in most enterprise application operates at the same level of abstraction. To pick on a popular example: John Papa’s video on Single Page Applications uses the following layers on the server side (and a separate set on the client side): Controllers, UnitOfWork, Repository, Factories and EntityFramework. So for example the AttendanceRepository property in CodeCamperUnitOfWork returns a AttendanceRepository to the AttendanceController, which calls GetBySessionId() method in AttendanceRepository layer, which finally calls DbSet.Where(ps => ps.SessionId == sessionId) on EntityFramework. And then there’s the RepositoryFactories layers. Whee! And what does it all do? It filters an entity based on a parameter. Wat?! (A hint that this is going off the rails is that discussion in the video presentation starts with the bottom and builds up to the controllers instead of outside in) In a similar Java application, I have seen – and feel free to skip these tedious details – the SpeakersController.findByConference calls SpeakersService.findByConference, which calls SpeakersManager.findByConference, which calls SpeakersRepository.findByConference, which constructs a horrific JPAQL query which nobody can understand. JPA returns an @Entity which is mapped to the database, and the Repository, or perhaps the Manager, Service or Controller, or perhaps two or three of these, will transform from Speaker-class to another. Why is this a problem? The cost of code: A reasonable conjecture would be that the cost of developing and maintaining an application grows with the size of the application. Adding code without value is waste. Single responsibility principle: In the above example, the SpeakerService will often contain all functionality associated with speakers. So if adding a speaker requires you to select a conference from a drop-down list, the SpeakerService will often have a findAllConferences method, so that SpeakersController doesn’t need to also have a dependency on ConferenceService. However, this makes the classes into functionality magnets. The symptom is low coherence: the methods of one class can be divided into distinct sets that are never used at the same time. Dumb services: “Service” is a horrible name for a class – a service is a more or less coherent collection of functions. A more meaningful name would be a “repository” for a service that stores and retrieves objects, a Query is a service that selects objects based on a criteria (actually it’s a command, not a service), a Gateway is a service that communicates with another system, a ReportGenerator is a service that creates a report. Of course, the fact that a controller may have references to a repository, a report generator and a gateway should be quite normal if the controller fetches data from the database to generate a report for another system. Multiple points of extension: If you have a controller that calls a service that calls a manager that calls a repository and you want to add some validation that the object you are saving is consistent, where would you add it? How much would you be willing to bet that the other developers on the team would give the same answer? How much would you be willing to bet that you would give the same answer in a few months? Catering to the least common denominator: In the conference application we have been playing with, DaysController creates and returns the days available for a conference. The functionality needed for DaysController is dead simple. On the other hand TalksController has a lot more functionality. Even though these controllers have vastly different needs, they both get the same (boring) set of classes: A Controller, a UnitOfWork, a Repository. There is no reason the DaysController couldn’t use EntityFramework directly, other than the desire for consistency. Most applications have a few functional verticals that contain the meat of the application and a lot of small supporting verticals. Treating them the same only creates more work and more maintenance effort. So how can you fix it? The first thing you must do is to build your application from the outside in. If your job is to return a set of objects, with .NET EntityFramework you can access the DbSet directly – just inject IDbSet in your controller. With Java JPA, you probably want a Repository with a finder method to hide the JPAQL madness. No service, manager, worker, or whatever is needed. The second thing you must do is to grow your architecture. When you realize that there’s more responsibilities in your controller than deciding what to do with a user request, you must extract new classes. You may for example need a PdfScheduleGenerator to create a printable schedule for your conference. If you’re using .NET entity framework, you many want to create some LINQ extension methods on e.g. IEnumerable(which is extended by IDbSet). The third and most important thing you must do is to give your classes names that reflect their responsibilities. A service should not just be a place to dump a lot of methods. Every problem in computer science can be solved by adding a layer of indirection, but most problems in software engineering can be solved by removing a misplaced layer. Let’s build leaner applications!Reference: The madness of layered architecture from our JCG partner Johannes Brodwall at the Thinking Inside a Bigger Box blog....
software-development-2-logo

Test Attributes #1: Validity

In my last post, I created a list of test attributes. If one of them isn’t ok, you need to do some fixin’. This is the first of a series of posts that is going to discuss the different faces of tests. Let’s start with validity. Admittedly, it’s not the first of attribute I thought about. What are the chances we’re going to write a wrong test?   How can this happen? We usually write tests based on our understanding of the code, the requirements we need to implement, and we fill the gap by assumptions. We can be wrong on either. Or all. A more interesting question is: How do we know we’ve written an incorrect test? We find out we have the wrong tests in one or more ways:The Review: Someone looks at our test and code, and tells us we’re either testing the wrong thing, or that it isn’t the way to prove our code works. The Surprise: Our test surprisingly fails, where it should have passed. It can happen the other way too. The Smell: We have a feeling we’re on the wrong path. The test passes, but something feels wrong. The Facepalm: A penny drops, and we get  the “what the hell are you doing?” followed by “glad no one noticed” feelings.Once we know what is wrong we can easily fix it. But wouldn’t it be easier if we avoided this altogether? The easiest way is to involve someone. Pair programming helps avoid bugs and writing the wrong test. Obviously, in the Review a partner helps, but they can also avoid the Surprise, Smell and the Facepalm. And you don’t want them there when you have a Facepalm moment, right? One of the foundation of agile development is feedback. We all make mistakes. In agile development we acknowledge that, so we put brakes in place, such as pair programming, to identify problems early. Next up: Readability.Reference: Test Attributes #1: Validity from our JCG partner Gil Zilberfeld at the Geek Out of Water blog....
java-logo

Compounding double error

Overview In a previous article, I outlined why BigDecimal is not the answer most of the time. While it is possible to construct situations where double produces an error, it is also just as easy to construct situations were BigDecimal get an error. BigDecimal is easier to get right, but easier to get wrong. The anecdotal evidence is that junior developers don’t have as much trouble getting BigDecimal right as they do getting double with rounding right.  However, I am sceptical of this because in BigDecimal it is much easier for an error to go unnoticed as well. Lets take this example where double produces an incorrect answer. double d = 1.00; d /= 49; d *= 49 * 2; System.out.println("d=" + d);BigDecimal bd = BigDecimal.ONE; bd = bd .divide(BigDecimal.valueOf(49), 2, BigDecimal.ROUND_HALF_UP); bd = bd.multiply(BigDecimal.valueOf(49*2)); System.out.println("bd=" + bd); printsd=1.9999999999999998 bd=1.96In this case, double looks wrong, it needs rounding which would give the correct answer of 2.0. However the BigDecimal looks right, but it isn’t due to representation error. We could change the division to use more precision, but you will always get a representation error, though you can control how small that error is. You have to ensure numbers are real and use rounding. Even with BigDecimal, you have to use appropriate rounding. Lets say you have a loan for $1,000,000 and you apply 0.0005% interest per day. The account can only have a whole number of cents, so rounding is needed to make this a real amount of money. If don’t do this how long does it take to make a 1 cent difference?double interest = 0.0005; BigDecimal interestBD = BigDecimal.valueOf(interest);double amount = 1e6; BigDecimal amountBD = BigDecimal.valueOf(amount); BigDecimal amountBD2 = BigDecimal.valueOf(amount);long i = 0; do { System.out.printf("%,d: BigDecimal: $%s, BigDecimal: $%s%n", i, amountBD, amountBD2); i++; amountBD = amountBD.add(amountBD.multiply(interestBD) .setScale(2, BigDecimal.ROUND_HALF_UP)); amountBD2 = amountBD2.add(amountBD2.multiply(interestBD));} while (amountBD2.subtract(amountBD).abs() .compareTo(BigDecimal.valueOf(0.01)) < 0); System.out.printf("After %,d iterations the error was 1 cent and you owe %s%n", i, amountBD);prints finally8: BigDecimal: $1004007.00, BigDecimal: $1004007.00700437675043756250390625000000000000000 After 9 iterations the error was 1 cent and you owe 1004509.00You could round the result but this hide the fact you are off by a cent even though you used BigDecimal. double eventually has a representation error Even if you use appropriate rounding, double will give you an incorrect result. It is much later than the previous example. double interest = 0.0005; BigDecimal interestBD = BigDecimal.valueOf(interest); double amount = 1e6; BigDecimal amountBD = BigDecimal.valueOf(amount); long i = 0; do { System.out.printf("%,d: double: $%.2f, BigDecimal: $%s%n", i, amount, amountBD); i++; amount = round2(amount + amount * interest); amountBD = amountBD.add(amountBD.multiply(interestBD) .setScale(2, BigDecimal.ROUND_HALF_UP)); } while (BigDecimal.valueOf(amount).subtract(amountBD).abs() .compareTo(BigDecimal.valueOf(0.01)) < 0); System.out.printf("After %,d iterations the error was 1 cent and you owe %s%n", i, amountBD); prints finally22,473: double: $75636308370.01, BigDecimal: $75636308370.01 After 22,474 iterations the error was 1 cent and you owe 75674126524.20From an IT perspective we have an error of one cent, from a business perspective we have a client who has made no repayments for more than 9 years and owes the bank $75.6 billion, enough to bring down the bank. If only the IT guy had used BigDecimal!? Conclusion My final recommendation is that you should use what you feel comfortable with, don't forget about rounding, do use real numbers, not whatever the mathematics produces e.g. can I have fractions of a cent, or can I trade fractions of share.  Don't forget about the business perspective. You might find that BigDecimal makes more sense for your company, your project or your team. Don't assume BigDecimal is the only way, don't assume the problems double faces don't apply also to BigDecimal. BigDecimal is not a ticket to best practice coding, because complacency is a sure way of introducing errors.Reference: Compounding double error from our JCG partner Peter Lawrey at the Vanilla Java blog....
java-logo

JUnit: testing exception with Java 8 and Lambda Expressions

In JUnit there are many ways of testing exceptions in test code, including try-catch idiom, JUnit @Rule, with catch-exception library. As of Java 8 we have another way of dealing with exceptions: with lambda expressions. In this short blog post I will demonstrate a simple example how one can utilize the power of Java 8 and lambda expressions to test exceptions in JUnit. Note: The motivation for writing this blog post was the message published on the catch-exception project page:       Java 8’s lambda expressions will make catch-exception redundant. Therefore, this project won’t be maintained any longer SUT – System Under Test We will test exceptions thrown by the below 2 classes. The first one: class DummyService { public void someMethod() { throw new RuntimeException("Runtime exception occurred"); }public void someOtherMethod() { throw new RuntimeException("Runtime exception occurred", new IllegalStateException("Illegal state")); } } And the second: class DummyService2 { public DummyService2() throws Exception { throw new Exception("Constructor exception occurred"); }public DummyService2(boolean dummyParam) throws Exception { throw new Exception("Constructor exception occurred"); } } Desired Syntax My goal was to achieve syntax close to the one I had with catch-exception library: package com.github.kolorobot.exceptions.java8;import org.junit.Test; import static com.github.kolorobot.exceptions.java8.ThrowableAssertion.assertThrown;public class Java8ExceptionsTest {@Test public void verifiesTypeAndMessage() { assertThrown(new DummyService()::someMethod) // method reference // assertions .isInstanceOf(RuntimeException.class) .hasMessage("Runtime exception occurred") .hasNoCause(); }@Test public void verifiesCauseType() { assertThrown(() -> new DummyService().someOtherMethod(true)) // lambda expression // assertions .isInstanceOf(RuntimeException.class) .hasMessage("Runtime exception occurred") .hasCauseInstanceOf(IllegalStateException.class); }@Test public void verifiesCheckedExceptionThrownByDefaultConstructor() { assertThrown(DummyService2::new) // constructor reference // assertions .isInstanceOf(Exception.class) .hasMessage("Constructor exception occurred"); }@Test public void verifiesCheckedExceptionThrownConstructor() { assertThrown(() -> new DummyService2(true)) // lambda expression // assertions .isInstanceOf(Exception.class) .hasMessage("Constructor exception occurred"); }@Test(expected = ExceptionNotThrownAssertionError.class) // making test pass public void failsWhenNoExceptionIsThrown() { // expected exception not thrown assertThrown(() -> System.out.println()); } } Note: The advantage over catch-exception is that we will be able to test constructors that throw exceptions. Creating the ‘library’ Syntatic sugar assertThrown is a static factory method creating a new instance of ThrowableAssertion with a reference to caught exception. package com.github.kolorobot.exceptions.java8;public class ThrowableAssertion { public static ThrowableAssertion assertThrown(ExceptionThrower exceptionThrower) { try { exceptionThrower.throwException(); } catch (Throwable caught) { return new ThrowableAssertion(caught); } throw new ExceptionNotThrownAssertionError(); }// other methods omitted for now } The ExceptionThrower is a @FunctionalInterface which instances can be created with lambda expressions, method references, or constructor references. assertThrown accepting ExceptionThrower will expect and be ready to handle an exception. @FunctionalInterface public interface ExceptionThrower { void throwException() throws Throwable; } Assertions To finish up, we need to create some assertions so we can verify our expactions in test code regarding teste exceptions. In fact, ThrowableAssertion is a kind of custom assertion providing us a way to fluently verify the caught exception. In the below code I used Hamcrest matchers to create assertions. The full source of ThrowableAssertion class: package com.github.kolorobot.exceptions.java8;import org.hamcrest.Matchers; import org.junit.Assert;public class ThrowableAssertion {public static ThrowableAssertion assertThrown(ExceptionThrower exceptionThrower) { try { exceptionThrower.throwException(); } catch (Throwable caught) { return new ThrowableAssertion(caught); } throw new ExceptionNotThrownAssertionError(); }private final Throwable caught;public ThrowableAssertion(Throwable caught) { this.caught = caught; }public ThrowableAssertion isInstanceOf(Class<? extends Throwable> exceptionClass) { Assert.assertThat(caught, Matchers.isA((Class<Throwable>) exceptionClass)); return this; }public ThrowableAssertion hasMessage(String expectedMessage) { Assert.assertThat(caught.getMessage(), Matchers.equalTo(expectedMessage)); return this; }public ThrowableAssertion hasNoCause() { Assert.assertThat(caught.getCause(), Matchers.nullValue()); return this; }public ThrowableAssertion hasCauseInstanceOf(Class<? extends Throwable> exceptionClass) { Assert.assertThat(caught.getCause(), Matchers.notNullValue()); Assert.assertThat(caught.getCause(), Matchers.isA((Class<Throwable>) exceptionClass)); return this; } } AssertJ Implementation In case you use AssertJ library, you can easily create AssertJ version of ThrowableAssertion utilizing org.assertj.core.api.ThrowableAssert that provides many useful assertions out-of-the-box. The implementation of that class is even simpler than with Hamcrestpresented above. package com.github.kolorobot.exceptions.java8;import org.assertj.core.api.Assertions; import org.assertj.core.api.ThrowableAssert;public class AssertJThrowableAssert { public static ThrowableAssert assertThrown(ExceptionThrower exceptionThrower) { try { exceptionThrower.throwException(); } catch (Throwable throwable) { return Assertions.assertThat(throwable); } throw new ExceptionNotThrownAssertionError(); } } An example test with AssertJ: public class AssertJJava8ExceptionsTest { @Test public void verifiesTypeAndMessage() { assertThrown(new DummyService()::someMethod) .isInstanceOf(RuntimeException.class) .hasMessage("Runtime exception occurred") .hasMessageStartingWith("Runtime") .hasMessageEndingWith("occurred") .hasMessageContaining("exception") .hasNoCause(); } } Summary With just couple of lines of code, we built quite cool code helping us in testing exceptions in JUnit without any additional library. And this was just a start. Harness the power of Java 8 and lambda expressions! ResourcesSource code for this article is available on GitHub (have a look at com.github.kolorobot.exceptions.java8 package) Some other articles of mine about testing exceptions in JUnit. Please have a look:Custom assertions Catch-exception library Junit @Rule: beyond basics Different ways of testing exceptions in JunitReference: JUnit: testing exception with Java 8 and Lambda Expressions from our JCG partner Rafal Borowiec at the Codeleak.pl blog....
java-logo

6 Reasons Not to Switch to Java 8 Just Yet

Java 8 is awesome. Period. But… after we had the chance to have fun and play around with it, the time has come to quit avoiding the grain of salt. All good things come with a price and in this post I will share the main pain points of Java 8. Make sure you’re aware of these before upgrading and letting go of 7. 1. Parallel Streams can actually slow you down Java 8 brings the promise of parallelism as one of the most anticipated new features. The .parallelStream() method implements this on collections and streams. It breaks them into subproblems which then run on separate threads for processing, these can go to different cores and then get combined when they’re done. This all happens under the hood using the fork/join framework. Ok, sounds cool, it must speed up operations on large data sets in multi-core environments, right? No, it can actually make your code run slower if not used right. Some 15% slower on this benchmark we ran, but it could be even worse. Let’s say we’re already running multiple threads and we’re using .parallelStream() in some of them, adding more and more threads to the pool. This could easily turn into more than our cores could handle, and slow everything down due to increased context switching. The slower benchmark, grouping a collection into different groups (prime / non-prime): Map<Boolean, List<Integer>> groupByPrimary = numbers .parallelStream().collect(Collectors.groupingBy(s -> Utility.isPrime(s))); More slowdowns can occur for other reasons as well. Consider this, let’s say we have multiple tasks to complete and one of them takes much longer than the others for some reason. Breaking it down with .parallelStream() could actually delay the quicker tasks from being finished and the process as a whole. Check out this post by Lukas Krecan for more examples and code samples. Diagnosis: Parallelism with all its benefits also brings in additional types of problems to consider. When already acting in a multi-threaded environment, keep this in mind and get yourself familiar with what’s going on behind the scenes. 2. The flip-side of Lambda Expressions Lambdas. Oh, lambdas. We can do pretty much everything we already could without you, but you add so much grace and get rid of boilerplate code so it’s easy to fall in love. Let’s say I rise up in the morning and want to iterate over a list of world cup teams and map their lengths (Fun fact: it sums up to 254): List lengths = new ArrayList();for (String countries : Arrays.asList(args)) { lengths.add(check(country)); } Now let’s get functional with a nice lambda: Stream lengths = countries.stream().map(countries -> check(country)); Baam! That’s super. Although… while mostly seen as a positive thing, adding new elements like lambdas to Java pushes it further away from its original specification. The bytecode is fully OO and with lambdas in the game, the distance between the actual code and runtime grows larger. Read more about the dark side of lambda expression on this post by Tal Weiss. On the bottom line this all means that what you’re writing and what you’re debugging are two different things. Stack traces grow larger and larger and make it harder to debug your code. Something simple like adding an empty string to list turns this short stack trace: at LmbdaMain.check(LmbdaMain.java:19) at LmbdaMain.main(LmbdaMain.java:34) Into this: at LmbdaMain.check(LmbdaMain.java:19) at LmbdaMain.lambda$0(LmbdaMain.java:37) at LmbdaMain$$Lambda$1/821270929.apply(Unknown Source) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:512) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:502) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.LongPipeline.reduce(LongPipeline.java:438) at java.util.stream.LongPipeline.sum(LongPipeline.java:396) at java.util.stream.ReferencePipeline.count(ReferencePipeline.java:526) at LmbdaMain.main(LmbdaMain.java:39) Another issue that lambdas raise has to do with overloading: since lambda arguments have to be cast into something when using them to call a method, and they can be cast to multiple types, it may cause ambiguous calls in some cases. Lukas Eder explains this with code samples right here. Diagnosis: Just stay aware of this, the traces might be a pain from time to time, but it will not keep us away from them precious lambdas. 3. Default Methods are distracting Default methods enable a default implementation of a function in the interface itself. This is definitely one of the coolest new features Java 8 brings to the table but it somewhat interferes with the way we used to do things. So why was this introduced anyway? And what not to do with it? The main motivation behind Default Methods was that if at some point we need to add a method to an existing interface, we could do this without rewriting the implementation. Making it compatible with older versions. For example, take this piece of code from Oracle’s Java Tutorials where they add an ability to specify a timezone: public interface TimeClient { // ... static public ZoneId getZoneId (String zoneString) { try { return ZoneId.of(zoneString); } catch (DateTimeException e) { System.err.println("Invalid time zone: " + zoneString + "; using default time zone instead."); return ZoneId.systemDefault(); } }default public ZonedDateTime getZonedDateTime(String zoneString) { return ZonedDateTime.of(getLocalDateTime(), getZoneId(zoneString)); } } And that’s it, problem solved. Or is it? Default Methods mix up a bit the separation of interface and implementation. In the wrong hands, As if type hierarchies don’t tend to tangle up on their own, there’s this new creature now that we need to tame. Read more about it on Oleg Shelajev’s post on RebelLabs. Diagnosis: When you hold a hammer everything looks like a nail, keep in mind to stick to their original use case, evolution of an existing interface when a refactor to introduce a new abstract class doesn’t make sense. Moving on to some things that are either missing, still with us or not exactly there yet: 4. Wherefore art thou Jigsaw? Project Jigsaw’s goal is to make Java modular and break the JRE to interoperable components. The motivation behind this first comes from a desire for a better, faster and stronger Java embedded. I’m trying to avoid mentioning the “Internet of Things”, but there I said it. Reduced JAR sizes, performance improvements and increased security are some more of the promises this ambitious project holds. So where is it? Jigsaw entered Phase 2 just recently, passed the exploratory phase and is now switching gears to a production quality design and implementation, says Mark Reinhold, Oracle’s Chief Java Architect. The project was first planned to be completed in Java 8 and was deferred to Java 9, expected to be one of its flagship new features. Diagnosis: If this is the main thing that you’re waiting for, Java 9 is due in 2016. In the meantime, take a closer look and maybe even get involved in the Jigsaw-dev mailing list. 5. Issues that are still around Checked Exceptions No one likes boilerplate code, that’s one of the reasons why lambdas got so popular. Thinking of boilerplate exceptions, regardless of whether or not you logically need to catch or have something to do with a checked exception, you still need to catch it. Even if it’s something that would never happen, like this exception that will never fire: try { httpConn.setRequestMethod("GET"); } catch (ProtocolException pe) { /* Why don’t you call me anymore? */ } Primitives They are still here, and it’s a pain to use them right. The one thing that separates Java from being a pure Object Oriented language, criticized to have no significant performance hit for their removal. None of the new JVM languages has them, just saying. Operator Overloading James Gosling, the father of Java, once said in an interview “I left out operator overloading as a fairly personal choice because I had seen too many people abuse it in C++”. Kind of makes sense but there are lots of split opinions around this. Other JVM languages do offer this feature but on the other hand, it could result in code that looks like this: javascriptEntryPoints <<= (sourceDirectory in Compile)(base => ((base / "assets" ** "*.js") --- (base / "assets" ** "_*")).get ) An actual line of code from the Scala Play Framework, ahm, I’m a bit dizzy now. Diagnosis: Are these real problems anyway? We all have our quirks and these are some of Java’s. A surprise might happen in future versions and it will change, but backwards compatibility among other things is keeping them right here with us. 6. Functional Programming – not quite there yet Functional programming has been possible with Java before, although it is pretty awkward. Java 8 improves on this with lambdas among other things. It’s most welcome but not as huge of a shift that was earlier portrayed. Definitely more elegant than in Java 7 but some bending over backwards is still needed to be truly functional. One of the most fierce reviews on this matter comes from Pierre-yves Saumont where in a series of posts he takes a close look at the differences between functional programing paradigms and the way to implement them in Java. So Java or Scala? The adoption of more functional modern paradigms in Java is a sign of approval for Scala who has been playing with lambdas for a while now. Lambdas do make a lot of noise, but there’s a lot more features like traits, lazy evaluation and immutables to name a few, that make quite a difference. Diagnosis: Don’t be distracted by the lambdas, functional programming is still a hassle in Java 8.Reference: 6 Reasons Not to Switch to Java 8 Just Yet from our JCG partner Alex Zhitnitsky at the Takipi blog....
arquillian-logo

RxJava + Java8 + Java EE 7 + Arquillian = Bliss

Microservices are an architectural style where each service is implemented as an independent system. They can use their own persistence system (although it is not mandatory), deployment, language, … Because a system is composed by more than one service, each service will communicate with other services, typically using a lightweight protocol like HTTP and following a Restful Web approach. You can read more about microservices here: http://martinfowler.com/articles/microservices.html Let’s see a really simple example. Suppose we have a booking shop where users can navigate through a catalog and when they find a book which they want to see more information, they click on the isbn, and then a new screen is opened with detailed information of the book and comments about it written by readers. This system may be composed by two services:One service to get book details. They could be retrieved from any legacy system like a RDBMS. One service to get all comments written in a book and in this case that information could be stored in a document base database.The problem here is that for each request that a user does we need to open two connections, one for each service. Of course we need a way do that jobs in parallel to improve the performance. And here lies one problem, how we can deal with this asynchronous requests? The first idea is to use Future class. For two services may be good but if you require four or five services the code will become more and more complex, or for example you may need to get data from one service and using it in another services or adapt the result of one service to be the input of another one. So there is a cost of management of threads and synchronization. It will be awesome to have some way to deal with this problem in a clean and easy way. And this is exactly what RxJava does. RxJava is a Java VM implementation of Reactive Extensions: a library for composing asynchronous and event-based programs by using observable sequences. With RxJava instead of pulling data from an structure, data is pushed to it which reacts with an event that are listened by a subscriber and acts accordantly. You can find more information in https://github.com/Netflix/RxJava. So in this case what we are going to implement is the example described here using RxJava, Java EE 7, Java 8 and Arquillian for testing. This post assumes you know how to write Rest services using Java EE specification. So let’s start with two services: @Singleton @Path("bookinfo") public class BookInfoService {@GET @Path("{isbn}") @Produces(MediaType.APPLICATION_JSON) @Consumes(MediaType.APPLICATION_JSON) public JsonObject findBookByISBN(@PathParam("isbn") String isbn) {return Json.createObjectBuilder() .add("author", "George R.R. Martin") .add("isbn", "1111") .add("title", "A Game Of Thrones").build(); }} @Singleton @Path("comments") public class CommentsService {@GET @Path("{isbn}") @Produces(MediaType.APPLICATION_JSON) public JsonArray bookComments(@PathParam("isbn") String isbn) {return Json.createArrayBuilder().add("Good Book").add("Awesome").build();}} @ApplicationPath("rest") public class ApplicationResource extends Application { } And finally it is time to create a third facade service which receives communication from the client, sends to both services in parallel a request and finally zip both responses. zip is the process of combining sets of items emitted together via a specified function and sent it back to client (not to be confused with compression!). @Singleton @Path("book") public class BookService {private static final String BOOKSERVICE = "http://localhost:8080/bookservice"; private static final String COMMENTSERVICE = "http://localhost:8080/bookcomments";@Resource(name = "DefaultManagedExecutorService") ManagedExecutorService executor;Client bookServiceClient; WebTarget bookServiceTarget;Client commentServiceClient; WebTarget commentServiceTarget;@PostConstruct void initializeRestClients() {bookServiceClient = ClientBuilder.newClient(); bookServiceTarget = bookServiceClient.target(BOOKSERVICE + "/rest/bookinfo");commentServiceClient = ClientBuilder.newClient(); commentServiceTarget = commentServiceClient.target(COMMENTSERVICE + "/rest/comments");}@GET @Path("{isbn}") @Produces(MediaType.APPLICATION_JSON) public void bookAndComment(@Suspended final AsyncResponse asyncResponse, @PathParam("isbn") String isbn) { //RxJava code shown below } } Basically we create a new service. In this case URLs of both services we are going to connect are hardcoded. This is done for academic purpose but in production-like code you will inject it from a producer class or from properties file or any system you will use for this purpose. Then we create javax.ws.rs.client.WebTarget for consuming Restful Web Service. After that we need to implement the bookAndComment method using RxJava API. The main class used in RxJava is rx.Observable. This class is an observable as his name suggest and it is the responsible of firing events for pushing objects. By default events are synchronous and it is responsible of developer to make them asynchronous. So we need one asynchronous observable instance for each service: public Observable<JsonObject> getBookInfo(final String isbn) { return Observable.create((Observable.OnSubscribe<JsonObject>) subscriber -> {Runnable r = () -> { subscriber.onNext(bookServiceTarget.path(isbn).request().get(JsonObject.class)); subscriber.onCompleted(); };executor.execute(r);}); } Basically we create an Observable that will execute the specified function when a Subscriber subscribes to it. The function is created using a lambda expression to avoid creating nested inner classes. In this case we are returning a JsonObject as a result of calling the bookinfo service. The result is passed to onNext method so subscribers can receive the result. Because we want to execute this logic asynchronously, the code is wrapped inside a Runnable block. Also it is required to call the onCompleted method when all logic is done. Notice that because we want to make observable asynchronous apart of creating a Runnable, we are using an Executor to run the logic in separate thread. One of the great additions in Java EE 7 is a managed way to create threads inside a container. In this case we are using ManagedExecutorService provided by container to span a task asynchronously in a different thread of the current one. public Observable<JsonArray> getComments(final String isbn) { return Observable.create((Observable.OnSubscribe<JsonArray>) subscriber -> {Runnable r = () -> { subscriber.onNext(commentServiceTarget.path(isbn).request().get(JsonArray.class)); subscriber.onCompleted(); };executor.execute(r);}); } Similar to previous but instead of getting book info we are getting an array of comments. Then we need to create an observable in charge of zipping both responses when both of them are available. And this is done by using zip method on Observable class which receives two Observables and applies a function to combine the result of both of them. In this case a lambda expression that creates a new json object appending both responses. @GET @Path("{isbn}") @Produces(MediaType.APPLICATION_JSON) public void bookAndComment(@Suspended final AsyncResponse asyncResponse, @PathParam("isbn") String isbn) { //Calling previous defined functions Observable<JsonObject> bookInfo = getBookInfo(isbn); Observable<JsonArray> comments = getComments(isbn);Observable.zip(bookInfo, comments, (JsonObject book, JsonArray bookcomments) -> Json.createObjectBuilder().add("book", book).add("comments", bookcomments).build() ) .subscribe(new Subscriber<JsonObject>() { @Override public void onCompleted() { } @Override public void onError(Throwable e) { asyncResponse.resume(e); }@Override public void onNext(JsonObject jsonObject) { asyncResponse.resume(jsonObject); } }); } Let’s take a look of previous service. We are using one of the new additions in Java EE which is Jax-Rs 2.0 asynchronous REST endpoints by using @Suspended annotation. Basically what we are doing is freeing server resources and generating the response when it is available using the resume method. And finally a test. We are using Wildfly 8.1 as Java EE 7 server and Arquillian. Because each service may be deployed in different server, we are going to deploy each service in different war but inside same server. So in this case we are going to deploy three war files which is totally easy to do it in Arquillian. @RunWith(Arquillian.class) public class BookTest {@Deployment(testable = false, name = "bookservice") public static WebArchive createDeploymentBookInfoService() { return ShrinkWrap.create(WebArchive.class, "bookservice.war").addClasses(BookInfoService.class, ApplicationResource.class); }@Deployment(testable = false, name = "bookcomments") public static WebArchive createDeploymentCommentsService() { return ShrinkWrap.create(WebArchive.class, "bookcomments.war").addClasses(CommentsService.class, ApplicationResource.class); }@Deployment(testable = false, name = "book") public static WebArchive createDeploymentBookService() { WebArchive webArchive = ShrinkWrap.create(WebArchive.class, "book.war").addClasses(BookService.class, ApplicationResource.class) .addAsLibraries(Maven.resolver().loadPomFromFile("pom.xml").resolve("com.netflix.rxjava:rxjava-core").withTransitivity().as(JavaArchive.class)); return webArchive; }@ArquillianResource URL base;@Test @OperateOnDeployment("book") public void should_return_book() throws MalformedURLException {Client client = ClientBuilder.newClient(); JsonObject book = client.target(URI.create(new URL(base, "rest/").toExternalForm())).path("book/1111").request().get(JsonObject.class);//assertions } } In this case client will request all information from a book. In server part zip method will wait until book and comments are retrieved in parallel and then will combine both responses to a single object and sent back to client. This is a very simple example of RxJava. In fact in this case we have only seen how to use zip method, but there are many more methods provided by RxJava that are so useful as well like take(), map(), merge(), … (https://github.com/Netflix/RxJava/wiki/Alphabetical-List-of-Observable-Operators) Moreover in this example we have seen only an example of connecting to two services and retrieving information in parallel, and you may wonder why not to use Future class. It is totally fine to use Future and Callbacks in this example but probably in your real life your logic won’t be as easy as zipping two services. Maybe you will have more services, maybe you will need to get information from one service and then for each result open a new connection. As you can see you may start with two Future instances but finishing with a bunch of Future.get() methods, timeouts, … So it is in these situations where RxJava really simplify the development of the application. Furthermore we have seen how to use some of the new additions of Java EE 7 like how to develop an asynchronous Restful service with Jax-Rs. In this post we have learnt how to deal with the interconnection between services andhow to make them scalable and less resource consume. But we have not talked about what’s happening when one of these services fails. What’s happening with the callers? Do we have a way to manage it? Is there a way to not spent resources when one of the service is not available? We will touch this in next post talking about fault tolerance. We keep learning, Alex.Bon dia, bon dia! Bon dia al dematí! Fem fora la mandra I saltem corrents del llit. (Bon Dia! – Dàmaris Gelabert)Reference: RxJava + Java8 + Java EE 7 + Arquillian = Bliss from our JCG partner Alex Soto at the One Jar To Rule Them All blog....
devops-logo

Configuring Chef part 1

Below are the first steps in getting started with using chef. The three main components of chef are :                  Work station This is the developer’s machine will be used to author cookbooks and recipes and upload them to the chef-server using the command line utility called knife.Chef-Server This is the main server on which all the cookbooks, roles, policies are uploaded.Node This is the instance which would be provisioned by applying the cookbooks uploaded on the chef-server.So, lets get started:Set up the workstationinstall Chef in your workstation. To do that follow here: http://www.getchef.com/chef/install/Use hosted chef as chef-serverRegister on chef on the chef’s site at http://www.getchef.com You can use hosted Chef, it gives you the option to manage upto 5 nodes for free. Create your user and an organisation.In order to authenticate your workstation with the chef-server we would need these 3 things: -[validator].PEM -knife.rb -[username].PEM So, you need to download these 3 items in your workstation. (You can try reset keys option or download the starter kit.)Set up chef-repo in the workstationOpen your workstation, go to the folder which you want to be your base folder for writing cookbooks. Download the chef-repo from opscode git repo or use the starter kit provided on the chef site. Put these 3 files in your .chef folder inside the chef-repo folder in your workstation (Create .chef, if not already present).Now your workstation is set, authenticated with chef-server and your chef-repo is configured. So lets begin configuring a node on which the cookbooks would be applied.Setting up the nodeThe node could be an EC2 instance or could be provided by any other cloud provider or a vm. The first step is to bootstrap it.Bootstrap any instanceknife bootstrap [ip-address] --sudo -x [user-name] -P [password] -N "[node name]" Or for an AWS instance: knife bootstrap [AWS external IP] --sudo -x ec2-user -i [AWS key] -N "awsnode" These are things that happen during the bootstraping : 1.) Installs chef client and OHAI on the node 2.) Establishes authentication for ssh keys. 3.) Send the 3 keys to chef-client Once the node is bootstrapped, Its now time to author some cookbooks to apply on the node.Download a cookbookWe will download an already existing cookbook of apache webserver, using the following knife command (Remember all the knife commands should be executed from the base chef-repo directory).knife cookbook site download apache This will download the tar.gz zipped folder in your chef-repo, We will need to unzip and copy it to the cookbooks folder. (After unzipping it remove the zipped file) (use tar -xvf [file], then mv command) mv apache ../chef-repo/cookbooks Inside the apache folder we can find the “recipes” folder and inside that there is a file called as “default.rb” This “default.rb” ruby file contains the default recipe required to configure the apache server. Lets have a look at an excerpt from it. .... package "httpd" do action :install end .... So this cookbook is defining the default action on application of this recipe to be “install”, this will install the apache webserver on the node. More details about these we will cover in the next blog, for now lets just upload this coookbook.Upload a cookbook to the chef-serverknife cookbook upload apache Now, the cookbook is uploaded on to the chef-server. Once chef-server has the cookbook we can apply it to any of the nodes which are configured with the chef-server. First lets find what all nodes we have.To see all my nodesknife node listApply the run-list to the nodeIn order to apply the cookbook to a given node , we need to add it to the run-list of the node: knife node run_list add node-name "recipe[apache]" Now we have successfully uploaded a cookbook and added it to the run-list of a node with alias “node-name”. Next time when chef-client will run on the node, it will fetch the details of its run-list from the chef-server and download any cookbook required from the chef-server and run it. For now, lets ssh into the node and run the chef-client manualy to see the results.Run chef-client on the nodesudo chef-client If the chef-client run is successful, we can hit the IP address of the instance to see the default page of apache up and running. If you are using AWS, don’t forget to open the port 80. This was just a basic introduction to chef, in the next blog we will see the killer feature of chef, which is search and go into the details of node object, roles, environments.Reference: Configuring Chef part 1 from our JCG partner Anirudh Bhatnagar at the anirudh bhatnagar blog....
software-development-2-logo

Why You Need a Strategic Data Service

It’s no longer even a question that data is a strategic advantage. Every business is a data business now, and it’s no longer sufficient to store and archive data, you need to be able to act on it: protect, nurture, develop, buy and sell it. Billion-dollar businesses are built around it. But many businesses are running into the reality that their legacy platforms are not built to treat data as such a valuable asset. We continually see companies that are boxed out of opportunities because of software design decisions made years ago without the foresight to anticipate this trend. If you refer back to classic software design principles and best practices you’ll see blueprints for building data layer abstractions and compartmentalizing data functionality from the rest of the system. Yet to this day I see developers questioning why these abstractions are needed–wondering what the payoff is. But the day of reckoning is either here or approaching fast for most companies, and if you don’t have properly constructed data architecture it won’t be capable of supporting the business as it responds to this transition. Based on what I’ve seen, here are my observations on why data services are necessary for just about every business today.  Multiple Data Stores are Key One of the main reasons why any software abstraction exists is to allow you to easily swap one component out for another. You may outgrow a database or realize new business requirements that are outside the capabilities of your current solution, and have to switch. Writing your software to an interface whose underlying implementation can be swapped out allows you to do this. This is called decoupling and it’s just good software design.But we’re entering a world of data store specialization now. Different data stores have unique reasons for being, and they’re good at different things. Some exist for the sole purpose of storing very specific types of data and doing specific things with it. Eventually you’ll probably want or need to use those unique capabilities as a competitive advantage or even key value proposition. We’re seeing a diversification of data sources in the marketplace, particularly in the open source world. Data stores have specialties now. And you will probably want to use more than one of them at some point, if not now.   A perfect example use case for this is a message in a social network. This piece of data has a number of potential uses, and not all of them are easily achievable using a single data store. But that’s ok, because you’re decoupled (right?). Now you can record the message in your social graph database so that you can cluster users by interest and predict relationships. You can search for the message later, after you’ve written it to your distributed search data store, which is perfect for that. And you can do analytics, trending, and dashboards on top of your relational database, which holds the output of your machine learning models. Aside from just features, from a technical perspective you’ll often have to trade off between the consistency, availability, and partition tolerance from the CAP algorithm. So far, no one data store has been able to have its cake and eat it too–but with a Data Service, you CAN. Service with a Smile… A properly built data abstraction layer will probably end up being a stateful service (as opposed to stateless services, which don’t really do anything on its own). These services stand alone in your architecture, components that are capable of talking with other components – and having their own behavior. This comes in VERY handy when dealing with data. For example, some data stores will require you to verify write persistence after the fact if you care about availability. If your service stands on its own it can do this work at the appropriate time, transparently to whatever or whomever is using it. Or, you might want to mine the data as it comes in by having the Data Service pipe the data to machine learning models to categorize it or do sentiment analysis. Maybe you want to look up customer demographic data in Census data based on their location and predict income level using that information.  For implementing this type of data-related behavior, I’m a huge fan of using an actor system in a Data Service. Your Data Service can host an actor system (or whatever executes your workflow logic) to handle your entire data workflow–ensuring availability, mining, transmitting, whatever you need to do with it. You will eventually want to take data you receive and enrich it (if you don’t already today): geolocate it, classify it, compute on it, raise alerts, and so on. For example, you may want to take transactional data as it comes into the system and roll it up at different intervals so that you can run machine learning models on it to predict future trends. This is the perfect place to do it. The brains of your data service doesn’t have to be an actor model, there are plenty of other options out there for carrying out data work. Hadoop is a classic example, but newcomers like Spark and Storm will accomplish many of the same things. Most of these frameworks have hooks available to extend them, which is super important if they’re going to serve you well into the future. (Again, thought, the face that it’s compartmentalized into a Data Service will let you use even more than one of these if you need to.) The key is that the data processing and workflow should be controlled and orchestrated by the Data Service itself–the users of the Data Service shouldn’t need to worry about what happens to the data, they should just have the ability to get the data in and read it back out in some form. If you like it then you shoulda put an API on it Want to be a pure-play data company? These are the companies who only provide a public API and don’t have to support a complex user interface. Having a properly-built Data Service allows you to do this very easily. Many application frameworks will let you turn an interface into a standards-compliant REST API with almost zero work. Just stand up the service in a Web server and let the framework look at it and turn it into an API. Even if your company isn’t selling the API outright your customers will surely love it, if not demand it.It’s always disappointing to see companies building an API as a separate project when they could have had it almost for nothing. It’s an indication of a code base that wasn’t properly built in the first place–technical debt that has to be addressed before the business can move forward. The End? Certainly these are not the only reasons to locate your Data Service centrally in your architectural blueprint. (But seriously, you need more reasons?) I’d love to hear your comments and thoughts on this in the Hacker News thread.Reference: Why You Need a Strategic Data Service from our JCG partner Jason Kolb at the Jason Kolb blog blog....
software-development-2-logo

Seriously. The Devil Made me do It!

Just as eternal as the cosmic struggle between good and evil is the challenge between our two natures. Religion aside, we have two natures, the part of us that:thinks things through; make good or ethical decisions a.k.a. our angelic nature react immediately; make quick but often wrong decisions a.k.a. our devil natureGuess the powers that be left a bug in our brains so that it emphasizes fast decisions over good / ethical decisions. Quite often we make sub-optimal or ethically ambiguous decisions under pressure. You decide…  Situation: Your manager comes to you and says that something urgent needs to be fixed right away. Turns out the steaming pile of @#$%$ that you inherited from Bob is malfunctioning again. Of course Bob created the mess and then conveniently left the company; in fact, the code is so bad that the work-arounds have work-arounds. Bite the bullet, start re-factoring the program when things goes wrong.  It will take more time up front, but over time the program will become stable. Find another fast workaround and defer the problem to the future.  Find a good reason why the junior member of the team should inherit this problem.  Situation: You’ve got a challenging section of code to write and not much time to write it. Get away from the computer, think things through. Get input from your peers, maybe they have seen this problem before. Then plan the pathways out and write the code once cleanly. Taking time to plan seems counter intuitive, but it will save time. (see Not Planning is for Losers) Naw, just sit at the keyboard and bang it out already. How difficult can it be?  Situation: The project is late and you know that your piece is behind schedule. However, you also know that several other pieces are late as well. Admit that you are late and that the project can’t finish by the deadline.  Give the project manager and senior managers a chance to make a course correction. Say that you are on schedule but you are not sure that other people (be vague here) will have their pieces ready on time and it could cause you to become late. This situation is also known as Schedule Chicken…  Situation: You have been asked to estimate how long a critical project will take. You are only been given a short time to come up with the estimate. Tell the project manager that getting a proper estimate takes longer than a few hours. Without proper estimates the project is likely to be severely underestimated and this will come back to bite you and the project manager in the @$$.  (See Who needs Formal Measurement?) Tell the project manager exactly the date that senior management wants the project to be finished by.  You know this is what they want to hear, why deal with the problem now? This will become the project manager’s problem when the project is late.  The statistics show that we don’t listen to our better (angelic?) natures very often. So when push comes to shove and you have to make a sub-optimal or less than ethical decision, just remember: The devil made you do it! Run into other common situations? email meReference: Seriously. The Devil Made me do It! from our JCG partner Dalip Mahal at the Accelerated Development blog....
apache-cassandra-logo

Custom Cassandra Data Types

In the blog post Connecting to Cassandra from Java, I mentioned that one advantage for Java developers of Cassandra being implemented in Java is the ability to create custom Cassandra data types. In this post, I outline how to do this in greater detail. Cassandra has numerous built-in data types, but there are situations in which one may want to add a custom type. Cassandra custom data types are implemented in Java by extending the org.apache.cassandra.db.marshal.AbstractType class. The class that extends this must ultimately implement three methods with the following signatures:     public ByteBuffer fromString(final String) throws MarshalException public TypeSerializer getSerializer() public int compare(Object, Object) This post’s example implementation of AbstractType is shown in the next code listing. UnitedStatesState.java – Extends AbstractType package dustin.examples.cassandra.cqltypes;import org.apache.cassandra.db.marshal.AbstractType; import org.apache.cassandra.serializers.MarshalException; import org.apache.cassandra.serializers.TypeSerializer;import java.nio.ByteBuffer;/** * Representation of a state in the United States that * can be persisted to Cassandra database. */ public class UnitedStatesState extends AbstractType { public static final UnitedStatesState instance = new UnitedStatesState();@Override public ByteBuffer fromString(final String stateName) throws MarshalException { return getStateAbbreviationAsByteBuffer(stateName); }@Override public TypeSerializer getSerializer() { return UnitedStatesStateSerializer.instance; }@Override public int compare(Object o1, Object o2) { if (o1 == null && o2 == null) { return 0; } else if (o1 == null) { return 1; } else if (o2 == null) { return -1; } else { return o1.toString().compareTo(o2.toString()); } }/** * Provide standard two-letter abbreviation for United States * state whose state name is provided. * * @param stateName Name of state whose abbreviation is desired. * @return State's abbreviation as a ByteBuffer; will return "UK" * if provided state name is unexpected value. */ private ByteBuffer getStateAbbreviationAsByteBuffer(final String stateName) { final String upperCaseStateName = stateName != null ? stateName.toUpperCase().replace(" ", "_") : "UNKNOWN"; String abbreviation; try { abbreviation = upperCaseStateName.length() == 2 ? State.fromAbbreviation(upperCaseStateName).getStateAbbreviation() : State.valueOf(upperCaseStateName).getStateAbbreviation(); } catch (Exception exception) { abbreviation = State.UNKNOWN.getStateAbbreviation(); } return ByteBuffer.wrap(abbreviation.getBytes()); } } The above class listing references the State enum, which is shown next. State.java package dustin.examples.cassandra.cqltypes;/** * Representation of state in the United States. */ public enum State { ALABAMA("Alabama", "AL"), ALASKA("Alaska", "AK"), ARIZONA("Arizona", "AZ"), ARKANSAS("Arkansas", "AR"), CALIFORNIA("California", "CA"), COLORADO("Colorado", "CO"), CONNECTICUT("Connecticut", "CT"), DELAWARE("Delaware", "DE"), DISTRICT_OF_COLUMBIA("District of Columbia", "DC"), FLORIDA("Florida", "FL"), GEORGIA("Georgia", "GA"), HAWAII("Hawaii", "HI"), IDAHO("Idaho", "ID"), ILLINOIS("Illinois", "IL"), INDIANA("Indiana", "IN"), IOWA("Iowa", "IA"), KANSAS("Kansas", "KS"), LOUISIANA("Louisiana", "LA"), MAINE("Maine", "ME"), MARYLAND("Maryland", "MD"), MASSACHUSETTS("Massachusetts", "MA"), MICHIGAN("Michigan", "MI"), MINNESOTA("Minnesota", "MN"), MISSISSIPPI("Mississippi", "MS"), MISSOURI("Missouri", "MO"), MONTANA("Montana", "MT"), NEBRASKA("Nebraska", "NE"), NEVADA("Nevada", "NV"), NEW_HAMPSHIRE("New Hampshire", "NH"), NEW_JERSEY("New Jersey", "NJ"), NEW_MEXICO("New Mexico", "NM"), NORTH_CAROLINA("North Carolina", "NC"), NORTH_DAKOTA("North Dakota", "ND"), NEW_YORK("New York", "NY"), OHIO("Ohio", "OH"), OKLAHOMA("Oklahoma", "OK"), OREGON("Oregon", "OR"), PENNSYLVANIA("Pennsylvania", "PA"), RHODE_ISLAND("Rhode Island", "RI"), SOUTH_CAROLINA("South Carolina", "SC"), SOUTH_DAKOTA("South Dakota", "SD"), TENNESSEE("Tennessee", "TN"), TEXAS("Texas", "TX"), UTAH("Utah", "UT"), VERMONT("Vermont", "VT"), VIRGINIA("Virginia", "VA"), WASHINGTON("Washington", "WA"), WEST_VIRGINIA("West Virginia", "WV"), WISCONSIN("Wisconsin", "WI"), WYOMING("Wyoming", "WY"), UNKNOWN("Unknown", "UK");private String stateName;private String stateAbbreviation;State(final String newStateName, final String newStateAbbreviation) { this.stateName = newStateName; this.stateAbbreviation = newStateAbbreviation; }public String getStateName() { return this.stateName; }public String getStateAbbreviation() { return this.stateAbbreviation; }public static State fromAbbreviation(final String candidateAbbreviation) { State match = UNKNOWN; if (candidateAbbreviation != null && candidateAbbreviation.length() == 2) { final String upperAbbreviation = candidateAbbreviation.toUpperCase(); for (final State state : State.values()) { if (state.stateAbbreviation.equals(upperAbbreviation)) { match = state; } } } return match; } } We can also provide an implementation of the TypeSerializer interface returned by the getSerializer() method shown above. That class implementing TypeSerializer is typically most easily written by extending one of the numerous existing implementations of TypeSerializer that Cassandra provides in the org.apache.cassandra.serializers package. In my example, my custom Serializer extends AbstractTextSerializer and the only method I need to add has the signature public void validate(final ByteBuffer bytes) throws MarshalException. Both of my custom classes need to provide a reference to an instance of themselves via static access. Here is the class that implements TypeSerializer via extension of AbstractTypeSerializer: UnitedStatesStateSerializer.java – Implements TypeSerializer package dustin.examples.cassandra.cqltypes;import org.apache.cassandra.serializers.AbstractTextSerializer; import org.apache.cassandra.serializers.MarshalException;import java.nio.ByteBuffer; import java.nio.charset.StandardCharsets;/** * Serializer for UnitedStatesState. */ public class UnitedStatesStateSerializer extends AbstractTextSerializer { public static final UnitedStatesStateSerializer instance = new UnitedStatesStateSerializer();private UnitedStatesStateSerializer() { super(StandardCharsets.UTF_8); }/** * Validates provided ByteBuffer contents to ensure they can * be modeled in the UnitedStatesState Cassandra/CQL data type. * This allows for a full state name to be specified or for its * two-digit abbreviation to be specified and either is considered * valid. * * @param bytes ByteBuffer whose contents are to be validated. * @throws MarshalException Thrown if provided data is invalid. */ @Override public void validate(final ByteBuffer bytes) throws MarshalException { try { final String stringFormat = new String(bytes.array()).toUpperCase(); final State state = stringFormat.length() == 2 ? State.fromAbbreviation(stringFormat) : State.valueOf(stringFormat); } catch (Exception exception) { throw new MarshalException("Invalid model cannot be marshaled as UnitedStatesState."); } } } With the classes for creating a custom CQL data type written, they need to be compiled into .class files and archived in a JAR file. This process (compiling with javac -cp "C:\Program Files\DataStax Community\apache-cassandra\lib\*" -sourcepath src -d classes src\dustin\examples\cassandra\cqltypes\*.java and archiving the generated .class files into a JAR named CustomCqlTypes.jar with jar cvf CustomCqlTypes.jar *) is shown in the following screen snapshot.The JAR with the class definitions of the custom CQL type classes needs to be placed in the Cassandra installation’s lib directory as demonstrated in the next screen snapshot.With the JAR containing the custom CQL data type classes implementations in the Cassandra installation’s lib directory, Cassandra should be restarted so that it will be able to “see” these custom data type definitions. The next code listing shows a Cassandra Query Language (CQL) statement for creating a table using the new custom type dustin.examples.cassandra.cqltypes.UnitedStatesState. createAddress.cql CREATE TABLE us_address ( id uuid, street1 text, street2 text, city text, state 'dustin.examples.cassandra.cqltypes.UnitedStatesState', zipcode text, PRIMARY KEY(id) ); The next screen snapshot demonstrates the results of running the createAddress.cql code above by describing the created table in cqlsh.The above screen snapshot demonstrates that the custom type dustin.examples.cassandra.cqltypes.UnitedStatesState is the type for the state column of the us_address table. A new row can be added to the US_ADDRESS table with a normal INSERT. For example, the following screen snapshot demonstrates inserting an address with the command INSERT INTO us_address (id, street1, street2, city, state, zipcode) VALUES (blobAsUuid(timeuuidAsBlob(now())), '350 Fifth Avenue', '', 'New York', 'New York', '10118');:Note that while the INSERT statement inserted “New York” for the state, it is stored as “NY”.If I run an INSERT statement in cqlsh using an abbreviation to start with (INSERT INTO us_address (id, street1, street2, city, state, zipcode) VALUES (blobAsUuid(timeuuidAsBlob(now())), '350 Fifth Avenue', '', 'New York', 'NY', '10118');), it still works as shown in the output shown below.In my example, an invalid state does not prevent an INSERT from occurring, but instead persists the state as “UK” (for unknown) [see the implementation of this in UnitedStatesState.getStateAbbreviationAsByteBuffer(String)]. One of the first advantages that comes to mind justifying why one might want to implement a custom CQL datatype in Java is the ability to employ behavior similar to that provided by check constraints in relational databases. For example, in this post, my sample ensured that any state column entered for a new row was either one of the fifty states of the United States, the District of Columbia, or “UK” for unknown. No other values can be inserted into that column’s value. Another advantage of the custom data type is the ability to massage the data into a preferred form. In this example, I changed every state name to an uppercase two-digit abbreviation. In other cases, I might want to always store in uppercase or always store in lowercase or map finite sets of strings to numeric values. The custom CQL datatype allows for customized validation and representation of values in the Cassandra database. Conclusion This post has been an introductory look at implementing custom CQL datatypes in Cassandra. As I play with this concept more and try different things out, I hope to write another blog post on some more subtle observations that I make. As this post shows, it is fairly easy to write and use a custom CQL datatype, especially for Java developers.Reference: Custom Cassandra Data Types from our JCG partner Dustin Marx at the Inspired by Actual Events blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.

Sign up for our Newsletter

20,709 insiders are already enjoying weekly updates and complimentary whitepapers! Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

As an extra bonus, by joining you will get our brand new e-books, published by Java Code Geeks and their JCG partners for your reading pleasure! Enter your info and stay on top of things,

  • Fresh trends
  • Cases and examples
  • Research and insights
  • Two complimentary e-books