Featured FREE Whitepapers

What's New Here?

software-development-2-logo

Towards a Theory of Test-Driven Development

This post examines how well we really understand the practice of Test-Driven Development (TDD).                   Red, Green, Refactor By now we all know that Test-Driven Development (TDD) follows a simple cycle consisting of these steps:Start by writing a test. Since there is no code, it will fail (Red) Write just enough code to make the test pass (Green) Clean up the code (Refactor)The beauty of this division is that we can focus on one thing at a time. Specify, Transform, Refactor Although simple, TDD isn’t easy. To execute the TDD cycle well, we need a deeper understanding that we can only get from experience. For instance, after doing TDD for a while we may look at the steps as:Specify new required functionality Improve the functionality while keeping the design constant Improve the design while keeping the functionality constantWhen we look at the TDD cycle in this light, we see that the Green and Refactor phases are each others opposite. Refactorings and Transformations In the Refactor phase, we use Martin Fowler‘s refactorings to clean up the code.Refactorings are standard alterations of the code that change its internal structure without changing its external behavior. Now, if the Green and Refactor phases are each others opposite, then you might think that there are “opposite refactorings” as well. You would be right. Robert Martin‘s transformations are standard alterations of the code that change its external behavior without changing its internal structure. Automated Transformations? Most of us use powerful IDEs to write our code. These IDEs support refactorings, which means that they can do the code alteration for you in a manner that is guaranteed to be safe. So do we need something similar for transformations? I think not. Some transformations are so simple in terms of the changes to code, that it wouldn’t actually save any effort to automate them. I don’t see a lot of room for improving the change from if to while, for instance. Other transformations simply have an unspecified effect. For example, how would you automate the statement->statements transformation?The crux is that refactorings keep the external behavior the same, and the tools depend on that to properly implement the refactorings. However, transformations don’t share that property. Standardized Work In the Specify/Transform/Refactor view of TDD, we write our programs by alternating between adding tests, applying transformations, and applying refactorings. In other words, if we look at the evolution of our non-test code through a series of diffs, then each diff shows either a transformation or a refactoring. It seems we are getting closer to the Lean principle of Standardized Work. What’s still missing, however, is a deeper insight into the Red/Specify phase. How to Write Tests The essential part of the Red/Specify phase is obviously to write a test. But how do we do that? For starters, how do we select the next test to implement?There is almost always more than one test to write for a given requirement. And the order in which you introduce tests makes a difference for the implementation. But there is very little advice on how to pick the next test, and this is sorely needed. Kent Beck has a kata for experimenting with test order, which helps in gaining understanding. But that’s a far cry from a well-developed theory like we have for refactorings. So what do you think? If we understood this phase better, could we come up with the test writing equivalent of transformations and refactorings?   Reference: Towards a Theory of Test-Driven Development from our JCG partner Remon Sinnema at the Secure Software Development blog. ...
java-logo

Processing huge files with Java

I recently had to process a set of files containg historical tick-by-tick fx market data and quickly realized that none of them could be read into memory using a traditional InputStream because every file was over 4 gigabytes in size. Emacs couldn’t even open them. In this particular case I could write a simple bash script that divide files into smaller pieces and read them as usual. But I don’t want that since binary formats would invalidate this approach.         So the way to handle this problem properly is to process regions of data incrementally using memory mapped files. What’s nice about memory mapped files is that they do not consume virtual memory or paging space since it is backed by file data on disk. Okey, let’s have a look at these files and extract some data. Seems like they contain ASCII text rows with comma delimited fields. Format: [currency-pair],[timestamp],[bid-price],[ask-price] Example: EUR/USD,20120102 00:01:30.420,1.29451,1.2949 Fair enough, I could write a program for that format. But reading and parsing files are orthogonal concepts; so let’s take a step back and think about a generic design that can be reused in case confronted with a similar problem in the future. The problem boils down to incrementally decode a set of entries encoded in a infinitely long byte array without exhausting memory. The fact that the example format is encoded in comma/line delimited text is irrelevant for the general solution so it is clear that a decoder interface is needed in order to handle different formats. Again, every entry cannot be parsed and kept in memory until the whole file is processed so we need a way to incrementally hand off chunks of entries that can be written elsewhere, disk or network, before they are garbage collected. An iterator is a good abstraction to handle this requirement because they act like cursors, which is exactly the point. Every iteration forwards the file pointer and let us do something with the data. So first the Decoder interface. The idea is to incrementally decode objects from a MappedByteBuffer or return null if no objects remains in the buffer. public interface Decoder<T> { public T decode(ByteBuffer buffer); } Then comes the FileReader which implements Iterable. Each iteration will process next 4096 bytes of data and decode them into a list of objects using the Decoder. Notice that FileReader accept a list of files, which is nice since it enable traversal through the data without worrying about aggregation across files. By the way, 4096 byte chunks are probably a bit small for bigger files. public class FileReader implements Iterable<List<T>> { private static final long CHUNK_SIZE = 4096; private final Decoder<T> decoder; private Iterator<File> files; private FileReader(Decoder<T> decoder, File... files) { this(decoder, Arrays.asList(files)); } private FileReader(Decoder<T> decoder, List<File> files) { this.files = files.iterator(); this.decoder = decoder; } public static <T> FileReader<T> create(Decoder<T> decoder, List<File> files) { return new FileReader<T>(decoder, files); }public static <T> FileReader<T> create(Decoder<T> decoder, File... files) { return new FileReader<T>(decoder, files); } @Override public Iterator<List<T>> iterator() { return new Iterator<List<T>>() { private List<T> entries; private long chunkPos = 0; private MappedByteBuffer buffer; private FileChannel channel; @Override public boolean hasNext() { if (buffer == null || !buffer.hasRemaining()) { buffer = nextBuffer(chunkPos); if (buffer == null) { return false; } } T result = null; while ((result = decoder.decode(buffer)) != null) { if (entries == null) { entries = new ArrayList<T>(); } entries.add(result); } // set next MappedByteBuffer chunk chunkPos += buffer.position(); buffer = null; if (entries != null) { return true; } else { Closeables.closeQuietly(channel); return false; } } private MappedByteBuffer nextBuffer(long position) { try { if (channel == null || channel.size() == position) { if (channel != null) { Closeables.closeQuietly(channel); channel = null; } if (files.hasNext()) { File file = files.next(); channel = new RandomAccessFile(file, "r").getChannel(); chunkPos = 0; position = 0; } else { return null; } } long chunkSize = CHUNK_SIZE; if (channel.size() - position < chunkSize) { chunkSize = channel.size() - position; } return channel.map(FileChannel.MapMode.READ_ONLY, chunkPos, chunkSize); } catch (IOException e) { Closeables.closeQuietly(channel); throw new RuntimeException(e); } } @Override public List<T> next() { List<T> res = entries; entries = null; return res; } @Override public void remove() { throw new UnsupportedOperationException(); } }; } } Next task is to write a Decoder and I decided to implement a generic TextRowDecoder for any comma delimited text file format, accepting number of fields per row and a field delimiter and returning an array of byte arrays. TextRowDecoder can then be reused by format specific decoders that maybe handle different character sets. public class TextRowDecoder implements Decoder<byte[][]> { private static final byte LF = 10; private final int numFields; private final byte delimiter; public TextRowDecoder(int numFields, byte delimiter) { this.numFields = numFields; this.delimiter = delimiter; } @Override public byte[][] decode(ByteBuffer buffer) { int lineStartPos = buffer.position(); int limit = buffer.limit(); while (buffer.hasRemaining()) { byte b = buffer.get(); if (b == LF) { // reached line feed so parse line int lineEndPos = buffer.position(); // set positions for one row duplication if (buffer.limit() < lineEndPos + 1) { buffer.position(lineStartPos).limit(lineEndPos); } else { buffer.position(lineStartPos).limit(lineEndPos + 1); } byte[][] entry = parseRow(buffer.duplicate()); if (entry != null) { // reset main buffer buffer.position(lineEndPos); buffer.limit(limit); // set start after LF lineStartPos = lineEndPos; } return entry; } } buffer.position(lineStartPos); return null; } public byte[][] parseRow(ByteBuffer buffer) { int fieldStartPos = buffer.position(); int fieldEndPos = 0; int fieldNumber = 0; byte[][] fields = new byte[numFields][]; while (buffer.hasRemaining()) { byte b = buffer.get(); if (b == delimiter || b == LF) { fieldEndPos = buffer.position(); // save limit int limit = buffer.limit(); // set positions for one row duplication buffer.position(fieldStartPos).limit(fieldEndPos); fields[fieldNumber] = parseField(buffer.duplicate(), fieldNumber, fieldEndPos - fieldStartPos - 1); fieldNumber++; // reset main buffer buffer.position(fieldEndPos); buffer.limit(limit); // set start after LF fieldStartPos = fieldEndPos; } if (fieldNumber == numFields) { return fields; } } return null; } private byte[] parseField(ByteBuffer buffer, int pos, int length) { byte[] field = new byte[length]; for (int i = 0; i < field.length; i++) { field[i] = buffer.get(); } return field; } } And this is how files are processed. Each list contain elements decoded from a single buffer and each element is an array of byte arrays as specified by the TextRowDecoder. TextRowDecoder decoder = new TextRowDecoder(4, comma); FileReader<byte[][]> reader = FileReader.create(decoder, file.listFiles()); for (List<byte[][]> chunk : reader) { // do something with each chunk } We could stop here but there was one more requirement. Every row contain a timestamp and each batch must be grouped according to periods of time instead of buffers, day-by-day or hour-by-hour. I still want to iterate through each batch so the immediate reaction was to create a Iterable wrapper for FileReader that would implement this behaviour. One additional detail is that each element must to provide its timestamp to PeriodEntries by implementing the Timestamped interface (not shown here). public class PeriodEntries<T extends Timestamped> implements Iterable<List<T>> { private final Iterator<List<T extends Timestamped>> entriesIt; private final long interval; private PeriodEntries(Iterable<List<T>> entriesIt, long interval) { this.entriesIt = entriesIt.iterator(); this.interval = interval; }public static <T extends Timestamped> PeriodEntries<T> create(Iterable<List<T>> entriesIt, long interval) { return new PeriodEntries<T>(entriesIt, interval); } @Override public Iterator<List<T extends Timestamped>> iterator() { return new Iterator<List<T>>() { private Queue<List<T>> queue = new LinkedList<List<T>>(); private long previous; private Iterator<T> entryIt; @Override public boolean hasNext() { if (!advanceEntries()) { return false; } T entry = entryIt.next(); long time = normalizeInterval(entry); if (previous == 0) { previous = time; } if (queue.peek() == null) { List<T> group = new ArrayList<T>(); queue.add(group); } while (previous == time) { queue.peek().add(entry); if (!advanceEntries()) { break; } entry = entryIt.next(); time = normalizeInterval(entry); } previous = time; List<T> result = queue.peek(); if (result == null || result.isEmpty()) { return false; } return true; } private boolean advanceEntries() { // if there are no rows left if (entryIt == null || !entryIt.hasNext()) { // try get more rows if possible if (entriesIt.hasNext()) { entryIt = entriesIt.next().iterator(); return true; } else { // no more rows return false; } } return true; } private long normalizeInterval(Timestamped entry) { long time = entry.getTime(); int utcOffset = TimeZone.getDefault().getOffset(time); long utcTime = time + utcOffset; long elapsed = utcTime % interval; return time - elapsed; } @Override public List<T> next() { return queue.poll(); } @Override public void remove() { throw new UnsupportedOperationException(); } }; } } The final processing code did not change much by introducing this functionality, only one clean and tight for-loop that does not have to care about grouping elements across files, buffers and periods. PeriodEntries is also flexible enough to mange any length on the interval. TrueFxDecoder decoder = new TrueFxDecoder(); FileReader<TrueFxData> reader = FileReader.create(decoder, file.listFiles()); long periodLength = TimeUnit.DAYS.toMillis(1); PeriodEntries<TrueFxData> periods = PeriodEntries.create(reader, periodLength); for (List<TrueFxData> entries : periods) { // data for each day for (TrueFxData entry : entries) { // process each entry } } As you may realize, it would not have been possible to solve this problem with collections; choosing iterators was a crucial design decision to be able to parse terabytes of data without consuming too much heap space.   Reference: Processing huge files with Java from our JCG partner Kristoffer Sjogren at the deephacks blog. ...
clojure-logo

Clojure: Reading and writing a reasonably sized file

In a post a couple of days ago I described some code I’d written in R to find out all the features with zero variance in the Kaggle Digit Recognizer data set and yesterday I started working on some code to remove those features. Jen and I had previously written some code to parse the training data in Clojure so I thought I’d try and adapt that to write out a new file without the unwanted pixels.             In the first version we’d encapsulated the reading of the file and parsing of it into a more useful data structure like so: (defn get-pixels [pix] (map #( Integer/parseInt %) pix))(defn create-tuple [[ head & rem]] {:pixels (get-pixels rem) :label head})(defn tuples [rows] (map create-tuple rows))(defn parse-row [row] (map #(clojure.string/split % #",") row))(defn read-raw [path n] (with-open [reader (clojure.java.io/reader path)] (vec (take n (rest (line-seq reader))))))(def read-train-set-raw (partial read-raw "data/train.csv"))(def parsed-rows (tuples (parse-row (read-train-set-raw 42000)))) So the def parsed-rows gives an in memory representation of a row where we’ve separated the label and pixels into different key entries in a map. We wanted to remove any pixels which had a variance of 0 across the data set which in this case means that they always have a value of 0: (def dead-to-us-pixels [0 1 2 3 4 5 6 7 8 9 10 11 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 52 53 54 55 56 57 82 83 84 85 111 112 139 140 141 168 196 392 420 421 448 476 532 560 644 645 671 672 673 699 700 701 727 728 729 730 731 754 755 756 757 758 759 760 780 781 782 783])(defn in? "true if seq contains elm" [seq elm] (some #(= elm %) seq))(defn dead-to-us? [pixel-with-index] (in? dead-to-us-pixels (first pixel-with-index)))(defn remove-unwanted-pixels [row] (let [new-pixels (->> row :pixels (map-indexed vector) (remove dead-to-us?) (map second))] {:pixels new-pixels :label (:label row)}))(defn -main [] (with-open [wrt (clojure.java.io/writer "/tmp/attempt-1.txt")] (doseq [line parsed-rows] (let [line-without-pixels (to-file-format (remove-unwanted-pixels line))] (.write wrt (str line-without-pixels "\n")))))) We then ran the main method using ‘leon run’ which wrote out the new file. A print screen of the heap space usage while this function was running looks like this:While I was writing this version of the function I made a mistake somewhere and ended up passing the wrong data structure to one of the functions which resulted in all the intermediate steps that the data structure goes through getting stored in memory and caused an OutOfMemory exception. A heap dump showed the following:When I reduced the size of the erroneous collection by using a ‘take 10′ I got an exception indicating that the function couldn’t process the data structure which allowed me to sort it out. I initially thought that the problem was to do with the loading of the file into memory at all but since the above seems to work I don’t think it is. When I was working along that theory Jen suggested it might make more sense to do the reading and writing of the files within a ‘with-open’ which tallies with a suggestion I came across in a StackOverflow post. I ended up with the following code: (defn split-on-comma [line] (string/split line #","))(defn clean-train-file [] (with-open [rdr (clojure.java.io/reader "data/train.csv") wrt (clojure.java.io/writer "/tmp/attempt-2.csv")] (doseq [line (drop 1 (line-seq rdr))] (let [line-with-removed-pixels ((comp to-file-format remove-unwanted-pixels create-tuple split-on-comma) line)] (.write wrt (str line-with-removed-pixels "\n")))))) Which got called in the main method like this: (defn -main [] (clean-train-file)) This version had the following heap usage:Its peaks are slightly lower than the first one and it seems like it buffers a bunch of lines, writes them out to the file (and therefore out of memory) and repeats.   Reference: Clojure: Reading and writing a reasonably sized file  from our JCG partner Markh Needham at the Mark Needham Blog. ...
apache-hadoop-mapreduce-logo

MapReduce Algorithms – Secondary Sorting

We continue with our series on implementing MapReduce algorithms found in Data-Intensive Text Processing with MapReduce book. Other posts in this series:Working Through Data-Intensive Text Processing with MapReduce Working Through Data-Intensive Text Processing with MapReduce – Local Aggregation Part II Calculating A Co-Occurrence Matrix with Hadoop MapReduce Algorithms – Order Inversion      This post covers the pattern of secondary sorting, found in chapter 3 of Data-Intensive Text Processing with MapReduce. While Hadoop automatically sorts data emitted by mappers before being sent to reducers, what can you do if you also want to sort by value? You use secondary sorting of course. With a slight manipulation to the format of the key object, secondary sorting gives us the ability to take the value into account during the sort phase. There are two possible approaches here. The first approach involves having the reducer buffer all of the values for a given key and do an in-reducer sort on the values. Since the reducer will be receiving all values for a given key, this approach could possibly cause the reducer to run out of memory. The second approach involves creating a composite key by adding a part of, or the entire value to the natural key to achieve your sorting objectives. The trade off between these two approaches is doing an explicit sort on values in the reducer would most likely be faster(at the risk of running out of memory) but implementing a “value to key” conversion approach, is offloading the sorting the MapReduce framework, which lies at the heart of what Hadoop/MapReduce is designed to do. For the purposes of this post, we will consider the “value to key” approach. We will need to write a custom partitioner to ensure all the data with same key (the natural key not including the composite key with the value) is sent to the same reducer and a custom Comparator so the data is grouped by the natural key once it arrives at the reducer. Value to Key Conversion Creating a composite key is straight forward. What we need to do is analyze what part(s) of the value we want to account for during the sort and add the appropriate part(s) to the natural key. Then we need to work on the compareTo method either in key class, or comparator class to make sure the composite key is accounted. We will be re-visiting the weather data set and include the temperature as part of the natural key (the natural key being the year and month concatenated together). The result will be a listing of the coldest day for a given month and year. This example was inspired from the secondary sorting example found in Hadoop, The Definitive Guide book. While there are probably better ways to achieve this objective, but it will be good enough to demonstrate how secondary sorting works. Mapper Code Our mapper code already concatenates the year and month together, but we will also include the temperature as part of the key. Since we have included the value in the key itself, the mapper will emit a NullWritable, where in other cases we would emit the temperature. public class SecondarySortingTemperatureMapper extends Mapper<LongWritable, Text, TemperaturePair, NullWritable> {private TemperaturePair temperaturePair = new TemperaturePair(); private NullWritable nullValue = NullWritable.get(); private static final int MISSING = 9999; @Override protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); String yearMonth = line.substring(15, 21);int tempStartPosition = 87;if (line.charAt(tempStartPosition) == '+') { tempStartPosition += 1; }int temp = Integer.parseInt(line.substring(tempStartPosition, 92));if (temp != MISSING) { temperaturePair.setYearMonth(yearMonth); temperaturePair.setTemperature(temp); context.write(temperaturePair, nullValue); } } } Now we have added the temperature to the key, we set the stage for enabling secondary sorting. What’s left to do is write code taking temperature into account when necessary. Here we have two choices, write a Comparator or adjust the compareTo method on the TemperaturePair class (TemperaturePair implements WritableComparable). In most cases I would recommend writing a separate Comparator, but the TemperaturePair class was written specifically to demonstrate secondary sorting, so we will modify the TemperaturePair class compareTo method. @Override public int compareTo(TemperaturePair temperaturePair) { int compareValue = this.yearMonth.compareTo(temperaturePair.getYearMonth()); if (compareValue == 0) { compareValue = temperature.compareTo(temperaturePair.getTemperature()); } return compareValue; } If we wanted to sort in descending order, we could simply multiply the result of the temperature comparison by a -1. Now that we have completed the part necessary for sorting, we need to write a custom partitioner. Partitoner Code To ensure only the natural key is considered when determining which reducer to send the data to, we need to write a custom partitioner. The code is straight forward and only considers the yearMonth value of the TemperaturePair class when calculating the reducer the data will be sent to. public class TemperaturePartitioner extends Partitioner<TemperaturePair, NullWritable>{ @Override public int getPartition(TemperaturePair temperaturePair, NullWritable nullWritable, int numPartitions) { return temperaturePair.getYearMonth().hashCode() % numPartitions; } } While the custom partitioner guarantees that all of the data for the year and month arrive at the same reducer, we still need to account for the fact the reducer will group records by key. Grouping Comparator Once the data reaches a reducer, all data is grouped by key. Since we have a composite key, we need to make sure records are grouped solely by the natural key. This is accomplished by writing a custom GroupPartitioner. We have a Comparator object only considering the yearMonth field of the TemperaturePair class for the purposes of grouping the records together. public class YearMonthGroupingComparator extends WritableComparator { public YearMonthGroupingComparator() { super(TemperaturePair.class, true); } @Override public int compare(WritableComparable tp1, WritableComparable tp2) { TemperaturePair temperaturePair = (TemperaturePair) tp1; TemperaturePair temperaturePair2 = (TemperaturePair) tp2; return temperaturePair.getYearMonth().compareTo(temperaturePair2.getYearMonth()); } } Results Here are the results of running our secondary sort job: new-host-2:sbin bbejeck$ hdfs dfs -cat secondary-sort/part-r-00000 190101 -206 190102 -333 190103 -272 190104 -61 190105 -33 190106 44 190107 72 190108 44 190109 17 190110 -33 190111 -217 190112 -300 Conclusion While sorting data by value may not be a common need, it’s a nice tool to have in your back pocket when needed. Also, we have been able to take a deeper look at the inner workings of Hadoop by working with custom partitioners and group partitioners. Thank you for your time. ResourcesData-Intensive Processing with MapReduce by Jimmy Lin and Chris Dyer Hadoop: The Definitive Guide by Tom White Source Code and Tests from blog Hadoop API MRUnit for unit testing Apache Hadoop map reduce jobs  Reference: MapReduce Algorithms – Secondary Sorting from our JCG partner Bill Bejeck at the Random Thoughts On Coding blog. ...
jboss-hibernate-logo

Hibernate Search 4.2 final released: spatial query supported

JBoss has announced the release of Hibernate Search 4.2 final. You may download it from Sourceforge or use the Maven artifacts. In the new release, some interesting features are included:Hibernate Search now supports spatial queries. With the Spatial extensions you can combine fulltext queries with restrictions based on distance from a point in space, filter results based on distances from coordinates or sort results on such a distance criteria. Compatibility with Apache Lucene 3.6, the suggested version being 3.6.2 Integration with Apache Tika for indexing files like MP3 or word documents (as well as the indexing the metadata of those files) JBoss AS 7 simplified integration via a set of modules Performance improvements for the Near Real Time support Compatibility with the latest Hibernate ORM 4.1.9.FinalIn order to migrate to the latest version, follow the Hibernate Search Migration Guide. ...
java-logo

Garbage Collection Analysis of PCGen

Introduction I decided to combine two software loves of mine and perform some analysis on PCGen, a popular Java based open source character generator for role-playing games. I used Censum, our ( jClarity‘s) new Garbage Collection log analysis tool to perform the analysis. This write-up assumes you have a passing familiarity with Garbage Collection (GC) on the JVM. If you’re not familiar with GC then I recommend you join our Friends of jClarity programme. We’re building up a knowledge base around GC to share freely with the whole Java community, we’d love for you to come and validate this! The projects The two projects I’m using are are PCGen (the project I’m doing the analysis on) and Censum (the GC analysis tool). PCGen PCGen is a popular character generator and maintenance program for d20 role playing systems such as Star Wars and Dungeons & Dragons. It’s a long running project (> 10 years) and consists of a large (~750,000 LOC) Java Swing desktop tool that has a ton of proprietary format data files. Disclaimer: I’m the Chairperson for PCGen.PCGen is a data intensive tool. In order to drive its rules engine and to meet the requirement of a responsive UI (with lots of detail) much of this data is loaded up front and held in memory. Users have reported the following issues in the past.Pauses often occur when dealing with multiple characters and/or high level characters. When creating a character of a high level or if more than 2 characters were created, PCGen simply died. More tech savvy users report that they saw an OutOfMemoryError in the pcgen log file.Some work has been done to mitigate this poor performance in the latest version of PCGen (6.0) and so I decided to use Censum to determine if those changes had improved matters. Censum Censum is jClarity’s new Garbage Collection analysis tool. It’s focus is to use powerful analytics to crunch through the raw log data and give busy developers (like contributors to PCGen!) plain english answers quickly. Disclaimer: I’m the CTO of jClarity.Censum is a new product that is free for Open Source projects and of course if you wish to purchase a copy you can get a free eval license (click on Try Censum) today! TLDR – The Conclusion We have good news, some information and bad news. The Good News The positive news is that the default heap settings that PCGen starts with ( -Xms256m -Xmx512m) are now adequate in terms of being sized well enough to keep PCGen running. Even after creating a 5th complex character, there was no OutOfMemoryError. Censum shows that once a full GC is run (typically after each new character has been created), a large percentage of the heap is recovered and each the character takes up about 25-50MB of heap space. We can very roughly extrapolate that with a starting (data loaded) point of ~125MB that PCGen can comfortably hold about 10-15 characters open at any one time without crashing. This is perhaps not enough for a GM to have his Goblin horde up and running, but certainly enough for most regular parties! The Bad News The slightly more negative news is Censum reporting that PCGen has relatively high pause times, likely triggered by too much premature promotion. Too much premature promotion pushes memory into the old gen space more quickly that we’d like. This can have the knock-on effect of causing more old gen collections as well as full GCs, naturally leading to more pause times. See the full analysis section for extra details on high pause times, premature promotion and what PCGen can do about it. Where to from here? PCGen could follow Censum’s ‘stop-gap’ recommendation to alter the size of the young generational space. By using the -XX:NewSize flag and setting that to ~256M, the high pause times problem is alleviated. However, the longer term solution is for PCGen to continue reducing the impact of their data structures (in particular the representation of a player character). This is in fact an ongoing project for PCGen today! The technical setup PCGen is typically run from a shell script with the default heap settings of -Xms256m and -Xmx512m. The script was altered to provide the minimum set of flags that are required to produce GC log that can be analysed. The flags that were added to the java command were: -verbose:gc -Xloggc:pcgen_gc.log -XX:+PrintGCDetails -XX:+PrintTenuringDistribution-verbose:gc and -Xloggc:pcgen_gc.log produce a basic log that outputs to a file called pcgen_gc.log. -XX:+PrintGCDetails provides the absolute minimum set of GC allocation information that Censum needs to perform an analysis. Finally -XX:+PrintTenuringDistribution gives useful information on when objects are moving from a young generational space (eden, survivor 1 and survivor 2) to an old generation space (tenured).All of these options have little to no impact on a running JVM. You should always have these switched on in Production! PCGen was run with Oracle’s Java 7u10 on a MBP running Mac OS 10.7.5, with 8GB of RAM, a 256MB SSD drive and a hyperthreaded Dual Core 2.8Ghz i7 processor. The PCGen activities PCGen begins by loading up basic game mode and system files + the basic UI to load data sources. The next step is for the user to select which data sources to load (roleplaying game rule sets). The popular SRD 3.5 with Monsters set was loaded (Dungeons and Dragons 3.5e). A Character (Karianna) was created level by level into a 20th Wizard with fully loaded spells, equipment, and a Cat familiar (effectively a 2nd character in PCGen). Several more complex characters were added after that, including a Great Wyrm Blue Dragon (loads of data!). Analysis I’ll cover the initial data loading use case and then general on going usage (character creation). Data Loading I was curious about the memory impact of the initial data load. Although I get fast loading times with my SSD drive having PCGen load its data without memory problems is certainly a project goal! Here is what Censum showed in terms of heap allocation after GC’s.As you can see, the initial load of data caused a number of young generational collections and one old (tenured) GC event at the end of the data load. The heap usage climbed to a max of about 325MB, but after garbage was collected, the heap usage fell back to about 100MB. Not too bad for loading about 15 thick rule books worth of data! However, data loading for PCGen is a little like the start-up period for an web/application server such as Tomcat. In terms of your GC analysis it’s generally best to be excluded as part of a one-off start-up as opposed to analysing your running application. Creating characters Creating Karianna and advancing her to 20th level involves filling in details of 13 main tabs ~20 sub tabs and a good deal of data! Another 4 characters were created of similar complexity, some friendly (a Cat familiar) and some not (a Great Wyrm Blue Dragon). A few screenshots of the process follow: SkillsEquipmentThe Embedded Character SheetCensum’s Analysis On opening the log file, Censum took me immediately to its Analytics Summary screen, which lets me know at a glance as to how PCGen’s garbage collection is going.The good news Immediately I know that:I have the right flags to collect enough GC data for analytics to work properly My GC throughput is good (PCGen spends it’s time running, not garbage collecting) There are no nasty System.gc() calls (generally not good practice).The informational news Memory Utilisation (which a memory leak is a subset of) and Memory Pool Sizes are informational as the log has not yet gathered 24 hours of data (our recommended minimum to see a full working day’s cycle for an application). The bad news PCGen appears to have high pause times as well as a premature promotion problem. Let’s dive into those a bit further! High Pause Times High pause times are caused by GC collectors having to pause application threads in order to clean up object references in a safe manner. The more object references that the collectors have to scan over and clean up, the longer the pause! The longest pauses are usually caused by full GC collections, where the entire Java Heap (e.g. Both young and old gen spaces are getting really full) are cleaned up. As a user I noticed pauses a couple of times, not enough to really disturb me, but I am aware that I have extremely good hardware and that these pause times may be significantly worse for other users.As Censum points out, the 0.15% time spent paused is not a major concern, it’s the 0.666 second pause time that’s concerning. However, I remembered that the highest pause time could come from the initial data load in PCGen. To correlate this, Censum provides a graph of pause times.The data load was the worst offender, but certainly for each character created there was a good ½ second pause around each character creation point due to a full GC. Again ½ a second wasn’t too annoying for me in the context of using PCGen, but as Censum shows, full GCs take time and so PCGen should look to reduce the number of full GC’s. In this case we know that we are probably getting more full GC’s than we’d like due to the other warning that Censum gives us – too much premature promotion. Premature Promotion Premature promotion basically means that objects that should be getting collected in the young gen space are being promoted into the old gen space before their age is up. This ‘age’ is known as the tenuring threshold, and is based on a combination of software research done in the 1980’s and the JVM’s runtime heuristics. Premature promotion can occur due to:The rate of new objects being created overwhelming the young gen space The size of the objects being created are too large to fit into the young gen space. e.g. Large contiguous blocks of memory.This has a knock on effect of putting pressure on the old gen space. It fills up more quickly and therefore more old gen collections and eventually full GCs occur, leading to more frequent pause times.When I go to take a look and see how soon objects should be promoted and how early they are getting promoted, I get an answer immediately. The Tenuring Summary screen shows us that the Tenuring threshold is set to 15 (objects can survive ~15 collections in young gen before being promoted naturally to old gen). Note also that 100% of objects are being prematurely promoted! ...
java-logo

How many threads do I need?

It depends on your application. But for those who wish to have some insight about how to squeeze out most from all those expensive cores you have purchased for your production site – bear with me and I will shed some light on the mysteries surrounding multi-threaded Java applications.                 The content is “optimized” towards the most typical Java EE application, which has got a web frontend allowing end users to initiate a lot of small transactions within the application. And significant part of each transaction is kept waiting for some external resource. Such as a query to return from the database or from any other integrated data source. But most of the content is also relevant for other applications. Such as computation-heavy modeling applications or data-chugging batch processes. But lets start with the basics. In the type of application we are describing you tend to have a lots of users interacting with your application. Will it be tens of simultaneous active users or tens of thousands – all those users expect application to respond them in a timely manner. And this is where you feel grateful for the operating system designers. Those guys had figured this kind of need out way before anybody had even dreamt about HTTP protocol. The solution used is beneficial in situations where you create more threads in your software then underlying hardware can simultaneously execute. On hardware level you also have threads. Such as the cores on your CPU or a virtualized environment like Intel with its Hyperthreading. In any case – our application at hand can easily have spawned way more software threads than underlying hardware can support directly. What your OS is now launching is similar to a simple round-robin scheduling. During which each software thread gets its turn, called a time slice, to be run on the actual hardware. Time slicing allows all threads to progress. Otherwise it is easy to imagine a situation where one of the users has initiated a truly expensive task and all other threads serving other users are starved. So we have this amazing time slicing going on. Wouldn’t it then be feasible to set the number of threads to some LARGE_NUMBER and be done with it? Apparently no. There is overhead included, in fact even several types of overheads. So in order to make an educated decision while tuning your threads, lets introduce the problems caused by having LARGE_NUMBER of threads one-by-one. Register state saving/restoring. Processor registers do contain a lot of state. Which gets saved to caches each time scheduler moves to the next task. And then restored when the time comes. Luckily the time slices allocated by schedulers are relatively large. So the save/restore overhead from and to the registries will most often not be the meanest of our enemies in multithreaded environments. Locks. When the time slice is consumed by the lock-holding thread then all other threads waiting for this particular lock must now wait. Until the lock holder gets another slice and another chance to free the lock. So – if you have a lot of synchronization going on then check out your thread’s behavior under heavy load. There is a chance that your synchronization code is causing a lot more context switching to take place because of the lock-holding threads. Analyzing thread dumps would be a good place to start investigating this peril. Thrashing virtual memory. All operating systems take advantage of the virtual memory swapped to external storage. By swapping least recently used (LRU) data in memory to a disk drive when the need arises. Which is good. But if you now are running your applications with limited memory and lot of threads fighting to fit their stack and private data into memory then you might run into problems. In each time-slicing round you might have threads swapping data in and out from the external storage. Which will significantly decrease your application’s performance. Especially for Java applications where the problem is particularly nasty. Whenever you start swapping your heap then each Full GC run is going to take forever. Some gurus out go as far as recommending to turn off the swapping in the OS level. In Linux distros you can achieve this via swapoff –a. But the good news is that this problem has been significantly reduced in past years. Both with widespread 64-bit OS deployments allowing larger RAM and SSD replacing traditional spinning disks all around the world. But be aware of the enemy and when in doubt – check the page in/out ratios for your processes. Last but not least – thread cache state. In all modern processors you have caches built next to your cores enabling operations to be completed up to 100x faster than on data residing in RAM. Which is definitely cool. But what is uncool is when your threads start fighting for this extremely limited space. Then again the LRU algorithm in charge of the starts cleaning for cache making room for new data. Which could be the data last thread in its time slice entered to the cache. So your threads can end up cleaning each other’s data from the caches. Again creating a thrashing problem. If you are running on Intel architecture then the solution which might help you out in this case is Intel’s VTune Performance Analyzer So maybe throwing LARGE_NUMBER of threads into your application configuration would not be the wisest thing to do. But what hints could be given when configuring the number of threads? First, certain applications can be configured to run with the number of threads equal to the underlying hardware threads. Could not be the case for the typical web application out there, but there are definitely good cases supporting this strategy. Note that when your threads are waiting behind an external resource such as the relation database, those threads are removed from the round-robin schedule. So in a typical Java EE application it is not uncommon to have a lot more threads than underlying hardware and still run without lock contention or other problems. Next it would be wise to segment your threads to different groups used for different purpose. Typical cases would involve separating computing threads from I/O threads. Computing threads tend to be busy for the most of the time so it is important to keep their count below the underlying hardware capacity. I/O threads such as operations requiring database round-trips are on the other hand waiting for most of the time. And thus not contributing into the fight for resources too often. So it is safe to have the number of I/O threads (way) higher than the amount of hardware threads supporting your application. Then you should minimize the thread creation and destruction. As those tend to be expensive operation then look into the pooling solutions. You could be using Java EE infrastructure having thread pools already built in or you can take a look into the java.util.concurrent.ThreadPoolExecutor and alikes for a solution. But you also should not be too shy when on occasions you need to increase or decrease the number of threads – just avoid creating and removing them on events as predictable as the next HTTP request or JDBC connection. And as the last advice we are handing out the most important one. Measure. Tweak the sizes of your thread pools and run your application under load. Measure both throughput and latency. Then optimize to achieve your goals. And then measure again. Rinse and repeat. Until you are satisfied with the result. Don’t make any assumptions about how CPUs will perform. The amount of magic going on in CPUs these days is enormous. Note also that the virtualization and JIT runtime optimization will also add additional layers of complexity. But those will be subjects for another talks. For which you will be notified in time if you subscribe to our Twitter feed. While writing the article, the following resources were used as a source of inspiration:Arch Robinson’s post about How many threads will hurt the performance Different Stackoverflow questions and comments:http://stackoverflow.com/questions/130506/how-many-threads-should-i-use-in-my-java-program http://stackoverflow.com/questions/763579/how-many-threads-can-a-java-vm-support http://stackoverflow.com/questions/481970/how-many-threads-is-too-manyAnd yes. This article is the first hint about our research in other problem domains besides memory leaks. But we cannot yet predict if and when we are going to ship a solution for all the locking and cache contention problems out there. But there is definitely hope.   Reference: How many threads do I need? from our JCG partner Nikita Salnikov Tarnovski at the Plumbr Blog blog. ...
software-development-2-logo

Dev vs QA, should there really be a distinction?

We had our scrum of scrum meetings last Wednesday where all scrum masters meet up with our line manager to discuss issues, bottlenecks and success stories of our previous sprint. One issue highlighted was the fact that sometimes the QA(Quality assurance) personnel are clogged up with testing issues that are in the ‘Ready for test’ stage. So one idea that came about was that given the fact that some developer resources are freed up during this period, that it would be better off putting up some of their time to test the issues.           My first reaction was ‘HELL NO!!’. My primary concern was that a developer will never have the mindset of a QA personnel when it comes to testing since he/she would always have a stringent approach on how to view the application. Also there is a notion among a few that doing QA is so ‘feminine’ (again this is what i have observed and would differ from the context where i am in). So i took a step back, took a deep breath, turned off my dev-mode and looked at the suggestion in a more holistic manner. I’m a person who loves to acquire new knowledge, be it from reading books, journals or simple through other’s experience (young or old). So for me this turned out to be a way i could grab some new knowledge. Specially about software testing and quality. Now i know writing unit tests count for something, yet that is at its name dictates a unit level testing. When it comes to quality assurance, that is a whole new beast in its own right. Now you have to think in the perspective of how users interact with the system and not in the way you wrote the code for the application to behave. By learning the art of QA testing, i believe that it would enable me to write more quality code since i would have the user’s mindset and approach when testing my own code before i push it in to version control. Also in scrum the notion is that we as a team commit to deliver a set of issues we plan for a particular sprint. Hence if i do development or testing doesn’t really matter since either way i am contributing towards helping my team achieve our team’s commitment. It should not be thought of as someone else’s responsibility. It’s the team’s responsibility to deliver the planned work on time with quality. But such a change of mindset does not happen overnight. Its not merely about you going and telling each developer that he/she has got to start testing issues from the next sprint. I believe of leading by example. It’s a win-win situation since i gain an extra skill of quality assurance testing whilst the team will emancipate themselves from the notion of thinking testing is not a part of responsibility of a developer (Hopefully). Change needs time to sink-in, and you cant coerce anyone into embracing change, but you can influence them through your actions. What if some people never gets on board with this idea? Well i would say its their loss and my gain since in this dynamic business context we live in, i strongly believe if you try to stay with the pack, you will always be where you are right now. Break away from restrictive thinking, like Lion-O of thunder-cats tells his awesome sword to give him sight beyond sight, we should strive to look through the limitations and think in ways we usually would not and get our lazy bottoms out of the conservative thinking mode.   Reference: Dev vs QA, should there really be a distinction? from our JCG partner Dinuka Arseculeratne at the My Journey Through IT blog. ...
java-logo

QOTD: Java Thread vs. Java Heap Space

The following question is quite common and is related to OutOfMemoryError: unable to create new native thread problems during the JVM thread creation process and the JVM thread capacity. This is also a typical interview question I ask to new technical candidates (senior role). I recommend that you attempt to provide your own response before looking at the answer.             Question: Why can’t you increase the JVM thread capacity (total # of threads) by expanding the Java heap space capacity via -Xmx? Answer: The Java thread creation process requires native memory to be available for the JVM process. Expanding the Java heap space via the –Xmx argument will actually reduce your Java thread capacity since this memory will be “stolen” from the native memory space.For a 32-bit JVM, the Java heap space is in a race with the native heap, including the thread capacity For a 64-bit JVM, the thread capacity will mainly depend of your OS physical & virtual memory availability along with your current OS process related tuning parametersIn order to better understand this limitation, I now propose to you the following video tutorial. You can also download the sample Java program from the link below: https://docs.google.com/file/d/0B6UjfNcYT7yGazg5aWxCWGtvbm8/edit  Reference: QOTD: Java Thread vs. Java Heap Space from our JCG partner Pierre-Hugues Charbonneau at the Java EE Support Patterns & Java Tutorial blog. ...
spring-interview-questions-answers

@Cacheable overhead in Spring

Spring 3.1 introduced great caching abstraction layer. Finally we can abandon all home-grown aspects, decorators and code polluting our business logic related to caching.                   Since then we can simply annotate heavyweight methods and let Spring and AOP machinery do the work: @Cacheable("books") public Book findBook(ISBN isbn) {...} "books" is a cache name, isbn parameter becomes cache key and returned Book object will be placed under that key. The meaning of cache name is dependant on the underlying cache manager (EhCache, concurrent map, etc.) – Spring makes it easy to plug different caching providers. But this post won’t be about caching feature in Spring… Some time ago my teammate was optimizing quite low-level code and discovered an opportunity for caching. He quickly applied @Cacheable just to discover that the code performed worse then it used to. He got rid of the annotation and implemented caching himself manually, using good old java.util.ConcurrentHashMap. The performance was much better. He blamed @Cacheable and Spring AOP overhead and complexity. I couldn’t believe that a caching layer can perform so poorly until I had to debug Spring caching aspects few times myself (some nasty bug in my code, you know, cache invalidation is one of the two hardest things in CS). Well, the caching abstraction code is much more complex than one would expect (after all it’s just get and put!), but it doesn’t necessarily mean it must be that slow? In science we don’t believe and trust, we measure and benchmark. So I wrote a benchmark to precisely measure the overhead of @Cacheable layer. Caching abstraction layer in Spring is implemented on top of Spring AOP, which can further be implemented on top of Java proxies, CGLIB generated subclasses or AspectJ instrumentation. Thus I’ll test the following configurations:no caching at all – to measure how fast the code is with no intermediate layer manual cache handling using ConcurrentHashMap in business code @Cacheable with CGLIB implementing AOP @Cacheable with java.lang.reflect.Proxy implementing AOP @Cacheable with AspectJ compile time weaving (as similar benchmark shows, CTW is slightly faster than LTW) Home-grown AspectJ caching aspect – something between manual caching in business code and Spring abstractionLet me reiterate: we are not measuring the performance gain of caching and we are not comparing various cache providers. That’s why our test method is as fast as it can be and I will be using simplest ConcurrentMapCacheManager from Spring. So here is a method in question: public interface Calculator { int identity(int x); } public class PlainCalculator implements Calculator { @Cacheable("identity") @Override public int identity(int x) { return x; } } I know, I know there is no point in caching such a method. But I want to measure the overhead of caching layer (during cache hit to be specific). Each caching configuration will have its own ApplicationContext as you can’t mix different proxying modes in one context: public abstract class BaseConfig { @Bean public Calculator calculator() { return new PlainCalculator(); } } @Configuration class NoCachingConfig extends BaseConfig {} @Configuration class ManualCachingConfig extends BaseConfig { @Bean @Override public Calculator calculator() { return new CachingCalculatorDecorator(super.calculator()); } } @Configuration abstract class CacheManagerConfig extends BaseConfig { @Bean public CacheManager cacheManager() { return new ConcurrentMapCacheManager(); } } @Configuration @EnableCaching(proxyTargetClass = true) class CacheableCglibConfig extends CacheManagerConfig {} @Configuration @EnableCaching(proxyTargetClass = false) class CacheableJdkProxyConfig extends CacheManagerConfig {} @Configuration @EnableCaching(mode = AdviceMode.ASPECTJ) class CacheableAspectJWeaving extends CacheManagerConfig { @Bean @Override public Calculator calculator() { return new SpringInstrumentedCalculator(); } } @Configuration @EnableCaching(mode = AdviceMode.ASPECTJ) class AspectJCustomAspect extends CacheManagerConfig { @Bean @Override public Calculator calculator() { return new ManuallyInstrumentedCalculator(); } } Each @Configuration class represents one application context. CachingCalculatorDecorator is a decorator around real calculator that does the caching (welcome to the 1990s): public class CachingCalculatorDecorator implements Calculator { private final Map<Integer, Integer> cache = new java.util.concurrent.ConcurrentHashMap<Integer, Integer>(); private final Calculator target; public CachingCalculatorDecorator(Calculator target) { this.target = target; } @Override public int identity(int x) { final Integer existing = cache.get(x); if (existing != null) { return existing; } final int newValue = target.identity(x); cache.put(x, newValue); return newValue; } } SpringInstrumentedCalculator and ManuallyInstrumentedCalculator are exactly the same as PlainCalculator but they are instrumented by AspectJ compile-time weaver with Spring and custom aspect accordingly. My custom caching aspect looks like this: public aspect ManualCachingAspect { private final Map<Integer, Integer> cache = new ConcurrentHashMap<Integer, Integer>(); pointcut cacheMethodExecution(int x): execution(int com.blogspot.nurkiewicz.cacheable.calculator.ManuallyInstrumentedCalculator.identity(int)) && args(x); Object around(int x): cacheMethodExecution(x) { final Integer existing = cache.get(x); if (existing != null) { return existing; } final Object newValue = proceed(x); cache.put(x, (Integer)newValue); return newValue; } } After all this preparation we can finally write the benchmark itself. At the beginning I start all the application contexts and fetch Calculator instances. Each instance is different. For example noCaching is a PlainCalculator instance with no wrappers, cacheableCglib is a CGLIB generated subclass while aspectJCustom is an instance of ManuallyInstrumentedCalculator with my custom aspect woven. private final Calculator noCaching = fromSpringContext(NoCachingConfig.class); private final Calculator manualCaching = fromSpringContext(ManualCachingConfig.class); private final Calculator cacheableCglib = fromSpringContext(CacheableCglibConfig.class); private final Calculator cacheableJdkProxy = fromSpringContext(CacheableJdkProxyConfig.class); private final Calculator cacheableAspectJ = fromSpringContext(CacheableAspectJWeaving.class); private final Calculator aspectJCustom = fromSpringContext(AspectJCustomAspect.class); private static <T extends BaseConfig> Calculator fromSpringContext(Class<T> config) { return new AnnotationConfigApplicationContext(config).getBean(Calculator.class); } I’m going to exercise each Calculator instance with the following test. The additional accumulator is necessary, otherwise JVM might optimize away the whole loop (!): private int benchmarkWith(Calculator calculator, int reps) { int accum = 0; for (int i = 0; i < reps; ++i) { accum += calculator.identity(i % 16); } return accum; } Here is the full caliper test without parts already discussed: public class CacheableBenchmark extends SimpleBenchmark { //... public int timeNoCaching(int reps) { return benchmarkWith(noCaching, reps); } public int timeManualCaching(int reps) { return benchmarkWith(manualCaching, reps); } public int timeCacheableWithCglib(int reps) { return benchmarkWith(cacheableCglib, reps); } public int timeCacheableWithJdkProxy(int reps) { return benchmarkWith(cacheableJdkProxy, reps); } public int timeCacheableWithAspectJWeaving(int reps) { return benchmarkWith(cacheableAspectJ, reps); } public int timeAspectJCustom(int reps) { return benchmarkWith(aspectJCustom, reps); } } I hope you are still following our experiment. We are now going to execute Calculate.identity() millions of times and see which caching configuration performs best. Since we only call identity() with 16 different arguments, we hardly ever touch the method itself as we always get cache hit. Curious to see the results? benchmark ns linear runtime NoCaching 1.77 = ManualCaching 23.84 = CacheableWithCglib 1576.42 ============================== CacheableWithJdkProxy 1551.03 ============================= CacheableWithAspectJWeaving 1514.83 ============================ AspectJCustom 22.98 =  Interpretation Let’s go step by step. First of all calling a method in Java is pretty darn fast! 1.77 nanoseconds, we are talking here about 3 CPU cycles on my Intel(R) Core(TM)2 Duo CPU T7300 @ 2.00GHz! If this doesn’t convince you that Java is fast, I don’t know what will. But back to our test. Hand-made caching decorator is also pretty fast. Of course it’s slower by an order of magnitude compared to pure function call, but still blazingly fast compared to all @Scheduled benchmarks. We see a drop by 3 orders of magnitude, from 1.8 ns to 1.5 μs. I’m especially disappointed by the @Cacheable backed by AspectJ. After all caching aspect is precompiled directly into my Java .class file, I would expect it to be much faster compared to dynamic proxies and CGLIB. But that doesn’t seem to be the case. All three Spring AOP techniques are similar. The greatest surprise is my custom AspectJ aspect. It’s even faster than CachingCalculatorDecorator! maybe it’s due to polymorphic call in the decorator? I strongly encourage you to clone this benchmark on GitHub and run it (mvn clean test, takes around 2 minutes) to compare your results. Conclusions You might be wondering why Spring abstraction layer is so slow? Well, first of all, check out the core implementation in CacheAspectSupport – it’s actually quite complex. Secondly, is it really that slow? Do the math – you typically use Spring in business applications where database, network and external APIs are the bottleneck. What latencies do you typically see? Milliseconds? Tens or hundreds of milliseconds? Now add an overhead of 2 μs (worst case scenario). For caching database queries or REST calls this is completely negligible. It doesn’t matter which technique you choose. But if you are caching very low-level methods close to the metal, like CPU-intensive, in-memory computations, Spring abstraction layer might be an overkill. The bottom line: measure! PS: both benchmark and contents of this article in Markdown format are freely available.   Reference: @Cacheable overhead in Spring from our JCG partner Tomasz Nurkiewicz at the Java and neighbourhood blog. ...
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close