Featured FREE Whitepapers

What's New Here?


Event streaming with MongoDB

MongoDB is a really great “NoSQL” database, with a very wide range of applications. In one project that we are developing at SoftwareMill, we used it as a replicated event storage, from which we stream the events to other components. Introduction The basic idea is pretty simple (see also Martin Fowler’s article on Event Sourcing). Our system generates a series of events. These events are persisted in the event storage. Other components in the system follow the stream of events and do “something” with them; for example they can get aggregated and written into a reporting database (this, on the other hand, resembles CQRS). Such an approach has many advantages:reading and writing of the events is decoupled (asynchronous) any following-component may die and then “catch up”, given that it wasn’t dead for too long there may be multiple followers. The followers may read the data from slave replicas, for better scalability bursts of event activity have a reduced impact on event sinks; at worst, the reports will get generated slowerThe key component here is of course a fast and reliable event storage. The three key features of MongoDB that we used to implement one are:capped collections and tailable cursors fast collection appends replica sets  Collection As the base, we are using a capped collection, which by definition is size-constrained. If writing a new event would cause the collection to exceed the size limit, the oldest events are overwritten. This gives us something similar to a circular buffer for events. (Plus we are also quite safe from out-of-disk-space errors.) Until version 2.2, capped collection didn’t have an _id field by default (and hence no index). However, as we wanted the events to be written reliably across the replica set, both the _id field and an index on it are mandatory.Writing events Writing events is a simple Mongo insert operation; inserts can also be done in batches. Depending on how tolerant we are of event loss, we may use various Mongo write concerns (e.g. waiting for a write confirmation from a single-node or from multiple nodes). All of the events are immutable. Apart from nicer, thread-safe Java code, this is a necessity for event streaming; if the events were mutable, how would the event sink know what was updated? Also, this has good Mongo performance implications. As the data is never changed, the documents that are written to disk never shrink or expand, so there is no need to move blocks on disk. In fact, in a capped collection, Mongo doesn’t allow to grow a document that was once written.Reading events Reading the event stream is a little bit more complex. First of all, there may be multiple readers, each with a different level of advancement in the stream. Secondly, if there are no events in the stream, we would like the reader to wait until some events are available, and avoid active polling. Finally, we would like to process the events in batches, to improve performance. Tailable cursors solve these problems. To create such a cursor we have to provide a starting point – an id of an event, from which we’ll start reading; if an id is not provided, the cursor will return events from the oldest one available. Thus each reader must store the last event that it has read and processed. More importantly, tailable cursors can optionally block for some amount of time if no new data is available, solving the active polling problem. (By the way, the oplog collection that mongo uses to replicate data across a replica set, is also a capped collection. Slave Mongo instances tail this collection, streaming the “events”, which are database operations, and applying them locally in order.)Reading events in Java When using the Mongo Java Driver, there are a few “catches”. First of all you need to initialise the cursor. To do that, we need to provide (1) the last event id, if present; (2) an order in which we want to read the events (here: natural, that is the insertion order); and (3) two crucial cursor options, that we want the cursor to be tailable, and that we want to block if there’s no new data: DBObject query = lastReceivedEventId.isPresent() ? BasicDBObjectBuilder.start('_id', BasicDBObjectBuilder .start('$gte', lastReceivedEventId.get()).get()) .get() : null;DBObject sortBy = BasicDBObjectBuilder.start('$natural', 1).get();DBCollection collection = ... // must be a capped collection DBCursor cursor = collection .find(query) .sort(sortBy) .addOption(Bytes.QUERYOPTION_TAILABLE) .addOption(Bytes.QUERYOPTION_AWAITDATA); You may wonder why we used >= last_id instead of >. That is needed here because of the way Mongo ObjectIds are generated. With a simple > last_id we may miss some events that have been generated in the same second as the last_id event, but after it. This also means that our Java code must take care of this fact and discard the first event that was received. The cursor’s class extends the basic Java Iterator interface, so it’s fairly easy to use. So now we can take care of batching. When iterating over a cursor, the driver receives the data from the Mongo server in batches; so we may simply call hasNext() and next(), as with any other iterator, to receive subsequent elements, and only some calls will actually cause network communication with the server. In the Mongo Java driver the call that is actually potentially blocking is hasNext(). If we want to process the events in batches, we need to (a) read the elements as long as they are available, and (b) have some way of knowing before getting blocked that there are no more events, and that we can process the events already batched. And as hasNext() can block, we can’t do this directly. That’s why we introduced an intermediate queue (LinkedBlockingQueue). In a separate thread, events read from the cursor are put on the queue as they come. If there are no events, the thread will block on cursor.hasNext(). The blocking queue has an optional size limit, so if it’s full, putting an element will block as well until space is available. In the event-consumer thread, we first try to read a single element from the queue, in a blocking fashion (using .poll, so here we wait until any event is available). Then we try to drain the whole content of the queue to a temporary collection (using .drainTo, building the batch, and potentially getting 0 elements, but we always have the first one).An important thing to mention is that if the collection is empty, Mongo won’t block, so we have to fall back to active polling. We also have to take care of the fact that the cursor may die during this wait; to check this we should verify that cursor.getCursorId() != 0, where 0 is an id of a “dead cursor”. In such a case we simply need to re-instantiate the cursor. Summing up To sum up, we got a very fast event sourcing/streaming solution. It is “self regulating”, in the sense that if there’s a peak of event activity, they will be read by the event sinks with a delay, in large batches. If the event activity is low, they will be processed quickly in small batches. We’re also using the same Mongo instance for other purposes; having a single DB system to cluster and maintain both for regular data and events is certainly great from an ops point of view.   Reference: Event streaming with MongoDB from our JCG partner Adam Warski at the Blog of Adam Warski blog. ...

Not All Optimization Is Premature

The other day the reddit community discarded my advice for switching from text-based to binary serialization formats. It was labeled “premature optimization”. I’ll zoom out of the particular case, and discuss why not all optimization is premature. Everyone has heard of Donald Knuth’s phrase “[..] premature optimization is the root of all evil”. And as with every well-known phrase, this one is usually misinterpreted. And to such an extent that people think optimizing something which is not a bottleneck is bad. That being the case, many system are unnecessarily heavy and consume a lot of resources…because there is no bottleneck. What has Knuth meant? That it is wrong to optimize if that is done at the cost of other important variables: readability, maintainability, time. Optimizing an algorithm can make it harder to read. Optimizing a big system can make it harder to maintain. Optimizing anything can take time that should probably be spent implementing functionality or fixing bugs. In practice, this means that you should not add sneaky if-clauses and memory workarounds in your code, that you shouldn’t introduce new tools or layers in your system for the sake of some potential gains in processing speed, and you shouldn’t spend a week on gaining 5% in performance. However, most interpretations say “you shouldn’t optimize for performance until it hits you”. And that’s where my opinion differs. If you wait for something to “hit” you, then you are potentially in trouble. You must make your system optimal before it goes into production, otherwise it may be too late (meaning – a lot of downtime, lost customers, huge bills for hardware/hosting). Furthermore, “bottlenecks” are not that obvious with big systems. If you have 20 servers, will you notice that one piece of code takes up 70% more resource than it should? What if there are 10 such pieces. There is no obvious bottleneck, but optimizing the code may save you 2-3 servers. That’s why writing optimal code is not optional and is certainly not “premature optimization”. But let me give a few examples:you notice that in some algorithms that are supposed to be invoked thousands of times, a linked list is used where random access is required. Is it premature optimization to change it to array/array list? No – it takes very little time, and does not make the code harder to read. Yet, it may increase the speed of the application a lot (how much is ‘a lot’ doesn’t even matter in that case) you realize that a piece of code (including db access) is executed many times, but the data doesn’t change. This rarely accounts for a big percentage of the time needed to process a request. Is it premature optimization to cache the results (provided you have a caching framework that can handle cache invalidation, cache lifetime, etc.)? No – caching the things would save tons of database requests, without making your code harder to read (with declarative caching it will be just an annotation). you measure that if you switch from a text to a binary format for transmitting messages within internal components you can do it 50%+ faster with half the memory. The system does not have huge memory needs, and the CPU is steady below 50%. Is replacing the text format with a binary one a premature optimization? No, because it costs 1 day, you don’t loose code readability (the change is only one line of configuration) and you don’t loose the means to debug your messages (you can dump them before/after serialization, or you can switch to text-based format in development mode. (yeah, that’s my case from the previous blogpost)So, with these kinds of things, you saved a lot of processing power and memory even though you didn’t have any problems with that. But you didn’t have the problems either because you had enough hardware to mask them or you didn’t have enough traffic/utilization to actually see them. And performance tests/profiling didn’t show a clear bottle-neck. Then you optimize “in advance”, but not prematurely. An important note here is that I mean mainly web applications. For desktop applications the deficiencies do not multiply. If you have a piece of desktop code that makes the system consume 20% more memory, (ask Adobe) then whatever – people have a lot of memory nowadays. But if your web application consumes 20% more memory for each user on the system, and you have 1 millions users, then the absolute value if huge (although it’s still “just” 20%). The question is – is there a fine line between premature and proper optimization? Anything that makes the code “ugly” and does not solve a big problem is premature. Anything that takes two weeks to improve performance 5% is premature. Anything that is explained with “but what if some day trillions of aliens use our software” is premature. But anything that improves performance without affecting readability is a must. And anything that improves performance by just a better configuration is a must. And anything that makes the system consume 30% less resources and takes a day to implement is a must. To summarize – if neither readability, not maintainability are damaged and the time taken is negligible – go for it. If every optimization is labeled as “premature”, a system may fail without any visible performance bottleneck. So assess each optimization rather than automatically concluding it’s premature.   Reference: Not All Optimization Is Premature from our JCG partner Bozhidar Bozhanov at the Bozho’s tech blog blog. ...

How many Java developers are there in the world?

Oracle says it’s 9,000,000. Wikipedia claims it’s 10,000,000. And the guys from NumberOf.net seem to be the most precise – they know that there are exactly 9,007,346 Java developers out there. Nice numbers. I have used those articles as reference points while speaking about the potential market size for our memory leak detection tool. But something in these numbers has bothered me for years – there is no trustworthy and public analysis behind those numbers. Its just conjured up from thin air. So I finally thought I would do something about it and try to figure it out for good. It proved out to be a challenging task. After all – with more than seven billion people on our planet I couldn’t call everyone and ask them. Well, maybe I could, but if every call would take on average 20 seconds I would need at least 4,439 years to complete the survey. If I would not sleep nor eat nor rest. So I had to use other ways for estimation. After playing around with different sources of information, I decided to dig into four of them for a closer look:Labour statistics provided by different governments Language popularity sites such as Tiobe and Langpop Employment portals using Indeed.com and Monster.com Download numbers on popular Java tools and libraries – namely Eclipse and Tomcat.Using that information I wanted to estimate the number using three different calculations – based on language popularity indexes, labour statistics and download figures. So, here we go. How many programmers could there be in total? World population is currently above seven billion. Out of those seven billion we can leave out sub-Saharan Africa (900M) and rural Asia (about 50% of its 2.2B population) as negligible. This leaves us with approximately 5 billion people living in regions where overall economical and cultural background can be considered suitable for software industries to spawn. Now, out of those 5,000,000,000 how many could be actually developing software? A good answer at StackExchange gives us some pointers as to where we can find information on the percentage of software developers in different countries. Using the US, Japan, Canada, the EU27 and the UK as a baseline we can estimate that 0.86% of the population is employed as a software developer or programmer:Country Population Developers %Canada 33,476,688 387,000 1.16%EU27 502,486,499 5,900,000 1.17%Japan 127,799,000 1,016,929 0.80%UK 63,162,000 333,000 0.53%US 313,931,000 1,336,300 0.43%Weighted average: 0.86%0.86% out of five billion is 43,000,000. Lets remember this number, as it will be used as a baseline in following calculations. Popularity contests In the popularity contest we will use two channels for the source of data – the TIOBE index and the Langpop one. Other sources such as Dataist figures were hard to interpret, so we’ll stick just to those two. For the background – the TIOBE ratings are calculated by counting hits of the most popular search engines. The search query that is used is +”<language> programming”, e.g. +“Java programming” in our case.Langpop uses more sources for input besides search engine queries – in equal weights it traces open job positions, book titles, search engine results, the number of open source projects and other data to calculate its popularity score. Simplifying TIOBE and Langpop results, we can conclude that according to TIOBE 17% and according to Langpop ~15% of the programmers in the world are using Java. Averaging those numbers we can say that around 16% out of the 43,000,000 developers in the world use Java. This translates to 6,880,000 Java developers out there. Job portals Job portals, especially when considering both available positions and uploaded resumes, are definitely a good source of information. The larger ones also provide nice reports on labour market, which we will dig into next. Note that we used Indeed.com and Monster.com – if you can point us towards more and/or better sources of information, we would be glad to correct our calculations. But using this analysis from Monster.com and the aggregated statistics from Indeed.com we can say that ~18% of Monster.com applicants can program in Java and ~16% of open engineering / programming positions scanned by Indeed.com are looking for Java talent. Averaging those numbers we arrive at 17%. Which out of 43,000,000 programmers in total would translate to 7,310,000 Java guys and girls in the world. Software downloads Every Java developer uses something to build the application. Well, we expect them to use at least a JVM and a compiler. If you happen to know anyone who can get away without those two, please let us know. We would hire him immediately. But most of us tend to use more than just a compiler and a virtual machine. We use IDEs, application servers, build tools, etc. So we figured that we would look into the publicly available download numbers of these tools and try to estimate the number of developers from the download numbers. When calculating the total number of developers from estimated number of users, we take into account the market share of the corresponding software. To estimate the market share we use Zeroturnaround’s statistics gathered in the spring of 2012. Eclipse downloads. Eclipse Juno was released on June 27 and has been downloaded 1,200,000 times during the first 20 days. Looking into the historical data published by eclipse.org we can predict that Juno will be downloaded approximately 8,000,000 times in total. Last four major Eclipse releases have all been released using a yearly release calendar and all the releases took place in June:Juno – 8,000,000 (in a year, expecting the trend to continue. Currently has 1,200,000 downloads in first 20 days). Indigo – 6,000,000 downloads Helios – 4,100,000 downloads Galileo – 2,200,000 downloadsAveraging Juno estimates and Indigo results, we can say that Eclipse is downloaded approximately 7,000,000 times a year. Using the Zeroturnaround’s statistics, we expect 68% of Java developers to use Eclipse as a (primary) IDE. If we now make a bold claim that each Java developer on Eclipse will download the IDE exactly once a year, expect the number of downloads per year to be 7,000,000 and consider that 32% of Java developers do not use Eclipse at all, we come to a conclusion that there should be 10,300,000 Java developers in total. Apache Tomcat downloads. Vadim Gritsenko has put together some nice statistics on top of Apache logs. From there we can see that during the last year Tomcat has been downloaded approximately 550,000 times/month. This gives us a yearly total of 6,600,000 Tomcat downloads.Applying now statistics from the same report used for calculating Eclipse’s market share we can estimate that 59% of Java developers are using Tomcat as one of their development platform. If we now again make a bold claim that each Java developer on Tomcat will download every major release exactly once and consider that 41% of Java developers do not use Tomcat, we reach to conclusion that there should be 11,186,000 Java developers out there. Averaging the numbers from Eclipse and Tomcat downloads, we end up with 10,743,000 Java developers. Conclusions We used three different sources for estimation – popularity contests, job market analysis and download numbers of popular Java development infrastructure products. The numbers varied quite a bit – from 6,880,000 to 10,743,000. Aggressively averaging the three numbers we can conclude that there are 8,311,000 Java developers out there. Not quite as much as Oracle or Wikipedia think, but still enough to build a business that provides developing tools for the Java community. Lies. Damn lies. And statistics.   Reference: How many Java developers are there in the world? from our JCG partner Nikita Salnikov Tarnovski at the Plumbr Blog blog. ...

Interview Prep For Geeks

Failing an interview due to a lack of qualifications is forgivable, but it is tragic when highly qualified candidates do not get an offer due to being unprepared. With the amount of information freely available today, the time and effort required to prepare for an interview is extremely low and a relatively small investment to make in your career. Typically a candidate will have at least two or three days advance notice to do some research and prepare for any interview. Here is a checklist of things for technologists to investigate to be sure you are ready for what will come your way.Company intel – Learn as much as you can about the company, and try to have at least one minute of material memorized to answer the “What do you know about us?” question. Be prepared to present on the company history, the products or services the company provides, details on the business model, and the industry itself (competitors, health of the market, etc.). For technologists, the ability to give an eloquent response to the “Describe what the company does” question is a huge asset that should not be overlooked and could be a significant factor in your success. Gathering substantial information on a young company’s funding status or finances might be difficult, but there will generally be at least some info in press releases from venture partners.Tech environment – Get specific details about the technical environment by doing some basic web research, reviewing any available job descriptions or LinkedIn employee profiles, and talking to your recruiter or any appropriate company contacts you may have. What frameworks, languages, databases, operating systems, and hardware are they using? Even if the details aren’t all entirely relevant to your interview, it will show that you are taking this process seriously. Look up any buzzwords or acronyms you don’t recognize so you can at least discover if you may have experience with a related item (“I haven’t worked with ______, but I’m familiar with ________ which appears to be a similar tool/language”).Tech moves – Knowing the company’s current tech details is valuable, but knowing about some of the company’s technical history will show great initiative while also providing potential insight into how the company views technology and makes tech decisions. Has the company made significant changes to their stack, and if so, why? Are they heavily invested in open source? Do they seem closely linked to a specific vendor? Does the company have an engineering blog or a company GitHub account for you to explore that might contain this information?Interviewer intel – Insight into the technical background and past employers of the individual(s) you will meet is a great advantage, as you may have some similar history. Personal GitHub or Twitter accounts? Technical blog posts? A LinkedIn or web search of the interviewer(s) might turn up some helpful details to use during the interview, as long as you use the info wisely. Showing that you did some research displays initiative, as long as you respect personal space.Confirm the basics – Where are you going and who should you ask for when you get there? Who are you meeting with and what is his/her/their role in the company? What is the preferred dress code? (NOTE: Some companies actually ask that candidates dress more casual, so be sure to ask)Prepare questions and anecdotes – Most interviews will provide you with at least a brief opportunity to ask questions. Although you ideally want to have these memorized, it is generally a good idea to have some questions listed so you don’t forget them under possible duress. There are also some fairly standard questions in the “tell me about a time when…” family which are commonly answered with anecdotes. Give some thought to past challenges, failures, and successes, and especially what lessons you learned from each project.Documents – Some companies may ask you to fill out an application and other relevant documents before the interview. Find out if this is the case and if so get those completed before interview day. Make sure to print out at least three copies of your resume and one copy of your list of questions. Think about who you will list as references if asked on the application, and have their info (name, email) available.Keep in mind that making a solid impression in an interview is something that can make a huge impact down the road, whether or not you get the job. Interviewers remember candidates who impressed, and they absolutely will remember those who crashed and burned as well. Do your homework and take interviews seriously, not just for the sake of getting this job but for opportunities later in your career.   Reference: Interview Prep For Geeks from our JCG partner Dave Fecak at the Job Tips For Geeks blog. ...

Why Scrum Won

In the 1990s and early 2000s a number of different lightweight ‘agile’ development methods sprung up. Today a few shops use Extreme Programming, including most notably ThoughtWorks and Industrial Logic. But if you ask around, especially in enterprise shops, almost everybody who is “doing Agile” today is following Scrum or something based on Scrum. What happened? Why did Scrum win out over XP, FDD, DSDM, Crystal, Adaptive Software Development, Lean, and all of the other approaches that have come and gone? Why are most organizations following Scrum or planning to adopt Scrum and not the Agile Unified Process or Crystal Clear (or Crystal Yellow, or Orange, Red, Violet, Magenta or Blue, Diamond or Sapphire for that matter)? Is Scrum that much better than every other idea that came out of the Agile development movement? Simplicity wins out Scrum’s principal strength is that it is simpler to understand and follow than most other methods – almost naively simple. There isn’t much to it: short incremental sprints, daily standup meetings, a couple of other regular and short planning and review meetings around the start and end of each sprint, some work to prioritize (or order) the backlog and keep it up-to-date, simple progress reporting, and a flat, simple team structure. You can explain Scrum in detail in a few pages and understand it less than an hour. This means that Scrum is easy for managers to get their heads around and easy to implement, at a team-level at least (how to successfully scale to enterprise-level Scrum in large integrated programs with distributed teams using Scrum of Scrums or Communities of Practice or however you are supposed to do it, is still fuzzy as hell). Scrum is easy for developers to understand too and easy for them to follow. Unlike XP or some of the other more well-defined Agile methods, Scrum is not prescriptive and doesn’t demand a lot of technical discipline. It lets the team decide what they should do and how they should do it. They can get up to speed and start “doing Agile” quickly and cheaply. But simplicity isn’t the whole answer But there’s more to Scrum’s success than simplicity. The real trick that put Scrum out front is certification. There’s no such thing as a Certified Extreme Programmer but there are thousands of certified ScrumMasters and certified product owners and advanced certified developers and even more advanced certified professionals and the certified trainers and coaches and officially registered training providers that certified them. And now the PMI has got involved with its PMI-ACP Certified Agile Practitioner designation which basically ensures that people understand Scrum, with a bit of XP, Lean and Kanban thrown in to be comprehensive. Whether Scrum certification is good or bad or useful at all is beside the point. Certification helped Scrum succeed for several reasons. First, certification lead to early codification and standardization of what Scrum is all about. Consultants still have their own ideas and continue to fight between themselves over the only right way to do Scrum and the future direction of Scrum and what should be in Scrum and what shouldn’t, but the people who are implementing Scrum don’t need to worry about the differences or get caught up in politics and religious wars. Certification is a win win win… Managers like standardization and certification – especially busy, risk-adverse managers in large mainstream organizations. If they are going to “do Agile”, they want to make sure that they do it right. By paying for standardized certified training and coaching on a standardized method, they can be reassured that they should get the same results as everyone else. Because of standardization and certification, getting started with Scrum is low risk: it’s easy to find registered certified trainers and coaches offering good quality professional training programs and implementation assistance. Scrum has become a product – everyone knows what it looks like and what to expect. Certification also makes it easier for managers to hire new people (demand a certification upfront and you know that new hires will understand the fundamentals of Scrum and be able to fit in right away) and it’s easier to move people between teams and projects that are all following the same standardized approach. Developers like this too, because certification (even the modest CSM) helps to make them more employable, and it doesn’t take a lot of time, money or work to get certified. But most importantly, certification has created a small army of consultants and trainers who are constantly busy training and coaching a bigger army of certified Scrum practitioners. There is serious competition between these providers, pushing each other to do something to get noticed in the crowd, saturating the Internet with books and articles and videos and webinars and blogs on Scrum and Scrumness, effectively drowning out everything else about Agile development. And the standardization of Scrum has also helped create an industry of companies selling software tools to help manage Scrum projects, another thing that managers in large organizations like, because these tools help them to get some control over what teams are doing and give them even more confidence that Scrum is real. The tool vendors are happy to sponsor studies and presentations and conferences about Agile (er, Scrum), adding to the noise and momentum behind Scrum. Scrum certification is a win win win: for managers, developers, authors, consultants and vendors. It looks like David Anderson may be trying to do a similar thing with Kanban certification. It’s hard to see Kanban taking over the world of software development – while it’s great for managing support and maintenance teams, and helps to control work flow at a micro-level, Kanban doesn’t fit for larger project work. But then neither does Scrum. And who would have picked Scrum as the winner 10 years ago?   Reference: Why Scrum Won from our JCG partner Jim Bird at the Building Real Software blog. ...

Type-safe Empty Collections in Java

I have blogged before on the utility of the Java Collections class and have specifically blogged on Using Collections Methods emptyList(), emptyMap(), and emptySet(). In this post, I look at the sometimes subtle but significant differences between using the relevant fields of the Collections class for accessing an empty collection versus using the relevant methods of the Collections class for accessing an empty collection. The following code demonstrates accessing Collections‘s fields directly to specify empty collections. Using Collections’s Fields for Empty Collections /** * Instantiate my collections with empty versions using Collections fields. * This will result in javac compiler warnings stating 'warning: [unchecked] * unchecked conversion'. */ public void instantiateWithEmptyCollectionsFieldsAssigment() { this.stringsList = Collections.EMPTY_LIST; this.stringsSet = Collections.EMPTY_SET; this.stringsMap = Collections.EMPTY_MAP; } The code above compiles with javac, but leads to the warning message (generated by NetBeans and Ant in this case): -do-compile: [javac] Compiling 1 source file to C:\java\examples\typesafeEmptyCollections\build\classes [javac] Note: C:\java\examples\typesafeEmptyCollections\src\dustin\examples\Main.java uses unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. Specifying -Xlint:unchecked as an argument to javac (in this case via the javac.compilerargs=-Xlint:unchecked in the NetBeans project.properties file) helps get more specific warning messages for the earlier listed code: [javac] Compiling 1 source file to C:\java\examples\typesafeEmptyCollections\build\classes [javac] C:\java\examples\typesafeEmptyCollections\src\dustin\examples\Main.java:27: warning: [unchecked] unchecked conversion [javac] this.stringsList = Collections.EMPTY_LIST; [javac] ^ [javac] required: List<String> [javac] found: List [javac] C:\java\examples\typesafeEmptyCollections\src\dustin\examples\Main.java:28: warning: [unchecked] unchecked conversion [javac] this.stringsSet = Collections.EMPTY_SET; [javac] ^ [javac] required: Set<String> [javac] found: Set [javac] C:\java\examples\typesafeEmptyCollections\src\dustin\examples\Main.java:29: warning: [unchecked] unchecked conversion [javac] this.stringsMap = Collections.EMPTY_MAP; [javac] ^ [javac] required: Map<String,String> [javac] found: Map NetBeans will also show these warnings if the appropriate hint box is checked in its options. The next three images demonstrate ensuring that the appropriate hint is set to see these warnings in NetBeans and provides an example of how NetBeans presents the code shown above with warnings.Fortunately, it is easy to take advantage of the utility of the Collections class and access empty collections in a typesafe manner that won’t lead to these javac warnings and corresponding NetBeans hints. That approach is to use Collections‘s methods rather than its fields. This is demonstrated in the next simple code listing. Using Collections’s Methods for Empty Collections /** * Instantiate my collections with empty versions using Collections methods. * This will avoid the javac compiler warnings alluding to 'unchecked conversion'. */ public void instantiateWithEmptyCollectionsMethodsTypeInferred() { this.stringsList = Collections.emptyList(); this.stringsSet = Collections.emptySet(); this.stringsMap = Collections.emptyMap(); } The above code will compile without warning and no NetBeans hints will be shown either. The Javadoc documentation for each field of the Collections class does not address why these warnings occur for the fields, but the documentation for each of the like-named methods does discuss this. Specifically, the documentation for Collections.emptyList(), Collections.emptySet(), and Collections.emptyMap() each state, ‘(Unlike this method, the field does not provide type safety.)’Use of the Collections methods for empty collections shown in the last code listing provided type safety without the need to explicitly specify the types stored within that collection because type was inferred by use of the Collections methods in assignments to known and already declared instance attributes with explicitly specified element types. When type cannot be inferred, compiler errors will result when using the Collections methods without an explicitly specified type. This is shown in the next screen snapshot of attempting to do this in NetBeans.The specific compiler error message is: [javac] C:\java\examples\typesafeEmptyCollections\src\dustin\examples\Main.java:62: error: method populateList in class Main cannot be applied to given types; [javac] populateList(Collections.emptyList()); [javac] ^ [javac] required: List<String> [javac] found: List<Object> [javac] reason: actual argument List<Object> cannot be converted to List<String> by method invocation conversion [javac] C:\java\examples\typesafeEmptyCollections\src\dustin\examples\Main.java:63: error: method populateSet in class Main cannot be applied to given types; [javac] populateSet(Collections.emptySet()); [javac] ^ [javac] required: Set<String> [javac] found: Set<Object> [javac] reason: actual argument Set<Object> cannot be converted to Set<String> by method invocation conversion [javac] C:\java\examples\typesafeEmptyCollections\src\dustin\examples\Main.java:64: error: method populateMap in class Main cannot be applied to given types; [javac] populateMap(Collections.emptyMap()); [javac] ^ [javac] required: Map<String,String> [javac] found: Map<Object,Object> [javac] reason: actual argument Map<Object,Object> cannot be converted to Map<String,String> by method invocation conversion [javac] 3 errors These compiler errors are avoided and type safety is achieved by explicitly specifying the types of the collections’ elements in the code. This is shown in the next code listing. Explicitly Specifying Element Types with Collections’s Empty Methods /** * Pass empty collections to another method for processing and specify those * empty methods using Collections methods. This will result in javac compiler * ERRORS unless the type is explicitly specified. */ public void instantiateWithEmptyCollectionsMethodsTypeSpecified() { populateList(Collections.<String>emptyList()); populateSet(Collections.<String>emptySet()); populateMap(Collections.<String, String>emptyMap()); } The Collections class’s methods for obtaining empty collections are preferable to use of Collections‘s similarly named fields for that same purpose because of the type safety the methods provide. This allows greater leveraging of Java’s static type system, a key theme of books such as Effective Java. A nice side effect is the removal of cluttering warnings and marked NetBeans hints, but the more important result is better, safer code.   Reference: Type-safe Empty Collections in Java from our JCG partner Dustin Marx at the Inspired by Actual Events blog. ...

Wasting time by saving memory

You might say that at my company, the hard ware is 10x more expensive, but it also likely you time is costing the company about the same more. In any case, this article attempts to demonstate that there is a tipping point where it no longer makes sense to spend time saving memory, or even thinking about it.              time spent cheap memory expensive memory cheap disk expensive diska screen refresh 20 ms      27 KB 150 bytes        1 MB   24 KBone trivial change ~1 sec     1.4 MB   7.6 KB      60 MB     1.2 MBone command ~5 sec        7 MB   50 KB    400 MB     6 MBa line of code ~1 min      84 MB 460 KB   3,600 MB   72 MBa small change ~20 min  1600 MB     9 MB 72,000 MB     1.4 GBa significant change ~1 day       40 GB   0.2 GB   1,700 GB   35 GBa major change ~2 weeks      390 GB   2 GB 17,000 GB 340 GB  Your mileage may vary, but just today some one asked how to save a few bytes by passing short instead of int as method arguments (Java doesn’t save any memory if you do) Even if it did save as much as it might, the time taken to ask the question, let alone implement and test it, could have been worth 10,000,000 times the cost of memory it could have saved. In short; don’t fall into the trap of a mind boggling imbalance of scale.   Reference: Wasting time by saving memory from our JCG partner Peter Lawrey at the Vanilla Java blog. ...

Coding for the Changes You’ll Have to Make Next Month

One of the most difficult parts of software development is adapting to change. It’s a guarantee that the concepts, ideas, and possibly the point of the program that you are writing will change several times before it’s actually done. If you have ever heard the buzzwords like Agile, Scrum, Extreme Programming, or anything similar, then you have put some time into writing software that adapts to change. Any programmer serious about the craft has at least heard about it, but few have actually mastered it (and neither have I). Designing a program that easily adapts to change is a very difficult task. However, there are a few good ideas I have found that help create this kind of software at a team level:Spend more time on design! Seriously, spend a few days or weeks on it. Time put into simply designing the software and not touching the code itself is time well spent. The better your design, the less pain will come your way down the road when the product is getting ready to ship. Think about what type of changes could come in the future. What would need to change in the system to let these new features exists? The smaller the effect on the entire system, the better. Step back from the requirements of the system for a moment and ask yourself why the system is designed in a particular way. What is going to happen when you run out of time (which you will)? Where do you think shortcuts will be made?After designing a system, spend a few days (yes, days) thinking about possible additions to the system. Don’t just rush into coding up the system after you have one working solution. With these ideas, think about how the system would have to change to accommodate this new idea. It really doesn’t matter if these additions are realistic, feasible, or something that no one would really want in the program. If it’s difficult to add to the current system, then maybe the system should be redesigned to accommodate it. The other trick is really just refactoring– but taking it to the extreme. Any and every chance to refactor some code to make it more readable, useable, and simpler, should be taken. Don’t let confusing, complicated code stick around in the code base just because you have more important things to do, or you didn’t code it yourself. If the code doesn’t make sense or is difficult to understand, get with the programmer who wrote it, and make it simpler. At the very least, understand what caused the code to become so complex and avoid that path next time. You’ll save yourself a ton of time in the long run of the project, and by making the code simpler, it will be much easier to change later on. ‘Anytime you find yourself looking at a class’s implementation to figure out how to use the class, you’re not programming to the interface, you’re programming through the interface to the implementation. If you’re programming through the interface, encapsulation is broken, and once encapsulation starts to break down, abstraction won’t be too far behind.'[Code Complete: A Practical Handbook of Software Construction, Second Edition] Basically, this quote says if you have to look how a class/method/function works to figure it out, you’re doing it wrong. Proper design hides this information well so that once it’s written, it can be taken for granted. This is an excellent time to refactor and think about how the code really should be structured. What ideas and concepts do you use to create adaptable software?   Reference: Coding for the Changes You’ll Have to Make Next Month from our JCG partner Isaac Taylor at the Programming Mobile blog. ...

How cool is integration testing with Spring and Hibernate

I am guilty of not writing integration testing (At least for database related transactions) up until now. So in order to eradicate the guilt i read up on how one can achieve this with minimal effort during the weekend. Came up with a small example depicting how to achieve this with ease using Spring and Hibernate. With integration testing, you can test your DAO(Data access object) layer without ever having to deploy the application. For me this is a huge plus since now i can even test my criteria’s, named queries and the sort without having to run the application. There is a property in hibernate that allows you to specify an sql script to run when the Session factory is initialized. With this, i can now populate tables with data that required by my DAO layer. The property is as follows; <prop key='hibernate.hbm2ddl.import_files'>import.sql</prop> According to the hibernate documentation, you can have many comma separated sql scripts.One gotcha here is that you cannot create tables using the script. Because the schema needs to be created first in order for the script to run. Even if you issue a create table statement within the script, this is ignored when executing the script as i saw it. Let me first show you the DAO class i am going to test; package com.unittest.session.example1.dao;import org.springframework.transaction.annotation.Propagation; import org.springframework.transaction.annotation.Transactional;import com.unittest.session.example1.domain.Employee;@Transactional(propagation = Propagation.REQUIRED) public interface EmployeeDAO {public Long createEmployee(Employee emp);public Employee getEmployeeById(Long id); } package com.unittest.session.example1.dao.hibernate;import org.springframework.orm.hibernate3.support.HibernateDaoSupport;import com.unittest.session.example1.dao.EmployeeDAO; import com.unittest.session.example1.domain.Employee;public class EmployeeHibernateDAOImpl extends HibernateDaoSupport implements EmployeeDAO {@Override public Long createEmployee(Employee emp) { getHibernateTemplate().persist(emp); return emp.getEmpId(); }public Employee getEmployeeById(Long id) { return getHibernateTemplate().get(Employee.class, id); } } Nothing major, just a simple DAO with two methods where one is to persist and one is to retrieve. For me to test the retrieval method i need to populate the Employee table with some data. This is where the import sql script which was explained before comes into play. The import.sql file is as follows; insert into Employee (empId,emp_name) values (1,'Emp test'); This is just a basic script in which i am inserting one record to the employee table. Note again here that the employee table should be created through the hibernate auto create DDL option in order for the sql script to run. More info can be found here. Also the import.sql script in my instance is within the classpath. This is required in order for it to be picked up to be executed when the Session factory is created. Next up let us see how easy it is to run integration tests with Spring. package com.unittest.session.example1.dao.hibernate;import static org.junit.Assert.*;import org.junit.Test; import org.junit.runner.RunWith; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.test.context.ContextConfiguration; import org.springframework.test.context.junit4.SpringJUnit4ClassRunner; import org.springframework.test.context.transaction.TransactionConfiguration;import com.unittest.session.example1.dao.EmployeeDAO; import com.unittest.session.example1.domain.Employee;@RunWith(SpringJUnit4ClassRunner.class) @ContextConfiguration(locations='classpath:spring-context.xml') @TransactionConfiguration(defaultRollback=true,transactionManager='transactionManager') public class EmployeeHibernateDAOImplTest {@Autowired private EmployeeDAO employeeDAO;@Test public void testGetEmployeeById() { Employee emp = employeeDAO.getEmployeeById(1L);assertNotNull(emp); }@Test public void testCreateEmployee() { Employee emp = new Employee(); emp.setName('Emp123'); Long key = employeeDAO.createEmployee(emp);assertEquals(2L, key.longValue()); }} A few things to note here is that you need to instruct to run the test within a Spring context. We use the SpringJUnit4ClassRunner for this. Also the transction attribute is set to defaultRollback=true. Note that with MySQL, for this to work, your tables must have the InnoDB engine set as the MyISAM engine does not support transactions. And finally i present the spring configuration which wires everything up; <?xml version='1.0' encoding='UTF-8'?> <beans xmlns='http://www.springframework.org/schema/beans' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:aop='http://www.springframework.org/schema/aop' xmlns:tx='http://www.springframework.org/schema/tx' xmlns:context='http://www.springframework.org/schema/context' xsi:schemaLocation=' http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.5.xsd http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx-2.5.xsd http://www.springframework.org/schema/aop http://www.springframework.org/schema/aop/spring-aop-2.5.xsd http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-2.5.xsd'><context:component-scan base-package='com.unittest.session.example1' /> <context:annotation-config /><tx:annotation-driven /><bean id='sessionFactory' class='org.springframework.orm.hibernate3.annotation.AnnotationSessionFactoryBean'> <property name='packagesToScan'> <list> <value>com.unittest.session.example1.**.*</value> </list> </property> <property name='hibernateProperties'> <props> <prop key='hibernate.dialect'>org.hibernate.dialect.MySQLDialect</prop> <prop key='hibernate.connection.driver_class'>com.mysql.jdbc.Driver</prop> <prop key='hibernate.connection.url'>jdbc:mysql://localhost:3306/hbmex1</prop> <prop key='hibernate.connection.username'>root</prop> <prop key='hibernate.connection.password'>password</prop> <prop key='hibernate.show_sql'>true</prop> <prop key='hibernate.dialect'>org.hibernate.dialect.MySQLDialect</prop> <!-- --> <prop key='hibernate.hbm2ddl.auto'>create</prop> <prop key='hibernate.hbm2ddl.import_files'>import.sql</prop> </props> </property> </bean><bean id='empDAO' class='com.unittest.session.example1.dao.hibernate.EmployeeHibernateDAOImpl'> <property name='sessionFactory' ref='sessionFactory' /> </bean><bean id='transactionManager' class='org.springframework.orm.hibernate3.HibernateTransactionManager'> <property name='sessionFactory' ref='sessionFactory' /> </bean></beans> That is about it. Personally i would much rather use a more light weight in-memory database such as hsqldb in order to run my integration tests. Here is the eclipse project for anyone who would like to run the program and try it out.   Reference: How cool is integration testing with Spring+Hibernate from our JCG partner Dinuka Arseculeratne at the My Journey Through IT blog. ...

Working Efficiently with JUnit in Eclipse

Recently I was dragged into a discussion1 with some test infected2 fellows about how we use JUnit within the Eclipse IDE. Surprisingly the conversation brought up some ‘tips and tricks’ not everybody was aware of. This gave me the idea to write this post doing a sum up of our talk. Who knows – maybe there is something new for somebody out there too… Launch Shortcuts If you are doing Test Driven Development you have to run your tests quite often. Obviously it gets somewhat tedious using e.g. the context menu of the editor to select the Run As -> JUnit Test to launch a test case under development. Fortunately the shortcut Alt+Shift+X,T does the same and Alt+Shift+D,T executes the test in debug mode. But there is more in that than meets the eye. Consider the following situation: a unit under test does not work as expected anymore. You have recognized this because a certain test of your test suite fails. Having a look at the code might not be conclusive so you decide to start a debugging session. To do so you set a breakpoint at the current cursor position (Ctrl+Shift+B). In such a case you are probably not interested in re-running the suite or even all the tests of the given test class. You only want to launch the single failing test3.Now it is important to know that the ‘Run as’ shortcuts described above are sensitive to the editor’s cursor position. Moving the cursor to a test method name allows to use those shortcuts to launch a JUnit process that runs this test method only45. Carrying on the example a little bit it is very likely that you will find a suspicious spot in your unit under test during the debugging session. Considering a solution you might change some code of that unit. After that you want to see if the test method still fails. Luckily there is another shortcut in Eclipse that allows you to re-run the latest executed launch configuration. Use F11 to re-run your debug session and Ctrl-F11 to re-run the test method normally. However there is a preference setting that have to be set to make this work reliable. After opening the Launching preference page (Windows -> Preferences | Run/Debug -> Launching) there is a section called Launch Operation. Ensure that the Always launch the previously launched application radio button is selected. Method Templates Every time you are about to create a new test method you might consider using Eclipse editor templates to improve your coding efficiency. Once you have positioned the cursor at the location where the new test method should be located type test and hit the Ctrl+Space shortcut to pop up the content assist.As shown in the first part of the picture above the content assist offers a test method template that will create a complete method stub on selection. Unfortunately this would be a JUnit 3 style method stub. But hitting Ctrl+Space again will reveal a second template that is written in JUnit 4 style. This is shown in the second part of the picture above. In spite of all hitting the shortcut twice still seems to be too cumbersome for many developers. And writing test cases you often have to create setup and/or teardown methods annotated with the @Before/@After tag as well. But thankfully it is possible to provide your own editor templates in Eclipse. Holger Staudacher has written a good post called Simple JUnit4 templates for Eclipse where he explains how to do this and even provides a set of templates in a gist. Favorites JUnit tests rely heavily on the usage of the various assertXXX methods provided by the class junit.framework.Assert. Those methods are all declared as static and can be referred to as Assert.assertTrue(condition) for example. But as far as I know most people would use static imports to shorten the statement for the sake of readability to assertTrue(condition). But by default the IDE’s content assist will not suggest the static methods of the Assert class. One way to get around this is to write the class name and let the content assist propose the available methods. The latter might be accelerated by using camel case matching. After that the use of Ctrl+Shift+M as described in Rüdiger’s post about static import shortens the statement and generates the import. However I think the most efficient way is to configure the junit.framework.Assert class as content assist favorite to allow the proposal of the static members even if the import is still missing. The configuration takes place in (window -> Preferences | Java -> Editor -> Content Assist -> Favorites) and looks like this:  JUnit View Configuration While working test driven running your tests regularly gets practically organic 6. However running a larger test suite takes some time. In the meanwhile the JUnit View pops up and continously updates the list of test results. But this can get enervating as it is distracting in the best – or even obstructing your work in the worst case. With test driven development you expect your tests to succeed at a rate of 100%. And because of this many developers want to be informed about failing tests only – the exception of the rule. The JUnit view supports this with a configuration setting called Activate on Error/Failure Only available via the viewpart’s menu:Every now and then your test suite will fail and there may be more than one problem at once. By default the JUnit view lists all test results. But as a developer you’re generally more interested in the failing ones and may percive the bulk of green tests as clutter. Here focus on your work means focus on the failing tests. There is a configuration setting called Show Failures Only available to change this behaviour. As people tend to change this setting more frequently a toggle button in the viewpart’s toolbar is provided.  Fast View If your are working with Eclipse 3.x there is a nice feature called Fast View that allows to unclutter your UI a bit. In general I prefer this for views that I use regularly but not continously and/or for views that I consider more lucid if provided with more space. Examples for this might be the Coverage-, History- or the Call Hierarchy view. A viewpart tab provides a content menu that makes it possible to use a view as fast view:This removes the view from its stack and shows a toggle button in the fast view toolbar at the left bottom corner of your workbench. With this button you can activate/deactivate a particular view as overlay7:A specific feature of the JUnit fast view button is that it provides status info about the latest test run or progress info about a currently executed one. So this little button is all the UI you need for a good deal of the time you spent with JUnit:Unfortunately fast views are no longer available in Eclipse 4.x. But there is a workaround that meets the behavior to a certain degree. You can move the views you want to use as ‘fast views’ into a designated view stack and minimize this stack. The toolbar representing the minimized view stack now serves as the former fast view bar. This works so-so as the activation/deactivation sometimes hangs and you have to fiddle a bit to hide the view and get back to the editor for example. In essence I think the sections above are covering the main points we were talking about in the disscussion I mentioned at the beginning of this post. Maybe you also have some infos about useful JUnit shortcuts, using patterns or the like to share – feel welcome to add a comment.The discussion happened during one of those spontaneous couple-of-beers-after-work-sessions we like to have once in a while… It is said that the term ‘test infected’ was originally coined by Erich Gamma. Together with Kent Beck he also published an article called JUnit Test Infected: Programmers Love Writing Tests that describes how ‘your attitude toward development is likely to change’, once you drive your programming work consistently with tests. In particular if a breakpoint is not located in the test method as in the example but in the unit under test it can get annoying running all test methods of a test case. This is because the program execution might halt at the breakpoint triggered actually by one of the test methods that do not have a problem. Unfortunately the framework is not able to distinguish test methods from non test methods. Using the shortcut on non test methods will lead to JUnit runs that show an Unrooted Tests error as result. Some of the attendees considered it as a minor downside that the framework automatically creates and persists launch configurations. Because of this behaviour running single test methods via shortcuts can generate a lot of clutter in your launch configuration list over time. There are even tools available that run your tests continuously. In practice I often use the Ctrl+F7 shortcut to switch to and/or between view parts.  Reference: Working Efficiently with JUnit in Eclipse from our JCG partner Frank Appel at the Code Affine blog. ...
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below: