Would you dare to change HashMap implementation?

There are bold engineers working for the Oracle nowadays. I came to this conclusion when trying to nail down a Heisenbug yesterday. Not too surprisingly, the bug seemed to disappear when I was trying to find the solution. Several hours later, the “Heisen”-part of the bug was removed, when the problem was traced down to minor differences between the JDK7 updates.

But back to the bravery claim. In order to understand the case I am describing I extracted it into a really simple test snippet for you to try out:
 
 
 

class OOM {
	public static void main(String[] args) {
		java.util.Map m = new java.util.HashMap(10_000_000);
	}
}

Now when I launch the class on my 64bit Mac OS X with the JDK7u40 or later:

my:tmp user$ /path-to/jdk1.7.0_40/bin/java -Xmx96m OOM
my:tmp user$

You see the command prompt returning and the JVM successfully completing its job. Now, launch the same class with JDKu25 or earlier:

my:tmp user$ /path-to/jdk1.7.0_25/bin/java -Xmx96m OOM
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.util.HashMap.(HashMap.java:283)
    at java.util.HashMap.(HashMap.java:297)
    at OOM.main(OOM.java:3)

And you see a different result. Initializing a HashMap with 100M entries fails to allocate enough resources in our ~100m heap and the JVM exits with an OutOfMemoryError being thrown.

Source of the HashMap is clearly the #1 suspect in this case. And indeed, when you compare the source code of the JDK 7u25 to the next release (named u40, kudos for naming!), you see a significant difference. The Hashmap(initialCapacity, loadFactor) constructor now ignores your will to construct a HashMap with the initial size being equal to initialCapacity. Instead, you see the underlying array being allocated lazily only when the first put() method is called on the map.

A seemingly very reasonable change – JVM is lazy by nature in different aspects, so why not postpone the allocation of large data structures until the need for such allocation becomes imminent. So in that sense a good call.

In the sense that a particular application was performing tricks via reflection and directly accessing the internal structures of the Map implementations – maybe not. But again, one should not bypass the API and start being clever, so maybe the particular developer is now a bit more convinced that each newly found concept is not applicable everywhere.

Would you have made the change yourself if you were the API developer? I am not convinced I would have had the guts, knowing that there has to be around bazillion apps out there depending on all kind of weird aspects of the implementation. But I do vote for reasonable changes within the JDK and can count this one definitely among the good ones.
 

Reference: Would you dare to change HashMap implementation? from our JCG partner Nikita Salnikov Tarnovski at the Plumbr Blog blog.
Related Whitepaper:

Bulletproof Java Code: A Practical Strategy for Developing Functional, Reliable, and Secure Java Code

Use Java? If you do, you know that Java software can be used to drive application logic of Web services or Web applications. Perhaps you use it for desktop applications? Or, embedded devices? Whatever your use of Java code, functional errors are the enemy!

To combat this enemy, your team might already perform functional testing. Even so, you're taking significant risks if you have not yet implemented a comprehensive team-wide quality management strategy. Such a strategy alleviates reliability, security, and performance problems to ensure that your code is free of functionality errors.Read this article to learn about this simple four-step strategy that is proven to make Java code more reliable, more secure, and easier to maintain.

Get it Now!  

One Response to "Would you dare to change HashMap implementation?"

  1. Udo says:

    Reflection has its place, but I do not consider it proper for getting around API limitations. If you do not have control over the source of the class, use of Reflection is essentially a crap-shoot.

    IMNSHO, any improvement to the core classes (performance and/or resource usage) is a welcome change, and I’d have only myself to blame if unorthodox access to class internals gets me into trouble.

Leave a Reply


six − 3 =



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.

Sign up for our Newsletter

20,709 insiders are already enjoying weekly updates and complimentary whitepapers! Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

As an extra bonus, by joining you will get our brand new e-books, published by Java Code Geeks and their JCG partners for your reading pleasure! Enter your info and stay on top of things,

  • Fresh trends
  • Cases and examples
  • Research and insights
  • Two complimentary e-books