Reducing memory consumption by 20x

This is going to be another story sharing our recent experience with memory-related problems. The case is extracted from a recent customer support case, where we faced a badly behaving application repeadedly dying with OutOfMemoryError messages in production. After running the application with Plumbr attached we were sure we were not facing a memory leak this time. But something was still terribly wrong.

The symptoms were discovered by one of our experimental features monitoring the overhead on certain data structures. It gave us a signal pinpointing towards one particular location in the source code. In order to protect the privacy of the customer we have recreated the case using a synthetic sample, at the same time keeping it technically equivalent to the original problem. Feel free to download the source code.dimagrire 2

We found ourselves staring at a set of objects loaded from an external source. The communication with the external system was implemented via XML interface. Which is not bad per se.  But the fact that the integration implementation details were scattered across the system – the documents received were converted to XMLBean instances and then used across the system – was not maybe the wisest thing.

Essentially we were dealing with a lazily-loaded caching solution. The objects cached were Persons:
 

// Imports and methods removed to improve readability
public class Person {
private String id;
private Date dateOfBirth;
private String forename;
private String surname;
}

Not too memory-consuming one might guess. But things start to look a bit more sour when we open up some more details. Namely the implementation of this data was anything like the simple class declaration above. Instead, the implementation used a model-generated data structure. Model used was similar to the following simplified XSD snippet:

<xs:schema targetNamespace="http://plumbr.eu"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="id" type="xs:string"/>
<xs:element name="dateOfBirth" type="xs:dateTime"/>
<xs:element name="forename" type="xs:string"/>
<xs:element name="surname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

Using XMLBeans, the developer had generated the model used behind the scenes. Now lets add the fact that the cache was supposed to hold up to 1.3M instances of Persons and we have created a strong foundation to failure.

Running a bundled testcase gave us an indication that 1.3M instances of the XMLBean-based solution would consume approximately 1.5GB of heap. We thought we could do better.

First solution is obvious. Integration details should not cross system boundaries. So we changed the caching solution to the simple java.util.HashMap<Long, Person> solution. ID as the key and Person object as the value. Immediately we saw the memory consumption reduced to 214MB. But we were not satisfied yet.

As the key in the Map was essentially a number, we had all the reasons to use Trove Collections to further reduce the overhead. Quick change in the implementation and we had replaced our HashMap withTLongObjectHashMap<Person>. Heap consumption dropped to 143MB.

We definitely could have stopped there, but the engineering curiosity did not allow us to do so. We could not help to notice that the data used contained a redundant piece of information. Date Of Birth was actually encoded in the ID, so instead of duplicating it in additional field, we could easily calculate the birthday from the given ID.

So we changed the layout of the Person object and now it contained just the following fields:

// Imports and methods removed to improve readability
public class Person {
private long id;
private String forename;
private String surname;
}

Re-running the tests confirmed our expectations. Heap consumption was down to 93MB. But we were still not satisfied.

The application was running on 64-bit machine with an old JDK6 release. Which did not compress the ordinary object pointers by default. Switching to the -XX:+UseCompressedOops gave us an additional win – now we were down to 73MB consumed.

comparing-heap-consumption

We could go further and start interning strings or building a b-tree based on the keys, but this would already start impacting the readability of the code, so we decided to stop here. 21.5x heap reduction should already be good enough result.

Lessons learned?

  • Do not let integration details cross system boundaries
  • Redundant data will be costly. Remove the redundancy whenever you can.
  • Primitives are your friends. Know thy tools and learn Trove if you already haven’t
  • Be aware of the optimization techniques provided by your JVM

If you are curious about the experiment conducted, feel free to download the code used from here. The utility used for measurements is described and available in this blogpost.
 

Reference: Reducing memory consumption by 20x from our JCG partner Nikita Salnikov Tarnovski at the Plumbr Blog blog.

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

JPA Mini Book

Learn how to leverage the power of JPA in order to create robust and flexible Java applications. With this Mini Book, you will get introduced to JPA and smoothly transition to more advanced concepts.

JVM Troubleshooting Guide

The Java virtual machine is really the foundation of any Java EE platform. Learn how to master it with this advanced guide!

Given email address is already subscribed, thank you!
Oops. Something went wrong. Please try again later.
Please provide a valid email address.
Thank you, your sign-up request was successful! Please check your e-mail inbox.
Please complete the CAPTCHA.
Please fill in the required fields.

5 Responses to "Reducing memory consumption by 20x"

  1. Chris Kelly says:

    Thanks for sharing. That’s one hell of a bit of optimisation. Proves the mantra ‘Make it work then make it fast’ :)

    • Richard says:

      That’s exactly what isn’t proved. Firstly the article is specifically about reducing menory consumption.
      Secondly it mentions nothing about how fast the code before or after any of the changes.

  2. jramoyo says:

    “So we changed the caching solution to the simple java.util.HashMap solution” — it’s not clear what the previous solution was.
    I’m also not sure what XMLBeans have to do with caching?

Leave a Reply


seven + 6 =



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close