GC impact on throughput and latency

One type of the problems each and every Java application out there has to wrestle with is related to garbage collection. When the garbage collector works, it represents a wonderful invention. When it does not – or when the way GC is doing its housekeeping becomes unpredictable – then you have a friend who has turned into a foe.

This post is about garbage collection pause times. Or more precisely – why should you care about the pauses.

Some posts ago, I explained throughput and latency via mr Apple CEO Tim Cook planning for the iPad demand and building factories. I will stick to the same illustrative story:

  • We have a factory line producing one iPad per second. Each second, every second. So the throughput of the line is 86,400 iPads/day
  • It takes four hours to complete an iPad from the start where the casing is molded to the finish when the acceptance tests on the iPad have been concluded. So the latency of the line is four hours.

The system above and the calculations are based on the assumption that the factory line is operational 24 hours a day, each day, every day. But all factory lines tend to need maintenance which is equivalent to garbage collection running inside the JVM.

As an example – lets take small maintenance tasks, which can be handled without much interruptions. Examples could involve adding oil to the machinery or picking up excess trash from the floor next to the molding equipment. Those operations are similar to minor GC’s within the JVM – it is maintenance you have to deal with, but the implementation is so clever that the performance of the system is not affected.

But in the very same factory mr. Tim Cook is going to face long-lasting maintenance tasks as well. Those tasks involve stopping the whole production line and are equivalent to the Full GC runs, where the JVM needs to stop servicing the threads in order to do some important housekeeping tasks.

Now, lets assume that after months of uninterrupted service, our hypothetical factory line gets jammed and the tech team takes four hours to resolve the issue. During this period the line is stopped. How do we measure the effect? As always, the impact can be measured by two different means:

  • Impact on throughput. The four-hour stop means we have 14,400 seconds during which no iPads are completed. Throughput-wise it means we have reduced the system’s capacity in this particular day from 86,400 to 72,000. Which means approximately ~16.5% loss in throughput.
  • Impact on latency. Now, if we took an iPad which was still on the line when the interruption occurred, it took not four but eight hours to complete. This represents a 100% increase in worst-case latency.

If you recall then mr. Cook did not care about latency. What was important for him was the overall throughput during a longer period, so mr. Cook would decide to optimize his processes in a way that the impact on throughput would be minimized.

Similar decisions need to be made in software development as well. If you have a Java EE application responsible for order processing, then a GC pause spanning four seconds would definitely reduce the throughput of your system. But for most of us it will not be a major issue. On the other hand, the users who were trying to accomplish things during the four-second stop-the-world-i-have-cleaning-to-do pause would get a perception that our systems are sluggish. And operating a service which is perceived by users as sluggish is a darn good way to go out of business.

The morale of the story? Pick your goals wisely and make sure you do not confuse throughput with latency. Then make sure you understand how GC can affect either of those by monitoring your GC logs, looking for unexpected Full GCs and tuning your application and/or GC to minimize their impact.
 

Reference: GC impact on throughput and latency from our JCG partner Nikita Salnikov Tarnovski at the Plumbr Blog blog.

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

JPA Mini Book

Learn how to leverage the power of JPA in order to create robust and flexible Java applications. With this Mini Book, you will get introduced to JPA and smoothly transition to more advanced concepts.

JVM Troubleshooting Guide

The Java virtual machine is really the foundation of any Java EE platform. Learn how to master it with this advanced guide!

Given email address is already subscribed, thank you!
Oops. Something went wrong. Please try again later.
Please provide a valid email address.
Thank you, your sign-up request was successful! Please check your e-mail inbox.
Please complete the CAPTCHA.
Please fill in the required fields.

One Response to "GC impact on throughput and latency"

  1. Zoltan Juhasz says:

    This post can couse misunderstandings about he GC.
    You say the GC means performance loss in the throughput, but compared to what?
    – Not having garbage collection requires practically infinite amount of memory, which is not an option I think.
    – Switching back to some GC-less language, like C++ requires manual garbage handling, – memory allocation, deallocation of the memory, – which is much less efficient than the Java GC.
    (- garbage free applications has serious drawbacks for everyday use)

    So overally, tha Java GC means performance increase in throughput and average latency. Of course GC tuning is still required to make it even more performant.

    BUT, as you said (so the point of the post is valid and important), it causes huge worst-case latency. So if latency matters (under 0.1 second latency requirements, like in high requency trading), it MUST be handled some way.

Leave a Reply


× 6 = six



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close