Core Java

What is the fastest Garbage Collector in Java 8?

OpenJDK 8 has several Garbage Collector algorithms, such as Parallel GC, CMS and G1. Which one is the fastest? What will happen if the default GC changes from Parallel GC in Java 8 to G1 in Java 9 (as currently proposed)? Let’s benchmark it.

Benchmark methodology

  • Run the same code 6 times with a different VM argument (-XX:+UseSerialGC, -XX:+UseParallelGC, -XX:+UseConcMarkSweepGC, -XX:ParallelCMSThreads=2, -XX:ParallelCMSThreads=4, -XX:+UseG1GC).
  • Each run takes about 55 minutes.
  • Other VM arguments: -Xmx2048M -server
    OpenJDK version: 1.8.0_51 (currently the latest version)
    Software: Linux version 4.0.4-301.fc22.x86_64
    Hardware: Intel® Core™ i7-4790 CPU @ 3.60GHz
  • Each run solves 13 planning problems with OptaPlanner. Each planning problem runs for 5 minutes. It starts with a 30 second JVM warm up which is discarded.
  • Solving a planning problem involves no IO (except a few milliseconds during startup to load the input). A single CPU is completely saturated. It constantly creates many short lived objects, and the GC collects them afterwards.
  • The benchmarks measure the number of scores that can be calculated per millisecond. Higher is better. Calculating a score for a proposed planning solution is non-trivial: it involves many calculations, including checking for conflicts between every entity and every other entity.

To reproduce these benchmarks locally, build optaplanner from source and run the main class GeneralOptaPlannerBenchmarkApp.

Benchmark results

Executive summary

For your convenience, I ‘ve compared each Garbage Collector type to the default in Java 8 (Parallel GC).

garbageCollectorTypesJava8

The results are clear: That default (Parallel GC) is the fastest.

Raw benchmark numbers

table1

Relative benchmark numbers

table2

Should Java 9 default to G1?

There is a proposal to make G1 the default Garbage Collector in OpenJDK9 for servers. My first reaction is to reject this proposal:

  • G1 is 17.60% is slower on average.
  • G1 is consistently slower on every use case for every dataset.
  • On the biggest dataset (Machine Reassignment B10), which dwarfs any of the other datasets in size, G1 is 34.07% is slower.
  • If the default GC differs between developer machines and servers, then developer benchmarks become less trustworthy.

On the other hand, there are a few nuances to note:

  • G1 focuses on limiting GC pauses, instead of throughput. For these use cases (with heavy calculations) GC pause length mostly doesn’t matter.
  • This was an (almost) single threaded benchmark. Further benchmarking with multiple solvers in parallel or multi-threaded solving might influence results.
  • G1 is recommended for a heap size of at least 6 GB. This benchmark used a heap size of only 2 GB and even that size is only needed for the biggest dataset (Machine Reassignment B10).

Heavy calculations is just one of the many things that OpenJDK is used for: it’s just 1 stakeholder in this community wide debate. If other stakeholders (such as web services) prove otherwise, maybe it’s worth changing the default GC. But show me the benchmarks on real projects first!

Conclusion

In Java 8, the default Garbage Collector (Parallel GC) is generally the best choice for OptaPlanner use cases.

Reference: What is the fastest Garbage Collector in Java 8? from our JCG partner Geoffrey De Smet at the OptaPlanner blog.

Geoffrey De Smet

Geoffrey De Smet (Red Hat) is the lead and founder of OptaPlanner. Before joining Red Hat in 2010, he was formerly employed as a Java consultant, an A.I. researcher and an enterprise application project lead. He has contributed to many open source projects (such as drools, jbpm, pressgang, spring-richclient, several maven plugins, weld, arquillian, ...). Since he started OptaPlanner in 2006, he’s been passionately addicted to planning optimization.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Meliora
Meliora
8 years ago

G1GC is about *latency* not throughput. So this benchmark is pretty pointless. Basically it compares the throughput of the throughput collector with the throughput of the various “low latency” collectors.

Back to top button