Core Java

A Simple Map Iterator Performance Test

There are many dimensions of Java Map performance to measure, but a critical one is simple single-threaded scanning. Here is some simple test code for Iterators and the Java 8 Map.forEach() along with some graphical results.

1. Performance Testing is Difficult

Performance testing is a notoriously difficult endeavor, and precise repeatable testing requires a framework like the Java Microbenchmarking Harness to filter out many noise sources and provide statistics such as the confidence interval. However, the results from the simple test code here are relatively repeatable and correlate well with JMH tests the author has done on the JDK Maps and AirConcurrentMap (at boilerbay.com and github boilerbay/airconcurrentmap).  There are also more extensive JMH tests there for streams and so on.

JMH is not complex, but it requires switching gears from an IDE to the maven model, learning the set of control annotations, and dealing with yet another directory tree full of cruft and possibly a new IDE project. Also, we want automatically to span a spectrum of Map sizes and a set of Maps in one run: this requires ‘parameterization’ annotations in JMH. The output data format of this test can easily be adjusted for the needs of the desired graphing tool.

1.1 Issues Causing Noise

Many issues must be addressed to get a good test:

  1. The inner loop must not have significant overhead. This is solved by having it simply watch changes in a shared static volatile int variable incremented by a ‘heartbeat’ thread;
  2. Garbage collection may jump in. This is helped by distributing GC overhead round-robin over the tests for different Maps. We also avoid creating any unreachable Objects. Use -Xloggc:<file> in the ORACLE JVM to watch GC;
  3. Code may be optimized at any time by the JVM during the run. We use warmup runs of each test to allow this to happen. It is hard to know when this will happen exactly, but it is a matter of seconds. This can be minimized with longer warmups. The typical JVM flag ‘-XX:+PrintCompilation’ shows the progress of optimization. Also see the ORACLE article ;
  4. The JVM will grow erratically while the Maps grow, so the test right after a growth must warmup;
  5. Different Maps use different amounts of memory, so even forking a JVM for each test would depend on JVM growth characteristics. We keep all Maps in one JVM;
  6. Different Maps must have exactly the same contents. We use synchronized random number generators seeded by Map size;
  7. Large Maps need more time to iterate than smaller of course. The heartbeat allows more runs of the smaller Maps so that each test takes a constant time;
  8. Maps must not share contents so earlier tests do not bring data into the CPU caches that is available for later tests;
  9. The ‘Just In Time’ compiler must not optimize away the loop. We print the sum of the contents so the loop has a tangible result. If the loop counter were to be used instead, the JIT can recognize the pattern and change the loop to increment by a number greater than one, thus skipping many iterations! Yes this happens! (In AirConcurrentMap at least).

2. Results

These precautions help us come to some tentative but repeatable conclusions. The results show:

  • forEach() is much faster than Iterators;
  • ConcurrentSkipListMap iterates fastest below about 30K Entries, and AirConcurrentMap is fastest above;
  • ConcurrentSkipListMap forEach() is fastest below about 100 Entries, and AirConcurrentMap is fastest above.

3. Code Details

The test code uses a Test class wrapper to contain a particular Map along with the test progress state such as time, map size, loops, and a running total. This organization hopefully makes the main() more readable.

The Test class implements BiConsumer so that it can be used in the forEach(this) invocation. It simply overrides accept(Integer, Integer). We also test forEach((k, v) -> total += v) (the author has tried it and the results are approximately the same). In either case, a local variable cannot be used as the total accumulator, because local variables must be ‘effectively final’ to an inner class or lambda, so we need an instance variable in a containing scope like the one in class Test anyway. A reduction using a stream will work too, but we are interested in Iterators here. (See https://github.com/boilerbay/airconcurrentmap for relevant stream tests in /jmh).

The heartbeat is a static int incremented once per second by a separate Thread. The heartbeat must be volatile so that its changes propagate between Threads – otherwise the program never terminates.

The USE_ITERATION and USE_LAMBDA flags are static final, so the javac actually evaluates the code it affects ahead-of-time, removing dead code. This is defined to be the necessary behavior, thus there is no overhead there. You have to recompile to do the different kinds of test of course, but just don’t change them to be non-static or non-final!

The inner loop is not slowed by the dynamic method invocation of test.testIterator() because methods of length 35 bytes or less are always inlined. Also, there is no polymorphism here – the Test class has no extensions, so there is no vtable for dispatching, and there are no parameters to push on the stack.

The tests are relatively repeatable and show the overall pattern in the diagrams.

4. Code

public class JavaCodeGeeksIteratorArticle {
    static volatile int heartBeat = 0;
    // otherwise use forEach()
    static final boolean USE_ITERATION = false;
    static final boolean USE_LAMBDA = true;

    public static void main(String... args) {
        System.out.println((USE_ITERATION ? "Iterator" :
            USE_LAMBDA ? " forEach(lambda)" : "ForEach()") + " performance test");
        Test tests[] = {
                new Test(new HashMap<Integer, Integer>()),
                new Test(new TreeMap<Integer, Integer>()),
                new Test(new ConcurrentHashMap<Integer, Integer>()),
                new Test(new ConcurrentSkipListMap<Integer, Integer>()),
                new Test(new AirConcurrentMap<Integer, Integer>())
        };
        int sizes[] = new int[] {
                1, 3, 10, 30, 100, 300, 1000, 3000, 10_000, 30_000, 100_000, 300_000,
                1000_000, 3_000_000, 10_000_000
        };

        // Just increment heartBeat every so often. It is volatile.
        // Reading it is very fast.
        new Thread(new Runnable() {
            public void run() {
                while (true) {
                    heartBeat++;
                    try {
                        Thread.sleep(100);
                    } catch (InterruptedException e) {
                    }
                }
            }
        }).start();

        for (int i = 0; i < sizes.length; i++) {
            for (Test test : tests)
                test.fillTo(sizes[i]);
            // warmup
            for (Test test : tests) {
                int nextHeartBeat = heartBeat + 20;
                while (heartBeat < nextHeartBeat)
                    if (USE_ITERATION)
                        test.testIterator();
                    else if (USE_LAMBDA)
                        test.testForEachLambda();
                    else
                        test.testForEach();
            }
            for (Test test : tests) {
                test.time = 0;
                test.loops = 0;
                long t0 = System.nanoTime();
                int nextHeartBeat = heartBeat + 30;
                while (heartBeat < nextHeartBeat) {
                    if (USE_ITERATION)
                        test.testIterator();
                    else if (USE_LAMBDA)
                        test.testForEachLambda();
                    else
                        test.testForEach();
                }
                long t1 = System.nanoTime();
                test.time += (t1 - t0);
            }
            for (Test test : tests)
                test.printResult();
            // System.out.println("---------------");
        }

    }
}

class Test implements BiConsumer<Integer, Integer> {
    // The total provides a tangible result to prevent optimizing-out
    long total = 0;
    long time = 0;
    int size = 0;
    long loops = 1;
    Map<Integer, Integer> map;

    Test(Map<Integer, Integer> map) {
        this.map = map;
    }

    void fillTo(int newSize) {
        Random random = new Random(size);
        while (size < newSize) {
            Integer n = new Integer(random.nextInt()); 
            map.put(n, n); 
            size++;
        }
    }

    void testIterator() { 
        for (Integer v : map.values()) {
            total += v.intValue(); 
        } loops++; 
    }

    // This has the same effect and is 'terser' 
    void testForEachLambda() {
        map.forEach((k, v) -> total += v);
        loops++;
    }
    
    void testForEach() {
        map.forEach(this);
        loops++;
    }

    // Implement BiConsumer for forEach()
    @Override
    public void accept(Integer k, Integer v) {
        total += k.intValue();
    }

    void printResult() {
        double seconds = time / 1e9;
        System.out.printf("%22s size=% 9d entries/s(K)=% 11.3f total=%d\n",
                map.getClass().getSimpleName(), size, size * loops / seconds / 1e3, total);

    }
}

5. Result Data

The data for the charts is below. This can be imported for example into Excel and sorted on the Map class and size to make graphs manually. You need to cut-and-paste the data, then use text-to-columns. Line charts can be used, manually selecting series data one-by-one for each Map class. One can also use ggplot in the ‘R’ statistics language by changing the data output format. Ignore the total: this only provides a tangible output to avoid optimizing out the loop.

JavaCodeGeeksIteratorArticle

Iterator performance test
               HashMap size=       1 entries/s=  31290.754K total=-289049400605539520
               TreeMap size=       1 entries/s=  50210.333K total=-331631386373881504
     ConcurrentHashMap size=       1 entries/s=  15881.356K total=-91464608232057952
 ConcurrentSkipListMap size=       1 entries/s=  42187.535K total=-254234286353018080
      AirConcurrentMap size=       1 entries/s=  25577.125K total=-149805405784208032
---------------
               HashMap size=       3 entries/s=  62664.626K total=-484691989675689270
               TreeMap size=       3 entries/s=  66908.091K total=-550745245141063704
     ConcurrentHashMap size=       3 entries/s=  38018.996K total=-211326922860746827
 ConcurrentSkipListMap size=       3 entries/s=  71265.063K total=-488278692474832005
      AirConcurrentMap size=       3 entries/s=  48540.146K total=-302163579336545082
---------------
               HashMap size=      10 entries/s=  86701.181K total=-832795481348512598
               TreeMap size=      10 entries/s=  87832.407K total=-916137658370092344
     ConcurrentHashMap size=      10 entries/s=  73069.458K total=-502840045890573499
 ConcurrentSkipListMap size=      10 entries/s=  96150.046K total=-880874881700377401
      AirConcurrentMap size=      10 entries/s=  72001.056K total=-591224549191451578
---------------
               HashMap size=      30 entries/s=  89419.363K total=-832238604657166224
               TreeMap size=      30 entries/s=  92397.645K total=-915559545928720920
     ConcurrentHashMap size=      30 entries/s=  71702.258K total=-502393457157098057
 ConcurrentSkipListMap size=      30 entries/s= 103387.524K total=-880213560719307683
      AirConcurrentMap size=      30 entries/s=  80807.271K total=-590716865563759348
---------------
               HashMap size=     100 entries/s=  90540.307K total=-845663104709828079
               TreeMap size=     100 entries/s=  96479.776K total=-930300717715488858
     ConcurrentHashMap size=     100 entries/s=  69055.433K total=-512752383626539310
 ConcurrentSkipListMap size=     100 entries/s= 111208.365K total=-897628029635465726
      AirConcurrentMap size=     100 entries/s=  89071.481K total=-604453907970584431
---------------
               HashMap size=     300 entries/s=  94846.852K total=-860347586557330269
               TreeMap size=     300 entries/s=  94506.995K total=-944821983122037883
     ConcurrentHashMap size=     300 entries/s=  65857.587K total=-522786710141245214
 ConcurrentSkipListMap size=     300 entries/s= 112398.344K total=-915694233970137252
      AirConcurrentMap size=     300 entries/s=  92168.052K total=-618862268508225735
---------------
               HashMap size=    1000 entries/s=  74493.997K total=-852961390806337305
               TreeMap size=    1000 entries/s=  80026.348K total=-937067262358689631
     ConcurrentHashMap size=    1000 entries/s=  38450.309K total=-519070765727723030
 ConcurrentSkipListMap size=    1000 entries/s= 112085.413K total=-904572392174355552
      AirConcurrentMap size=    1000 entries/s=  89022.852K total=-609589031935380951
---------------
               HashMap size=    3000 entries/s=  57470.417K total=-847037038798366713
               TreeMap size=    3000 entries/s=  65963.172K total=-930168324830420495
     ConcurrentHashMap size=    3000 entries/s=  41073.089K total=-514814334734654678
 ConcurrentSkipListMap size=    3000 entries/s= 109217.866K total=-892841347702876464
      AirConcurrentMap size=    3000 entries/s=  89175.845K total=-600361285087522951
---------------
               HashMap size=   10000 entries/s=  46254.558K total=-846309210319384299
               TreeMap size=   10000 entries/s=  49044.408K total=-929403633977808893
     ConcurrentHashMap size=   10000 entries/s=  36385.473K total=-514246034592481772
 ConcurrentSkipListMap size=   10000 entries/s=  99442.425K total=-891342788950136070
      AirConcurrentMap size=   10000 entries/s=  85447.544K total=-599022098209904335
---------------
               HashMap size=   30000 entries/s=  43723.556K total=-848942181585517584
               TreeMap size=   30000 entries/s=  45253.915K total=-932143497084889069
     ConcurrentHashMap size=   30000 entries/s=  32665.051K total=-516220137420939070
 ConcurrentSkipListMap size=   30000 entries/s=  83393.494K total=-896231928805619954
      AirConcurrentMap size=   30000 entries/s=  80619.262K total=-603845100973163554
---------------
               HashMap size=  100000 entries/s=  40028.088K total=-849706795555554639
               TreeMap size=  100000 entries/s=  41755.506K total=-932944794183923404
     ConcurrentHashMap size=  100000 entries/s=  26064.027K total=-516724530444651670
 ConcurrentSkipListMap size=  100000 entries/s=  46619.667K total=-897138307784594414
      AirConcurrentMap size=  100000 entries/s=  75034.058K total=-605290263409285564
---------------
               HashMap size=  300000 entries/s=  28271.140K total=-850157323063101369
               TreeMap size=  300000 entries/s=  23442.635K total=-933312552546033574
     ConcurrentHashMap size=  300000 entries/s=  22886.588K total=-517086645455936620
 ConcurrentSkipListMap size=  300000 entries/s=  26852.530K total=-897567202447311134
      AirConcurrentMap size=  300000 entries/s=  43406.800K total=-605991920028554584
---------------
               HashMap size= 1000000 entries/s=  20762.874K total=-850266118577777400
               TreeMap size= 1000000 entries/s=  21465.730K total=-933426629396373490
     ConcurrentHashMap size= 1000000 entries/s=  17617.501K total=-517179596963620996
 ConcurrentSkipListMap size= 1000000 entries/s=  17753.452K total=-897660153954995510
      AirConcurrentMap size= 1000000 entries/s=  24726.115K total=-606121840885886155
---------------
               HashMap size= 3000000 entries/s=  20859.307K total=-850290350569160265
               TreeMap size= 3000000 entries/s=  17078.422K total=-933446707332090721
     ConcurrentHashMap size= 3000000 entries/s=  19987.888K total=-517202444269781983
 ConcurrentSkipListMap size= 3000000 entries/s=  23990.479K total=-897687847659433070
      AirConcurrentMap size= 3000000 entries/s=  30472.006K total=-606157150359044044
---------------
               HashMap size= 10000000 entries/s=  18594.336K total=-850335966429011695
               TreeMap size= 10000000 entries/s=  14332.011K total=-933483200019971865
     ConcurrentHashMap size= 10000000 entries/s=  17038.665K total=-517248060129633413
 ConcurrentSkipListMap size= 10000000 entries/s=  18600.417K total=-897733463519284500
      AirConcurrentMap size= 10000000 entries/s=  39037.289K total=-606248382078746904
---------------


ForEach() performance test
               HashMap size=       1 entries/s=  60469.332K total=-429010055848020608
               TreeMap size=       1 entries/s= 162720.446K total=-1192873323002853184
     ConcurrentHashMap size=       1 entries/s=  39683.288K total=-238381128098095008
 ConcurrentSkipListMap size=       1 entries/s= 125216.579K total=-742139402594604448
      AirConcurrentMap size=       1 entries/s=  40199.780K total=-223453462147226592
---------------
               HashMap size=       3 entries/s= 154792.076K total=-907351702836558858
               TreeMap size=       3 entries/s= 209809.380K total=-1866688472587176884
     ConcurrentHashMap size=       3 entries/s= 100788.166K total=-558021636792810008
 ConcurrentSkipListMap size=       3 entries/s= 202179.445K total=-1380005440196857173
      AirConcurrentMap size=       3 entries/s=  99654.211K total=-536118642462089017
---------------
               HashMap size=      10 entries/s= 297297.392K total=-2080956150412338142
               TreeMap size=      10 entries/s= 228262.234K total=-2782965758065995472
     ConcurrentHashMap size=      10 entries/s= 189641.651K total=-1313404093180619596
 ConcurrentSkipListMap size=      10 entries/s= 304427.266K total=-2602702896415550485
      AirConcurrentMap size=      10 entries/s= 179131.091K total=-1257952993772621945
---------------
               HashMap size=      30 entries/s= 305634.315K total=-2079066757218703314
               TreeMap size=      30 entries/s= 224584.862K total=-2781566150975000263
     ConcurrentHashMap size=      30 entries/s= 174917.912K total=-1312321443993669200
 ConcurrentSkipListMap size=      30 entries/s= 354581.676K total=-2600502266614288915
      AirConcurrentMap size=      30 entries/s= 254150.967K total=-1256373143380530839
---------------
               HashMap size=     100 entries/s= 295763.376K total=-2123996537948400465
               TreeMap size=     100 entries/s= 225375.420K total=-2816005249627104526
     ConcurrentHashMap size=     100 entries/s= 168596.386K total=-1337959745143386554
 ConcurrentSkipListMap size=     100 entries/s= 366342.358K total=-2656351051389612808
      AirConcurrentMap size=     100 entries/s= 336305.468K total=-1307783105877232718
---------------
               HashMap size=     300 entries/s= 335940.642K total=-2176036047793281607
               TreeMap size=     300 entries/s= 213798.860K total=-2849343024468137449
     ConcurrentHashMap size=     300 entries/s= 196158.746K total=-1368245793652746886
 ConcurrentSkipListMap size=     300 entries/s= 348762.198K total=-2711376732587358984
      AirConcurrentMap size=     300 entries/s= 435378.640K total=-1375390632080685627
---------------
               HashMap size=    1000 entries/s= 312285.030K total=-2145888100702813327
               TreeMap size=    1000 entries/s= 157007.240K total=-2834013084542617045
     ConcurrentHashMap size=    1000 entries/s= 178919.552K total=-1350929801563960438
 ConcurrentSkipListMap size=    1000 entries/s= 253518.140K total=-2687136797657993712
      AirConcurrentMap size=    1000 entries/s= 449868.858K total=-1331959489090096491
---------------
               HashMap size=    3000 entries/s= 269579.050K total=-2118053337323592431
               TreeMap size=    3000 entries/s= 129271.886K total=-2820412724828116661
     ConcurrentHashMap size=    3000 entries/s=  95870.060K total=-1340856888537750166
 ConcurrentSkipListMap size=    3000 entries/s= 203849.473K total=-2666107139705649888
      AirConcurrentMap size=    3000 entries/s= 443682.852K total=-1286068695711899563
---------------
               HashMap size=   10000 entries/s= 100969.734K total=-2116523986910706773
               TreeMap size=   10000 entries/s=  81240.085K total=-2819237854014952091
     ConcurrentHashMap size=   10000 entries/s=  74229.656K total=-1339726640597926192
 ConcurrentSkipListMap size=   10000 entries/s= 147733.052K total=-2664068913299591178
      AirConcurrentMap size=   10000 entries/s= 399269.901K total=-1279844336850624703
---------------
               HashMap size=   30000 entries/s=  77830.586K total=-2121068856388826590
               TreeMap size=   30000 entries/s=  73801.711K total=-2823183557187296024
     ConcurrentHashMap size=   30000 entries/s=  63503.394K total=-1343522194696030345
 ConcurrentSkipListMap size=   30000 entries/s= 148907.153K total=-2671403338078408622
      AirConcurrentMap size=   30000 entries/s= 450044.679K total=-1305454406448994036
---------------
               HashMap size=  100000 entries/s=  57919.591K total=-2122150626578319295
               TreeMap size=  100000 entries/s=  39468.243K total=-2823932886520250879
     ConcurrentHashMap size=  100000 entries/s=  30273.873K total=-1344103393021081000
 ConcurrentSkipListMap size=  100000 entries/s=  48766.187K total=-2672332261897079327
      AirConcurrentMap size=  100000 entries/s= 307460.503K total=-1311189966514089586
---------------
               HashMap size=  300000 entries/s=  42127.664K total=-2122825947560403955
               TreeMap size=  300000 entries/s=  26508.090K total=-2824353316156729769
     ConcurrentHashMap size=  300000 entries/s=  27206.845K total=-1344518179306734670
 ConcurrentSkipListMap size=  300000 entries/s=  19172.093K total=-2672628537815403377
      AirConcurrentMap size=  300000 entries/s= 152485.548K total=-1313414387297697136
---------------
               HashMap size= 1000000 entries/s=  40607.148K total=-2123037200986959355
               TreeMap size= 1000000 entries/s=  25159.911K total=-2824487462082592448
     ConcurrentHashMap size= 1000000 entries/s=  30257.523K total=-1344677675643783997
 ConcurrentSkipListMap size= 1000000 entries/s=  37742.151K total=-2672825003502099899
      AirConcurrentMap size= 1000000 entries/s= 148954.277K total=-1314169618297632691
---------------
               HashMap size= 3000000 entries/s=  43941.038K total=-2123087741997557902
               TreeMap size= 3000000 entries/s=  18891.646K total=-2824508924703531557
     ConcurrentHashMap size= 3000000 entries/s=  32007.894K total=-1344715062144774703
 ConcurrentSkipListMap size= 3000000 entries/s=  21722.543K total=-2672849927836093703
      AirConcurrentMap size= 3000000 entries/s= 126084.363K total=-1314312240875486125
---------------
               HashMap size= 10000000 entries/s=  35724.715K total=-2123169850545290476
               TreeMap size= 10000000 entries/s=  12693.940K total=-2824540855805427558
     ConcurrentHashMap size= 10000000 entries/s=  27254.306K total=-1344783485934551848
 ConcurrentSkipListMap size= 10000000 entries/s=  13572.712K total=-2672886420523974847
      AirConcurrentMap size= 10000000 entries/s=  92726.047K total=-1314522073830802703
---------------

6. Summary

It is possible to get repeatable, meaningful custom Map performance tests using simple code with certain precautions. This code shows some of those issues. The code is easily adaptable to other kinds of Map tests and data output formats.

Roger Deran

Roger Deran started his programming career with the first PCs, writing 8080 assembly language for a custom OS at PolyMorphic Systems. He has written real-time software, GUI applications, I/O drivers, embedded systems and more. He likes concurrency, B-Trees, and high-performance software such as his own InfinityDB and AirConcurrentMap at boilerbay.com. He has two patents.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Anurag Singh
6 years ago

Hello sir,

this is really very nice post about map iterator and you explained very well in detail.
keep posting for us.
thank you so much

Roger Deran
6 years ago

Thanks very much! Your comment floats my boat.

I do an awful lot of Map performance testing, nowadays using the Java Microbenchmarking Harness, which is excellent. I’m publishing various articles on Map performance right now covering serial and parallel streams. I think the graphical approach over Map sizes is vital, and basically nobody else uses it, so you can see why their results are all different. It’s way more work, though. Everyone has an opinion and makes wild guesses.

I’ll keep going!

Back to top button