Enterprise Java

The Effect of Co-ordinated Omission – Measure loopback latency using a simple NIO client/server

In this post I demonstrate a number of ideas and techniques: 

  1. How to write a simple non-blocking NIO client/server
  2. The effects of co-ordinated omission
  3. How to measure latencies in percentiles (as opposed to simple average)
  4. How to time latency loopback on your machine

I was recently developing a low latency benchmark for client server application.  Initially I was simulating the benchmark on a single machine using TCP over loopback. The first measure I wanted to quantify was how much of the recorded latency I needed to allow for simple loopback latency. I would then be able to understand more clearly the latency added by my actual application.

To do this I created a program (code at the end of the post) that transfers a single byte from the client to the server and back again. This is done repeatedly and the results are processed.

The program is written using non-blocking java NIO to optimise the loopback latencies as much as possible.

More importantly than recording the average time, the percentile latencies are recorded. (See a previous post here for a discussion of how to measure latency). Critically, the code factors for co-ordinated omission.  (To learn more about this see here from Gil Tene). In a nutshell, you don’t time a job from when it started but from it should have started.

These are results on my 2 year old MBP.

Starting latency test rate: 80000
Average time 2513852
Loop back echo latency was 2514247.3/3887258.6 4,196,487/4,226,913 4,229,987/4230294 4,230,294 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 70000
Average time 2327041
Loop back echo latency was 2339701.6/3666542.5 3,957,860/3,986,626 3,989,404/3989763 3,989,763 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 50000
Average time 1883303
Loop back echo latency was 1881621.0/2960104.0 3,203,771/3,229,260 3,231,809/3232046 3,232,046 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 30000
Average time 1021576
Loop back echo latency was 1029566.5/1599881.0 1,726,326/1,739,626 1,741,098/1741233 1,741,233 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 20000
Average time 304
Loop back echo latency was 65.6/831.2 3,632/4,559 4,690/4698 4,698 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 10000
Average time 50
Loop back echo latency was 47.8/57.9 89/120 152/182 182 us for 50/90 99/99.9 99.99/99.999 worst %tile

Compare these results to when I did not correct for co-ordinated omission:

 Starting latency test rate: 80000
Average time 45
Loop back echo latency was 44.1/48.8 71/105 124/374 374 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 70000
Average time 45
Loop back echo latency was 44.1/48.9 76/106 145/358 358 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 50000
Average time 45
Loop back echo latency was 43.9/48.8 74/105 123/162 162 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 30000
Average time 45
Loop back echo latency was 44.0/48.8 73/104 129/147 147 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 20000
Average time 45
Loop back echo latency was 44.7/49.6 78/107 135/311 311 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 10000
Average time 46
Loop back echo latency was 45.1/50.8 81/112 144/184 184 us for 50/90 99/99.9 99.99/99.999 worst %tile

As you can see the effects of throughput are completely ignored! It would appear that even at a rate of 80,000 messages a second your 99.99 percentile is 374us when in fact it is way way larger than that.

In fact you can only hit your target latencies when your throughput is nearer 10,000 a second.  As you would understand intuitively there is a trade off between throughput and latency.

The code for this test is below:

package util;
 
 
import java.io.EOFException;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.net.Socket;
import java.nio.ByteBuffer;
import java.nio.channels.ServerSocketChannel;
import java.nio.channels.SocketChannel;
import java.util.Arrays;
 
/**
 * Created by daniel on 02/07/2015.
 * Simple program to test loopback speeds and latencies.
 */
public class LoopBackPingPong {
    public final static int PORT = 8007;
 
    public void runServer(int port) throws IOException {
        ServerSocketChannel ssc = ServerSocketChannel.open();
        ssc.bind(new InetSocketAddress(port));
        System.out.println("listening on " + ssc);
 
        final SocketChannel socket = ssc.accept();
        socket.socket().setTcpNoDelay(true);
        socket.configureBlocking(false);
        new Thread(() -> {
            long totalTime = 0;
            int count = 0;
            try {
                System.out.println("Connected " + socket);
 
                ByteBuffer bb = ByteBuffer.allocateDirect(1);
                int length;
                while ((length = socket.read(bb)) >= 0) {
                    if (length > 0) {
                        long time = System.nanoTime();
                        bb.flip();
                        bb.position(0);
                        count++;
 
                        if (socket.write(bb) < 0)
                            throw new EOFException();
 
                        bb.clear();
                        totalTime += System.nanoTime() - time;
                    }
                }
            } catch (IOException ignored) {
            } finally {
                System.out.println("Total server time " + (totalTime / count) / 1000);
                System.out.println("... disconnected " + socket);
                try {
                    socket.close();
                } catch (IOException ignored) {
                }
            }
        }).start();
 
    }
 
    public void testLatency(int targetThroughput, SocketChannel socket) throws IOException {
        System.out.println("Starting latency test rate: " + targetThroughput);
        int tests = Math.min(18 * targetThroughput, 100_000);
        long[] times = new long[tests];
        int count = 0;
        long now = System.nanoTime();
        long rate = (long) (1e9 / targetThroughput);
 
        ByteBuffer bb = ByteBuffer.allocateDirect(4);
        bb.putInt(0, 0x12345678);
 
        for (int i = -20000; i < tests; i++) {
            //now += rate;
            //while (System.nanoTime() < now)
            //    ;
            now = System.nanoTime();
 
            bb.position(0);
            while (bb.remaining() > 0)
                if (socket.write(bb) < 0)
                    throw new EOFException();
 
            bb.position(0);
            while (bb.remaining() > 0)
                if (socket.read(bb) < 0)
                    throw new EOFException();
 
            if (bb.getInt(0) != 0x12345678)
                throw new AssertionError("read error");
 
            if (i >= 0)
                times[count++] = System.nanoTime() - now;
        }
        System.out.println("Average time " + (Arrays.stream(times).sum() / times.length) / 1000);
        Arrays.sort(times);
        System.out.printf("Loop back echo latency was %.1f/%.1f %,d/%,d %,d/%d %,d us for 50/90 99/99.9 99.99/99.999 worst %%tile%n",
                times[times.length / 2] / 1e3,
                times[times.length * 9 / 10] / 1e3,
                times[times.length - times.length / 100] / 1000,
                times[times.length - times.length / 1000] / 1000,
                times[times.length - times.length / 10000] / 1000,
                times[times.length - times.length / 100000] / 1000,
                times[times.length - 1] / 1000
        );
    }
 
 
    public static void main(String... args) throws Exception {
        int port = args.length < 1 ? PORT : Integer.parseInt(args[0]);
        LoopBackPingPong lbpp = new LoopBackPingPong();
 
        new Thread(() -> {
            try {
                lbpp.runServer(port);
            } catch (IOException e) {
                Jvm.rethrow(e);
            }
        }).start();
        //give the server a chance to start
        Thread.sleep(1000);
 
        SocketChannel socket = SocketChannel.open(new InetSocketAddress("localhost", port));
        socket.socket().setTcpNoDelay(true);
        socket.configureBlocking(false);
 
        for (int i : new int[]{80_000, 70_000, 50_000, 30_000, 20_000, 10_000})
            lbpp.testLatency(i, socket);
        System.exit(0);
    }
 
}

Daniel Shaya

Daniel has been programming in Java since it was in beta. Working predominantly in the finance industry he has created real time trading and margin risk applications. He is currently a director at OpenHFT where we are building next generation Java low latency products.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button