Java multi-threading: volatile variables, happens-before relationship, and memory consistency

MD Sayem AhmedNovember 3rd, 2015Last Updated: November 2nd, 2015

6 93 9 minutes read

What is a volatile variable?

volatile is a keyword in Java. You cannot use this as a variable or method name. Period.

Seriously, jokes aside, what is volatile variable? When should we use it?

Ha ha, sorry, couldn’t help.

We typically use volatile keyword when we share variables with more than one thread in a multi-threaded environment, and we want to avoid any memory inconsistency errors due to the caching of these variables in the CPU cache.

Consider the following example of producer/consumer, where we are producing/consuming items one at a time:

public class ProducerConsumer {
  private String value = "";
  private boolean hasValue = false;

  public void produce(String value) {
    while (hasValue) {
      try {
        Thread.sleep(500);
      } catch (InterruptedException e) {
        e.printStackTrace();
      }
    }

    System.out.println("Producing " + value + " as the next consumable");
    this.value = value;
    hasValue = true;
  }

  public String consume() {
    while (!hasValue) {
      try {
        Thread.sleep(500);
      } catch (InterruptedException e) {
        e.printStackTrace();
      }
    }

    String value = this.value;
    hasValue = false;
    System.out.println("Consumed " + value);
    return value;
  }
}

In the above class, the produce method generates a new value by storing its argument into value, and changing the hasValue flag to true. The while loop checks if the value flag (hasValue) is true, which signifies the presence of a new value not yet consumed, and if it’s true then it requests the current thread to sleep. This sleeping loop only stops if the hasValue flag has been changed to false, which is only possible if the new value has been consumed by the consume method. The consume method requests the current thread to sleep if no new value is available. When a new value is produced by the produce method it terminates its sleeping loop, consumes it, and clears the value flag.

Now imagine that two threads are using an object of this class – one is trying to produce values (the writer thread), and another one is consuming them (the reader thread). The following test illustrates this approach:

public class ProducerConsumerTest {

  @Test
  public void testProduceConsume() throws InterruptedException {
    ProducerConsumer producerConsumer = new ProducerConsumer();
    List<String> values = Arrays.asList("1", "2", "3", "4", "5", "6", "7", "8",
        "9", "10", "11", "12", "13");
    Thread writerThread = new Thread(() -> values.stream()
        .forEach(producerConsumer::produce));
    Thread readerThread = new Thread(() -> {
      for (int i = 0; i > values.size(); i++) {
        producerConsumer.consume();
      }
    });

    writerThread.start();
    readerThread.start();

    writerThread.join();
    readerThread.join();
  }
}

This example will produce expected output in most of the times, but it also has a strong chance to run into a deadlock!

How?

Let’s talk about computer architecture a bit.

We know that a computer consists of CPUs and Memory Units (and many other parts). Even though the main memory is where all of our program instructions and variables/data reside, during program execution CPUs can store copies of variables in their internal memory (which is known as CPU cache) for performance gain. Since modern computers now have more than one CPUs, there are more than one CPU caches as well.

In a multi-threaded environment, it’s possible for more than one threads to execute at the same time, each one in a different CPU, (although this is totally dependent on the underlying OS), and each one of them may copy variables from main memory into their corresponding CPU cache. When a thread accesses these variables, they will then then access these cached copies, not the actual ones in the main memory.

Now let’s assume that the two threads in our test are running on two different CPUs, and the hasValue flag has been cached on either one of them (or both). Now consider the following execution sequence:

writerThread produces a value, and changes the hasValue to true. However, this update is only reflected in the cache, not in the main memory.
readerThread is trying to consume a value, but it’s cached copy of the hasValue flag is set to false. So even though a value has been produced by the writerThread, it cannot consume it as the thread cannot break out of the sleeping loop (hasValue is false).
Since the readerThread is not consuming the newly generated value, writerThread cannot proceed either as the flag is not being cleared, and hence it will be stuck in its sleeping loop.
And we have a deadlock in our hands!

This situation will only change if the hasValue flag is synchronized across all caches, which totally depends on the underlying OS.

What’s the solution then? And how does volatile fit into this example?

If we just mark the hasValue flag as volatile, we can be sure that this type of deadlock will not occur:

private volatile boolean hasValue = false;

Marking a variable as volatile will force each thread to read the value of that variable directly from the main memory. Also each write to a volatile variable will be flushed into the main memory immediately. If the threads decide to cache the variable, it will be synced with the main memory on each read/write.

After this change, consider the previous execution steps which led to deadlock:

Writer thread produces a value, and changes the hasValue to true. This time the update will be directly reflected into the main memory (even if it’s cached).
Reader thread is trying to consume a value, and checking the value of hasValue. This time every read will force the value to be fetched directly from the main memory, so it will pick up the change made by the writer thread.
Reader thread consumes the generated value, and clears the value of the flag. This new value will go to the main memory (if it’s cached, then the cached copy will also be updated).
Writer thread will pick up this change as every read is now accessing the main memory. It will continue to produce new values.

And voila! We are all happy ^_^ !

I see. Is this all volatile do, forcing threads to read/write variables directly from memory?

Actually it has some further implications. Accessing a volatile variable establishes a happens-before relationship between program statements.

What is a happens-before relationship?

A happens-before relationship between two program statements is sort a guarantee which ensures that any memory writes by one statement are visible to another statement.

How does it relate with volatile?

When we write to a volatile variable, it creates a happens-before relationship with each subsequent read of that same variable. So any memory writes that have been done until that volatile variable write, will subsequently be visible to any statements that follow the read of that volatile variable.

Err….Ok….I sort of got it, but may be an example will be good.

Ok, sorry about the vague definition. Consider the following example:

// Definition: Some variables
private int first = 1;
private int second = 2;
private int third = 3;
private volatile boolean hasValue = false;

// First Snippet: A sequence of write operations being executed by Thread 1
first = 5;
second = 6;
third = 7;
hasValue = true;

// Second Snippet: A sequence of read operations being executed by Thread 2
System.out.println("Flag is set to : " + hasValue);
System.out.println("First: " + first);  // will print 5
System.out.println("Second: " + second); // will print 6
System.out.println("Third: " + third);  // will print 7

Let’s assume that the above two snippets being executed by two different threads – thread 1 and 2. When the first thread changes hasValue, it will not only flush this change to main memory, but it will also cause the previous three writes (and any other previous writes) to be flushed into the main memory as well! As a result, when the second thread accesses these three variables it will see all the writes made by thread 1, even if they were all cached before (and these cached copies will be updated as well)!

This is the exactly why we did not have to mark the value variable in our first example with volatile as well. Since we wrote to that variable before accessing hasValue, and read from it after reading hasValue, it was automatically synced with the main memory.

This has another interesting consequence. JVM is famous for its program optimization. Sometimes it reorders the program statements to boost performance without changing the output of the program. As an example, it can change the following sequence of statements:

first = 5;
second = 6;
third = 7;

into this:

second = 6;
third = 7;
first = 5;

However, when the statements involve accessing a volatile variable, then it will never move a statement occurring before a volatile write after it. Which means, it will never transform this:

first = 5;  // write before volatile write
second = 6;  // write before volatile write
third = 7;   // write before volatile write
hasValue = true;

into this:

first = 5;
second = 6;
hasValue = true;
third = 7;  // Order changed to appear after volatile write! This will never happen!

even though from the perspective of program correctness both of them seem to be equivalent. Note that the JVM is still allowed to reorder the first three writes among them as long as they all appear before the volatile write.

Similarly, the JVM will also not change the order of a statement which appears after a volatile variable read to appear before the access. Which means the following:

System.out.println("Flag is set to : " + hasValue);  // volatile read
System.out.println("First: " + first);  // Read after volatile read
System.out.println("Second: " + second); // Read after volatile read
System.out.println("Third: " + third);  // Read after volatile read

will never be transformed by the JVM into this:

System.out.println("First: " + first);  // Read before volatile read! Will never happen!
System.out.println("Fiag is set to : " + hasValue); // volatile read
System.out.println("Second: " + second); 
System.out.println("Third: " + third);

However, the JVM can certainly reorder the last three reads among them, as long as they keep appearing after the volatile read.

I sense a performance penalty has to be paid for volatile variables.

You got that right, since volatile variables force main memory access, and accessing main memory is always way slower than accessing CPU caches. It also prevents certain program optimizations by JVM as well, further reducing the performance.

Can we always use volatile variables to maintain data consistency across threads?

Unfortunately not. When more than one threads read and write to the same variable, then marking it as volatile is not enough to maintain consistency. Consider the following UnsafeCounter class:

public class UnsafeCounter {
  private volatile int counter;

  public void inc() {
    counter++;
  }

  public void dec() {
    counter--;
  }

  public int get() {
    return counter;
  }
}

and the following test:

public class UnsafeCounterTest {

  @Test
  public void testUnsafeCounter() throws InterruptedException {
    UnsafeCounter unsafeCounter = new UnsafeCounter();
    Thread first = new Thread(() -> {
      for (int i = 0; i < 5; i++) { 
        unsafeCounter.inc();
      }
    });
    Thread second = new Thread(() -> {
      for (int i = 0; i < 5; i++) {
        unsafeCounter.dec();
      }
    });

    first.start();
    second.start();
    first.join();
    second.join();

    System.out.println("Current counter value: " + unsafeCounter.get());
  }
}

The code is pretty self-explanatory. We are incrementing the counter in one thread, and decrementing it in another by same number of times. After running this test we expect the counter to hold 0, but this is not guaranteed. Most of the times it will be 0, and some of the times it will be -1, -2, 1, 2 i.e., any integer value between the range [-5, 5].

Why does this happen? It happens because both the increment and the decrement operation of the counter are not atomic – they do not happen all at once. Both of them consists of multiple steps, and the sequence of steps overlap with each other. So you can think of an increment operation as follows:

Read the value of the counter.
Add one to it.
Write back the new value of the counter.

and an decrement operation as follows:

Read the value of the counter.
Subtract one from it.
Write back the new value of the counter.

Now, let’s consider the following execution steps:

First thread has read the value of the counter from memory. Initially it’s set to zero. It then adds one to it.
Second thread has also read the value of the counter from memory, and saw that it’s set to zero. It then subtracts one from it.
First thread now writes back the new value of counter to memory, changing it to 1.
Second thread now writes back the new value of counter to memory, which is -1.
First thread’s update is lost.

How do we prevent this?

By using synchronization:

public class SynchronizedCounter {
  private int counter;

  public synchronized void inc() {
    counter++;
  }

  public synchronized void dec() {
    counter--;
  }

  public synchronized int get() {
    return counter;
  }
}

Or by using an AtomicInteger:

public class AtomicCounter {
  private AtomicInteger atomicInteger = new AtomicInteger();

  public void inc() {
    atomicInteger.incrementAndGet();
  }

  public void dec() {
    atomicInteger.decrementAndGet();
  }

  public int get() {
    return atomicInteger.intValue();
  }
}

My personal choice is the one using AtomicInteger as the synchronized one hampers performance greatly by allowing only one thread to access any of the inc/dec/get methods.

I notice that the synchronized version does not mark the counter as volatile. Does this mean……..?

Yup. Using the synchronized keyword also establishes a happens-before relationship between statements. Entering a synchronized method/block establishes a happens-before relationship between the statements that appear before it and the ones inside the method/block. For a full list of what establishes a happens-before relationship, please go here.

That’s all I have to say about volatile for the time being.