Core Java

The Java Memory Model Explained: Happens-Before and Concurrency Semantics

There is a quiet assumption embedded in almost every line of Java ever written: that when one thread writes a value to a variable, another thread will eventually see it. That when two operations appear in a certain order on the page, the processor will execute them in that order. That a flag you set to true in one thread will, at some point, read back as true in another.

None of these things are guaranteed. Not by Java. Not by the JVM. Not by the hardware beneath them. And yet most concurrent Java code works most of the time — not because these guarantees exist, but because the conditions under which they fail tend not to occur during development, in unit tests, or under low concurrency. They emerge instead under the specific combination of load, hardware architecture, and JIT optimisation that only exists in production. By then, the bug is intermittent, non-reproducible, and deeply confusing.

The Java Memory Model is the specification that defines, with formal precision, exactly what Java does and does not guarantee about the visibility and ordering of memory operations across threads. Understanding it is not an academic exercise reserved for compiler authors. It is the conceptual foundation on which every correct concurrent Java program is built — and every incorrect one broke down. This article explains that foundation without code, from first principles, in the order that makes the reasoning clear.

Why the Problem Exists: Hardware the JVM Must Accommodate

The Java Memory Model does not exist because Java’s designers wanted complexity. It exists because the hardware on which Java runs is already complex, and ignoring that complexity would make Java programs either incorrect or unacceptably slow. To understand the JMM, you therefore need to understand the physical reality it is responding to.

The memory hierarchy and why caches matter

Modern CPUs do not read from and write directly to RAM. They operate instead through a hierarchy of increasingly fast, increasingly small memory layers. At the top sits the processor register — the fastest memory that exists, holding the value the CPU is actively operating on. Below it are the L1 cache (a few kilobytes, on-chip, nanoseconds), the L2 cache (hundreds of kilobytes, microseconds), and sometimes an L3 cache shared across cores. Below all of those is main memory, orders of magnitude slower.

The consequence of this hierarchy is decisive. When a thread running on Core A writes a value to a variable, that write lands initially in Core A’s private L1 cache. It propagates to main memory only later — potentially much later, depending on the cache coherence protocol in use. Meanwhile, a thread running on Core B has its own private L1 cache. If Core B reads that variable before Core A’s write propagates, it will see the old value. Both threads are running correctly from the perspective of their own cache — they simply disagree about what memory contains.

This is not a bug in the hardware. It is an intentional design: eliminating private caches would force every memory access to go to shared RAM, slowing modern CPUs by orders of magnitude. The cache hierarchy is what makes CPUs fast. The price is that multi-core memory is not sequentially consistent from a programmer’s perspective unless explicit synchronisation is used.

Instruction reordering: the second source of surprise

Beyond caches, both compilers and CPUs reorder instructions. A compiler performing dead-code elimination or loop optimisation may move a write to a variable later in the instruction stream than where it appears in the source. A CPU with an out-of-order execution engine may issue instructions in a different sequence than they arrived, as long as the observable result within a single thread is unchanged.

This last qualification — “within a single thread” — is the crux of the problem. Reordering that is invisible to a single-threaded program can be entirely visible to a concurrent one. If a compiler moves the initialisation of an object’s fields after the assignment of the object reference, a second thread reading that reference might see a non-null pointer to an incompletely constructed object. The program is still correct from the perspective of the thread doing the writing. It is silently broken from the perspective of every other thread.

“The processor sees multiple operations which are related will have a happens-before relation, but two unrelated statements can be executed out-of-order. The guarantee of output not being disturbed is only for that thread running in isolation.”

This principle — that the JVM provides as-if-serial semantics within a single thread, but makes no ordering promises across threads without explicit synchronisation — is the central premise the entire Java Memory Model is built upon.

Sources of observable reordering in the JVM execution pipeline

Conceptual contribution of each layer to instruction reordering visible across threads

The Original Memory Model Was Fundamentally Broken

Java’s first memory model — in versions 1 through 4 — was an earnest attempt to provide cross-platform concurrency guarantees. It was, by the time Java 5 arrived, considered broken by the language community. Understanding why it failed is important because the failure was not incidental. It illuminated exactly which guarantees a correct memory model must provide.

The original model suffered from two opposing and equally damaging defects. On one side, it over-specified behaviour for volatile. The old volatile guaranteed visibility but also implied a degree of total ordering across all volatile and non-volatile operations that was impossibly expensive to implement on most CPU architectures. Correct implementations on platforms like SPARC and IA-64 required memory fences far more aggressive than the situation demanded, making volatile on those platforms slower than it needed to be.

On the other side, the old model wildly under-specified behaviour for final fields. As the JSR-133 FAQ documents directly, nothing in the original model treated final fields differently from any other field. This meant that, in theory, a second thread reading a final field of an object might see the field’s default zero value rather than the value assigned in the constructor — a violation of the most basic intuition about what final should mean. Synchronisation was the only safe path, defeating the purpose of declaring fields final in the first place.

Beyond volatile and final, the original model was simply too imprecise to reason about. It did not define what constituted a data race with sufficient clarity for either programmers or JVM implementors to rely on. Different JVM vendors implemented different interpretations. The result was a model that was simultaneously too strict for implementors and too vague for programmers — the worst of both worlds.

JSR-133, led by Jeremy Manson, Brian Goetz, and William Pugh, replaced the original memory model beginning with Java 5 in 2004. It introduced a formally grounded specification based on the happens-before relation, tightened the semantics of volatile to be useful without being unnecessarily expensive, and gave final fields the safety guarantee programmers had always assumed they had. The JSR-133 model, with minor refinements, is the Java Memory Model that governs all Java code today.

The Happens-Before Relation: The Core Mechanism of the JMM

The central concept introduced by JSR-133 is the happens-before relation. It is worth being precise about what it means and, equally importantly, what it does not mean.

Crucially, happens-before is a partial ordering, not a total one. Not every pair of actions in a concurrent program is ordered relative to each other. Two actions that are unordered may execute in either sequence, and the JMM makes no guarantee about which value a thread observes in that case. The art of writing correct concurrent Java is, in a significant sense, the art of ensuring that every pair of accesses to shared mutable state has a happens-before relationship in the appropriate direction.

The five axioms: the complete set of happens-before rules

The JMM defines exactly which actions create happens-before edges. There are five primary rules, and together they form a complete axiom system. Every happens-before relationship in any Java program can be derived from these five, through the transitivity property described below.

Axiom 1 — Program Order Rule

Within a single thread, every action happens-before every subsequent action in program order. This is what gives sequential code its predictability: instructions in a method execute in the order they are written, at least from the perspective of that thread.

Axiom 2 — Monitor Lock Rule

An unlock of a monitor (the release of a synchronized block or method) happens-before every subsequent lock of that same monitor. This is the guarantee behind synchronized: any thread that acquires a lock is guaranteed to see all writes made by the previous holder of that lock before it released it.

Axiom 3 — Volatile Variable Rule

A write to a volatile variable happens-before every subsequent read of that same variable. This is a stronger guarantee than mere visibility: it establishes a total ordering on all reads and writes of a given volatile variable across all threads.

Axiom 4 — Thread Start Rule

A call to Thread.start() happens-before any action in the newly started thread. Everything the spawning thread did before calling start() is guaranteed to be visible to the spawned thread when it begins execution.

Axiom 5 — Thread Join Rule

All actions in a thread happen-before any thread that successfully returns from Thread.join() on that thread. A thread that calls join() and returns is guaranteed to see every write made by the joined thread during its lifetime.

To these five, a sixth and technically distinct property makes the system powerful: transitivity. If A happens-before B, and B happens-before C, then A happens-before C. This rule is what makes the JMM practical — you can chain guarantees together to reason about complex scenarios. Without transitivity, each happens-before relationship would be an island. With it, a single well-placed volatile write or monitor unlock can anchor an entire chain of visibility guarantees that reaches back through many prior operations.

The five happens-before axioms — and how they compose

Relative strength of guarantee (visibility + ordering enforcement) provided by each rule and whether transitivity extends them across thread boundaries

The Three Pillars the JMM Governs: Atomicity, Visibility, and Ordering

Alongside happens-before, the JMM organises its guarantees around three distinct concepts. Understanding each separately prevents the common mistake of conflating them — a mistake that leads engineers to apply the wrong synchronisation primitive to the wrong problem.

Atomicity: the indivisibility of operations

An operation is atomic when it completes as an indivisible unit — no other thread can observe the operation in a partially completed state. The JMM provides atomicity guarantees for reads and writes to most primitive types: reads and writes to intbooleancharbyteshortfloat, and object references are guaranteed atomic on 32-bit and 64-bit JVMs (though for object references on 32-bit JVMs, references are only 32 bits, not 64, so no word-tearing occurs).

The notable exceptions are long and double. On 32-bit platforms, the JMM explicitly permits non-atomic reads and writes to these 64-bit types — a single write can be observed by another thread as two separate 32-bit writes, with an intermediate state visible between them. This is called word tearing. On 64-bit JVMs this is not an issue in practice, but the specification still allows it, which is why volatile is the technically correct way to declare a shared long or double field when atomicity matters.

Critically, atomicity at the level of individual reads and writes does not imply atomicity at the level of compound operations. A read-modify-write sequence — reading a value, incrementing it, writing it back — involves three separate actions. Even if each is individually atomic, the sequence as a whole is not: another thread can interleave between any two of those three steps. This is the reason i++ on a shared counter is not thread-safe without external synchronisation, even though it looks like a single statement.

Visibility: when writes travel across the memory hierarchy

Visibility is the question of whether a write performed by one thread will be observed by a read in another thread. Without any happens-before relationship between a write and a subsequent read, the JMM makes no guarantee. The write might be sitting in Core A’s private cache indefinitely. The JIT compiler might have cached the value in a register on Core B. The read might return a value that was current milliseconds ago.

This is where the practical implications become stark. Consider a service that uses a simple boolean flag to signal a background thread to stop processing. If that flag is written by one thread and read by another with no happens-before relationship between them, the JMM permits the reading thread to never observe the write — and the JIT compiler is within its rights to optimise the read into a register load that never re-checks main memory, effectively transforming the loop into an infinite one. This is not a theoretical worst case. It happens on real JVM implementations, on certain hardware architectures, under certain JIT compilation paths.

Without synchronisation, the JMM does not guarantee that a write will eventually become visible. It guarantees nothing at all. An unsynchronised read may see the initial value of a variable forever, depending on the JIT optimisation path taken. “Eventually consistent” is not the correct mental model — “undefined” is.

Ordering: the illusion of sequential execution

Ordering is the question of whether actions appear to other threads to have executed in the sequence they appear in source code. The JMM’s answer is nuanced: the JVM is permitted to reorder any actions between two threads as long as those actions are unordered by the happens-before relation, and as long as the reordering does not violate as-if-serial semantics within any individual thread.

This means that in the absence of synchronisation, there is no guarantee that the order in which a thread performs writes is the order in which those writes are observed by other threads. A thread that writes field A, then field B, then publishes a reference to the containing object, might be observed by another thread as having published the reference before completing the write to field A. The object appears to exist before it is finished being constructed. This is the unsafe publication problem, and it is the direct motivation for the special semantics of final fields introduced in JSR-133.

Final Fields and the Safe Publication Guarantee

One of JSR-133’s most practically important contributions is the formalisation of the final field guarantee. The rule, as it stands since Java 5, is elegant: if an object’s constructor completes normally, and the object reference is not allowed to escape the constructor before it completes, then any thread that subsequently reads that reference is guaranteed to see the fully initialised values of all final fields — without any additional synchronisation.

This guarantee is implemented at the JVM level through a memory barrier. After the constructor of any object with final fields completes, the JVM inserts a write barrier that ensures all writes to final fields are flushed before the reference to the object can be observed by any other thread. In effect, the JMM treats the constructor’s completion as establishing a happens-before edge: the writes to final fields happen-before any thread reads those fields through the published reference.

This is precisely what makes immutable objects inherently thread-safe. An object with all fields declared final, constructed normally, and published through any safe publication mechanism — a volatile field, a synchronized block, a final field of another safely constructed object, or a concurrency library mechanism — can be read by any thread without any further synchronisation. The JMM guarantees full visibility of its state.

The final field guarantee depends critically on the object reference not escaping the constructor. If the constructor registers this with a listener, publishes this to a static field, or starts a thread that captures this — all before the constructor completes — then another thread may observe the object in a partially initialised state, even with all fields final. This is called the constructor escape problem, and it is the one way to defeat the final field guarantee.

What the JMM Implies for Correctly Synchronized Programs

The JMM’s design goal, stated in the JSR-133 specification itself, is to ensure that correctly synchronised programs have sequentially consistent semantics — meaning they behave as if all operations execute in some total order, consistent with the program order within each thread. That is a strong and reassuring guarantee. Its precondition, however, is exacting: the program must have no data races.

A correctly synchronised program is one where every access to a shared mutable variable is either protected by a lock held by the accessing thread, marked volatile, or accessed through a class in java.util.concurrent that provides equivalent guarantees. A program that satisfies this constraint can be reasoned about as if it were single-threaded: all operations appear to execute atomically and in order, because the happens-before chain ensures that every relevant write has propagated before every relevant read.

The JMM’s treatment of incorrectly synchronised programs — those with data races — is deliberately minimal. The specification does not promise that they will crash, produce garbage, or behave in any particular way. It simply declares their behaviour undefined. This is the same design decision made by C++11’s memory model and by POSIX threads. It is the right decision because any attempt to define the behaviour of racy programs would either constrain JVM implementors to the point of preventing optimisations, or define semantics so complex that neither programmers nor tools could reason about them.

Synchronisation mechanismGuarantees atomicity?Guarantees visibility?Guarantees ordering?Transitivity?
synchronized blockCompound actionsYesYesYes
volatile fieldSingle read/write onlyYesYesYes (via chain)
final fieldPost-constructionYes (safely published)Write-once orderingScoped to construction
No synchronisationSingle primitives onlyNot guaranteedNot guaranteedNone
java.util.concurrent classesCompound, per-classYesYesYes

Why Developer Intuition Fails: Three Conceptual Traps

The JMM is difficult not because the rules are complicated — the five axioms are reasonably concise — but because the mental model most developers carry about how memory works is simply wrong. The following three conceptual traps account for the majority of concurrency bugs in production Java code.

Trap 1: Confusing “eventually” with “guaranteed”

The most common mental model is that unsynchronised writes will “eventually” propagate to other threads — perhaps with a small delay, but reliably. This model is wrong in a way that matters. The JMM does not guarantee eventual visibility. A JIT-compiled loop that reads a non-volatile flag may be optimised to load the flag’s value once into a register and never re-read it from memory. That optimisation is permanent for the lifetime of that compiled method. The write never arrives. There is no delay — there is no arrival at all.

Trap 2: Believing that program order implies execution order across threads

Developers naturally read concurrent code as if it executes in the order it appears on the page. In single-threaded code, that is exactly what happens (modulo reorderings that are invisible to the single thread). In concurrent code, it is not. A thread that performs writes A, B, and C may be observed by another thread as having performed them in the order C, A, B — or as having performed only C, with A and B not yet visible. Any order that does not violate happens-before is legal, and the absence of synchronisation means no order is constrained.

Trap 3: Treating volatile as sufficient for all concurrency needs

Understanding that volatile provides visibility leads many developers to believe it is sufficient for any shared variable. For simple flag-and-publish patterns, it often is. For compound operations — read a value, compute a new value based on it, write the result — it is not. volatile provides no atomicity for the read-modify-write sequence as a whole. Between the read and the write, another thread may have performed its own read-modify-write on the same variable, and both threads’ writes will be applied independently. The result is a lost update. This is the exact scenario that AtomicInteger and its siblings in java.util.concurrent.atomic were designed to address through hardware-level compare-and-swap operations.

These three traps share a common theme: they arise from reasoning about concurrent programs as if they were sequential ones. The JMM exists precisely because that reasoning is inadequate. The correct approach is to reason exclusively in terms of happens-before chains and to verify, for every pair of accesses to shared mutable state, that an appropriate happens-before relationship exists. No other form of reasoning about concurrent Java is reliable.

The JMM Today: How It Relates to Virtual Threads, VarHandle, and Modern Java

The JSR-133 memory model, introduced in Java 5, remains the governing specification for all Java concurrency today. Modern additions to the language and platform have not replaced it — they have extended it, filling gaps that the original specification left open.

VarHandle, introduced in Java 9 through JEP 193, extended the JMM’s synchronisation mechanisms to offer fine-grained access modes. Where volatile provides a single, fixed level of ordering (sequential consistency for that variable), VarHandle exposes a spectrum: plain access (no ordering guarantee), opaque (ordering only within the current thread), acquire and release (one-directional memory barrier), and volatile (the full sequential-consistency guarantee of a traditional volatile field). This gives library and framework authors access to the same granularity of memory barriers that native code has always had, without requiring JNI.

Virtual threads, introduced in Java 21, operate under exactly the same memory model as platform threads. A virtual thread is still a java.lang.Thread. It still participates in the happens-before graph via the same five axioms. Thread.start() and Thread.join() establish happens-before edges for virtual threads precisely as they do for platform threads. The structured concurrency APIs in newer JDK versions — StructuredTaskScope — build their safety guarantees on top of these same JMM rules, with the fork and join operations creating the happens-before relationships that guarantee child task writes are visible to the parent after joining.

Scoped Values, finalised in Java 25 through JEP 506, are not a change to the JMM but a response to one of its practical consequences. Because virtual threads are never reused, ThreadLocal variables become per-request rather than per-pool-thread, with the memory implications discussed elsewhere. Scoped Values sidestep this by binding data to a lexical scope rather than a thread — and because they are immutable and scope-bounded, they require no cross-thread synchronisation at all. Their safety is guaranteed by structure, not by happens-before reasoning.

What We Learned

We began by anchoring the Java Memory Model in physical reality: the multi-level cache hierarchy and instruction reordering engines of modern CPUs mean that multi-core memory is not sequentially consistent by default, and any memory model that ignored this would either be incorrect or impractically slow. We then traced the history of Java’s original memory model and its two core failures — over-specifying volatile in a way that was expensive to implement, and under-specifying final in a way that was dangerous to use — leading to the JSR-133 revision in Java 5 that defines the model we work with today. The heart of the article was the five axioms of the happens-before relation: program order, monitor lock, volatile variable, thread start, and thread join — plus transitivity, which chains them into a complete reasoning system.

We examined how these axioms organise around the three pillars the JMM governs: atomicity (the indivisibility of operations, with special attention to word tearing in 64-bit types), visibility (the guarantee that writes propagate across the cache hierarchy, which is never automatic without a happens-before edge), and ordering (the constraint that operations appear sequentially, which the JVM is free to violate in the absence of synchronisation). We then covered the final field and safe publication guarantee introduced by JSR-133, the three conceptual traps that cause most production concurrency bugs, and how the model extends through VarHandle, virtual threads, and Scoped Values into the modern Java landscape.

Eleftheria Drosopoulou

Eleftheria is an Experienced Business Analyst with a robust background in the computer software industry. Proficient in Computer Software Training, Digital Marketing, HTML Scripting, and Microsoft Office, they bring a wealth of technical skills to the table. Additionally, she has a love for writing articles on various tech subjects, showcasing a talent for translating complex concepts into accessible content.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Back to top button