Core Java

Event Sourcing Without a Framework: What You Actually Need, What You Don’t, and the Parts Nobody Warns You About

Most Java tutorials reach for Axon before explaining the moving parts underneath it. This is the framework-agnostic treatment — the event store, the snapshot problem, schema evolution, replay, and the distinction between an event store and an event bus that most guides quietly skip.

A certain pattern repeats itself in Java teams discovering event sourcing for the first time. Someone reads the theory, gets excited, searches for a tutorial, and lands on Axon Framework. Within a day they have events flowing, aggregates rehydrating, and a sagas handling compensation. Everything feels clean. Then, six months later, the team hits a wall: they need to evolve an event schema, or replay a projection, or understand why a particular aggregate is slow to load — and they realise they do not actually know what the framework is doing for them. At that point, going back to first principles is genuinely hard.

This article is the first-principles treatment. We will build event sourcing in plain Java with no framework dependency beyond a JDBC connection and Jackson. More importantly, we will spend most of our time on the parts that are rarely covered: the non-obvious hard parts that only become obvious after something breaks in production.

Start with the distinction almost everyone blurs: event store vs event bus

Before a single line of Java, it is worth getting this distinction crisp, because blurring it leads to architectural mistakes that are painful to undo later.

An event bus is a transport mechanism. It carries a message from a producer to one or more consumers right now. It is concerned with delivery, routing, and fan-out. Kafka, RabbitMQ, and even Spring’s internal ApplicationEventPublisher are all event bus concerns. An event bus is inherently ephemeral: once a message is consumed and acknowledged, it is either gone or archived for debugging. You would not rebuild the state of your domain by replaying a Kafka topic from the beginning in production — at least, not without significant ceremony.

An event store is a persistence mechanism. It is an append-only log of everything that has ever happened to every aggregate in your system. Its job is not delivery — it is authoritative storage. When you ask “what is the current state of order ord-4821?”, the event store gives you every event that has ever been applied to that aggregate, in order, from the beginning of time. You reconstruct the current state by replaying those events, not by querying a projection table.

“The event store is your source of truth. The event bus is your notification mechanism. A system that confuses the two will eventually find itself in the unenviable position of having lost history it thought it was persisting.”

In practice, the two often work together: when you append an event to the store, you also publish it to a bus for downstream consumers. But they must remain conceptually and physically separate. Your event store should never depend on your event bus being up. And your projections should be rebuildable from the event store alone, without the bus.

The minimal event store schema you actually need

An event store, at its core, is a single append-only table. Teams often over-engineer the schema on the first attempt. Here is what you genuinely need, plus the constraints that enforce the invariants you care about.

CREATE TABLE event_store (
  global_sequence  BIGSERIAL    PRIMARY KEY,
  stream_id        VARCHAR(255) NOT NULL,
  stream_version   BIGINT       NOT NULL,
  event_type       VARCHAR(255) NOT NULL,
  payload          JSONB        NOT NULL,
  metadata         JSONB        NOT NULL DEFAULT '{}',
  occurred_at      TIMESTAMPTZ  NOT NULL DEFAULT now(),
  UNIQUE (stream_id, stream_version)
);

CREATE INDEX idx_event_store_stream
  ON event_store (stream_id, stream_version);

CREATE INDEX idx_event_store_global
  ON event_store (global_sequence);

The UNIQUE (stream_id, stream_version) constraint is doing significant work here. It enforces optimistic concurrency at the database level: two concurrent writes to the same aggregate at the same version will result in one succeeding and one receiving a unique constraint violation, which your application layer translates into a concurrency exception and a retry. This is your primary concurrency guard, and it is far simpler and more reliable than application-level locking.

The global_sequence is a monotonically increasing integer across all streams. Projections use it to track their position — a projection records the last global_sequence it processed, and on restart it picks up from there. This is how you get reliable, resumable projection rebuilds without a message broker.

Gaps in sequences

PostgreSQL sequences can have gaps — for example, if a transaction is rolled back after the sequence advances. Consequently, your projection polling logic must never assume that all sequences between last_processed and max_seen exist. Query WHERE global_sequence > :last AND global_sequence <= :batch_max and handle gaps gracefully.

The Java event model: what you need and nothing more

The event model itself requires very little boilerplate in modern Java. A sealed interface for your domain events, records for the concrete types, and a simple envelope for serialisation are all you need to get started. Java records, available since Java 16 and stable since Java 17, are an excellent fit here because they are immutable by default and require no Lombok.

// The marker interface — all domain events implement this
public sealed interface DomainEvent permits
    OrderPlaced, OrderShipped, OrderCancelled, PaymentReceived { }

// A concrete event — a plain Java record
public record OrderPlaced(
    String   orderId,
    String   customerId,
    BigDecimal amount,
    String   currency,
    Instant  occurredAt
) implements DomainEvent { }

// The envelope written to and read from the event_store table
public record EventEnvelope(
    long   globalSequence,
    String streamId,
    long   streamVersion,
    String eventType,
    String payloadJson,
    String metadataJson,
    Instant occurredAt
) { }

The EventRepository is, deliberately, a narrow interface. It has three methods: append a batch of events for a single stream, load all events for a stream, and load events from a given version onwards (used after a snapshot). That is the entire write and read API you need for the aggregate side of your system.

public interface EventRepository {

    // Append events for a stream, enforcing optimistic concurrency.
    // expectedVersion is the version of the last known event for this stream.
    // Throws OptimisticConcurrencyException if the stream has moved on.
    void append(String streamId, long expectedVersion,
                List<DomainEvent> events);

    // Load all events for a stream from the beginning.
    List<EventEnvelope> loadStream(String streamId);

    // Load events for a stream starting after a given version.
    // Used when a snapshot is available.
    List<EventEnvelope> loadStreamFrom(String streamId, long afterVersion);
}

Optimistic concurrency: the part that actually protects you

The append operation deserves close attention because getting it wrong leads to lost updates and corrupted aggregate state. The pattern is straightforward: before appending, you assert that the current maximum version for this stream equals the version you loaded. If another writer has appended in the meantime, the unique constraint fires.

// Simplified JDBC implementation of the append method
public void append(String streamId, long expectedVersion,
                   List<DomainEvent> newEvents) {

    String sql = """
        INSERT INTO event_store
          (stream_id, stream_version, event_type, payload, occurred_at)
        VALUES (?, ?, ?, ?::jsonb, ?)
        """;

    try (Connection conn = dataSource.getConnection()) {
        conn.setAutoCommit(false);
        try (PreparedStatement ps = conn.prepareStatement(sql)) {
            long version = expectedVersion;
            for (DomainEvent event : newEvents) {
                version++;
                ps.setString(1, streamId);
                ps.setLong(2, version);
                ps.setString(3, event.getClass().getSimpleName());
                ps.setString(4, serializer.serialize(event));
                ps.setObject(5, Instant.now());
                ps.addBatch();
            }
            ps.executeBatch();
            conn.commit();
        } catch (SQLException ex) {
            conn.rollback();
            // PostgreSQL error code 23505 = unique_violation
            if ("23505".equals(ex.getSQLState())) {
                throw new OptimisticConcurrencyException(streamId, expectedVersion);
            }
            throw new EventStoreException("Append failed for stream: " + streamId, ex);
        }
    } catch (SQLException ex) {
        throw new EventStoreException("Connection failure", ex);
    }
}

This is the complete concurrency mechanism. There is no need for distributed locks, no need for a SELECT FOR UPDATE, and no need for a version column on your aggregate read model. The database unique constraint does the work, and it does it correctly under concurrent load.

Aggregate load time vs event count — with and without snapshots

Measured loading a single aggregate stream from PostgreSQL (local), Jackson deserialisation included. p95 latency in milliseconds. Source: Benchmarks measured with JMH on a local PostgreSQL 16 instance, Java 21, Jackson 2.17. Hardware: 8-core, 32 GB RAM. Values are indicative; your mileage will vary with network latency and event payload size.

Snapshots: the right strategy, and the one nobody warns you about

The chart above makes the snapshot trade-off obvious: without snapshots, aggregate load time grows linearly with event count. Eventually, that becomes a real problem — not on day one, but reliably after an aggregate accumulates a few hundred events. The question is not whether to snapshot, but when and how.

The interval strategy

The simplest approach: after every N events appended to a stream, take a snapshot. Store the serialised aggregate state alongside the stream version at which the snapshot was taken. On the next load, check for the most recent snapshot first, load aggregate state from it, and then replay only the events appended after that version.

CREATE TABLE snapshots (
  stream_id       VARCHAR(255) NOT NULL,
  stream_version  BIGINT       NOT NULL,
  payload         JSONB        NOT NULL,
  created_at      TIMESTAMPTZ  NOT NULL DEFAULT now(),
  PRIMARY KEY (stream_id, stream_version)
);
// Loading an aggregate with snapshot support
public Order loadOrder(String orderId) {
    Optional<Snapshot> snap = snapshotRepository.findLatest(orderId);

    List<EventEnvelope> events = snap
        .map(s  -> eventRepository.loadStreamFrom(orderId, s.streamVersion()))
        .orElseGet(() -> eventRepository.loadStream(orderId));

    Order order = snap
        .map(s  -> deserializer.deserialize(s.payload(), Order.class))
        .orElseGet(Order::empty);

    events.forEach(env -> order.apply(deserializer.deserialize(env)));
    return order;
}

The part nobody warns you about: snapshot invalidation after schema changes

Here is the problem. You take a snapshot of an aggregate at version 150. Three months later, you add a new field to the aggregate’s Java class. The snapshot at version 150 was serialised without that field. When you deserialise it now, Jackson either throws an UnrecognizedPropertyException or silently populates the new field with null. Neither outcome is what you want.

The solution is to version your snapshots independently. Store a snapshot_schema_version alongside each snapshot. When you change the aggregate structure, increment the schema version. Your snapshot loader checks the schema version before using a snapshot: if the version does not match the current schema, it falls back to a full replay from the event store. This is slower, but it is correct.

The snapshot invalidation trap

Never silently accept a deserialization failure from a snapshot and fall back to full replay without logging a warning. If your snapshots are silently invalid in production, you will not notice until aggregate load times start climbing — and by then, the root cause may be non-obvious.

Schema evolution of events: the four strategies

This is the section that frameworks tend to abstract away, and consequently the section most developers have never thought through carefully. Events are immutable once written, but the classes that represent them in your codebase are not. At some point, you will need to change what an event looks like. There are four recognised strategies, each with different trade-offs.

StrategyDescriptionBackward compatibilityOperational costBest for
Weak schemaAccept unknown fields, use defaults for missing ones. Jackson’s @JsonIgnoreProperties(ignoreUnknown=true).HighLowAdditive changes: new optional fields
Event versioningWrite OrderPlacedV2 as a new type. Keep reading both in the aggregate apply() method.HighMediumStructural changes that preserve both old and new semantics
UpcastingA pipeline of transformers converts old event JSON to the current structure before deserialisation.HighMediumWhen you want a single event class but need to read old payloads
In-place migrationA migration script rewrites old events in the store to the new format.DestructiveHighOnly when the old format is genuinely unusable and you have full control of all replaying code

In practice, the first two strategies handle 90% of real-world cases. The weak schema approach costs you nothing in most additive scenarios. Event versioning is verbose but explicit — you can grep your codebase for all places that handle OrderPlacedV1 and know exactly what you are supporting.

Upcasting is the most powerful pattern and the one most worth understanding in detail. The core idea: instead of keeping old Java classes around forever, you write a chain of pure functions that transform raw JSON. Each function takes the JSON of a specific event version and returns the JSON of the next version. At deserialisation time, you run the chain before handing the JSON to Jackson.

// A simple upcaster chain for OrderPlaced schema evolution
public interface Upcaster {
    String eventType();
    int fromVersion();
    JsonNode upcast(JsonNode oldPayload);
}

// Upcast OrderPlaced from schema v1 (no currency field)
// to schema v2 (currency field added, defaulting to "USD")
public class OrderPlacedV1ToV2 implements Upcaster {

    @Override public String eventType()  { return "OrderPlaced"; }
    @Override public int fromVersion()   { return 1; }

    @Override
    public JsonNode upcast(JsonNode v1) {
        ObjectNode v2 = v1.deepCopy();
        if (!v2.has("currency")) {
            v2.put("currency", "USD");  // safe default for legacy events
        }
        return v2;
    }
}

// The chain runs all upcasters in version order before deserialisation
public class UpcasterChain {
    private final Map<String, List<Upcaster>> byType;

    public JsonNode upcast(String eventType, int storedVersion, JsonNode payload) {
        List<Upcaster> chain = byType.getOrDefault(eventType, List.of());
        JsonNode current = payload;
        for (Upcaster u : chain) {
            if (u.fromVersion() >= storedVersion) {
                current = u.upcast(current);
            }
        }
        return current;
    }
}

Replay performance: what actually makes it slow

When you need to rebuild a projection from scratch — whether because you introduced a bug in the projection logic or because you are adding a new read model — you will replay every event in the store in global_sequence order. In a mature system, that might be tens of millions of events. The naive implementation does this slowly, and the reasons are worth understanding.

Projection replay throughput — events per second by approach

Replaying 1 million events from a PostgreSQL event store into an in-memory projection. Java 21, batch JDBC reads. Source: JMH benchmarks on local PostgreSQL 16, Java 21 virtual threads, Jackson 2.17, projection writing to an in-memory ConcurrentHashMap. Network-bound production environments will show different ratios.

Several factors account for the 80× difference between the naive approach and the optimised one. First and most significantly, row-by-row fetches generate one round-trip per event. Use setFetchSize() on your JDBC statement to enable cursor-based streaming, and process events in batches. Second, deserialisation of JSON is CPU-bound and embarrassingly parallel — use a ForkJoinPool or Java 21 virtual threads to process batches concurrently. Third, if your projection writes go to a database, batching those writes provides another large multiplier.

Additionally, if you run multiple projections simultaneously, consider whether they can share the single-pass read from the event store. Reading 10 million events once and fanning them out to 5 projections in memory is far faster than reading 10 million events 5 times.

On catch-up subscriptions

A catch-up subscription is the mechanism a projection uses to stay current without polling. The projection stores its last processed global_sequence. A background thread periodically queries for new events above that position and feeds them to the projection handler. This is simpler than a Kafka consumer group, does not require a message broker, and is easy to make reliable with a simple retry loop. For most systems, it is all you need.

What you actually do not need

It is worth being direct about what this approach does not give you, and why that is often fine rather than a limitation.

FeatureFramework providesDo you need it?DIY cost if you do
Saga orchestrationAxon’s @Saga, step tracking, compensationOnly for multi-aggregate transactionsMedium — a saga state table + handler, ~200 lines
Distributed command routingAxon Server routes commands to correct aggregate instanceOnly in multi-node deployments with shardingHigh — use Kafka or consistent hashing
Dead-letter queue for projectionsBuilt-in replay and error isolationYes, in productionLow — a failed_events table + admin tool
Projection registry and lifecycleAutomatic tracking of all projectionsHelpful but not essentialLow — a projection_checkpoints table
Event serialiser / deserialiserXStream, Jackson, customAlwaysVery low — Jackson with the upcaster chain above

The key insight from this table is that the most valuable parts of a framework are the parts that address genuinely hard distributed systems problems: cross-aggregate coordination, reliable delivery in a multi-node environment, and failure isolation at the projection level. If your system does not have those problems yet, you are paying a framework’s cognitive overhead for features you are not using.

When to reach for a framework

The framework answer becomes attractive when: you have multiple services coordinating via sagas; you need to shard command processing across nodes; or your team is new to event sourcing and wants guardrails. None of those are true for most teams starting out. Build the simple version first, and the places where you feel its absence will tell you exactly which framework features are worth the dependency.

The projection checkpoint table: a detail that matters

Before closing, there is one small detail that prevents a large class of production incidents: the projection checkpoint. Every projection must durably record its position in the event stream. Without this, a process restart replays the entire history every time — which is expensive at best and causes duplicate side effects at worst.

CREATE TABLE projection_checkpoints (
  projection_name  VARCHAR(255) PRIMARY KEY,
  last_sequence    BIGINT       NOT NULL DEFAULT 0,
  updated_at       TIMESTAMPTZ  NOT NULL DEFAULT now()
);

The checkpoint update should happen in the same database transaction as the projection’s write. If your projection writes to the same PostgreSQL instance as your event store, this is straightforward. If it writes to a separate store — say, Elasticsearch — you need to decide whether you are willing to accept at-least-once delivery (checkpoint after write, accept idempotent handlers) or prefer at-most-once (checkpoint before write, accept occasional missed events). For most business projections, idempotent at-least-once delivery is the pragmatic choice. Make your projection handlers idempotent by checking whether the event’s global_sequence has already been processed before applying it.

What we have learned

  • An event store is a persistence mechanism — an append-only log of everything that happened. An event bus is a delivery mechanism. Keeping them separate means your source of truth never depends on a broker being available.
  • A single append-only PostgreSQL table with a UNIQUE (stream_id, stream_version) constraint gives you a fully functional event store with optimistic concurrency at the database level, no application-level locking required.
  • Aggregate load time grows linearly with event count without snapshots. Snapshots every 50–200 events flatten this curve dramatically. But snapshot schema versioning is essential — a snapshot written before a schema change must be invalidated, not silently misread.
  • Schema evolution of events has four strategies: weak schema (additive changes), event versioning (new types in parallel), upcasting (JSON transform chain), and in-place migration (destructive, use rarely). The first two cover most real-world cases; upcasting handles the rest cleanly.
  • Replay throughput improves by 80× or more when you move from row-by-row JDBC fetches to batched reads with parallel deserialisation. A catch-up subscription based on global_sequence polling is all you need for live projection updates without a message broker.
  • Frameworks like Axon earn their place for distributed sagas and multi-node command routing. For a single-node or single-service deployment, the framework’s overhead often exceeds its benefit. Know what it does before you add it.

Eleftheria Drosopoulou

Eleftheria is an Experienced Business Analyst with a robust background in the computer software industry. Proficient in Computer Software Training, Digital Marketing, HTML Scripting, and Microsoft Office, they bring a wealth of technical skills to the table. Additionally, she has a love for writing articles on various tech subjects, showcasing a talent for translating complex concepts into accessible content.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button