Enterprise Java

Why Your Integration Tests Are Slower Than They Should Be: The Spring Context Cache You Don’t Know You’re Misusing

Every JCG article recommends @SpringBootTest for integration testing. Almost none explain the context cache model underneath it — how a single misplaced @DirtiesContext, a scattered @MockBean, or a slightly different property override silently forces Spring to rebuild the entire ApplicationContext from scratch. The result is a test suite that should run in 90 seconds and instead takes 12 minutes. This article explains exactly why that happens and, more importantly, how to fix it.

The problem nobody names until CI hurts

Slow test suites follow a predictable pattern. The pipeline starts fast, developers add tests over several months, and somewhere around the 200-test mark the CI job that used to finish in two minutes now takes fifteen. Nobody introduced a big change. Nobody wrote a slow test. The time accumulated through dozens of individually invisible decisions, each of which caused Spring to spin up one extra ApplicationContext.

Spring’s TestContext Framework has a context cache that, when it works, is genuinely powerful. The first test that needs a context builds it — which on a real microservice with Testcontainers and Flyway migrations can easily take 15 seconds. Every subsequent test that matches the same configuration reuses that context at near-zero cost. The entire premise of @SpringBootTest performance is that this cache is hit repeatedly. The problem is that the cache key is determined by a precise combination of annotations and configuration, and a surprising number of common testing patterns break it in ways that are completely silent at runtime.

How the context cache key actually works

Spring stores loaded ApplicationContexts in a static Map — the ContextCache — keyed by a MergedContextConfiguration object. That object is a hash of everything Spring considers relevant to the identity of a context. If two test classes produce an identical MergedContextConfiguration, they share a context. If they differ by even one field, Spring creates an entirely separate context for the second class, cold-starts it, and holds both in the cache simultaneously.

According to the official Spring Framework documentation, the cache key is built from the following parameters gathered from the test class and all its superclasses:

The practical implication is worth stating clearly: two @SpringBootTest classes are only guaranteed to share a context if every one of those parameters is identical. That is a surprisingly high bar to clear accidentally, which is why most teams end up with more contexts than they realise.

Cache hit vs. miss — example 20-class suite with scattered configuration

Illustrative breakdown of how 20 integration test classes map to ApplicationContext instances under three common scenarios: no discipline, partial consolidation, and a shared base class strategy. Each cold-start costs ~15s; reuse costs ~0s.

Cache-killer #1: @DirtiesContext on a base class

@DirtiesContext does exactly what its name suggests: it marks the context as dirty, removes it from the cache, and forces a fresh startup for the next test that needs it. Used surgically on a single test class that genuinely modifies shared state, this is the correct tool. The problem, as Baeldung’s 2025 integration test guide describes, is what happens when it lands on a parent class.

If @DirtiesContext is placed on an abstract base class that every integration test inherits from, the result is catastrophic: every single test class evicts the context when it finishes, and every subsequent test that happens to share the same configuration must restart it from scratch. In a suite of 50 tests where 45 share a configuration, you would expect 1 cold start followed by 44 cache hits — roughly 15 seconds of context startup total. With @DirtiesContext on the base class, you get 50 cold starts — roughly 750 seconds. That is the entire difference between a 2-minute CI job and a 15-minute one.

The copy-paste antipattern@DirtiesContext frequently spreads through codebases via copy-paste. A developer hits a test isolation problem, adds it as a quick fix, commits. Three months later it has been copied into 8 different test base classes by engineers who saw it and assumed it was intentional best practice. It is now effectively disabling context caching for the entire suite.

What to do instead

The correct alternative is to reset state in the data layer, not by rebuilding the context. For database state, @Transactional on the test class combined with a rollback strategy costs essentially nothing compared to a context restart. For external state (a Redis key, a Kafka offset), an explicit @BeforeEach cleanup that deletes only the affected records is similarly cheap and achieves the same isolation guarantee.

Reserve @DirtiesContext for the genuinely rare scenario where the test modifies a bean definition or a static singleton in a way that cannot be undone — for example, a test that replaces a scheduled task configuration at the BeanDefinitionRegistry level. For everything else, data-layer cleanup is faster and does not punish every other test in the suite.

Cache-killer #2: scattered @MockBean declarations

@MockBean (and its Spring Boot 3.4+ replacement @MockitoBean) is where most teams inadvertently generate a combinatorial explosion of distinct contexts without realising it. The mechanism is straightforward: because @MockBean replaces a real bean with a Mockito mock, each unique set of mocked beans contributes to the cache key. Two test classes that mock the same beans produce the same key and share a context. Two test classes that mock different beans, or one that mocks an additional bean, produce different keys and each require their own context.

As explained by w3tutorials: “if the original context has a UserRepository bean, @MockBean(UserRepository.class) replaces it with a mock. This changes the bean definitions in the context. Since the context key includes bean definitions, modifying them with @MockBean creates a new context key.” What seems minor for one test becomes a massive overhead across a suite.

The organic growth problemIn a suite that has grown organically, each test class that needed to mock one service added @MockBean directly to that class. Over 30 test classes, you might end up with 12 distinct combinations of mocked beans — 12 distinct contexts — even though every test is loading the same application. This is the most common cause of “why does CI take 20 minutes” that we have seen in teams adopting Spring Boot testing at scale.

The problem — @MockBean in individual test classes creates separate contexts

// Context A — EmailService mocked, PaymentService real
@SpringBootTest
class OrderServiceTest {
    @MockBean EmailService emailService;
    // ...
}

// Context B — PaymentService mocked, EmailService real
// Different @MockBean set → Spring builds an entirely new context
@SpringBootTest
class InvoiceServiceTest {
    @MockBean PaymentService paymentService;
    // ...
}

// Context C — both mocked → yet another new context
@SpringBootTest
class CheckoutFlowTest {
    @MockBean EmailService emailService;
    @MockBean PaymentService paymentService;
    // ...
}

The fix — consolidate all @MockBean declarations into a single base class

// One base class declares ALL mocks the suite ever needs.
// Every subclass inherits the same MergedContextConfiguration → one shared context.
@SpringBootTest
@Transactional
abstract class IntegrationTestBase {
    @MockBean EmailService   emailService;
    @MockBean PaymentService paymentService;
    @MockBean NotificationService notificationService;
    // Declare every mock the whole suite needs — even if a given test does
    // not need all of them. The cost of an unused mock is negligible;
    // the cost of a new context is ~15 seconds.
}

// Subclasses inherit the base config → cache hit every time
class OrderServiceTest extends IntegrationTestBase {
    // Use only emailService mock — paymentService mock is present but ignored
}

class CheckoutFlowTest extends IntegrationTestBase {
    // Uses all three mocks — same context as OrderServiceTest
}

The trade-off here is explicit and acceptable: the shared context carries a handful of mocks that some tests do not need. Because each mock is a lightweight Mockito proxy and not a real service instance, this adds no meaningful overhead to context startup. What it buys is a single context for the entire suite, rather than one per mock-configuration permutation.

Cache-killer #3: per-test property overrides

Property overrides are the subtlest of the three main cache-breakers, because the intent is often performance optimisation. A developer wants to disable a feature flag in one test class, so they add @TestPropertySource(properties = "feature.payments.v2=false"). Another adds @SpringBootTest(properties = "server.port=0") directly. A third uses @ActiveProfiles("integration") while the rest use @ActiveProfiles("test").

Each of these is a different cache key. According to rieckpil.de’s best-practices guide: “Each unique profile combination creates a different cache key, preventing context reuse. Instead of toggling functionality with profiles, prefer configuration properties that can be overridden without changing the cache key.” The advice is precisely backwards from what most developers’ instincts suggest: fewer @ActiveProfiles declarations means better cache performance, not worse.

Annotation / patternBreaks cache?Safer alternative
@ActiveProfiles("test") on all testsNo — consistentKeep one shared profile across the entire suite
@ActiveProfiles("integration") on some, "test" on othersYes — different keyConsolidate to a single profile; use properties for feature toggles
@TestPropertySource(properties = "x=y") unique per classYes — different keyMove shared overrides to application-test.properties
@SpringBootTest(webEnvironment = RANDOM_PORT) everywhereNo — consistentFine as-is; just keep it uniform
@SpringBootTest(webEnvironment = MOCK) in some, RANDOM_PORT in othersYes — different keyPick one web environment for the whole suite; use @WebMvcTest for slice tests
@DynamicPropertySource with a static Testcontainers containerNo — one static methodDeclare the container as a static field in the base class (see below)
@DynamicPropertySource redeclared per test classYes — different customiserMove to a shared base class or DynamicPropertyRegistrar bean

Cache-killer #4: Testcontainers wired per test class

Testcontainers is excellent. The default pattern for using it, however — declaring a @Container field and a @DynamicPropertySource method in each test class — silently defeats the context cache. The reason is that @DynamicPropertySource methods are treated as context customisers. Each distinct method reference produces a different customiser entry in the cache key. Two test classes that declare their own @DynamicPropertySource methods (even with identical bodies) will produce different keys and each get their own context — as well as their own container, doubling Docker overhead.

The well-tested solution, endorsed in Baeldung’s optimisation guide and used extensively in practice, is to declare both the container and the @DynamicPropertySource method as static members of the shared base class. A static container is started once for the entire JVM process. The static @DynamicPropertySource method produces the same customiser object regardless of which subclass triggers it, making the cache key stable across all inheriting tests.

Java — Testcontainers with static shared container in base class (Spring Boot 3.1+)

// Works with Spring Boot 3.1+ — @ServiceConnection auto-configures the datasource
// The container starts once and is reused by all subclasses via the cache.
@SpringBootTest
@Transactional
abstract class IntegrationTestBase {

    @Container
    @ServiceConnection  // Spring Boot 3.1+ — no @DynamicPropertySource needed
    static final PostgreSQLContainer<?> postgres =
        new PostgreSQLContainer<>("postgres:16-alpine")
            .withReuse(true);
}

Java — fallback for Spring Boot < 3.1 using static @DynamicPropertySource

@SpringBootTest
@Transactional
abstract class IntegrationTestBase {

    static final PostgreSQLContainer<?> postgres =
        new PostgreSQLContainer<>("postgres:16-alpine");

    // Static block starts the container once for the entire JVM session
    static {
        postgres.start();
    }

    @DynamicPropertySource
    static void configureProperties(DynamicPropertyRegistry registry) {
        registry.add("spring.datasource.url", postgres::getJdbcUrl);
        registry.add("spring.datasource.username", postgres::getUsername);
        registry.add("spring.datasource.password", postgres::getPassword);
    }
    // All subclasses inherit this single static method ? same customiser ? same cache key
}

CI test suite run time — before and after cache consolidation

Approximate runtime breakdown for a 50-test suite with 15s context startup time. “Before” shows a typical organically-grown suite. “After” shows the effect of a shared base class with consolidated mocks and a static container. Context count drops from ~14 to 2.

How to diagnose how many contexts you are actually creating

Before you can fix the problem, you need to measure it. Fortunately, Spring makes this relatively easy. The first and most important step is to enable DEBUG logging on org.springframework.test.context.cache. This produces a cache statistics line at the end of each test run that tells you exactly how many contexts were created, how many cache hits occurred, and the current cache size.

application-test.properties — enable context cache logging

# Add to src/test/resources/application-test.properties (or logback-test.xml)
logging.level.org.springframework.test.context.cache=DEBUG

With this enabled, after a full test run you will see output similar to the following in your build log. The numbers tell you precisely how much cache reuse you are getting:

Example log output — 14 contexts in a 20-class suite (before fix)

# BAD — 14 creates, 6 hits means only 30% cache utilisation
Spring test ApplicationContext cache statistics:
[DefaultContextCache@... size = 14, maxSize = 32, parentContextCount = 0,
 hitCount = 6, missCount = 14]

Example log output — 2 contexts after consolidation (after fix)

# GOOD — 2 creates, 18 hits means 90% cache utilisation
Spring test ApplicationContext cache statistics:
[DefaultContextCache@... size = 2, maxSize = 32, parentContextCount = 0,
 hitCount = 18, missCount = 2]

A second diagnostic tool is the spring-startup-analyzer library, which can identify the specific beans driving startup time within each context — useful when you want to understand not just how many contexts you are creating, but why each one takes the time it does.

The complementary tool: test slices for non-integration tests

Not every test that uses Spring actually needs the full application context. This is worth stating because it is often overlooked when teams are optimising: the fastest context is the one you never started. Spring Boot’s slice annotations — @WebMvcTest@DataJpaTest@JsonTest@DataRedisTest — load only the beans relevant to a specific layer. A @WebMvcTest context starts in roughly 2–3 seconds compared to 10–15 seconds for a full @SpringBootTest because it loads only the web layer, controllers, and their direct dependencies.

The important constraint is that slice contexts and full contexts do not share the same cache entry — they are different context types entirely. So the strategy is: use slices for focused layer tests, use a single shared @SpringBootTest base class for true end-to-end integration tests, and keep the two populations completely separate. Mixing them — a @DataJpaTest class that inherits from your @SpringBootTest base — produces a new context that satisfies neither efficiently.

The three-tier strategy that works at scaleTier 1: Pure JUnit 5 unit tests — no Spring context at all, instantiate classes directly. Tier 2: @WebMvcTest / @DataJpaTest slice tests — fast, focused, one shared context per slice type. Tier 3: A single @SpringBootTest base class shared by all full-stack integration tests — one context, populated with all mocks the suite needs, using a static Testcontainers container. This structure keeps CI time predictable as the suite grows, because each new test is a hit against an already-warm context rather than a cold start.

Quick reference: what to check in your own suite

What to checkSignal it is brokenFix
@DirtiesContext placementPresent on a base class or more than 2–3 test classesReplace with @Transactional rollback or @BeforeEach data cleanup
@MockBean distributionDeclared in more than one test class with different combinationsCentralise all @MockBean declarations in one abstract base class
Active profilesMix of "test""integration""it" across classesStandardise on one profile for the entire integration test tier
@TestPropertySource usageUnique inline properties per test classMove shared overrides to application-test.properties
Testcontainers wiring@DynamicPropertySource declared in each test classMove container + source method to a static in base class; use @ServiceConnection on Spring Boot 3.1+
Web environmentMix of MOCK and RANDOM_PORT across testsPick one per tier; use slice tests for layer-specific tests
Cache hit ratehitCount much lower than missCount in log outputEnable DEBUG logging on o.s.t.context.cache, then address each miss type above

What we learned

Spring’s context cache is the entire reason @SpringBootTest is viable for large suites — without it, every test class would cold-start an ApplicationContext and CI would be unusable. The cache works by storing contexts keyed to a precise fingerprint of the test’s configuration, called the MergedContextConfiguration. Any difference in configuration classes, active profiles, property sources, web environment, or context customisers (which includes every @MockBean declaration and every @DynamicPropertySource method) produces a cache miss and forces a full context rebuild.

The four most common ways teams accidentally destroy cache efficiency are: placing @DirtiesContext on a shared base class, scattering @MockBean declarations across individual test classes with different combinations, using different profiles or per-class property overrides, and wiring Testcontainers via @DynamicPropertySource methods redeclared in each test class. All four have the same fix: consolidate into a single abstract base class that every integration test inherits, declare all mocks there, use a static container with @ServiceConnection, and enable cache logging to verify the hit rate. Done right, a suite that creates 14 contexts drops to 2, and a 12-minute CI pipeline becomes a 90-second one.

Eleftheria Drosopoulou

Eleftheria is an Experienced Business Analyst with a robust background in the computer software industry. Proficient in Computer Software Training, Digital Marketing, HTML Scripting, and Microsoft Office, they bring a wealth of technical skills to the table. Additionally, she has a love for writing articles on various tech subjects, showcasing a talent for translating complex concepts into accessible content.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Back to top button