Writing Tests for Data Access Code – Green Build Is Not Good Enough

Petri KainulainenJuly 17th, 2014Last Updated: July 17th, 2014

0 16 4 minutes read

The first thing that we have to do before we can start writing integration tests for our data access code is to decide how we will configure our test cases.

We have two options: the right one and wrong one.

Unfortunately many developers make the wrong choice.

How can we avoid making the same mistake?

We can make the right decisions by following these three rules:

Rule 1: We Must Test Our Application

This rule seems obvious. Sadly, many developers use a different configuration in their integration tests because it makes their tests pass.

This is a mistake!

We should ask ourselves this question:

Do we want to test that our data access code works when we use the configuration that is used in the production environment or do we just want that our tests pass?

I think that the answer is obvious. If we use a different configuration in our integration tests, we are not testing how our data access code behaves in the production environment. We are testing how it behaves when we run our integration tests.

In other words, we cannot verify that our data access code works as expected when we deploy our application to the production environment.

Does this sound like a worthy goal?

If we want to test that our data access code works when we use the production configuration, we should follow these simple rules:

We should configure our tests by using the same configuration class or configuration file which configures the persistence layer of our application.
Our tests should use the same transactional behavior than our application.

These rules have two major benefits:

Because our integration tests use exactly the same configuration than our application and share the same transactional behavior, our tests help us to verify that our data access code is working as expected when we deploy our application to the production environment.
We don’t have to maintain different configurations. In other words, if we make a change to our production configuration, we can test that the change doesn’t break anything without making any changes to the configuration of our integration tests.

Rule 2: We Can Break Rule One

There are no universal truths in software development. Every principle rule is valid only under certain conditions. If the conditions change, we have to re-evaluate these principles. This applies to the first rule as well.

It is a good starting point, but sometimes we have to break it.

If we want to introduce a test specific change to our configuration, we have to follow these steps:

Figure out the reason of the change.
List the benefits and drawbacks of the change.
If the benefits outweigh the drawbacks, we are allowed to change the configuration of our tests.
Document the reason why this change was made. This crucial because it gives us the possibility to revert that change if we find out that making it was a bad idea.

For example, we want to run our integration tests against an in-memory database when these tests are run in a development environment (aka developer’s personal computer) because this shortens the feedback loop. The only drawback of this change is that we cannot be 100% sure that our code works in the production environment because it uses a real database.

Nevertheless, the benefits of this change outweigh its drawbacks because we can (and we should) still run our integration tests against a real database. A good way to do this is to configure our CI server to run these tests.

This is of course a very simple (and maybe a bit naive) example and often the situations we face are much more complicated. That is why we should follow this guideline:

If in doubt, leave test config out.

Rule 3: We Must Not Write Transactional Integration Tests

One of the most dangerous mistakes that we can make is to modify the transactional behavior of our application in our integration tests.

If we make our tests transactional, we ignore the transaction boundary of our application and ensure that the tested code is executed inside a transaction. This is extremely harmful because it only helps us to hide the possible errors instead of revealing them.

If you want to know how transactional tests can ruin the reliability of your test suite, you should read a blog post titled: Spring pitfalls: transactional tests considered harmful by Tomasz Nurkiewicz. It provides many useful examples about the errors which are hidden if you write transactional integration tests.

Once again we have to ask ourselves this question:

Do we want to test that our data access code works when we use the configuration that is used in the production environment or do we just want that our tests pass?

And once again, the answer is obvious.

Summary

This blog post has taught use three things:

Our goal is not to verify that our data access code is working correctly when we run our tests. Our goal is to ensure that it is working correctly when our application is deployed to the production environment.
Every test specific change creates a difference between our test configuration and production configuration. If this difference is too big, our tests are useless.
Transactional integration tests are harmful because they ignore the transactional behavior of our application and hides errors instead of revealing them.

That is a pretty nice summary. We did indeed learn those things, but we learned something much more important as well. The most important thing we learned from this blog post is this question:

Do we want to test that our data access code works when we use the configuration that is used in the production environment or do we just want that our tests pass?

If we keep asking this question, the rest should be obvious to us.

Reference:

Writing Tests for Data Access Code – Green Build Is Not Good Enough from our JCG partner Petri Kainulainen at the Petri Kainulainen blog.