Seriously How Long!?

Ashley FriezeMay 13th, 2020Last Updated: May 11th, 2020

0 240 3 minutes read

I mentioned the delay caused by a slow build when talking about the costs of tweaking code that has a slow build.

Let’s define what a slow build is.

It’s slow if it takes over 4 minutes.

If it’s a test that takes over 30 seconds it’s also slow.

Builds are Slower Than This Dude

I’m fully aware that many builds are significantly slower than my prescribe thresholds. The point is that we should do our damndest not to tolerate these slow builds, or at least slow stages of builds. I’ve worked on projects where each build takes 30 minutes.

While multiple branches could be tested in parallel, merging branches to master was, essentially a queue. At the end of our Sprints we needed air traffic control to get the merges to happen because there are not that many 30 minute blocks in a single working day!

Why is Your Current Build so Slow?

Erm… mumble, mumble, something to do with data access layers and Oracle… mumble, client confidentiality.

The interesting challenge here is that there are some services which are so bound to other real-world resources, that to meaningfully test them requires a real environment. Docker lets us build those environments, so we use Docker; then our tests slow down… and it’s both right AND wrong at the same time.

It all depends on how much you trust the client layer to your real-life service. If you trusted it completely, you could stub it completely, only testing it when you come to end-to-end testing. If, however, the client layer is an intrinsic risk in your software, or if you’re constrained by environments for high-level tests, then having a heavy service-specific integration test in your build can be a good thing.

I think it’s a good thing, but it’s also a slow thing. It’s hard to have the best of both worlds here.

Today I Broke The Cycle

I spent about 90 minutes today constructing an in-memory database. I could have used H2 with my existing ORM layer, and that might have worked, though I suspected it would introduce some weird Oracle schema quirks that would take more time.

Instead, I constructed an in memory implementation of the relatively simple operations I’d exposed in the Data Access Layer. I had a little refactoring to do to make it possible to produce a test-double/fake rather than use Mockito for mocking.

Test doubles/fakes are a great tool. If built correctly, they can be the perfect implementation of the interface you have created for the real resource, but are not constrained too much by real-world limitations.

In this same project, I’ve created (and not for the first time) an in-memory version of S3. This is again because I have an internal BLOB storage interface which is a simplification of how the app uses S3.

It took 90 minutes to make my in-memory services. It took a few more minutes to import some reference data into the data source from the real database. It then took a few minutes to write the test I wanted to write.

The aim of the test was to do a 100% comparison between the existing system we’re rewriting and the new one. Does it produce exactly the same output?

I needed to be able to debug the running solution and re-run my test over and over. I have tests that run against the real database, and I’ve even pioneered some tricks with JUnit 5 test suites to make it possible for my tests to run against local long-lived docker images, or spontaneously created containers inside the CI build.

However, running the test over and over in a few seconds was the aim and I felt using the real service would get in the way?

Did it Pay Off?

Short answer: yes.

There was a bug. It took multiple runs to debug it. In fact, there was also a discovery when I pulled the test data together that I had some other bits of spec to flesh out and check. I found a missing feature too.

I iterated over the code and tests about 40 times in the hour or so where I got everything working. This simply would not have been possible without test that could run in under 4 seconds.

I’d resisted creating an in-memory spoof of a database because I genuinely needed to see the data access layer working with real-life problems when I created it. Those problems would have been impossible to debug in the running application in the cloud, and would have been masked by the in-memory mocking.

However, every technique has its place in the project. In this case, I had a perfect problem for a test double. The challenge was to iterate quickly and mocks and fakes let you do that.

Normally people over mock and then have to add some realistic tests to cover the gaps. I just happened to do it the other way around this time.

Published on Java Code Geeks with permission by Ashley Frieze, partner at our JCG program. See the original article here: Seriously How Long!?

Opinions expressed by Java Code Geeks contributors are their own.