Effective Unit Testing – Not All Code is Created Equal

Nadav AzariaJanuary 27th, 2012Last Updated: October 21st, 2012

0 143 3 minutes read

Unit Testing is one of the most adopted methodologies for high quality code. Its contribution to a more stable, independent and documented code is well proven . Unit test code is considered and handled as an a integral part of your repository, and as such requires development and maintenance. However, developers often encounter a situation where the resources invested in unit tests where not as fruitful as one would expect. This leads us to wonder, as in any investment, where and how should resources be invested in unit tests?

Current metric used to assess the quality of unit testing utilize the notion of code coverage. Code coverage describes the effectiveness to which the source code of a program has been tested. In an ideal world every method we code will have a series of tests covering it’s code and validating it’s correctness. However, usually due to time limitations we either skip some tests or write poor quality ones. In such reality, while keeping in mind the amount of resources invested in unit testing development and maintenance, one must ask himself, given the available time, which code deserve testing the most?

And from the existing tests, which tests are actually worth keeping and maintaining? We will try to answer those questions today.

We believe that not all code is created equal. There are certain code sections that are harder to test than others. Other code sections are more important than others. We suggest a few guidelines which will help determine in what code sections to invest in Unit Testing first, and maintaining as well:

Usages of code – when code is used frequently, it is important to unit test it.
Code dependencies – similar to (1), when other code is heavily dependent on the examined code, the more important it is to unit test it. On the other hand, when the examined code is greatly dependent on other code, it is harder to test and the chances to catch a fault is smaller.
I\O dependency – code which is dependent on I\O (DB, Networking, etc), is harder to test, as it requires creating mock objects which simulate the behavior of the I\O components. This mock objects require developing, maintenance and are vulnerable to bugs on their own. Moreover, writing mock objects that will simulate the exact behavior of any given I\O, such as faults is not trivial at all.
Multithreaded code –multithreaded code behavior is unexpected and as such harder to test.
Cyclomatic complexity – this metric is used to indicate the complexity of your source code. The higher the complexity, it is more important to test the code.
Code accessibility – this measure is related to the number of people that are acquainted with the source code in question. The bigger the accessibility is the less testing is needed, since problems will be identified and handled more rapidly.

Regarding the latter question presented above, we suggest a new approach for managing Unit Tests. This preliminary idea defiantly needs some polish, and we only present a rough outline.

After taking all the above into account, the real bother is maintaining the tests. We suggest thinking on a single unit test as a stock. We can keep track on each test unit, treating them as dynamic objects that have initial value that can change over time. According to the above points, we can give each test a preliminary value, indicating its importance. Note that most of the attributes above, can be determined automatically. The change in value over time is related to our profit from the test. Each time a test fails and catches a real bug, its value increases and each time you invest in fixing the test itself, while not catching any real bug in your business logic, its value decreases. Moreover, each time you need to change the code of a test, as a result of change in your business logic, its value stays the same.

The above model is not complete, as we only wanted to give a general idea on effective unit testing. There is the question of how each value for our suggested points is computed? how will the preliminary value for each test will then be determined? and how much should we increase/decrease over time? This questions can be answered, for example, by using machine learning techniques, but it is out of the scope of this post.

Reference: Effective Unit Testing – Not All Code is Created Equal from our JCG partners Nadav Azaria & Roi Gamliel at the DeveloperLife blog.