Measuring Code Complexity

Luis AtencioSeptember 22nd, 2011Last Updated: October 21st, 2012

1 150 3 minutes read

Lately, development managers have put a lot of interest in measuring code quality. Therefore, things like code reviews and analysis tools have become very popular at identifying “Technical Debt” early.

Several tools exist for Java: Sonar, JavaNCSS, Eclipse plugins; as well as other languages: Visual Studio Code Analysis, PHPDepend, among others.

What is Technical Debt?

From the Sonar site, technical debt is the cost (in man days) measured by the following formula:

TD = (cost to fix duplications) + (cost to fix violations) +
(cost to comment public API) + (cost to fix uncovered complexity) +
(cost to bring complexity below threshold)

Many organizations are actually resorting to this metric as a long term investment. Although very hard to quantify, reducing technical debt can reduce the total number of open bugs, improve code quality, lower developer ramp up time (thereby fighting Brooks law), and more importantly decrease the number of man-(hours|days) it takes to resolve an issue — A good investment.

In this article, I want to focus on one factor of Technical Debt called “Code Complexity”.

Code complexity in itself is very “complex” to measure. It is influenced by numerous factors such as: Average Hierarchy Height, Average Number of Derived Classes, Afferent Coupling, Efferent Coupling, number of “if” condition statements, and many others. Let’s talk about some of the most important metrics briefly in order to understand what these tools capture and how it’s measured.

McCabe Cyclomatic Complexity (MCC)

You’ve probably come across this one before. The McCabe Cyclomatic Metric was introduced by Thomas McCabe in 1976 (link to this paper at the bottom). It measures the number of independent paths (term taken from graph theory) through a particular method (let’s talk Java parlance, although the sample applies to whole programs or subroutines). For example, for a simple method that has no conditionals, the MCC is 1. Programs that have many conditionals are harder to follow, harder to test, and as a result feature a higher MCC.

The MCC formula is:

M = E – N + X

where M is the McCabe Cyclomatic Complexity (MCC) metric, E is the number of edges, N is the number of nodes or decision points (conditional statements), and X is the number of exits (return statements) in the graph of the method.

Quick Example:

In this example, MCC = 3

A simpler method of computing the MCC is demonstrated in the equation below. If D is the number of decision points in the program, then

M = D + 1 (Each decision point normally has two possible paths)

As mentioned earlier, MCC also is useful in determining the testability of a method. Often, the higher the value, the more difficult and risky the method is to test and maintain.

Some standard values of Cyclomatic Complexity are shown below:

M = D + 1	Assesment
1-10	not much risk
11-20	moderate risk
21-50	high risk
51+	untestable, very high risk

One final word on MCC that also applies to most of the other metrics: each element in the formulae is assumed to have the same weight. In MCC’s case, both branches are assumed to be equally complex. However, in most cases this is not the case. Think of the if statement with code for only one branch–yet each branch is treated as having the same weight. Also, measures of expressions are all the same, even for those that contain many factors and terms. Be aware of this and be prepared, if your tool gives you the ability, to add weight to different branches. This metric is called an Extended McCabe Complexity Metric.

Afferent Coupling (Ca)
Afferent coupling is the number of other packages that depend upon classes within this package. This value is good indicator of how changes to classes in this package would influence other parts of the software.

Efferent Coupling (Ce)
Efferent coupling is the number of other packages that classes from this package depend upon. This value indicates how sensitive this package is for changes to other packages.

Code with high Ca and high Ce is very hard to test and maintain, therefore, has very high complexity and is largely unstable.

Instability (I)
Instability is the ratio between efferent coupling (Ce) and the total package coupling (Ce + Ca) which is based on the following formula (Ce / (Ce + Ca)) and produces results in the range [0,1]. As I -> 0, this indicates a maximally stable package that is completely independent. On the other hand, as I -> 1 this indicates a totally instable package that has no incoming dependencies but depends upon other packages.

Resources

Reference: Measuring Code Complexity from our JCG partner Luis Atencio at Reflective Thought.

Related Articles :

Measuring Code Complexity

Thank you!

Luis Atencio

Thank you!