Software Development

How to avoid messy code

Few programmers explicitly intend to write poorly structured source code.

They don’t sit down, whip out their Bad Code Design Patterns book, and wreak meticulous spaghettipocalypse. Rather, poorly structured code is what happens when programmers don’t know what they’re doing.
 
 
 
 
 
 
 

Figure 1: Two Java package structures: one well-designed, the other, not so much.

So: why is this difficult?

Source code has many properties, and of different kind.

One property, for instance, is the number of public methods in a program. Programmers easily control this property: making a private method public increases the number by 1. And that’s it. In a sense, this is a, “Linear,” property in that small changes produce small effects.

Structure also represents a source code property, but calling a method b() from method a() does not only affect those two methods. New transitive dependencies form from all methods depending on a() and on all methods depended upon by b().

Furthermore, Java has three structural levels: method, class and package, and method connections need not affect method-level alone. If the owning classes had not been connected, then new transitive dependencies pop into being on class level, too. And similarly on package level. Structure thus represents a, “Non-linear,” property, in that small changes may trigger large consequences.

This non-linearity makes writing well-structured programs hard.

It would be helpful if we could forget about this grand, over-arching structure and focus instead on small, linear properties that somehow magically lead to well-structured code.

Alas, no such linear properties exist.

But there are hidden clues.

Because source code properties are objective, we can measure them. We can certainly count the number of public methods in a program, and hosts of other linear properties besides. We can also measure the, “Messiness,” of a program via the structural disorder, a percentage which rises as source code structure decays. If we measure over a large number of programs, we can then calculate the mathematical correlation between structural disorder and all those other properties.

A negligible correlation would imply no connection between a particular linear property and overall program structure. A large correlation, however, suggests that careful management of that linear property may contribute towards overall well-structured awesomeness.

For example, if structural disorder correlated 100% with the number of public methods, then we might suggest minimizing the numbers of public methods in order to minimize structural disorder, thereby using a simple, linear property to control a difficult non-linear one.

Let’s give it a whirl.

Let’s blitz 4 million lines of code from 38 Java systems1 in a code analyzer and get correlatin’ over dozens of its structural properties. Table 1 shows the strongest structural disorder correlations discovered2 (full matrices: method, class and package).

MethodClassPackage
Average circular dependencies0.620.210.58
Average complectation0.750.390.02
Average depth0.660.710.89
Average impact set0.590.350.42
Average impacted set0.560.350.42
Average middle-man0.70.23-0.1
Average transitive dependencies0.560.240.35
Average transitive dependency length0.640.240.42

Table 1: Structural disorder correlations with other properties.

Only one property correlates strongly with structural disorder over all three levels: depth.

The depth of a method (class, or package) is just its position in a transitive dependency. In figure 2, on the left, the depth of method a() is 0, the depth of b() is 1, c() is 2, etc. These depths sum to 21. On the right, however, a() still has a depth of 0, but all other methods – being directly called from a() – each have a depth of 1, making the total depth of the right structure just 6.

Figure 2: A deep transitive dependency on the left, and shallow dependencies on the right.

It is this depth total which correlates with structural disorder. The deeper your code, the more disordered it’ll probably be. If you want to manage your program’s structural disorder, avoid deep dependencies3.

One way to do this is to user a coordinator (sunburst) method which calls other methods to do the heavy lifting, with the coordinator reduced to a sequencing role. Then repeat this pattern on class- and package-level, where possible.

Summary

A previous post introduced four evidence-based principles for code structure, where the justifying correlations were weak but non-negligible.

This post adds a fifth principle, “Manage depth,” but with a much stronger correlation, making the list of evidence-based structural principles now:

  1. Manage Size.
  2. Manage method Impact set.
  3. Manage absolute Potential coupling.
  4. Manage the number of Transitive dependencies.
  5. Manage Depth.
Reference: How to avoid messy code from our JCG partner Edmund Kirwan at the A blog about software. blog.

Edmund Kirwan

Edmund is a programmer with a telecoms company in Stockholm where he is currently working on a large-scale network simulator. In his spare time he thinks far too much about program-structure.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Vadym Baranenko
Vadym Baranenko
7 years ago

Interesting article!
may be I didn’t get all the computations right, but is you are measuring disorder by number of public methods it’s kinda make sense that there is a correlation with depth. Simple because in layered architecture it’s a case, upper layers are calling lower layers.
But you make me think about using package visibility more to define API of the module, because in real life, is we talk about Java, its something rare to see.

Back to top button