Jim Bird

About Jim Bird

Jim is an experienced CTO, software development manager and project manager, who has worked on high-performance, high-reliability mission-critical systems for many years, as well as building software development tools. His current interests include scaling Lean and Agile software development methodologies, software security and software assurance.

Rule of 30 – When is a method, class or subsystem too big?

A question that constantly comes up from people that care about writing good code, is: what’s the right size for a method or function, or a class, or a package or any other chunk of code?
 
At some point any piece of code can be too big to understand properly – but how big is too big? It starts at the method or function level.
 
 
 
 
 
 
In Code Complete, Steve McConnell says that the theoretical best maximum limit for a method or function is the number of lines that can fit on one screen (i.e., that a developer can see at one time). He then goes on to reference studies from the 1980s and 1990s which found that the sweet spot for functions is somewhere between 65 lines and 200 lines: routines this size are cheaper to develop and have fewer errors per line of code.

However, at some point beyond 200 lines you cross into a danger zone where code quality and understandability will fall apart: code that can’t be tested and can’t be changed safely. Eventually you end up with what Michael Feathers calls “runaway methods”: routines that are several hundreds or thousands of lines long and that are constantly being changed and that continuously get bigger and scarier.

Patrick Duboy looks deeper into this analysis on method length, and points to a more modern study from 2002 that shows that code with shorter routines has fewer defects overall, which matches with most people’s intuition and experience.

Smaller must be better

Bob Martin takes the idea that “if small is good, then smaller must be better” to an extreme in Clean Code:

“The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that. Functions should not be 100 lines long. Functions should hardly ever be 20 lines long.”

Martin admits that “This is not an assertion that I can justify. I can’t produce any references to research that shows that very small functions are better.” So like many other rules or best practices in the software development community, this is a qualitative judgement made by someone based on their personal experience writing code – more of an aesthetic argument – or even an ethical one – than an empirical one. Style over substance.

The same “small is better” guidance applies to classes, packages and subsystems – all of the building blocks of a system. In Code Complete, a study from 1996 found that classes with more routines had more defects. Like functions, according to Clean Code, classes should also be “smaller than small”. Some people recommend that 200 lines is a good limit for a class – not a method, or as few as 50-60 lines (in Ben Nadel’s Object Calisthenics exercise)and that a class should consist of “less than 10” or “not more than 20” methods. The famous C3 project – where Extreme Programming was born – had 12 methods per class on average. And there should be no more than 10 classes per package.

PMD, a static analysis tool that helps to highlight problems in code structure and style, defines some default values for code size limits: 100 lines per method, 1000 lines per class, and 10 methods in a class. Checkstyle, a similar tool, suggests different limits: 50 lines in a method, 1500 lines in a class.

Rule of 30

Looking for guidelines like this led me to the “Rule of 30” in Refactoring in Large Software Projects by Martin Lippert and Stephen Roock:
 
“If an element consists of more than 30 subelements, it is highly probable that there is a serious problem”:

  • Methods should not have more than an average of 30 code lines (not counting line spaces and comments).
  • A class should contain an average of less than 30 methods, resulting in up to 900 lines of code.
  • A package shouldn’t contain more than 30 classes, thus comprising up to 27,000 code lines.
  • Subsystems with more than 30 packages should be avoided. Such a subsystem would count up to 900 classes with up to 810,000 lines of code.
  • A system with 30 subsystems would thus possess 27,000 classes and 24.3 million code lines.

What does this look like? Take a biggish system of 1 million NCLOC. This should break down into:

  • 30,000+ methods
  • 1,000+ classes
  • 30+ packages
  • Hopefully more than 1 subsystem

How many systems in the real world look like this, or close to this – especially big systems that have been around for a few years?

Are these rules useful? How should you use them?

Using code size as the basis for rules like this is simple: easy to see and understand. Too simple, many people would argue: a better indicator of when code is too big is cyclomatic complexity or some other measure of code quality. But some recent studies show that code size actually is a strong predictor of complexity and quality – that

“complexity metrics are highly correlated with lines of code, and therefore the more complex metrics provide no further information that could not be measured simplify with lines of code”.

In ‘Beyond Lines of Code: Do we Need more Complexity Metrics’ in Making Software, the authors go so far as to say that lines of code should be considered always as the ‘first and only metric’ for defect prediction, development and maintenance models.

Recognizing that simple sizing rules are arbitrary, should you use them, and if so how?

I like the idea of rough and easy-to-understand rules of thumb that you can keep in the back of your mind when writing code or looking at code and deciding whether it should be refactored. The real value of a guideline like the Rule of 30 is when you’re reviewing code and identifying risks and costs.

But enforcing these rules in a heavy handed way on every piece of code as it is being written is foolish. You don’t want to stop when you’re about to write the 31st line in a method – it would slow down work to a crawl. And forcing everyone to break code up to fit arbitrary size limits will make the code worse, not better – the structure will be dominated by short-term decisions.

As Jeff Langer points out in his chapter discussing Ken Beck’s four rules of Simple Design in Clean Code:

“Our goal is to keep our overall system small while we are also keeping our functions and classes small. Remember however that this rule is the lowest priority of the four rules of Simple Design. So, although it’s important to keep class and function count low, it’s more important to have tests, eliminate duplication, and express yourself.”

 
Sometimes it will take more than 30 lines (or 20 or 5 or whatever the cut-off is) to get a coherent piece of work done. It’s more important to be careful in coming up with the right abstractions and algorithms and to write clean clear code – if a cut-off guideline on size helps to do that, use it. If it doesn’t, then don’t bother.
 

Reference: Rule of 30 – When is a method, class or subsystem too big? from our JCG partner Jim Bird at the Building Real Software blog.

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

JPA Mini Book

Learn how to leverage the power of JPA in order to create robust and flexible Java applications. With this Mini Book, you will get introduced to JPA and smoothly transition to more advanced concepts.

JVM Troubleshooting Guide

The Java virtual machine is really the foundation of any Java EE platform. Learn how to master it with this advanced guide!

Given email address is already subscribed, thank you!
Oops. Something went wrong. Please try again later.
Please provide a valid email address.
Thank you, your sign-up request was successful! Please check your e-mail inbox.
Please complete the CAPTCHA.
Please fill in the required fields.

5 Responses to "Rule of 30 – When is a method, class or subsystem too big?"

  1. pschwarz says:

    After saying “Functions should hardly ever be 20 lines long”, Bob Martin goes on to say, of a program written by Kent Beck, “every function was just 2, or 3, or 4 lines long, each was transparently obvious, each told a story, and each led you to the next in a compelling order”. That’s how short your functions should be.

    • Does that scale to 1 million lines of code?

    • Anonymous Coward says:

      I’m just reading Bob Martin’s book. I think he doesn’t consider an aspect which is IMO crucial to how the human brain works. We don’t work with code of a given size, we work with ensembles of symbols. The more symbols in a given context, the harder we comprehend that context. Contexts are hierarchical, and roughly identical to programming artefacts – projects, packages, classes and methods.

      So if you want to keep your system comprehensible at all levels/ keep all contexts that make up your system, there are two things you should do:
      * make sure you can reason in any given context without the need to dive into internals of that context’s vocabulary or the vocabulary of that context’s collaborators (i.e. don’t need to look at classes in other packages, be they siblings or sub-packages, to understand a package’s structure, neither do you need to look at implementation details of a given method or at the interface of other classes to understand how the class as a whole works)
      * keep size of each context/scope roughly uniform – i.e. if you have at most 10 classes in most packages, classes with 50 methods of 2-3 lines break the uniformity, whereas classes with 10 methods of 10-15 less so.

      Bob Martin’s example of Kent Beck’s code with very small methods refers to a quite small piece of code – code to draw a graphical effect on screen. He doesn’t say anything about class or package structure or the overall size of the code, just about the method length. IMO this is too little to judge whether the max. 5 lines per method contributed to code readability or not.

      My thesis is that there’s no absolute truth about how large or how small a given scope for reasoning should be. Obviously, smaller scopes, i.e. scopes with less symbols in them, are easier to comprehend. But splitting things up too much at one level makes the next level harder to comprehend, or requires you to create a deeper scope nesting hierarchy, both of which increase complexity and reduce comprehensibility. As such, Bob Martin’s general advice that you should split till you drop only makes sense in systems of infinite size, IMO – there, you have scope nesting so deep that any attempt to limit this depth to make the system as a whole more comprehensible is hopeless.

      Such systems are enterprise monoliths with many millions of lines of code (remember that an older version of Windows was rumoured to have beyond 1E+8 LOCs). This may have been a reasonable architecture a decade ago, but nobody does this anymore today. Nowadays, the orthodox way to organize systems is more in line with Netflix’ microservices concept. In such systems, limiting the scope nesting depth is obviously realistic, and as such it makes sense to consider the number of statements in a method, the number of methods in a class, the number of classes in a package, the number of packages in a project _and_ the scope nesting depth in relation to each other.

      Just my opinion.

  2. dapeng liu says:

    “a class avg 30 methds, avg 900 lines” is absurd, any class grows more than 300 lines is absurd imo

  3. James S says:

    I have to say, this is absolute futile speculation. How can you quantify how large a function should be when a lot of it is 1 – domain specific, 2 – language specific ( and moreso api), 3 – purpose specific ? If you are coding some realtime system in C, or system coding in C++ , especially where there are complicated algorithms in which separation would be arbitrary, putting a metric is like that makes no sense at all. Let’s look at ownership, something a Java coder rarely does ( usually just hoping gc figures it out and removes it like it shoud). You would separate classes by delegation. Even breaking it further apart you may have to eventually keep a chain of ownership. Unfortunately that could be larger than expected.
    … How many lines you can see on a screen? Is that a serious example? Are we talking text mode? 1080p? What?
    For: _After saying “Functions should hardly ever be 20 lines long”, Bob Martin goes on to say, of a program written by Kent Beck, “every function was just 2, or 3, or 4 lines long, each was transparently obvious, each told a story, and each led you to the next in a compelling order”. That’s how short your functions should be._

    Arbitrary, we aren’t writing a haiku.

    I can pick any random example, first thing that will pop into my head that you’d not be able to.. How about this?
    Can you write a radix sort based on mod 7 of a list of signed integers where you need to keep track the amount of each that falls into the same bucket as well as returning value for the mean bucket of that list in 2 – 3 lines? How would you break that up without actually re-iterating over the same list? This who concept to me seems pretty myopic, it sounds like it’s coming from an EA with no concept of anything outside the Enterprise world where you’re just running simple business rules.

Leave a Reply


− 3 = zero



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close