Coding Guidelines – follow the rules
Getting the team to follow coding guidelines is important in maintenance to help ensure the consistency and integrity of the code base over time – and to help ensure software security (PPT). Of course teams may have to compromise on coding standards and style conventions, depending on what they have inherited in the code base; and teams that maintain multiple systems will have to follow different guidelines for each system.
In XP, teams are supposed to share a Metaphor: a simple high-level expression of the system architecture (the system is a production line, or a bill of materials) and common names and patterns that can be used to describe the system. It’s a fuzzy concept at best (PDF), a weak substitute for more detailed architecture or design, and it’s not of much practical value in maintenance. Maintenance teams have to work with the architecture and patterns that are already in place in the system.
What is important is making sure that the team has a common understanding of these patterns and the basic architecture so that the integrity isn’t lost – if it hasn’t been lost already. Getting the team together and reviewing the architecture, or reverse-engineering it, making sure that they all agree on it and documenting it in a simple way is important especially when taking over maintenance of a new system and when you are planning major changes.
Agile development teams start with simple designs and try to keep them simple. Maintenance teams have to work with whatever design and architecture that they inherit, which can be overwhelmingly complex, especially in bigger and older systems. But the driving principle should still be to design changes and new features as simple as the existing system lets you – and to simplify the system’s design further whenever you can.
Especially when making small changes, simple, just-enough design is good – it means less documentation and less time and less cost. But maintenance teams need to be more risk adverse than development teams – even small mistakes can break compatibility or cause a run-time failure or open a security hole. This means that maintainers can’t be as iterative and free to take chances, and they need to spend more time upfront doing analysis, understanding the existing design and working through dependencies, as well as reviewing and testing their changes for regressions afterwards.
Refactoring takes on a lot of importance in maintenance. Every time a developer makes a change or fix they should consider how much refactoring work they should do and can do to make the code and design clearer and simpler, and to pay off technical debt. What and how much to refactor depends on what kind of work they are doing (making a well-thought-out isolated change, or doing shotgun surgery, or pushing out an emergency hot fix) and the time and risks involved, how well they understand the code, how good their tools are (development IDEs for Java and .NET at least have good built-in tools that make many refactorings simple and safe) and what kind of safety net they have in place to catch mistakes – automated tests, code reviews, static analysis.
Some maintenance teams don’t refactor because they are too afraid of making mistakes. It’s a vicious circle – over time the code will get harder and harder to understand and change, and they will have more reasons to be more afraid. Others claim that a maintenance team is not working correctly if they don’t spend at least 50% of their time refactoring (PDF).
The real answer is somewhere in between – enough refactoring to make changes and fixes safe. There are cases where extensive refactoring, restructuring or rewriting code is the right thing to do. Some code is too dangerous to change or too full of bugs to leave the way it is – studies show that in most systems, especially big systems, 80% of the bugs can cluster in 20% of the code. Restructuring or rewriting this code can pay off quickly, reducing problems in production, and significantly reducing the time needed to make changes and test them as you go forward.
Testing is even more important and necessary in maintenance than it is in development. And it’s a major part of maintenance costs. Most maintenance teams rely on developers to test their own changes and fixes by hand to make sure that the change worked and that they didn’t break anything as a side effect. Of course this makes testing expensive and inefficient and it limits how much work the team can do. In order to move fast, to make incremental changes and refactoring safe, the team needs a better safety net, by automating unit and functional tests and acceptance tests.
It can take a long time to put in test scaffolding and tools and write a good set of automated tests. But even a simple test framework and a small set of core fat tests can pay back quickly in maintenance, because a lot changes (and bugs) tend to be concentrated in the same parts of the code – the same features, framework code and APIs get changed over and over again, and will need to be tested over and over again. You can start small, get these tests running quickly and reliably and get the team to rely on them, fill in the gaps with manual tests and reviews, and then fill out the tests over time. Once you have a basic test framework in place, developers can take advantage of TFD/TDD especially for bug fixes – the fix has to be tested anyways, so why not write the test first and make sure that you fixed what you were supposed to?
To get Continuous Testing to work, you need a Continuous Integration environment. Understanding, automating and streamlining the build and getting the CI server up and running and wiring in tests and static analysis checks and reporting can take a lot of work in an enterprise system, especially if you have to deal with multiple languages and platforms and dependencies between systems. But doing this work is also the foundation for simplifying release and deployment – frequent short releases means that release and deployment has to be made as simple as possible.
Onsite Customer / Product Owner
Working closely with the customer to make sure that the team is delivering what the customer needs when the customer needs it is as important in maintenance as it is in developing a new system. Getting a talented and committed Customer engaged is hard enough on a high-profile development project – but it’s even harder in maintenance. You may end up with too many customers with conflicting agendas competing for the team’s attention, or nobody who has the time or ability to answer questions and make decisions. Maintenance teams often have to make compromises and help fill in this role on their own.
But it doesn’t all fit….
Kilner’s main point of concern isn’t really with Agile methods in maintenance. It’s with incremental design and development in general – that some work doesn’t fit nicely into short time boxes. Short iterations might work ok for bug fixes and small enhancements (they do), but sometimes you need to make bigger changes that have lots of dependencies. He argues that while Agile teams building new systems can stub out incomplete work and keep going in steps, maintenance teams have to get everything working all at once – it’s all or nothing.
It’s not easy to see how big changes can be broken down into small steps that can be fit into short time boxes. I agree that this is harder in maintenance because you have to be more careful in understanding and untangling dependencies before you make changes, and you have to be more careful not to break things. The code and design will sometimes fight the kinds of changes that you need to make, because you need to do something that was never anticipated in the original design, or whatever design there was has been lost over time and any kind of change is hard to make.
It’s not easy – but teams solve these problems all the time. You can use tools to figure out how much of a dependency mess you have in the code and what kind of changes you need to make to get out of this mess. If you are going to spend “weeks, months, or even years” to make changes to a system, then it makes sense to take time upfront to understand and break down build dependencies and isolate run-time dependencies, and put in test scaffolding and tests to protect the team from making mistakes as they go along. All of this can be done in time boxed steps. Just because you are following time boxes and simple, incremental design doesn’t mean that you start making changes without thinking them through.
Read Working With Legacy Code – Michael Feathers walks through how to deal with these problems in detail, in both object oriented and procedural languages. What to do if it takes forever to make a change. How to break dependencies. How to find interception points and pinch points. How to find structure in the design and the code. What tests to write and how to get automated tests to work.
Changing data in a production system, especially data shared with other systems, isn’t easy either. You need to plan out API changes and data structure changes as carefully as possible, but you can still make data and database changes in small, structured steps.
To make code changes in steps you can use Branching by Abstraction where it makes sense (like making back-end changes) and you can protect customers from changes through Feature Flags and Dark Launching like Facebook and Twitter and Flickr do to continuously roll out changes – although you need to be careful, because if taken too far these practices can make code more fragile and harder to work with.
Agile development teams follow incremental design and development to help them discover an optimal solution through trial-and-error. Maintenance teams work this way for a different reason – to manage technical risks by breaking big changes down and making small bets instead of big ones.
Working this way means that you have to put in scaffolding (and remember to take it out afterwards) and plan out intermediate steps and review and test everything as you make each change. Sometimes it might feel like you are running in place, that it is taking longer and costing more. But getting there in small steps is much safer, and gives you a lot more control.
Teams working on large legacy code bases and old technology platforms will have a harder time taking on these ideas and succeeding with them. But that doesn’t mean that they won’t work. Yes, you can be Agile in maintenance.