The History of Failed Initiatives

Viktor FarcicOctober 16th, 2015Last Updated: October 15th, 2015

0 100 14 minutes read

I worked with many different clients. From small greenfield projects all the way to big ones in sectors like automotive, lottery, banking, insurance and other industries. With few exceptions teams in those projects can be divided into those that started anew and think that they are using latest and greatest ways to develop and those that are in charge of bigger projects that started long time ago. The later group tends to put to much energy trying to stay afloat that latest and greatest is very low on their list of priorities.

Greenfield projects, by definition, are young and did not have the time to accumulate legacy code. It’s easy to “brag” when you’re working in one of them. You still use Java? You don’t do continuous deployment? Your architecture is not based on microservices? Those are only few of the questions they might be asking other teams.

The truth of those projects is that given enough time and they too will become outdated. They too will be converted into something people are working on but do not brag about. That’s the nature of business. You create something with intention to capitalize on it and than move on to another opportunity. There is no business reason to go rewriting your software every time you discover a new way to do software. There are of course exceptions but such cases are so few in numbers that can not be considered a rule.

Think about companies we admire. Those are likely to be companies like NetFlix, Google, Amazon, Docker, and so on. Docker¹, for example, realized that they would be better off with a new interface. With the knowledge they have today, they would code Docker differently if they would be starting over. Maybe they would create their interface differently. Maybe they would code it in a different language. There are many things they would have done differently only if they knew then what they know now. But they can’t start over and even if they could, they would not want to. Their product is popular and there are many people who depend on it. They need to maintain backward compatibility. They need to maintain what they have today and continue capitalizing on it while striving to continuously come up with improvements and continue being better than their competition. NetFlix is another good example. What they do is impressive but still one can notice that they are falling behind. Their architecture and the tools they use is not latest and greatest any more (which does not mean that they are far behind).

Some other companies might have chosen Java as their programming language of choice and now would be better off with Go. Others might have very low test coverage because they started when automated tests were not considered an integral part of the development life-cycle. What I’m trying to say is that legacy is (almost) an inevitable part of any software. If there is anything in it that you would have done differently today, you have legacy code. You have the code that is not what you’d want it to be and you need to live with it until it is changed to your current vision. But, even if you do change all your code to be in line with what we consider to be the best way to solve certain problems today, tomorrow will bring something new and your code will become legacy again. That’s the nature of the fast developing and ever-changing industry we work in. Things change fast. What we consider a success today, easily becomes a failure tomorrow.

Long time ago I worked as a developer. We were getting requirements from analysts and deadlines from managers. What we were producing would be sent to testers and then a couple of times back to us for fixing problems. The separation was not only based on high level types of tasks (analysts, developers, managers, testers and so on) but also on some lower levels like front-end and back-end developers. One team would work on a product A while the other would work on the product B. The systems were as complex as was our division into departments. But life was great. We all knew what is expected from us and did our best to fulfill those expectations. We did not worry about the work that should be done by other teams in the same way as a worker in a factory does not care about the whole process but only the part he’s in charge with. That model was a success back then. Today? Today I would not accept to be a developer in such a structure. It would be an indication of a project bound to fail or, even worse, being led by someone who does not know what success means today.

Developers not being involved in discussions about requirements and blindly following some document, trying to reach arbitrary deadlines and hoping that someone else would take care of quality of their work is not the way we want to build software today. The result of such an organization is easiest to observe in the integration phases we had back then. All those different tasks done by isolated teams during a long period of time would come together. Integration engineer (IE) would come one morning, set up everything from the OS all the way until the applications we built and bring the system up. That first day of the integration phase would be the day all of us came to the office in a bad mood because, even though we hoped for the best, we knew that the outcome would be bad. That was the day that reality beats optimism. For some unknown (not to say mysterious) reason we all hoped that this time things would be different. We hoped that this time IE would bring up the whole system and everything would be fine. It never was.

The real question was not whether all those pieces done by different teams will fit together into one working system. We knew they wouldn’t. What we didn’t know is whether we had days, week or months left to fix all the problems that were bound to get discovered when everything is assembled together. Still, when looking at those times today, the major problem was that we thought that was normal. We thought that’s how you are supposed to do things.

With time we realized that automated testing is a must. We realized that integration problems can not be fixed by waiting for the last moment and that automation is the key. So we started creating tests. Ten tests, hundred tests, thousand tests. Number of tests was continuously increasing and with them problems those tests were creating. They were continuously failing for many different reasons and only a few of those could be related to bugs in the applications code. With time, we started to ignore tests because they we’re not reliable. Fixing problems caused by “flaky” tests become something to dread and with time we just stopped paying attention to them. Why would we spend hours fixing a failing test when it’s likely that the problem is anything but a bug in the application?

Data setup is not correct. Tests depend on each others. Third party system is down. The list of reasons for failed tests was huge and in most cases unrelated to the quality of the code we wrote. We tried to fix one problem (integration phase) by introducing another (unreliable tests). After all, the chances are that tests written after the code will be so influenced by it that they won’t find bugs and unfulfilled requirements but will reflect the current state of the software we’re building. Today I think that tests we wrote at that time were a failure.

Then came eXtreme programming (XP). Some of us were introduced to its practices even before XP come into existence, some learned about it when it become popular while others heard about it yesterday. When I got introduced to it, it was like music to my ears. For example, using Test-driven development sounded like a logical thing to do. Write tests first. Use them as a form of requirements. Force yourself to think before writing the code. Repeat short reg-green-refactor cycles. And it’s not only TDD. All XP practiced made perfect sense. It was like listening to a philharmonic orchestra playing Beethoven’s 5th. Perfect tune. Nothing to add and nothing to remove.

So we started applying XP practices with more enthusiasm then ever seen before. What was the result back then? A big success. How I see those attempts today? They were failures. We were not ready to play Beethoven’s 5th. Even when we become proficient in XP practices, we did not realize that there’s much more to be done than simply becoming proficient and applying them.

We realized that changes in practices and tools require changes in software architecture. You can’t efficiently write tests for all types of applications. Coupling between classes and applications will make writing your tests a horrible and very time-consuming experience. The main culprit? Monolithic architecture.

Designing monolithic applications is great. We design them by dividing things horizontally into few layers, throw in a few frameworks, decide on a few standards and, voila, we have a beautifully designed application that will be a pleasure to develop. It’s the prettiest princess in the whole kingdom. After a month of development, she’s still beautiful. Few more months pass and few more layers had to be added but nothing that we should worry about. It might not be the greatest beauty any more but her looks are still something to admire. She’s pretty (if not beautiful), agile, slim and easy to reason with. Fast forward few years. She is an old lady. Her looks are… Let’s just say we look at her only when we must. She’s old, slow and grumpy. No one can speak with her without having a lot of patience.

Remember the layers? We have a lot of them now and getting to the point means that we need to go slowly. The worst problem is that we never know what will hurt her feelings. Anything we say, even something as benign as “hello world” can trigger some old memory that will make her collapse. Something that started great (I must say that I was proud of most of my girls) turned up into a “bad marriage”. I was in love and now I speak to her only when I must. And it’s not for luck of trying. I tried makeup, spa and what-so-not but the result was always a temporary improvement that could not stop her from aging. I would change her for a younger girl but the investments we made together during all this time is too big for us to separate peacefully.

How did I feel about those applications back then? I thought they were great. Each was a masterpiece in its own. How do I feel about them today? They are failures. We should never have built such huge applications that with each passing month our abilities to improve, change and renew diminish more and more. Another failure when evaluated with the knowledge we have today. And industry proved this statement by introducing service oriented architecture (SOA).

SOA looked like a great idea and it still is. The problem with SOA is that big enterprises understood marketing opportunities behind the concept and came up with an infinite number of products one can use to develop services. The best example are enterprise service bus (ESB) products. They become so popular that almost every bigger company has one. However, those products themselves (together with the code we put on top of them) are not much different from monolithic applications we tried to get away from. Instead of making a change, we put another huge monster on top of the applications we thought should be replaced or renewed. Today we speak about microservices in an attempt to get away from the bad practices used around SOA architecture. They are an attempt to get back to the original idea. So, we went back to the drawing board and started applying microservices pattern. The results were beautiful.

We started building small independent services with well-defined APIs. We could choose a programming language, frameworks and libraries that best suited the task at hand. That in itself was a great improvement since monolithic applications meant that we had to stick with the stack and the logic of the existing application. With microservices we could develop faster, test faster and deploy faster. We could scale them and easily recuperate from failures that, by the way, affected only part of the system. We were spending less time fighting ghosts of the past and more time focusing on the task at hand. Time to market decreased, quality increased and overall happiness of the teams and the customers showed great improvement. Finally a success story that does not end up with me saying that it was a failure? Not really.

We had to pay the piper. Development improved at the expense of operations. Every hour development gained had to be moved to operations. It become a nightmare. We already had enough problems running operations with monolithic applications. With microservices that problem multiplied ten fold. Instead of making sure that few applications run smoothly, we had to think of tens, hundreds or even thousands of services deployed and run independently. You probably noticed the pattern. It started great and it turned into a failure. We needed tools that will help us manage the ever-increasing number of services.

Configuration management (CM) tools promised to solve servers setup and deployments. Did they succeed? Partly. They made the tasks previously done manually or through shell scripts much easier. With CM tools like CFEngine, Puppet and Chef we could set up a new base server in no time and make sure that it is always in the state we want it to be. The problem become evident with deployments and configuration of our services. Monoliths proved to be easy. Even without CM tools, how hard can it be to configure few applications on a few servers? Even if that number jumps to tens it’s still relatively low. With CM tools relatively easy task becomes even easier. The major difference was in speed in reliability. CMs allowed us to accomplish in seconds what took minutes, hours or even days. Even more importantly, they provided reliability. Unlike manual deployments and setup, with CM tools the results were always the same. They delivered what they promised. The cost was in maintenance of CM tools code and scripts.

Managing, for example, Puppet or Chef code for a large organization is problematic at best. Besides the sheer size of CM code, the real problem lies in separation from the application code. Since all configurations are located outside the repository your code resides (unless you have only one application and can therefore merge the two). It would be enough to add a new property or a setting to your application and the problems would start to arise. In order for that change to be used during the deployment, it would need to be advertised to those in charge of CM. You might think that developers can be in charge of both development and CM but the tools (at least those used until recently) were complex enough to make the learning curve too high for many developers already saturated with coding tasks.

And it’s not only the complexity but also the promotion that deployment logic should be separated from the applications code. That fostered equal separation in tasks. While some should develop others should take care of operations. Increased separation between interconnected tasks (development and deployment are connected, aren’t they?) slows down the process and increases the time to market. CMs brought great improvements but, from today’s prism, they sent us on a wrong track. While we benefited from automated configuration and deployment, we needed a better way to package our products. We needed something that will be deployed as it is in order to avoid infinite amount of configuration combinations and quickly turn into a nightmare even with CM tools. We needed immutable servers.

Immutable servers originated as VMs. We would build a new VM with everything needed for an application to run. Once built, we could move those VMs through the deployment pipeline (manual or automated) all the way until they reach production. While before we were deploying few artifacts (JARs, DLLs, static files, and so on) and (often a huge number of) configuration files into already running servers, with immutable deployments we would use the same VM in all environments. We would bring down the VM running the existing release (unless we’re doing blue-green deployment) and put the newly built one in its place. That way we could guarantee that what was tested is exactly the same as what was put to production and, at the same time, reduce configurations to a minimum.

The problem was the overhead VMs introduced (mainly OS). That overhead was not so noticeable with monolithic applications since they were few in numbers. With microservices architecture, few becomes many and many OSs are very resource demanding. Suddenly we needed more hardware than before. The dream of immutable deployments lives but the implementation using VMs died. Another great idea that turned into a failure. When that failure was joined with other complications, microservices become the architecture flavor practiced only by a very small number of teams. There were too few benefits when compared to problems they solved.

At this moment you might wonder why am I such a big proponent of microservices if benefits they bring to the table are overshadowed by too many problems. Well… That was past and we live in the present. Today we are building on top of the knowledge from the past and can count on the tools and ideas we didn’t have before. We design our systems around smaller independent services. We tend to use small and specialized tools over big do-it-all products. We use immutable servers packaged as containers and deployed using blue-green procedure. We use service discovery instead of static configurations maintained with CM tools.

Domain-driven design thought us how to organize our services in a more meaningful way. Continuous delivery and deployment thought us that it is not enough to install Jenkins, throw in a few builds and tests and call it continuous integration. We are focused on continuous growth and zero-downtime and are trying to build scalable and fault tolerant systems that would sound like a science fiction not many years ago.

How do I feel about projects I worked on? I was proud of most of them but today I consider them to be failures. Today I would do them differently and today I think that they are legacy. They are old (everything done more than a year ago is old, isn’t it?) and they do not fulfill what I consider a minimum requirements for a successful project. They might have been successes in the past but today they are failures. The real problem is not that they are legacy. It’s something else.

The problem is that an individual human brain cannot keep up with all the improvements the world is throwing at us on daily basis. We cannot keep up and we are bound to fall and be left behind. And that is the moment that separates great professionals from the rest of us. It’s not the question whether you’ll fail or not. We all do even though some of us do not even realize that. The real question is how you get up.

Each improvement raises the bar that will be reached by someone else and turns that improvement into a failure. We can either learn from those failures and improve the way we work or ignore them and stay still locked in our own immutable world until it’s too late to get back on track.

I am not associated with Docker and my speculations might easily be wrong. If that’s the case, please consider this as an example and change it to (almost) any other company that existed for a while.

Reference:

The History of Failed Initiatives from our JCG partner Viktor Farcic at the Technology conversations blog.