Software Development

Development Horror Story – Release Nightmare

Everyone has good stories about releases that went wrong, right? I’m no exception and I have a few good ones under my development career. These are usually very stressful at the time, but now me and my teammates can’t talk about these stories without laughing.
 
 
 
 
 
 
 
 
never-happened-qa

History

I think this happened around 2009. Me and my team had to maintain a medium to large legacy web application with around 500 k lines of code. This application was developed by another company, so we didn’t have the code. Since we were in charge now and needed the code to maintain it, they handed us the code in a zip file (first pointer that something was wrong)!

Their release process was peculiar to say the least. I’m pretty sure there are worst release procedures out there. This one consisted in copying the changed files (*.class, *.jsp, *.html, etc) to an exploded war folder on a Tomcat server. We also had three environments (QA, PRE, PROD) with different application versions and no idea which files were deployed on each. They also had a ticket management application with attached compiled files, ready to be deployed and no idea of the original sources. What could possibly go wrong here?

The Problem

Our team was able to make changes required by the customer and push them to PROD servers. We have done it a few times successfully, even with all the handicaps. Everything was looking good until we got another request for additional changes. These changes were only a few improvements in the log messages of a batch process. The batch purpose was to copy files sent to the application with financial data input to insert into a database. I guess that I don’t have to state the obvious: this data was critical to calculate financial movements with direct impact on the amounts paid by the application users.

After our team made the changes and perform the release, all hell went loose. Files were not being copied to the correct locations. Several data duplicated in the database and the file system. Financial transactions with incorrect amounts. You name it. A complete nightmare. But why? The only change was a few improvements in the log messages.

The Cause

The problem was not exactly related with the changed code. Look at the following files:

BatchConfiguration

public class BatchConfiguration {
    public static final String OS = "Windows";
}

And:

public class BatchProcess {
    public void copyFile() {
        if (BatchConfiguration.OS.equals("Windows")) {
            System.out.println("Windows");
        } else if (BatchConfiguration.OS.equals("Unix")) {
            System.out.println("Unix");
        }
    }

    public static void main(String[] args) {
        new BatchProcess().copyFile();
    }
}

This is not the real code, but for the problem purposes it was laid out like this. Don’t ask me about the why it was like this. We got it in the zip file, remember?

So we have here a variable which sets the expected Operating System and then the logic to copy the file is dependant on this. The server was running on a Unix box so the variable value was Unix. Unfortunately, all the developers were working on Windows boxes. I said unfortunately, because if the developer that implemented the changes was using Unix, everything would be fine.

Anyway, the developer changed the variable to Windows so he could proceed with some tests. Everything was fine, so he performs the release. He copied the resulting BatchProcess.class into the server. He didn’t bother about the BatchConfiguration, since the one on the server was configured to Unix right?

Maybe you already spotted the problem. If you haven’t, try the following:

  • Copy and build the code.
  • Execute it. Check the output, you should get Windows.
  • Copy the resulting BatchProcess.class to an empty directory.
  • Execute this one again. Use command line java BatchProcess

What happened? You got the output Windows, right?. Wait! We didn’t have the BatchConfiguration.class file in the executing directory. How is that possible? Shouldn’t we need this file there? Shouldn’t we get an error?

When you build the code, the java compiler will inline the BatchConfiguration.OS variable. This means that the compiler will replace the variable expression in the if statement with the actual variable value. It’s like having if ("Windows".equals("Windows"))

Try executing javap -c BatchProcess. This will show you a bytecode representation of the class file:

BatchProcess.class

public void copyFile();
    Code:
       0: ldc           #3                  // String Windows
       2: ldc           #3                  // String Windows
       4: invokevirtual #4                  // Method java/lang/String.equals:(Ljava/lang/Object;)Z
       7: ifeq          21
      10: getstatic     #5                  // Field java/lang/System.out:Ljava/io/PrintStream;
      13: ldc           #3                  // String Windows
      15: invokevirtual #6                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      18: goto          39
      21: ldc           #3                  // String Windows
      23: ldc           #7                  // String Unix
      25: invokevirtual #4                  // Method java/lang/String.equals:(Ljava/lang/Object;)Z
      28: ifeq          39
      31: getstatic     #5                  // Field java/lang/System.out:Ljava/io/PrintStream;
      34: ldc           #7                  // String Unix
      36: invokevirtual #6                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      39: return

You can confirm that all the variables are replaced with their constant values.

Now, returning to our problem. The .class file that was copied to the PROD servers had the Windows value set in. This messed everything in the execution runtime that handled the input files with the financial data. This was the cause of the problems I’ve described earlier.

Aftermath

Fixing the original problem was easy. Fixing the problems caused by the release was painful. It involved many people, many hours, pizza, loads of SQL queries, shell scripts and so on. Even our CEO came to help us. We called this the mUtils problem, since it was the original java class name with the code.

Yes, we migrated the code to something manageable. It’s now on a VCS with a tag for every release and version.

Roberto Cortez

My name is Roberto Cortez and I was born in Venezuela, but I have spent most of my life in Coimbra – Portugal, where I currently live. I am a professional Java Developer working in the software development industry, with more than 8 years of experience in business areas like Finance, Insurance and Government. I work with many Java based technologies like JavaEE, Spring, Hibernate, GWT, JBoss AS and Maven just to name a few, always relying on my favorite IDE: IntelliJ IDEA.Most recently, I became a Freelancer / Independent Contractor. My new position is making me travel around the world (an old dream) to customers, but also to attend Java conferences. The direct contact with the Java community made me want to become an active member in the community itself. For that reason, I have created the Coimbra Java User Group, started to contribute to Open Source on Github and launched my own blog (www.radcortez.com), so I can share some of the knowledge that I gained over the years.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button