DevOps

Blue-Green Deployment

Traditionally, we deploy a new release by replacing the current one. The old release is stopped, and the new one is brought up in its place. The problem with this approach is the downtime occurring from the moment the old release is stopped until the new one is fully operational. No matter how quickly you try to do this process, there will be some downtime. That might be only a millisecond, or it can last for minutes or, in extreme situations, even hours. Having monolithic applications introduces additional problems like, for example, the need to wait a considerable amount of time until the application is initialized. People tried to solve this issue in various ways, and most of them used some variation of the blue-green deployment process. The idea behind it is simple. At any time, one of the releases should be running meaning that, during the deployment process, we must deploy a new release in parallel with the old one. The new and the old releases are called blue and green.

At any given moment, at least, one service release is up and running
At any given moment, at least, one service release is up and running

We run one color as a current release, bring up the other color as a new release and, once it is fully operational, switch all the traffic from the current to the new release. This switch is often made with a router or a proxy service.

With the blue-green process, not only that we are removing the deployment downtime, but we are also reducing the risk the deployment might introduce. No matter how well we tested our software before it reached the production node(s), there is always a chance that something will go wrong. When that happens, we still have the current version to rely on. There is no real reason to switch the traffic to the new release until it is tested enough that any reasonable possibility of a failure due to some specifics of the production node is verified. That usually means that integration testing is performed after the deployment and before the “switch” is made. Even if those verifications returned false negatives and there is a failure after the traffic is redirected, we can quickly switch back to the old release and restore the system to the previous state. We can roll back much faster than if we’d need to restore the application from some backup or do another deployment.

If we combine the blue-green process with immutable deployments (through VMs in the past and though Docker containers today), the result is a very powerful, secure and reliable deployment procedure that can be performed much more often. If architecture is based on microservices in conjunction with Docker containers, we don’t need two nodes to perform the procedure and can run both releases side by side.

The significant challenges with this approach are databases. In many cases, we need to upgrade a database schema in a way that it supports both releases and then proceed with the deployment. The problems that might arise from this database upgrade are often related to the time that passes between releases. When releases are done often, changes to the database schema tend to be small, making it easier to maintain compatibility across two releases. If weeks, or months, passed between releases, database changes could be so big that backward compatibility might be impossible or not worthwhile doing. If we are aiming towards continuous delivery or deployment, the period between two releases should be short or, if it isn’t, involve a relatively small amount of changes to the code base.

The Blue-Green Deployment Process

The blue-green deployment procedure, when applied to microservices packed as containers, is as follows.

The current release (for example blue), is running on the server. All traffic to that release is routed through a proxy service. Microservices are immutable and deployed as containers.

Immutable microservice deployed as a container
Immutable microservice deployed as a container

When a new release (for example green) is ready to be deployed, we run it in parallel with the current release. This way we can test the new release without affecting the users since all the traffic continues being sent to the current release.

New release of the immutable microservice deployed alongside the old release
New release of the immutable microservice deployed alongside the old release

Once we think that the new release is working as expected, we change the proxy service configuration so that the traffic is redirected to that release. Most proxy services will let the existing requests finish their execution using the old proxy configuration so that there is no interruption.

Poxy is re-configured to point to the new release
Poxy is re-configured to point to the new release

When all the requests sent to the old release received responses, the previous version of a service can be removed or, even better, stopped from running. If the latter option is used, rollback in case of a failure of the new release will be almost instantaneous since all we have to do is bring the old release back up.

The old release is removed
The old release is removed

Reference: Blue-Green Deployment from our JCG partner Viktor Farcic at the Technology conversations blog.

Viktor Farcic

Viktor Farcic is a Software Developer currently focused on transitions from Waterfall to Agile processes with special focus on Behavior-Driven Development (BDD), Test-Driven Development (TDD) and Continuous Integration (CI).
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Kumar
Kumar
8 years ago

Most agile development methodology timeframes are around 3 weeks of development and say around 2-3 weeks of testing. In all, a software release is performed only after 5-6weeks. There could be multiple changes in databases as different projects go live in 1 Agile release cycle that are worked upon as global team. This approach of deploying a single instance while other instance still supporting production, may not be feasible. Even if a stored procedure is changed while the production environment is running, it can cause errors in transaction processing. I do not this this is feasible in a trading application… Read more »

Viktor Farcic
8 years ago
Reply to  Kumar

What you described is a waterfall process that you chose to rename to agile.

Back to top button