Service Discovery Inside A Docker Swarm Cluster

Viktor FarcicSeptember 16th, 2016Last Updated: September 15th, 2016

0 239 7 minutes read

The text that follows contains excerpts from the Service Discovery Inside A Swarm Cluster chapter of the The DevOps 2.1 Toolkit: Docker Swarm book.

Service Discovery In The Swarm Cluster

The old (standalone) Swarm required a service registry so that all its managers can have the same view of the cluster state. When instantiating the old Swarm nodes, we had to specify the address of a service registry. However, if you take a look at setup instructions of the new Swarm (Swarm Mode introduced in Docker 1.12), you’ll notice that we nothing is required beyond Docker Engines. You will not find any mention of an external service registry or a key-value store.

Does that mean that Swarm does not need service discovery? Quite the contrary. The need for service discovery is as strong as ever, and Docker decided to incorporate it inside Docker Engine. It is bundled inside just as Swarm is. The internal process is, essentially, still very similar to the one used by the standalone Swarm, only with less moving parts. Docker Engine now acts as a Swarm manager, Swarm worker, and service registry.

The decision to bundle everything inside the engine provoked a mixed response. Some thought that such a decision creates too much coupling and increases Docker Engine’s level of instability. Others think that such a bundle makes the engine more robust and opens the door to some new possibilities. While both sides have valid arguments, I am more inclined towards the opinion of the later group. Docker Swarm Mode is a huge step forward, and it is questionable whether the same result could be accomplished without bundling service registry inside the engine.

It is important to note that service registry bundled inside the engine is for internal use only. We cannot access it nor use it for our own purposes.

Knowing how Docker Swarm works, especially networking, the question that might be on your mind is whether we need service discovery (beyond Swarm’s internal usage). In The DevOps 2.0 Toolkit, I argued that service discovery is a must and urged everyone to set up Consul or etcd as service registries, Registrator as a mechanism to register changes inside the cluster, and Consul Template or confd as a templating solution. Do we still need those tools?

Do We Need Service Discovery?

It is hard to provide a general recommendation whether service discovery tools are needed when working inside a Swarm cluster. If we look at the need to find services as the main use case for those tools, the answer is usually no. We don’t need external service discovery for that. As long as all services that should communicate with each other are inside the same network, all we need is the name of the destination service. For example, all that the go-demo service needs to know to find the related database is that its DNS is go-demo-db. The Docker Swarm Networking And Reverse Proxy chapter proved that the proper networking usage is enough for most use cases.

However, finding services and load balancing requests among them is not the only reason for service discovery. We might have other usages of service registries or key-value stores. We might have a need to store some information in a way that it is distributed and fault tolerant.

An example of the need for a key-value store can be seen inside the Docker Flow: Proxy project. It is based on HAProxy which is a stateful service. It loads the information from a configuration file into memory. Having stateful services inside a dynamic cluster represents a challenge that needs to be solved. Otherwise, we might lose a state when a service is scaled, rescheduled after a failure, and so on.

Problems When Scaling Stateful Instances

Scaling services inside a Swarm cluster is easy, isn’t it? Just execute docker service scale SERVICE_NAME=NUMBER_OF_INSTANCES and, all of a sudden, the service is running multiple copies.

The previous statement is only partly true. The more precise wording would be that scaling stateless services inside a Swarm cluster is easy.

The reason that scaling stateless services is easy lies in the fact that there is no state to think about. An instance is the same no matter how long it runs. There is no difference between a new instance and one that run for a week. Since the state does not change over time, we can create new copies at any given moment, and they will all be exactly the same.

However, the world is not stateless. State is an unavoidable part of our industry. As soon as the first piece of information is created, it needs to be stored somewhere. The place we store data must be stateful. It has a state that changes over time. If we want to scale such a stateful service, there are, at least, two things we need to consider.

How do we propagate a change of a state of one instance towards the rest of the instances?
How to we create a copy (a new instance) of a stateful service and make sure that the state is copied as well.

We usually combine a stateless and stateful services into one logical entity. A back-end service could be stateless and rely on a database service as an external data storage. That way, there is a clear separation of concerns and a different lifecycle of each of those services.

Before we proceed, I must state that there is no silver bullet that makes stateful services scalable and fault-tolerant. Throughout the book, I will go through a couple of examples that might, or might not apply to your use case. An obvious, and very typical example of a stateful service is a database. While there are some common patterns, almost every database provides a different mechanism for data replication. That, in itself, is enough to prevent us from having a definitive answer that would apply to all. We’ll explore scalability of a Mongo DB later on in the book. We’ll also see an example with Jenkins that uses a file system for its state.

The first case we’ll tackle will be of a different type. We’ll discuss scalability of a service that has a state stored in its configuration file. To make things more complicated, the configuration is dynamic. It changes over time throughout the lifetime of the service. We’ll explore ways to make HAProxy scalable.

If we’d use the official HAProxy image, one of the challenges we would face is how to update the state of all the instances. How to change the configuration and reload each copy of the proxy.

We can, for example, mount an NFS volume on each node in the cluster and make sure that the same host volume is mounted inside all HAProxy containers. On the first look, it might seem that would solve the problem with the state since all instances would share the same configuration file. Any change to the config on the host would be available inside all the instances we would have. However, that, in itself, would not change the state of the service.

HAProxy loads the configuration file during initialization, and it is oblivious to any changes we might make to the configuration afterward. For the change of the state of the file to be reflected in the state of the service we need to reload it. The problem is that instances can run on any of the nodes inside the cluster. On top of that, if we adopt dynamic scaling (more on that later on), we might not even know how many instances are running. So, we’d need to discover how many instances we have, find out on which nodes they are running, get IDs of each of the containers, and, only then, send a signal to reload the proxy. While all this can be scripted, it is far from an optimum solution. Moreover, mounting an NFS volume is a single point of failure. If the server that hosts the volume fails, data is lost. Sure, we can create backups, but they would only provide a way to restore lost data partially. That is, we can restore a backup, but the data generated between the moment the last backup was created, and the node failure would be lost.

An alternative would be to embed the configuration into HAProxy images. We could create a new Dockerfile that would be based on haproxy and add the COPY instruction that would add the configuration. That would mean that every time we want to reconfigure the proxy, we’d need to change the config, build a new set of images (a new release), and update the proxy service currently running inside the cluster. As you can imagine, this is also not practical. It’s too big of a process for a simple proxy reconfiguration.

Docker Flow: Proxy uses a different, less conventional, approach to the problem. It stores a replica of its state in Consul. It also uses an undocumented Swarm networking feature (at least at the time of this writing).

The DevOps 2.1 Toolkit: Docker Swarm

If you liked this article, you might be interested in The DevOps 2.1 Toolkit: Docker Swarm book. Unlike the previous title in the series (The DevOps 2.0 Toolkit: Automating the Continuous Deployment Pipeline with Containerized Microservices) that provided a general overlook of some of the latest DevOps practices and tools, this book is dedicated entirely to Docker Swarm* and the processes and tools we might need to **build, test, deploy, and monitor services running inside a cluster.

The book is still under “development”. You can get a copy from LeanPub. It is also available as The DevOps Toolkit Series bundle. If you download it now, before it is fully finished, you will get frequent updates with new chapters and corrections. More importantly, you will be able to influence the direction of the book by sending me your feedback.

I choose the lean approach to book publishing because I believe that early feedback is the best way to produce a great product. Please help me make this book a reference to anyone wanting to adopt Docker Swarm for cluster orchestration and scheduling.

Reference:

Service Discovery Inside A Docker Swarm Cluster from our JCG partner Viktor Farcic at the Technology conversations blog.