This article will focus on one of the advanced service mesh topology called cross-cluster mesh model. Often the terms used are multi-cluster and cross-cluster where multi-cluster denotes a single central mesh control plane used by all the clusters as part of the mesh topology and cross-cluster assumes having distinct mesh control plane per cluster. The cross cluster topology is a decentralised approach to form a single logical mesh topology where each cluster runs its own control and data plane components.
Some of the use cases for the multi or cross cluster mesh topology are:
- High Availability across region: You could have cluster setup in two different region in an active-active mode i.e. both the cluster services serving the traffic from either side. The traffic can also be load balanced across clusters. It can also do failover between regions if one of the cluster in a region becomes inactive.
- Multi cloud deployment: You could have clusters setup with different cloud providers. The operating procedure will be specific to the cloud provider.
- Service deployment strategy: One can adopt Canary or Blue/Green style of deployments with multi cluster setup. Depending on the requirement or test case, you could allocate a certain percentage of traffic going to each cluster.
This article will explain how to implement cross-cluster Anthos Service Mesh (ASM) model in a single network. ASM is backed by Istio – an open source service mesh framework. The use case will focus on adopting a single VPC network that will showcase service to service communication across clusters in different regional subnet and also perform service load balancing. Though the illustration will use two clusters, you could also have this set up using more than two clusters.
- You have already setup a Google project with a single VPC and two regional subnets. You can opt for any available two regions to implement this use case. For this article we will take Mumbai as one region and Singapore as another.
- You have Anthos GKE cluster setup in both the regions along with ASM version 1.8.3 installed. To understand how to install ASM, follow this link.
- The ASM Certificate Authority (CA) used will be Mesh CA (only available for GKE clusters). You could also use Citadel CA as an alternate option.
Set the cluster context
As a first step, identify the context of each cluster. The below command will lists the different cluster context.
The cluster context name follows a pattern: project-id_cluster-location_cluster_name. Assign the context name to $ctx1 and $ctx2 environment variables, each representing cluster one and two respectively.
Setup endpoint discovery between clusters
In this step you will enable each cluster to discover service endpoints of their counterpart, so that cluster one will discover service endpoints of the second cluster and vice versa.
You enable this by creating secrets for each cluster that grants access to kube API server of that cluster. Each secret is the certificate derived from the common root CA, in this case Mesh CA. You then apply the secret to the other cluster. In that way secrets are exchanged and the clusters are able to see the service endpoints of each other.
Setup the application
The application as part of this use case is a simple NodeJS application that prints the service name. Below is the sample code:
You will deploy four distinct deployments (containers) of the above application – nodeapp1 (ver 1) and nodeapp3 deployments in the first cluster and nodeapp1 (ver 2) and nodeapp2 deployments in the second cluster.
Our mesh topology will look like the following:
We will use nodeapp1 service to demonstrate load balancing – where request to common nodeapp1 service can either print ‘version 1’ or ‘version 2’. We will also demonstrate communication between two different services i.e nodeapp3 and nodeapp2. All the services will be able to communicate with each other through direct endpoint discovery. There will be no gateway routing involved as all the services are part of the same VPC.
Our Kubernetes resource deployment setup will look like the following:
|Cluster Name||Kubernetes Service||Kubernetes Deployment|
|cluster-1||nodeapp1 | ClusterIP | 80:9000||nodeapp1-v1|
|nodeapp3 | ClusterIP | 80:9000||nodeapp3|
|nodeapp2 | ClusterIP | 80:9000|
|cluster-2||nodeapp1 | ClusterIP | 80: 9000||nodeapp1-v2|
|nodeapp2 | ClusterIP | 80:9000||nodeapp2|
|nodeapp3 | ClusterIP | 80:9000|
The mesh will use Kubernetes DNS to resolve the service name with its endpoint. In order for the DNS lookup to be successful, the target services must be deployed in both the clusters even if there are no instances of service’s pod running in the client (calling) cluster. If the endpoint is not found in the calling cluster the mesh will route the request to the second cluster.
Testing service to service communication
To test the cross-cluster communication, you can call the nodeapp1 service from nodeapp3 pod.
Invoke this multiple times and you will see load balancing in action. It will print output from both the versions of service nodeapp1
You can also test the communication between nodeapp3 and nodeapp2. You can invoke the nodeapp2 service from the nodeapp3 pod.
As you can see it was so easy and seamless to setup cross-cluster mesh and enable communication across clusters in a single network. In the next blog, I will explain how to implement cross-cluster mesh in two different VPC networks.