Zero downtime updates are a critical feature in today's world of continuous delivery and deployment, but it can be challenging to achieve when deploying applications on traditional infrastructure. With the dynamic nature of cloud computing environments, achieving zero downtime becomes even more complicated with increased scale.
The Kubernetes platform enables automated deployments that provide scalable architecture support while maintaining high availability across all components within your environment. This post will discuss how to use Kubernetes to perform zero-downtime updates by leveraging its rolling update functionality.
What are Zero Downtime Updates?
Zero Downtime means that your customers and users won’t experience any downtime during an application update. Zero downtime updates can be achieved by keeping your application running and servicing new requests during an update.
If your competitors do not have any downtime on their platforms, you can not afford it either. If your application is directly impacting your client's revenue, you are probably aware of the implications any downtime can cause to your business. As the quality of user experience continually rises, clients' and users' expectations for application availability also increase.
Before Kubernetes: Blue-Green deployment
Prior to Kubernetes, the most popular method for rolling updates was using a Blue-Green deployment. This deployment model states that there should be two identical environments, one referred to as Blue and the second as Green. At any given time, one of the two should be running the production application, and another should host the new version of the application.
These two environments should be running behind a load balancer that routes the incoming requests to the production environment. During an update, you first deploy the updated application to a non-production environment.
Once updated and tested successfully, you can then switch all traffic over to this new version of your app running on the newly deployed environment. This way all traffic will be forwarded only to your recently deployed application version that should be up-to-date with production.
With such a simple approach as the blue-green model, it might seem logical for the incoming requests, but what happens to users already using the application when the update is in progress?
Kubernetes: Rolling updates
Traditional methods of achieving zero downtime require many manual steps that make it error-prone and hard to scale. With Kubernetes, you can achieve this process easily through rolling deployment strategy.
Kubernetes Rolling Updates are a way to upgrade Pods incrementally, rather than replacing them all at a time with new ones. Rolling deployments mean there will always be at least one instance available that acts as a hot standby until the new release has been successfully deployed across all nodes within the Kubernetes cluster.
For Zero-downtime updates, Kubernetes will first roll out the new version of your container and stop all requests to it. Once ready, Kubernetes will start sending traffic to this new pod/version without dropping any requests coming from users already using the application - Zero downtime for clients or customers!
In addition, you can specify the precise manner in which Kubernetes scales multiple replicas during the application update process. Also, you can leverage the Kubernetes Deployments readiness and liveness probes when the application needs to boot up. For example, if you had five pod replicas, should you immediately create five new pods and wait for them to start, deploy one by one, or terminate all old pods except one?
Example of a Rolling update strategy
Let's create a Deployment for an app. The following YAML code shows the Kubernetes deployment for an application with the default RollingUpdate strategy:
The above definition should look familiar:
- It's an object of the kind `Deployment`.
- In the Pod template, a single container with the image `rootedmind/deployment-strategies:v1` is defined.
- The container exposes port 80.
Adding a Liveness probe
A liveness probe is designed to probe the container and check if the process is healthy. If it isn't, it restarts the container. You could use a liveness probe to automatically restart the container when the web server goes into a deadlock.
Let's amend the Deployment definition to include it:
Let's break down the details:
- `httpGet` is the command that should be executed to inspect the endpoint. In this case is an HTTP probe, but you could have a TCP probe or execute a generic command in the container.
- `initialDelaySeconds` is the initial delay. The probe will start checking the container only after 6 seconds.
- `timeoutSeconds` specifies how long to wait before timing out the request.
- `periodSeconds` is the frequency used to check the probe. In this case, the `/` endpoint is checked every 5 seconds.
- `successThreshold` is the number of successful attempts before you can consider the probe successful.
- `failureThreshold` is the number of failed attempts before the probe gives up.
Adding a readiness probe
The readiness probe determines if the application in the container is available to serve incoming requests or not. When it is ready, the probe attaches the container to the Service. So you could include a readiness probe in your Deployment like this:
As you might have noticed, most of the configuration for the readiness probe mimics the same properties of the liveness probe.
There's a notable change, though.
- livenessProbe endpoint checks for the process. The process could be healthy but not ready to accept connections.
- readinessProbe endpoint checks for the traffic. You might want to wait a bit longer after the healthy process before you route traffic to it.
Is the Readiness always executed after the Liveness probe? Kubernetes doesn't care.
You can do a readiness check before the liveness or vice versa. As far as Kubernetes is concerned, the two probes are independent. However, as an engineer, you might want to wait for the Liveness probe before you start checking the Readiness.
The Readiness probe checks the container every 5 seconds (periodSeconds). It will have to wait for three failed attempts (failureThreshold) before removing the container from the Service.
So it might take up to 15 seconds to detect a Pod with a malfunctioning web server.
Of course, you can fine-tune those values, but even if you select the very minimum for both settings (1 second and one attempt), there's always the chance that the application will crash in-between the interval.
From this point onwards:
- Kubernetes will detach the application from the Service if the readinessProbe endpoint doesn't return a successful message. This action could take up to 5 seconds.
- The liveness probe will check the livenessProbe endpoint every 5 seconds. If the endpoint doesn't return a successful message, the container is restarted.
During a rolling update, 25% of Pods are replaced at the time. If your Deployment is four replicas, Kubernetes will execute a rolling update one Pod at a time. If you have 20 replicas, Kubernetes will replace five Pods at the time.
Also, you can specify how many Pods you want to be replaced at a time with the following parameters:
Let's recap what we done so far:
- Created a Deployment for an application.
- Set up a liveness probe to check if the containers need restarting.
- Configured a readiness probe to check when the container is ready to accept incoming requests.
Hooray! You've successfully mitigated the risk of downtime. Better yet, you didn't have to code any logic to execute a rolling update.
What’s next after achieving zero downtime updates with Kubernetes?
To avoid being stuck in a "blue-green deployment" loop of having to redeploy an update every time new code goes live, you will want to start exploring the idea of rolling deployments. Rolling deployments allow for gradual and continual updates by distributing changes across all pods without any interruption.
Ultimately, this means that your customers can continue using your software as it seamlessly transitions from one version to another - no downtime! And, more importantly, if you want to make informed business decisions, you need to comprehensively grasp what's happening inside the Kubernetes engine and not just look at the nice display screens by the deck entrance!
Going Further, Kubernetes has a wide range of features and services available, including native rolling deployments and canary releases as well as powerful primitives such as service meshes (kube-proxy, kube-dns), storage orchestration (etcd, ceph) and persistent volumes, load balancing, and much more. To explore more, check out the official Kubernetes documentation here.