Autoscaling in Kubernetes with Vertical Pod Autoscaler (VPA)

Overview

For a Kubernetes-based e-commerce application that experiences variable workloads throughout the day. During peak hours, such as flash sales or high-traffic periods, the application encounters increased demand, leading to performance challenges. To address this, you implement Vertical Pod Autoscaler (VPA) to optimize resource allocations for improved application responsiveness dynamically. Vertical Pod Autoscaler is invaluable in addressing the challenges of variable workloads in an e-commerce application. By dynamically optimizing resource allocations, VPA ensures optimal performance, enhances user experience, and contributes to cost-efficient resource utilization.

The blog goes through the significance of Vertical Pod Autoscaler (VPA) in Kubernetes, emphasizing its role in dynamically optimizing resource allocations for individual pods. Key features, benefits, and practical use cases, such as enhancing performance in an e-commerce application during variable workloads, are explored.

Pioneers in Cloud Consulting & Migration Services

Reduced infrastructural costs
Accelerated application deployment

Get Started

Introduction

Kubernetes stands as a leader in container orchestration, offering robust features to manage and scale containerized applications.

The Vertical Pod Autoscaler (VPA) takes center stage among its autoscaling capabilities. This blog aims to provide an in-depth understanding of Vertical Pod Autoscaler—its significance, applications, and a comprehensive guide on its implementation within Kubernetes clusters.

Prerequisites

Kubernetes Cluster
Installation of Kubernetes Metrics Server (Refer to the doc for Deploying the Kubernetes Metrics Server on a Cluster Using Kubectl (oracle.com))

Vertical Pod Autoscaler (VPA)

The Vertical Pod Autoscaler is a sophisticated component within the Kubernetes ecosystem, designed to dynamically adjust resource requests for individual pod containers based on their real-time usage metrics. Unlike its counterpart, the Horizontal Pod Autoscaler (HPA), which scales the number of pod replicas, VPA focuses on fine-tuning CPU and memory resource allocations for individual pods, optimizing resource utilization.

Key Features and Benefits

Efficient Resource Utilization
Improved Application Performance
Cost Optimization
Adaptability to Workload Changes

Steps to Implement Vertical Pod Autoscaler

Step 1: Install VPA in Your Cluster

Download the source code for the Vertical Pod Autoscaler from GitHub. For instance, use the following command:

git clone -b vpa-release-0.8 https://github.com/kubernetes/autoscaler.git

1	git clone -b vpa-release-0.8 https://github.com/kubernetes/autoscaler.git

Navigate to the directory for the Vertical Pod Autoscaler:

cd autoscaler/vertical-pod-autoscaler

1	cd autoscaler/vertical-pod-autoscaler

If you have previously deployed the Vertical Pod Autoscaler, remove it with the command:

./hack/vpa-down.sh

1	./hack/vpa-down.sh

Deploy the Vertical Pod Autoscaler using the command:

./hack/vpa-up.sh

1	./hack/vpa-up.sh

Confirm the successful creation of Vertical Pod Autoscaler pods by running the command:

kubectl get pods -n kube-system

1	kubectl get pods -n kube-system

step1

Step 2: Launch the Sample Application

To deploy a deployment and a corresponding Vertical Pod Autoscaler, initiate the deployment of the sample hamster application using the following command:

kubectl apply -f examples/hamster.yaml

1	kubectl apply -f examples/hamster.yaml

step2

This deployment of the hamster application generates a deployment featuring two pods and a Vertical Pod Autoscaler linked to the deployment.

Confirm the successful creation of the hamster pods by executing the command:

kubectl get pods -l app=hamster

1	kubectl get pods -l app=hamster

step2b

Inspect the CPU and memory reservations using the kubectl describe pod command and one of the hamster pod names obtained in the preceding step. For instance:

kubectl describes pod hamster-7cbfd64f57-rq6wv

1	kubectl describes pod hamster-7cbfd64f57-rq6wv

In the configuration file you got by the above command and in the “requests” section of the output, observe the existing CPU and memory allocations for the pod as shown below:

step2c

The Recommender component of the Vertical Pod Autoscaler assesses pod behavior to evaluate the adequacy of CPU and memory reservations. It’s important to note that the observed CPU and memory reservations may differ.

The current reservations are insufficient due to a deliberate under-resourcing of the sample hamster application. Each pod runs a single container with the following specifications:

CPU request is set at 100 milliCores, but the container attempts to use more than 500 milliCores.
Memory reservation is notably lower than the required amount for proper execution.

Step 3: Monitor the Scaling Operation

After evaluating the inadequacy of CPU and memory reservations in the original pods of the sample hamster application, the Vertical Pod Autoscaler, specifically the Updater, initiates the relaunch of pods with revised values recommended by the Recommender. It’s important to note that the Vertical Pod Autoscaler does not modify the deployment template but updates the actual requests of the pods.

To track the pods in the sample hamster application and await the initiation of a new hamster pod with a different name by the Updater, use the following command:

kubectl get --watch pods -l app=hamster

1	kubectl get --watch pods -l app=hamster

Once a new hamster pod has started, inspect its CPU and memory reservations using the kubectl describe pod command along with the pod’s name, as shown in this example:

kubectl describe pod hamster-7cbfd64f57-wmg4

1	kubectl describe pod hamster-7cbfd64f57-wmg4

In the “Requests” section of the output, observe the CPU and memory reservations for the new pod:

step3

In the given example, notice the increased CPU reservation to 587 milliCores and the memory reservation to 262,144 Kilobytes. The original pod had insufficient resources, and the Vertical Pod Autoscaler has adjusted the reservations to more suitable values. It’s important to acknowledge that CPU and memory reservations may vary.

Step 4: Inspection

Inspect the recommendations provided by the Vertical Pod Autoscaler, specifically those made by the Recommender, using the following command:

kubectl describe vpa/hamster-vpa

1	kubectl describe vpa/hamster-vpa

step4

Step 5: Cleanup

To uninstall the sample application, execute the following command:

kubectl delete -f examples/hamster.yaml

1	kubectl delete -f examples/hamster.yaml

In the directory for the Vertical Pod Autoscaler (vertical-pod-autoscaler), eliminate the Vertical Pod Autoscaler deployment with the command:

./hack/vpa-down.sh

1	./hack/vpa-down.sh

Conclusion

Vertical Pod Autoscaler emerges as a pivotal tool within the Kubernetes ecosystem, offering a sophisticated approach to resource optimization. By dynamically adjusting the resource requests of individual pods based on their actual usage, VPA addresses the challenges of efficient resource utilization, cost optimization, and adaptability to changing workloads. As organizations delve into the realm of autoscaling in Kubernetes, the integration of Vertical Pod Autoscaler stands out as a strategic choice to elevate the efficiency of containerized workloads and ensure the optimal performance of applications.

Drop a query if you have any questions regarding Vertical Pod Autoscaler and we will get back to you quickly.

Making IT Networks Enterprise-ready – Cloud Management Services

Accelerated cloud migration
End-to-end view of the cloud environment

Get Started

About CloudThat

CloudThat is an award-winning company and the first in India to offer cloud training and consulting services worldwide. As a Microsoft Solutions Partner, AWS Advanced Tier Training Partner, and Google Cloud Platform Partner, CloudThat has empowered over 850,000 professionals through 600+ cloud certifications winning global recognition for its training excellence including 20 MCT Trainers in Microsoft’s Global Top 100 and an impressive 12 awards in the last 8 years. CloudThat specializes in Cloud Migration, Data Platforms, DevOps, IoT, and cutting-edge technologies like Gen AI & AI/ML. It has delivered over 500 consulting projects for 250+ organizations in 30+ countries as it continues to empower professionals and enterprises to thrive in the digital-first world.

FAQs

1. How can I scale my application pods dynamically based on resource utilization in Kubernetes?

ANS: – Employ the Horizontal Pod Autoscaler (HPA) in Kubernetes to adjust pod counts automatically in a deployment or replica set. HPA observes CPU or memory usage, scaling pods horizontally based on defined utilization thresholds. Configure resource metrics in the HPA to enable Kubernetes to scale your application dynamically.

2. What is the difference between resource requests and resource limits in Kubernetes?

ANS: – Kubernetes employs resource requests to guarantee CPU and memory allocations for a container, while resource limits set maximum usage thresholds. These values impact scheduling decisions and prevent containers from consuming excessive resources, ensuring predictable performance and resource utilization in the cluster.

3. How does the Vertical Pod Autoscaler (VPA) differ from the Horizontal Pod Autoscaler (HPA) in Kubernetes?

ANS: – While the Horizontal Pod Autoscaler scales the number of pod replicas based on observed metrics, the Vertical Pod Autoscaler focuses on adjusting the resource requests and limits of individual pods. VPA analyzes pod resource usage and recommends adjustments to optimize resource allocation. Unlike HPA, which scales horizontally by adding or removing pod replicas, VPA scales vertically by fine-tuning the resource allocations of existing pods.

WRITTEN BY Bhanu Prakash K

K Bhanu Prakash is working as a Subject Matter Expert in CloudThat. He is proficient in Managing and configuring AWS Infrastructure as well as on Kubernetes and DevOps tools like Terraform, ansible, Jenkins, and Git. He is very keen on learning new technologies and publishing blogs for the tech community.