10 Easy Steps to Migrate Workload to a New GKE Node Pool

Overview

Hey folks, welcome to another blog series on Kubernetes. In this blog post, we’ll cover how we can migrate an existing workload from the default node pool to a new node pool/group without causing any downtime. For those who have just started with Kubernetes, I would recommend following along with our elementary Kubernetes blog series.

Customized Cloud Solutions to Drive your Business Success

Cloud Migration
Devops
AIML & IoT

Know More

Introduction

Kubernetes is a Container Orchestration System popularly known for providing an automated and organized way to deploy, scale and monitor containerized workloads. Kubernetes has gained so much popularity due to being an open-source tool and ephemeral in nature. One can easily containerize, deploy, and monitor production workload in Kubernetes Clusters. Kubernetes cluster configuration is widely bifurcated into two categories i.e., self-managed Kubernetes clusters like kubeadm or kops cluster where the control plan is managed by the admin, on the other side we have managed Kubernetes clusters provided by top-notch cloud service providers like Amazon Elastic Kubernetes Service, Azure Kubernetes Service, Google Kubernetes Engine & many more where we can leverage ourselves with less operational overhead with the managed control plane.

Here in this blog, we are going to see how we can do a seamless workload migration with no downtime to a new node pool using GKE (Google Kubernetes Engine). On a Kubernetes cluster, there are times when we need to upgrade the underlying worker node/ node pool configuration to keep the workload running at all times based on certain requirements, hence in such scenarios, this type of migration can be useful if you need to move your workloads to a different machine type or deploy the certain type of workload to a specific kind of node pool.

Prerequisites

Google Account

Since we are utilizing Google Kubernetes Engine for this demo, it would be better to perform this with the GKE cluster, although it is completely okay to do this on any Kubernetes cluster of your choice.

IAM Role /Admin User

If you are the owner/admin of your project, then you’ll be having full access to the Kubernetes Environment. If not, it is advisable to use Service Account with Kubernetes Engine Admin or Kubernetes Cluster Admin along with Compute Engine Admin or Editor Access.

Create or Select a New Project

You can make use of any existing project or create a new project as per your preference. In this demo, I have created a new project to keep clarity and resource isolation.

Step-by-Step Guide

Step 1: Enable Kubernetes Engine API

For every project, you create you need to enable the API for the Services to want to work with.

Step1

Step 2: Activate Cloud Shell

To authenticate your access to the Google Cloud Shell, Client on the cloud Shell icon, once open you can run the mentioned command to authenticate your access to the environment.

Run gcloud auth login

Step2

Step 3: Set project Id, zone, and region for your cluster

Run gcloud config set project/projectID
Run gcloud config set compute/region=compute_region
Run gcloud config se compute/zone=compute_zone

To verify your desired project and region/zone are set

Run gcloud config list

Output:

Step3

Step 4: Provision a GKE Cluster

We’ll start by creating a demo cluster to run application workloads. In the following command, three nodes will be created with the default machine type (e2-medium). Nodes can be configured according to your workload requirements by providing multiple flags. Getting your GKE cluster ready will take approximately 7-10 minutes or more.

Run gcloud container clusters to create cluster_name –num-nodes=3

Step 5: Get the demo Application Code

Run git clone https://github.com/ShivaniG04/KubernetesWorkloadMigration
Run cd KubernetesWorkloadMigration

Step 6: Running a replicated application deployment

To deploy the yaml manifest

Run kubectl apply -f node-pools-deployment.yaml

You can retrieve the list of running Pods

Run kubectl get pods

Output:

Step6

To retrieve pods with the nodes

Run kubectl get pods –o wide

Output:

Step6b

Step 7: Creating a new node pool with different machine type

We need to create a new node pool to introduce instances with different configurations, such as a different machine type or different authentication scopes.

Note: By default, whenever a new GKE is created, the attached node pool is considered and named as default

To verify the existing node pool

Run gcloud container node-pools list –cluster cluster_name

Output:

Step7

To create a new node pool named new-pool with three high memory instances of the e2-highmem-2 machine type, you can also select any machine type and other configurations as per the requirement.

Run gcloud container node-pools create new-pool –cluster=cluster_name –machine-type=e2-highmem-2 –num-nodes=3

Output:

Step7b

Now your cluster has two node pools, to list node pools attached to your cluster

Run gcloud container node-pools list –cluster cluster_name

Output:

Step7c

To list all the nodes attached to your cluster

Run kubectl get nodes

Output:

Step7d

Step 8: Migrating the workload

We have added a new node pool to our cluster, but existing workloads will still run on the default node pool. The workload must be explicitly migrated to the new node pool.

The following command will display the node on which your pods are running.

Run kubectl get pods –o wide

Output:

Step8

Our current workload is running on the default node pool, to migrate this running workload to a new node pool, we must follow the mentioned steps.

Cordon the existing node pool: This operation marks the nodes in the existing node pool (default-pool) as unschedulable. Kubernetes stops scheduling new Pods to these nodes once you mark them as unschedulable.
Drain the existing node pool: This operation evicts the workloads running on the nodes of the existing node pool (default-pool) gracefully.

First, get a list of nodes in the default-pool:

Run kubectl get nodes –l cloud.google.com/gke-nodepool=default-pool

Output:

Step8b

Now cordon the nodes, this will mark the node unschedulable for future workloads.

Run for node in $(kubectl get nodes -lcloud.google.com/gkenodepool=default-pool -o=name); do kubectl cordon “$node”;
done

Output:

Step8c

Now we must drain each node by evicting Pods with an allotted graceful termination period of 10 seconds, graceful termination period can be variable depending on the running workload. To drain the nodes

Run for node in $(kubectl get nodes -l cloud.google.com/gke-nodepool=default-pool -o=name); do
kubectl drain –force –ignore-daemonsets –delete-emptydir-data –grace-period=10 “$node”;
done

Output:

Step8d

Once this command completes, you can see that the default-pool nodes have SchedulingDisabled status in the node list, To Verify

Run kubectl get nodes

Step8e

Once the default nodes get SchedulingDisable, the workload/pod will automatically switch to run on the new-pool nodes:

kubectl get pods -o=wide

Output:

Step8f

Now the workload has moved to the new pool, and as we can see how we can gracefully disable Scheduling on a group of nodes and easily migrate the workload to new nodes from the new node pool, without causing any downtime.

Step 9: Deleting the Old Node pool

Run gcloud container node-pools delete default-pool –cluster cluster_name

Output:

Step9

Once this operation completes, you should have a single node pool for your container cluster, which is the new-pool, & to list the existing node-pool

Run gcloud container node-pools list –cluster cluster_name

Output:

Step 10: Clean Up

It is always advisable to clean up resources once they are not in use, it not only avoids incurring unnecessary charges but also saves your resources from malicious and unauthorized usage.

Delete the container cluster: This step deletes resources that make up the container cluster, such as the compute instances, disks, and network resources.

Run gcloud container clusters delete cluster_name

Output:

Step10

Conclusion

For some reason or another, migrating workloads to a dedicated node pool or a specific node is a standard practice in the production environment. The one factor that must be observed is no downtime. No one would want to compromise application uptime even for a second. Therefore, gracefully migrating workloads is one of the most effective things companies can do today. Kubernetes is indeed complex, but it has made running workload containers a lot easier. As a result, we can easily transfer our workload to a multiple-node pool based on our needs.

Get your new hires billable within 1-60 days. Experience our Capability Development Framework today.

Cloud Training
Customized Training
Experiential Learning

About CloudThat

CloudThat is an award-winning company and the first in India to offer cloud training and consulting services worldwide. As a Microsoft Solutions Partner, AWS Advanced Tier Training Partner, and Google Cloud Platform Partner, CloudThat has empowered over 850,000 professionals through 600+ cloud certifications winning global recognition for its training excellence including 20 MCT Trainers in Microsoft’s Global Top 100 and an impressive 12 awards in the last 8 years. CloudThat specializes in Cloud Migration, Data Platforms, DevOps, IoT, and cutting-edge technologies like Gen AI & AI/ML. It has delivered over 500 consulting projects for 250+ organizations in 30+ countries as it continues to empower professionals and enterprises to thrive in the digital-first world.

FAQs

1. What is Workload Migration?

ANS: – Workload migration is a process of moving programs and services from one point of access to another and in terms of k8s, it is more likely to be migration of services from one node group to another.

2. Why do we need to migrate the workload?

ANS: – There is no fixed reason for workload migration, oftentimes we migrate to achieve high scalability and efficiency, locating workloads in new/specific global regions, reducing the cost of fixed capacity infrastructure, or upgrading or updating the underlying infrastructure.

3. What is Zero-Down Time Deployment?

ANS: – Zero downtime deployment is a process that lets you complete the process of updating, upgrading, and version changes and more from start to finish without impacting your users’ experience and revenue (i.e., the application keeps on serving traffic). It is an ideal approach if you are migrating your ecosystem to a newly implemented infrastructure and need to bring your entire database over (this would be zero downtime migration – more on that later).

WRITTEN BY Shivani Gandhi

Shivani Gandhi is a Research Associate (Kubernetes) at CloudThat technologies. She holds a master's degree in Computer Application. She is passionate about cloud computing and has a strong urge to learn new cloud-native technologies. She has experience in GCP & AWS and enjoys leveraging clients with efficient cloud-based solutions. She is adaptive, a good team player, and enjoys reading.