AWS, Cloud Computing

5 Mins Read

Real-time Data Streaming: Amazon RDS MySQL CDC with Apache Kafka on Amazon EC2, Debezium, and AWS Lambda – Part 3

Voiced by Amazon Polly

Introduction

Change Data Capture (CDC) is a fascinating, transformative data integration and replication concept. It is that captures and propagates data modifications within a database, enabling real-time synchronization between multiple systems or applications. CDC presents a paradigm shift from traditional batch-based methods, offering organizations a more efficient and precise way to handle data changes.

At its core, the CDC captures individual data alterations, such as inserts, updates, and deletes, as they occur in the source database. Rather than periodically extracting and processing entire datasets, the CDC selectively captures the relevant changes and immediately transfers them to the target systems. This approach minimizes the data transfer volume, reduces latency, and optimizes resource utilization, making it ideal for scenarios where timeliness and accuracy are crucial.

Several tools available in the market provide Change Data Capture (CDC) functionality, but in this example, we will be using Debezium.

This solution is explained in 3 parts, earlier, in the 1st part, we saw the creation of VPC and installation of an Amazon EC2 machine(private) which can be SSH without using Bastion host that is with using Amazon EC2 Instance Connect (EIC) Endpoint, in the 2nd part, launching Amazon RDS, install Apache Kafka and configure Debezium on Private Amazon EC2. This is the 3rd part, where we will see CRUD operation on MySQL Database to check if Debezium is working.

Pioneers in Cloud Consulting & Migration Services

  • Reduced infrastructural costs
  • Accelerated application deployment
Get Started

Steps to perform CRUD Operations on MySQL Database

Step 1: Let’s do a CRUD operation on MySQL Database to check if Debezium is working

Keep multiple tabs to check, such as tabs on MySQL CRUD operation, Debezium running, and Apache Kafka topic consumption.

We already have the Debezium Connector running.

crud1

MySQL tab

crud1b

Step 2: Now let’s have another tab for Consumers to consume the data from Apache Kafka topics

Step 3: Now insert a record

crud3

Step 4: Goto consumer and check

crud4

As you can see here, whenever an operation is performed on the database, the Debezium records it in the Apache Kafka topic.

Step 5: Delete the NAT Gateway and Release the Elastic IP because it will cost.

Step 6: Create AWS Lambda for Apache Kafka Topic as an event source to consume and produce it to another Table in the database.

Step 7: First, go to MySQL Tab and create a table called destination table

crud7

Step 8: Goto Amazon EC2 and edit Security Group

crud8

Steps to Create Endpoints

Step 1: Create STS Endpoint

end1

end1b

end1c

Step 2: Create AWS Lambda and Amazon EC2 Endpoints as well.

end2

Steps to Create of AWS Lambda Function

AWS Lambda is a serverless compute service that Amazon Web Services (AWS) provides. It allows users to run their code without provisioning or managing servers. With AWS Lambda, developers can upload their code, and the service automatically scales and manages the infrastructure needed to run the code. AWS Lambda supports a wide range of programming languages and can be triggered by various events, such as changes in data, API requests, or scheduled intervals. It offers a pay-per-use pricing model, where users are billed only for the actual compute time consumed by their code. AWS Lambda simplifies the process of building scalable and event-driven applications, enabling developers to focus on writing code rather than managing servers. We will use the Self-Managed Apache Kafka as an event source for AWS Lambda, which will be used in the trigger and send the CDC event data to another table in a different database/same with Dynamic attribute.

Step 1: Now go to the AWS Lambda service and create a function.

lambda1

Step 2: Attach these AWS IAM Role of AWS Lambda

lambda2

Step 3: Goto AWS Lambda Configuration -> Amazon VPC

lambda3

Step 4: Now go to the Trigger section and add the necessary details

lambda4

lambda4b

lambda4c

Step 5: Add MySQL Connector to the AWS Lambda with the help of below video

AWS Lambda with AWS RDS Tutorial: Connecting to MySQL on AWS Lambda using mysql-connector-python – YouTube

Step 6: Add the below code to AWS Lambda

Amazon CloudWatch Logs

lambda6

Step 7: Final result where we insert a table, and the CDC data will be stored in the destination table. The attributes are dynamically created if it doesn’t exist in the table.

lambda7

Conclusion

Finally, the entire process is done from RDS MySQL, Apache Kafka Topics, Debezium Connector, and AWS Lambda to process the CDC data into another table of RDS MySQL by using Endpoints for communication. The Debezium connector sends CDC data instantaneously, as seen in the above examples. AWS Lambda triggers whenever the topic receives the data from MySQL, and the trigger has various options such as batch, window, etc.

Click here to check out the Part 1 and Part 2.

Drop a query if you have any questions regarding Amazon Lambda and we will get back to you quickly.

Making IT Networks Enterprise-ready – Cloud Management Services

  • Accelerated cloud migration
  • End-to-end view of the cloud environment
Get Started

About CloudThat

CloudThat is a leading provider of Cloud Training and Consulting services with a global presence in India, the USA, Asia, Europe, and Africa. Specializing in AWS, Microsoft Azure, GCP, VMware, Databricks, and more, the company serves mid-market and enterprise clients, offering comprehensive expertise in Cloud Migration, Data Platforms, DevOps, IoT, AI/ML, and more.

CloudThat is the first Indian Company to win the prestigious Microsoft Partner 2024 Award and is recognized as a top-tier partner with AWS and Microsoft, including the prestigious ‘Think Big’ partner award from AWS and the Microsoft Superstars FY 2023 award in Asia & India. Having trained 850k+ professionals in 600+ cloud certifications and completed 500+ consulting projects globally, CloudThat is an official AWS Advanced Consulting Partner, Microsoft Gold Partner, AWS Training PartnerAWS Migration PartnerAWS Data and Analytics PartnerAWS DevOps Competency PartnerAWS GenAI Competency PartnerAmazon QuickSight Service Delivery PartnerAmazon EKS Service Delivery Partner AWS Microsoft Workload PartnersAmazon EC2 Service Delivery PartnerAmazon ECS Service Delivery PartnerAWS Glue Service Delivery PartnerAmazon Redshift Service Delivery PartnerAWS Control Tower Service Delivery PartnerAWS WAF Service Delivery PartnerAmazon CloudFront Service Delivery PartnerAmazon OpenSearch Service Delivery PartnerAWS DMS Service Delivery PartnerAWS Systems Manager Service Delivery PartnerAmazon RDS Service Delivery PartnerAWS CloudFormation Service Delivery PartnerAWS ConfigAmazon EMR and many more.

FAQs

1. Do Endpoints cost?

ANS: – Yes, please look into AWS service pricing for more.

2. Can we have multiple topics in the same trigger?

ANS: – No, only one topic for a trigger. If you need more, then add another trigger.

WRITTEN BY Suresh Kumar Reddy

Yerraballi Suresh Kumar Reddy is working as a Research Associate - Data and AI/ML at CloudThat. He is a self-motivated and hard-working Cloud Data Science aspirant who is adept at using analytical tools for analyzing and extracting meaningful insights from data.

Share

Comments

    Click to Comment

Get The Most Out Of Us

Our support doesn't end here. We have monthly newsletters, study guides, practice questions, and more to assist you in upgrading your cloud career. Subscribe to get them all!