AWS, Cloud Computing

4 Mins Read

AWS Kinesis Data Streams for DynamoDB

Voiced by Amazon Polly

1. Introduction

Online streaming has become part and parcel of information consumption in today’s era. However, creating live, real-time systems is a niche skill in the world of cross-platform integration, subscriptions, instant notifications, etc. The core component of creating a real-time system is the continuous streaming of data from one application to another. Various tools provide this ability: RabbitMQ, Apache Kafka, Amazon Kinesis, and many more. Each tool has its fair share of advantages and disadvantages. Today we are going to focus on Amazon Kinesis.

Customized Cloud Solutions to Drive your Business Success

  • Cloud Migration
  • Devops
  • AIML & IoT
Know More

2. Amazon Kinesis Data Streams

It is used for capturing item-level modifications of any DynamoDB table. Our applications can access the Kinesis stream and view changes in near real-time. The Kinesis data stream will be able to continuously capture and store terabytes of data per hour, which we can use for longer retention by having additional audit and security transparency. Kinesis Data Streams can also be used with Kinesis Data Firehose – a delivery stream platform and Amazon QuickSight – where we can create real-time dashboards, generate alerts, etc.

3. Amazon Kinesis Data Firehose

It is a fully managed ETL service used for reliable loading of streaming data to the data stores, data lakes, analytics services. It can capture, transform, and deliver streaming data into S3 and other destinations like Redshift, OpenSearch, DataDog, etc. Kinesis Data Firehose can scale automatically to match the throughput of the data and used to batch, compress, transform and encrypt the data streams which minimizes the storage used and increased security.

4. High-Level Architecture Diagram

AWS Kinesis

5. Step-by-Step Data Lake implementation guide for DynamoDB tables using Kinesis Streams

I will use AWS Kinesis Data streams to store DynamoDB table data into S3 (as a data lake) using Kinesis Data Firehose.

Step-1:

  • Create Kinesis Data Stream by provisioning required data stream capacity by selecting either On-demand capacity mode or provisioned capacity mode
    AWS Kinesis
    AWS Kinesis

Step-2:

  • Create a Delivery Stream which is used for sending streamed data into the S3 bucket
  • Choose the source as Kinesis Data Streams and destination as an S3 bucket
    AWS Kinesis
  • Under source settings, select the data stream created in the earlier step
  • We can transform the data in two ways either using Lambda (if stream data is not JSON) or using Glue to convert the records to Apache Parquet or Apache ORC format (converts JSON data to table schema which we can define) which provides efficient querying, or we can send the raw data directly to S3
  • Under Destination Settings, select the S3 bucket where the streamed data is to be stored. Select the custom S3 bucket prefix to store the data and error output prefix where any errors occurred will be logged
    AWS Kinesis
  • Dynamic Partitioning is a feature that can be enabled on the S3 bucket in Destination settings used to partition the streaming data into multiple folders as per our requirement. This feature can be enabled only when creating a delivery stream and cannot be allowed for the existing one.
  • We can set the S3 buffer limits with buffer size and buffer interval. Compression and encryption (for data records and server-side encryption) can also be enabled to reduce storage size and provide additional security.
  • After selecting all the required specifications, create the Delivery stream whose status will be Active upon creation
    AWS Kinesis

Step-3:

  • Now, go to DynamoDB console and enable Kinesis Data streams for the tables required
    AWS Kinesis
  • Any item modifications that have been happening on the DynamoDB table are now being captured and stored in S3
    AWS Kinesis

6. Conclusion

AWS Kinesis Data Streams and Data Firehose combined can be used as an efficient way to create a centralized data lake used for performing advanced analytics or sending the data to redshift for optimized querying. In addition, they can create dashboards using QuickSight or Athena for better visualization of data.

As Kinesis is a Managed Service, meaning AWS handles most of the administration and developers can focus on their code and not worry about managing their system. Hope that this step-by-step guide has been useful to you.

Get your new hires billable within 1-60 days. Experience our Capability Development Framework today.

  • Cloud Training
  • Customized Training
  • Experiential Learning
Read More

About CloudThat

CloudThat is a leading provider of Cloud Training and Consulting services with a global presence in India, the USA, Asia, Europe, and Africa. Specializing in AWS, Microsoft Azure, GCP, VMware, Databricks, and more, the company serves mid-market and enterprise clients, offering comprehensive expertise in Cloud Migration, Data Platforms, DevOps, IoT, AI/ML, and more.

CloudThat is the first Indian Company to win the prestigious Microsoft Partner 2024 Award and is recognized as a top-tier partner with AWS and Microsoft, including the prestigious ‘Think Big’ partner award from AWS and the Microsoft Superstars FY 2023 award in Asia & India. Having trained 850k+ professionals in 600+ cloud certifications and completed 500+ consulting projects globally, CloudThat is an official AWS Advanced Consulting Partner, Microsoft Gold Partner, AWS Training PartnerAWS Migration PartnerAWS Data and Analytics PartnerAWS DevOps Competency PartnerAWS GenAI Competency PartnerAmazon QuickSight Service Delivery PartnerAmazon EKS Service Delivery Partner AWS Microsoft Workload PartnersAmazon EC2 Service Delivery PartnerAmazon ECS Service Delivery PartnerAWS Glue Service Delivery PartnerAmazon Redshift Service Delivery PartnerAWS Control Tower Service Delivery PartnerAWS WAF Service Delivery PartnerAmazon CloudFront Service Delivery PartnerAmazon OpenSearch Service Delivery PartnerAWS DMS Service Delivery PartnerAWS Systems Manager Service Delivery PartnerAmazon RDS Service Delivery PartnerAWS CloudFormation Service Delivery PartnerAWS ConfigAmazon EMR and many more.

Share

Comments

    Click to Comment

Get The Most Out Of Us

Our support doesn't end here. We have monthly newsletters, study guides, practice questions, and more to assist you in upgrading your cloud career. Subscribe to get them all!