Voiced by Amazon Polly |
Overview
As industrial automation is advancing at a rapid pace, IoT (Internet of Things) technology is proving to be quite useful in monitoring machines or devices, keeping track of them, and taking actions based on particular situations, which will help save time and accelerate corporate growth. These IoT devices generate a large amount of unstructured data, which must be kept in a specific location for real-time monitoring or analytics in the form of a file in a certain format, such as CSV, Apache Parquet, or Apache ORC, which is typically more efficient to query.
This blog will look at how to use the kinesis data firehose to store different IoT data as a file format in S3.
Freedom Month Sale — Upgrade Your Skills, Save Big!
- Up to 80% OFF AWS Courses
- Up to 30% OFF Microsoft Certs
Steps to Create S3 Bucket
Step 1: To create an S3 bucket, log into your AWS account and type S3 into the console’s search bar. The S3 service will appear below; click on it.
Step 2: Go to the left blade and Select the Bucket option, then click the Create bucket option. In the bucket, name option put s3-bucket-for-iot-data and then Select AWS Region.
Step 3: Choose to create bucket option to create a bucket. We can see that the bucket with the name s3-bucket-for-iot-data has been successfully created.
Steps to Create IoT data delivery stream in Kinesis Data Firehose
Step 1: To create a Kinesis Data Firehose delivery stream, in the AWS console search bar enter Kinesis and then Select Kinesis service.
Step 2: At the left blade of Amazon Kinesis, select Delivery streams and then Create Delivery Stream by clicking the option which is created delivery stream. On the option Choose the source and destination select the Direct PUT option in the Source and Select Amazon S3 in the Destination part finally enter the Delivery stream name as IoT_data_delivery_stream.
In the Destination settings choose the Browse option which is below the S3 bucket.
Step 3: Choose the previously created bucket which is s3-bucket-for-iot-data and then click choose an option. Enter the IoT-data at S3 bucket prefix – optional – to indicate that the kinesis data firehose will create a folder in S3 with the “YYYY/MM/dd/HH” format.
Step 4: Choose Buffer hints, compression, and encryption option and enter 1 MiB at the place of Buffer size and enter 60 seconds as Buffer interval.
Step 5: Finally, choose the Enable option in the Amazon CloudWatch error logging, and at the permissions choose the create or update IAM role option and down.
Click on Create deliver stream option you can find IoT_data_delivery_stream will be created in a matter of time.
Steps to Create IoT Rule to Send Data to the Kinesis Data Firehose
Step 1: Type IoT Core into the search box, then select IoT Core from the services menu.
Step 2: Click Act on the left blade of the IoT Core console, then Rule on the right blade. To create the Rule that will transfer IoT data to an S3 bucket, go to the Rules console and click Create.
Enter IoT_data_rule as the name and define the rule in the description section.
Step 3: In the Rule query statement, select 2016-03-23 as the SQL version, and type SELECT * FROM ‘iotdevice/+/data’, where * denotes reading the entire data sequence from the IoT topic iotdevice/+/data, and + denotes a wildcard character.
To add an action to a rule, select Add action.
Step 4: To add an action, click Send a message to an Amazon Kinesis Firehose stream in the select an action blade and then click Configure action at the bottom page.
Then select the IoT_data_delivery_stream from the stream name
Step 5: To create a role for the IoT core to send data to the kinesis data firehose click on create role option
Enter the role name as iot-kinesis-delivery-role and click Create Role.
At the final to create a rule click create rule option.
Just a matter of time IoT rule will be created for sending data to the kinesis firehose.
Testing
- Go to the AWS IoT Core dashboard, then to the left blade, select Test, and then MQTT test client to test an IoT rule.
- On the MQTT Test Client page, select the publish to a topic option and enter “iotdevice/22/data” as the Topic name and the Message payload {“Temperature”: 20.5,”Humidity”: 50,”Co2″: 450} as the Message payload.
- Click the Publish option to make the data publish again with the different data {“Temperature”: 21.5,”Humidity”: 60,”Co2″: 350}. The information has now been made published.
- To verify the published data, wait 60 seconds then open the S3 console, and choose the bucket which is created in the first step which is s3-bucket-for-iot-data.
- We can be able to find the created folder named IoT-data2022 and open that folder to see the published data.
- You can find the file name IoT_data_delivery_stream with a date.
- If you download and open the file, we can find all ingested data within 60 seconds.
Conclusion
Thus, the above integration flow shows how to inject the IoT data into the S3 with help of Amazon Kinesis data firehose. The advantage of using kinesis data firehoses is that we can create files with lots of IoT data and do transformations on the fly if necessary. This approach helps to collect data from a vast number of IoT devices and store it in the S3.
Freedom Month Sale — Discounts That Set You Free!
- Up to 80% OFF AWS Courses
- Up to 30% OFF Microsoft Certs
About CloudThat
CloudThat is an award-winning company and the first in India to offer cloud training and consulting services worldwide. As a Microsoft Solutions Partner, AWS Advanced Tier Training Partner, and Google Cloud Platform Partner, CloudThat has empowered over 850,000 professionals through 600+ cloud certifications winning global recognition for its training excellence including 20 MCT Trainers in Microsoft’s Global Top 100 and an impressive 12 awards in the last 8 years. CloudThat specializes in Cloud Migration, Data Platforms, DevOps, IoT, and cutting-edge technologies like Gen AI & AI/ML. It has delivered over 500 consulting projects for 250+ organizations in 30+ countries as it continues to empower professionals and enterprises to thrive in the digital-first world.
FAQs
1. What is use of Kinesis Data Firehose?
ANS: – Amazon Kinesis Data Firehose is an extract, transform, and load (ETL) service that collects, processes, and distributes streaming data to data lakes, data storage, and analytics services with high reliability.
2. What type of data compression format is supported by AWS Kinesis Data Firehose?
ANS: – GZIP, ZIP, and SNAPPY compression formats.

WRITTEN BY Vasanth Kumar R
Vasanth Kumar R works as a Sr. Research Associate at CloudThat. He is highly focused and passionate about learning new cutting-edge technologies including Cloud Computing, AI/ML & IoT/IIOT. He has experience with AWS and Azure Cloud Services, Embedded Software, and IoT/IIOT Development, and also worked with various sensors and actuators as well as electrical panels for Greenhouse Automation.
Comments