AWS, Cloud Computing

< 1 min

Data Lake Solution for Live Streaming Text & Audio Data with AWS Deep-Learning/ML Services

Voiced by Amazon Polly

AWS Solution Overview

The below architecture diagram illustrates high-level infrastructure components of the customer’s production environment.

AD1

Freedom Month Sale — Upgrade Your Skills, Save Big!

  • Up to 80% OFF AWS Courses
  • Up to 30% OFF Microsoft Certs
Act Fast!

Data Engineering Pipeline for Text and Audio Data

This solution helped in building a data engineering pipeline for data in the form of JSON and audio data from the source.

The pipeline includes 4-5 different steps that were performed on the source data:

  • Source text data coming from the customer application was first stored in a central data lake solution like S3
  • An ETL operation is performed on it to clean the raw data
  • A querying mechanism is set up to extract important data from the cleaned data.
  • The audio data is transcribed into SRT format and translated to the desired language
  • This formatted audio data is again stored in a central data store for more feature implementation. 

AWS Services Leveraged

  • Amazon API Gateway
  • Amazon DynamoDB
  • Amazon S3
  • AWS Lambda
  • Amazon Transcribe
  • Amazon Translate
  • AWS CloudFront
  • AWS Glue Crawler, Data Catalog, and ETL-jobs
  • AWS Athena
  • Amazon Kinesis Data Streams
  • Amazon Kinesis Firehose

Solution Outcome

  • Data-driven Architecture helped in new feature releases for their application, with an increased number of customer registration and better customer experience.
  • AWS Transcribe and Translate integration helped in attracting more customers from different language backgrounds.

Freedom Month Sale — Discounts That Set You Free!

  • Up to 80% OFF AWS Courses
  • Up to 30% OFF Microsoft Certs
Act Fast!

About CloudThat

CloudThat is an award-winning company and the first in India to offer cloud training and consulting services worldwide. As a Microsoft Solutions Partner, AWS Advanced Tier Training Partner, and Google Cloud Platform Partner, CloudThat has empowered over 850,000 professionals through 600+ cloud certifications winning global recognition for its training excellence including 20 MCT Trainers in Microsoft’s Global Top 100 and an impressive 12 awards in the last 8 years. CloudThat specializes in Cloud Migration, Data Platforms, DevOps, IoT, and cutting-edge technologies like Gen AI & AI/ML. It has delivered over 500 consulting projects for 250+ organizations in 30+ countries as it continues to empower professionals and enterprises to thrive in the digital-first world.

WRITTEN BY Bhavesh Goswami

Bhavesh Goswami is the Founder & CEO of CloudThat Technologies. He is a leading expert in the Cloud Computing space with over a decade of experience. He was in the initial development team of Amazon Simple Storage Service (S3) at Amazon Web Services (AWS) in Seattle. and has been working in the Cloud Computing and Big Data fields for over 12 years now. He is a public speaker and has been the Keynote Speaker at the ‘International Conference on Computer Communication and Informatics’. He also has authored numerous research papers and patents in various fields.

Share

Comments

    Click to Comment

Get The Most Out Of Us

Our support doesn't end here. We have monthly newsletters, study guides, practice questions, and more to assist you in upgrading your cloud career. Subscribe to get them all!