Building a Complete Voice AI Agent Using Amazon Nova Sonic

Introduction

AI-driven voice solutions are revolutionizing how contact centres operate, offering natural, real-time conversations between customers and virtual agents. These advancements reduce wait times and operational costs and maintain the personalized, human-like interaction that customers value. With the introduction of Amazon Nova Sonic on Amazon Bedrock, developers can now create advanced conversational AI agents capable of seamless voice communication, eliminating the need for separate speech recognition and text-to-speech systems.

Pioneers in Cloud Consulting & Migration Services

Reduced infrastructural costs
Accelerated application deployment

Get Started

Solution overview

Source Link

The following layers make up the solution:

Frontend layer:
The web application content delivery network is the Amazon CloudFront distribution.
Static assets are hosted by Amazon Simple Storage Service (Amazon S3).
User interaction and audio streaming are managed via the user interface.
Communication layer:
A network load balancer manages WebSocket connections. Applications that stream audio in real time depend on WebSockets to facilitate two-way interactive communication sessions between the user’s browser and the server.
JSON web token (JWT) validation and user authentication are offered by Amazon Cognito. By offering user management, authorization, and authentication for online and mobile apps, Amazon Cognito eliminates the need to create and manage your identity systems.
Processing layer:
Amazon Elastic Container Service (Amazon ECS) operates the containerized backend service.
AWS Fargate provides a backend for serverless computing. The Amazon ECS engine provides orchestration.
Python backend controls interactions with Amazon Nova Sonic and handles audio streams.
Intelligence layer:
Speech processing is handled using Amazon Bedrock’s Amazon Nova Sonic model.
Customer data is stored in Amazon DynamoDB.
By integrating foundation models (FMs) with your data sources, Amazon Bedrock Knowledge Bases enables AI applications to access precise, current data unique to your enterprise.

The following sequence diagram shows the flow that occurs when a user starts a chat. Although the user only logs in once, authentication is required. Each time the user begins a new session, steps three and four are carried out. The conversational loop from Steps 6–12 is repeated throughout the verbal encounter. Only when the Amazon Nova Sonic agent chooses to employ a tool do steps a through c occur. The procedure proceeds straight from Step 9 to Step 10 when no tools are used.

AD2

Source Link

Deploy the solution:

Git Repo contains the solution and comprehensive deployment instructions. AWS CDK is used in the solution to automate the deployment of infrastructure. To begin using your AWS Command Line Interface AWS CLI environment, use the code terminal instructions listed below:

git clone https://github.com/aws-samples/sample-sonic-cdk-agent.git 
cd nova-s2s-call-center 

# Configure environment variables
cp template.env .env

# Edit .env with your settings

# Deploy the solution 
./deploy.sh

git clone https://github.com/aws-samples/sample-sonic-cdk-agent.git

cd nova-s2s-call-center

# Configure environment variables

cp template.env .env

# Edit .env with your settings

# Deploy the solution

./deploy.sh

The deployment creates two AWS CloudFormation stacks:

Network stack for networking components and virtual private clouds (VPCs)
Application resource stack

You may access the login page by clicking on the Amazon CloudFront distribution URL provided by the second stack’s output.

AD3

The AWS CLI command below may be used to create an Amazon Cognito admin user:

aws cognito-idp admin-create-user \
  --user-pool-id YOUR_USER_POOL_ID \
  --username USERNAME \
  --user-attributes Name=email,Value=USER_EMAIL \
  --temporary-password TEMPORARY_PASSWORD \
  --region YOUR_AWS_REGION

aws cognito-idp admin-create-user \

--user-pool-id YOUR_USER_POOL_ID \

--username USERNAME \

--user-attributes Name=email,Value=USER_EMAIL \

--temporary-password TEMPORARY_PASSWORD \

--region YOUR_AWS_REGION

The previous command uses the following arguments:

YOUR_USER_POOL_ID: The ID of your Amazon Cognito user pool
USERNAME: The desired user name for the user
USER_EMAIL: The email address of the user
TEMPORARY_PASSWORD: A temporary password for the user
YOUR_AWS_REGION: Your AWS Region (for example, us-east-1)

You will be prompted to create a new password after logging in with your temporary one via the Amazon CloudFront distribution link.
To initiate communication with your assistant, select Start Session. Try out several tools and prompts for your use case.

AD4

Source Link

Customizing the application

The ability to customize the AI agent’s skills to your unique use case is a crucial aspect of this technology. This extensibility is demonstrated by the sample implementation using knowledge integration and custom tools:

Using phone numbers as keys, the customer information lookup retrieves customer profile data from Amazon DynamoDB.
Looks up company facts, plan specifics, and prices in an Amazon Bedrock Knowledge base

These features demonstrate the use of external data sources and domain-specific expertise to improve Amazon Nova Sonic’s capabilities.

Conclusion

Voice AI agents are reshaping customer service by enabling businesses to deliver personalized, always available, and highly scalable support while optimizing costs. With Amazon Nova Sonic and AWS cloud-native capabilities, building these advanced solutions has never been more accessible. By leveraging security, flexible architecture, and integrating real-time data, organizations can move rapidly from an idea to a working proof of concept, unlocking new efficiency and customer satisfaction opportunities. This approach simplifies the development of voice-driven applications and sets the stage for innovation across industries.

Drop a query if you have any questions regarding Amazon Nova Sonic and we will get back to you quickly.

Empowering organizations to become ‘data driven’ enterprises with our Cloud experts.

Reduced infrastructure costs
Timely data-driven decisions

Get Started

About CloudThat

CloudThat is an award-winning company and the first in India to offer cloud training and consulting services worldwide. As a Microsoft Solutions Partner, AWS Advanced Tier Training Partner, and Google Cloud Platform Partner, CloudThat has empowered over 850,000 professionals through 600+ cloud certifications winning global recognition for its training excellence including 20 MCT Trainers in Microsoft’s Global Top 100 and an impressive 12 awards in the last 8 years. CloudThat specializes in Cloud Migration, Data Platforms, DevOps, IoT, and cutting-edge technologies like Gen AI & AI/ML. It has delivered over 500 consulting projects for 250+ organizations in 30+ countries as it continues to empower professionals and enterprises to thrive in the digital-first world.

FAQs

1. What is Amazon Nova Sonic?

ANS: – It’s a speech-to-speech model in Amazon Bedrock that enables real-time, natural voice conversations without separate ASR and TTS components.

2. Can I connect it to my business data?

ANS: – Yes. You can integrate with Amazon DynamoDB, Amazon Bedrock Knowledge Bases, and custom tools for real-time data access.

3. How do I deploy this solution?

ANS: – Use the AWS CDK from the sample GitHub repo to quickly deploy networking, authentication, and backend components.

WRITTEN BY Aayushi Khandelwal

Aayushi is a data and AIoT professional at CloudThat, specializing in generative AI technologies. She is passionate about building intelligent, data-driven solutions powered by advanced AI models. With a strong foundation in machine learning, natural language processing, and cloud services, Aayushi focuses on developing scalable systems that deliver meaningful insights and automation. Her expertise includes working with tools like Amazon Bedrock, AWS Lambda, and various open-source AI frameworks.