Voiced by Amazon Polly |
Overview
In today’s AI-driven world, efficiently handling large volumes of data is key to unlocking valuable insights. One critical aspect of data processing, especially in Retrieval-Augmented Generation (RAG) models, is how data is chunked. Traditional methods, such as fixed-size or no chunking, may not always optimize retrieval performance. This blog introduces two advanced data chunking techniques, semantic chunking, and hierarchical chunking, and the option to apply custom chunking logic using AWS Lambda. These approaches, now available in Amazon Bedrock Knowledge Bases, aim to preserve contextual integrity and enhance retrieval efficiency.
Pioneers in Cloud Consulting & Migration Services
- Reduced infrastructural costs
- Accelerated application deployment
Introduction
Data chunking isn’t simply about breaking data into smaller pieces; it’s about transforming it into a format that facilitates the tasks of language models and retrieval systems. The real question is not “How do I chunk my data?” but “How can I best organize my data so that it’s efficient for retrieval and task completion?” With this in mind, Amazon Bedrock Knowledge Bases introduces advanced chunking strategies like semantic chunking and hierarchical chunking. These techniques offer more refined approaches to partitioning data, enhancing the ability of Foundation Models (FM) to retrieve relevant and coherent information from a large corpus.
Advanced Data Chunking in Amazon Bedrock
Amazon Bedrock Knowledge Bases introduces two new chunking methods beyond traditional techniques: semantic chunking and hierarchical chunking. These new methods focus on preserving the context and relationships between different parts of the data, thereby improving the quality of results generated by RAG models.
Semantic Chunking
Semantic chunking focuses on breaking data into segments based on meaning and context. Instead of simply chopping data into equal parts, this method analyzes the relationships between sentences or paragraphs, creating chunks that preserve the integrity of the information. This is especially useful in cases where maintaining the semantic meaning is critical, such as in legal or technical documents.
For example, consider a technical manual that describes complex machinery operations. Semantic chunking ensures that instructions and descriptions related to specific functions stay together, making it easier for the model to retrieve and provide coherent responses.
To use semantic chunking in Amazon Bedrock:
- In the Knowledge Base creation process, choose the Advanced (customization) option under chunking and parsing configurations.
- Select Semantic chunking from the drop-down menu.
- Configure parameters such as:
- Max buffer size for grouping surrounding sentences: This defines how many neighboring sentences to include when evaluating semantic similarity. A buffer size 1 includes the current sentence, the one before, and the one after.
- Max token size for a chunk: The maximum number of tokens a chunk can contain, ranging from 20 to 8,192 tokens.
- Breakpoint threshold: This defines the similarity threshold for combining chunks, with a recommended value of 95%.
Hierarchical Chunking
Hierarchical chunking organizes data into a tree-like structure, breaking it into larger parent chunks and smaller child chunks. This structure enables efficient and granular information retrieval, making it easier for the model to retrieve relevant data based on its inherent relationships.
For instance, in an academic paper, hierarchical chunking can break the document into sections like introduction, methodology, and conclusion (parent chunks). In contrast, each section is further divided into sub-sections or paragraphs (child chunks). During retrieval, the model searches within child chunks and returns the parent chunk, ensuring granularity and comprehensive context.
To implement hierarchical chunking:
- Select Hierarchical chunking under the Advanced (customization) options during Knowledge Base creation.
- Configure the following:
- Max parent token size: The maximum number of tokens a parent chunk can contain (up to 8,192 tokens).
- Max child token size: The token limit for child chunks is usually around 300 tokens.
- Overlap tokens between chunks: A percentage defining the overlap between child chunks, typically set at 20%.
By maintaining the hierarchical relationship between parent and child chunks, this method ensures that context is preserved across different levels of granularity. This is ideal for complex, nested datasets like legal contracts or research papers.
Custom Processing with AWS Lambda
If the built-in chunking options aren’t sufficient for your use case, you can implement custom chunking logic using AWS Lambda functions. With AWS Lambda, you can go beyond chunking and apply custom logic for metadata processing or advanced data parsing.
To set up custom processing:
- Write an AWS Lambda function that defines your custom chunking logic or integrate a method from frameworks like LangChain or LLamaIndex.
- Create an AWS Lambda layer for the specific framework.
- Choose the appropriate Lambda function in the chunking and parsing configuration in the Knowledge Base creation workflow.
This level of customization allows you to tailor the chunking process to your specific requirements, adding another layer of flexibility to your RAG model.
Conclusion
Effective data chunking is critical for improving retrieval efficiency and accuracy in AI systems.
For even more control, custom chunking logic can be implemented using AWS Lambda, providing flexibility to adapt the chunking process to unique requirements.
Drop a query if you have any questions regarding Amazon Bedrock and we will get back to you quickly.
Making IT Networks Enterprise-ready – Cloud Management Services
- Accelerated cloud migration
- End-to-end view of the cloud environment
About CloudThat
CloudThat is an award-winning company and the first in India to offer cloud training and consulting services worldwide. As a Microsoft Solutions Partner, AWS Advanced Tier Training Partner, and Google Cloud Platform Partner, CloudThat has empowered over 850,000 professionals through 600+ cloud certifications winning global recognition for its training excellence including 20 MCT Trainers in Microsoft’s Global Top 100 and an impressive 12 awards in the last 8 years. CloudThat specializes in Cloud Migration, Data Platforms, DevOps, IoT, and cutting-edge technologies like Gen AI & AI/ML. It has delivered over 500 consulting projects for 250+ organizations in 30+ countries as it continues to empower professionals and enterprises to thrive in the digital-first world.
FAQs
1. What is the difference between semantic chunking and hierarchical chunking?
ANS: – Semantic chunking focuses on dividing the data based on meaning and context, ensuring that related information stays together. On the other hand, hierarchical chunking organizes the data into a tree-like structure, with parent and child chunks, to maintain contextual relationships across different levels of the document.
2. When should I use custom chunking with AWS Lambda?
ANS: – Custom chunking with AWS Lambda should be used when the built-in chunking options don’t meet the specific needs of your application. For instance, if you have unique chunking requirements based on your data format or need to apply additional metadata processing, a custom AWS Lambda function can provide the flexibility to address these challenges.

WRITTEN BY Suresh Kumar Reddy
Suresh is a highly skilled and results-driven Generative AI Engineer with over three years of experience and a proven track record in architecting, developing, and deploying end-to-end LLM-powered applications. His expertise covers the full project lifecycle, from foundational research and model fine-tuning to building scalable, production-grade RAG pipelines and enterprise-level GenAI platforms. Adept at leveraging state-of-the-art models, frameworks, and cloud technologies, Suresh specializes in creating innovative solutions to address complex business challenges.
Comments