Building a RAG Knowledge Base with Amazon S3 Vectors and Amazon Bedrock

Overview

Modern Retrieval-Augmented Generation (RAG) applications rely on vector embeddings to retrieve relevant information from large datasets. Amazon S3 Vectors is a cost‐optimized object storage service that natively supports storing and querying high-dimensional vectors. Combining S3 Vectors with Amazon Bedrock Knowledge Bases gives you a fully managed RAG workflow that dramatically cuts costs while preserving sub-second retrieval performance. For example, S3 Vectors can reduce vector storage and query costs by up to 90% compared to SSD-based vector databases.

This makes it ideal for building large-scale knowledge bases (documents, manuals, archives, etc.) where you want durable, scalable vector storage and semantic search via Amazon Bedrock’s AI models. Bedrock automatically handles ingestion of your Amazon S3 data: it will fetch your documents from Amazon S3, chunk them, generate embeddings, and index those vectors in the Amazon S3 vector store so that you can later retrieve and generate answers with context.

Pioneers in Cloud Consulting & Migration Services

Reduced infrastructural costs
Accelerated application deployment

Get Started

Prerequisites

AWS Account: Ensure Amazon Bedrock is enabled, and you have AWS IAM access to Amazon Bedrock and Amazon S3 Vectors.
AWS IAM Role/Permissions: You will need an AWS IAM role (or user) with permissions for S3 Vectors (s3vectors:CreateVectorBucket, s3vectors:CreateIndex, s3vectors:PutVectors, s3vectors:QueryVectors, etc.) and Amazon Bedrock Knowledge Bases. If you encrypt your vector bucket with a customer-managed AWS KMS key, also include AWS KMS permissions. Example:

{
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3vectors:*",
        "s3:GetObject",
        "s3:PutObject",
        "bedrockagent:*",
        "bedrockdata:*",
        "kms:*"
      ],
      "Resource": "*"
    }
  ]
}

{

"Statement": [

{

"Effect": "Allow",

"Action": [

"s3vectors:*",

"s3:GetObject",

"s3:PutObject",

"bedrockagent:*",

"bedrockdata:*",

"kms:*"

"Resource": "*"

}

]

}

Embedding Model Access: Bedrock supports Amazon Titan and other embedding models. For text data, Amazon Titan Text Embedding v2 is a good choice (outputs 1,024-dim vectors by default). Depending on your use case, you can also use Cohere’s embedding models or image embeddings.
AWS CLI / SDK: Install AWS CLI v2 and configure it, or use Boto3 (Python). Use the latest version that supports Amazon S3 Vectors and Amazon Bedrock commands.

Step-by-Step Guide

Step 1: Prepare and Ingest Source Data into Amazon S3

First, upload your documents to a standard Amazon S3 bucket. This is the data source for your knowledge base. For example:

aws s3 cp ./my-manuals/ s3://my-doc-bucket/manuals/ --recursive

1	aws s3 cp ./my-manuals/ s3://my-doc-bucket/manuals/ --recursive

Once uploaded, Amazon Bedrock can ingest these files, chunk them, and generate embeddings.

Step 2: Create an Amazon S3 Vector Bucket and Index

# Create a new Amazon S3 vector bucket

aws s3vectors create-vector-bucket \
    --vector-bucket-name my-vector-bucket \
    --region us-east-1

aws s3vectors create-vector-bucket \

--vector-bucket-name my-vector-bucket \

--region us-east-1

# Create a vector index

aws s3vectors create-index \
    --vector-bucket-name my-vector-bucket \
    --index-name my-vector-index \
    --data-type float32 \
    --dimension 1024 \
    --distance-metric cosine \
    --region us-east-1

aws s3vectors create-index \

--vector-bucket-name my-vector-bucket \

--index-name my-vector-index \

--data-type float32 \

--dimension 1024 \

--distance-metric cosine \

--region us-east-1

Python example:

import boto3

s3vectors = boto3.client('s3vectors', region_name='us-east-1')
s3vectors.create_vector_bucket(VectorBucketName='my-vector-bucket')
s3vectors.create_index(
    VectorBucketName='my-vector-bucket',
    IndexName='my-vector-index',
    DataType='float32',
    Dimension=1024,
    DistanceMetric='COSINE'
)

import boto3

s3vectors = boto3.client('s3vectors', region_name='us-east-1')

s3vectors.create_vector_bucket(VectorBucketName='my-vector-bucket')

s3vectors.create_index(

VectorBucketName='my-vector-bucket',

IndexName='my-vector-index',

DataType='float32',

Dimension=1024,

DistanceMetric='COSINE'

)

Step 3: Generate and Store Vector Embeddings

import boto3, json

s3 = boto3.client('s3', region_name='us-east-1')
bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')
s3vectors = boto3.client('s3vectors', region_name='us-east-1')

# Load document
bucket_name = 'my-doc-bucket'
object_key = 'manuals/chapter1.txt'
text = s3.get_object(Bucket=bucket_name, Key=object_key)['Body'].read().decode('utf-8')

# Generate embedding
response = bedrock.invoke_model(
    modelId='amazon.titan-embed-text-v2:0',
    contentType='application/json',
    body=json.dumps({"text": text})
)
embeddings = json.loads(response['body'].read())['embeddings']

# Store in S3 Vector index
vectors = [{
    'key': 'manuals/chapter1',
    'data': {'float32': embeddings},
    'metadata': {'source': f's3://{bucket_name}/{object_key}'}
}]
s3vectors.put_vectors(
    vectorBucketName='my-vector-bucket',
    indexName='my-vector-index',
    vectors=vectors
)

import boto3, json

s3 = boto3.client('s3', region_name='us-east-1')

bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')

s3vectors = boto3.client('s3vectors', region_name='us-east-1')

# Load document

bucket_name = 'my-doc-bucket'

object_key = 'manuals/chapter1.txt'

text = s3.get_object(Bucket=bucket_name, Key=object_key)['Body'].read().decode('utf-8')

# Generate embedding

response = bedrock.invoke_model(

modelId='amazon.titan-embed-text-v2:0',

contentType='application/json',

body=json.dumps({"text": text})

)

embeddings = json.loads(response['body'].read())['embeddings']

# Store in S3 Vector index

vectors = [{

'key': 'manuals/chapter1',

'data': {'float32': embeddings},

'metadata': {'source': f's3://{bucket_name}/{object_key}'}

}]

s3vectors.put_vectors(

vectorBucketName='my-vector-bucket',

indexName='my-vector-index',

vectors=vectors

)

Step 4: Create and Configure the Amazon Bedrock Knowledge Base

aws bedrock create-knowledge-base \
  --name "MyS3VectorKB" \
  --description "Knowledge base using S3 vector store" \
  --role-arn arn:aws:iam::123456789012:role/MyBedrockServiceRole \
  --knowledge-base-configuration '{
      "vectorStore": {
         "type": "S3_VECTORS",
         "s3VectorsConfiguration": {
             "vectorBucketArn": "arn:aws:s3:::my-vector-bucket",
             "indexArn": "arn:aws:s3vectors:us-east-1:123456789012:vectorIndex/my-vector-index",
             "indexName": "my-vector-index"
         }
      },
      "embeddingModelArn": "arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0"
  }'

aws bedrock create-knowledge-base \

--name "MyS3VectorKB" \

--description "Knowledge base using S3 vector store" \

--role-arn arn:aws:iam::123456789012:role/MyBedrockServiceRole \

--knowledge-base-configuration '{

"vectorStore": {

"type": "S3_VECTORS",

"s3VectorsConfiguration": {

"vectorBucketArn": "arn:aws:s3:::my-vector-bucket",

"indexArn": "arn:aws:s3vectors:us-east-1:123456789012:vectorIndex/my-vector-index",

"indexName": "my-vector-index"

}

"embeddingModelArn": "arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0"

Python example:

bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')
response = bedrock_agent.create_knowledge_base(
    name='MyS3VectorKB',
    description='KB with S3 Vectors',
    roleArn='arn:aws:iam::123456789012:role/MyBedrockServiceRole',
    knowledgeBaseConfiguration={
        'vectorStore': {
            'type': 'S3_VECTORS',
            's3VectorsConfiguration': {
                'vectorBucketArn': 'arn:aws:s3:::my-vector-bucket',
                'indexArn': 'arn:aws:s3vectors:us-east-1:123456789012:vectorIndex/my-vector-index',
                'indexName': 'my-vector-index'
            }
        },
        'embeddingModelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0'
    }
)

bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')

response = bedrock_agent.create_knowledge_base(

name='MyS3VectorKB',

description='KB with S3 Vectors',

roleArn='arn:aws:iam::123456789012:role/MyBedrockServiceRole',

knowledgeBaseConfiguration={

'vectorStore': {

'type': 'S3_VECTORS',

's3VectorsConfiguration': {

'vectorBucketArn': 'arn:aws:s3:::my-vector-bucket',

'indexArn': 'arn:aws:s3vectors:us-east-1:123456789012:vectorIndex/my-vector-index',

'indexName': 'my-vector-index'

}

'embeddingModelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0'

}

)

Step 5: Query the Knowledge Base (RAG Workflow)

bedrock_runtime = boto3.client('bedrock-runtime', region_name='us-east-1')
response = bedrock_runtime.retrieve_and_generate(
    input={'text': 'What safety protocols does the manual describe?'},
    retrieveAndGenerateConfiguration={
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': 'your-kb-id',
            'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0'
        }
    }
)
answer = json.loads(response['body'].read())['generatedText']
print("Answer:", answer)

bedrock_runtime = boto3.client('bedrock-runtime', region_name='us-east-1')

response = bedrock_runtime.retrieve_and_generate(

input={'text': 'What safety protocols does the manual describe?'},

retrieveAndGenerateConfiguration={

'knowledgeBaseConfiguration': {

'knowledgeBaseId': 'your-kb-id',

'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0'

}

)

answer = json.loads(response['body'].read())['generatedText']

print("Answer:", answer)

rag

Use Cases

Chatbots: Build customer or employee support bots that ground responses in your company data.
Document Q&A: Query manuals, policies, or knowledge repositories with natural language.
Research Assistants: Summarize and extract key points from large document collections.

Conclusion

Developers can create scalable and cost-efficient RAG solutions by combining Amazon S3 Vectors with Amazon Bedrock Knowledge Bases. The workflow is straightforward: ingest data into Amazon S3, build embeddings with Bedrock, store vectors in Amazon S3 Vector Buckets, and enable semantic search with Bedrock Knowledge Bases. This setup powers accurate, explainable, cost-effective AI-driven applications like chatbots and document search engines.

Drop a query if you have any questions regarding Amazon Bedrock and we will get back to you quickly.

Empowering organizations to become ‘data driven’ enterprises with our Cloud experts.

Reduced infrastructure costs
Timely data-driven decisions

Get Started

About CloudThat

CloudThat is an award-winning company and the first in India to offer cloud training and consulting services worldwide. As a Microsoft Solutions Partner, AWS Advanced Tier Training Partner, and Google Cloud Platform Partner, CloudThat has empowered over 850,000 professionals through 600+ cloud certifications winning global recognition for its training excellence including 20 MCT Trainers in Microsoft’s Global Top 100 and an impressive 12 awards in the last 8 years. CloudThat specializes in Cloud Migration, Data Platforms, DevOps, IoT, and cutting-edge technologies like Gen AI & AI/ML. It has delivered over 500 consulting projects for 250+ organizations in 30+ countries as it continues to empower professionals and enterprises to thrive in the digital-first world.

FAQs

1. What embedding dimension should I choose when creating the vector index?

ANS: – The dimension must match the output size of your embedding model.

For Titan Text Embedding v2, use 1024.
For other models (like Cohere embeddings), check the model’s documentation.

2. Can I store non-text data in Amazon S3 Vectors?

ANS: – Yes. Embeddings can come from text, images, or other modalities. For example, you could store image embeddings (using Titan Image Embedding) and build a semantic image search system.

3. What distance metric should I use: cosine, dot, or Euclidean?

ANS: –

Cosine is most common for semantic similarity in NLP tasks.
The dot product may be used in specialized cases (like normalized vectors).
Euclidean is suitable when absolute vector distance matters.

WRITTEN BY Shantanu Singh

Shantanu Singh is a Research Associate at CloudThat with expertise in Data Analytics and Generative AI applications. Driven by a passion for technology, he has chosen data science as his career path and is committed to continuous learning. Shantanu enjoys exploring emerging technologies to enhance both his technical knowledge and interpersonal skills. His dedication to work, eagerness to embrace new advancements, and love for innovation make him a valuable asset to any team.