AWS, Cloud Computing

3 Mins Read

Exporting Data from Amazon RDS PostgreSQL to Amazon S3 Using aws_s3.query_export_to_s3

Voiced by Amazon Polly

Introduction

As cloud architectures evolve, efficiently moving data from transactional databases to scalable storage like Amazon S3 becomes increasingly critical. Amazon RDS for PostgreSQL offers a built-in extension called aws_s3, which enables direct exports of query results to Amazon S3 through a secure, server-side operation.

One of the standout features provided by this extension is the aws_s3.query_export_to_s3 function greatly simplifies and accelerates data transfers without the need for intermediate storage or complex scripts.

In this blog post, we will walk you through the setup, permissions, usage, and best practices for using aws_s3.query_export_to_s3 to export data from Amazon RDS PostgreSQL to Amazon S3.

Pioneers in Cloud Consulting & Migration Services

  • Reduced infrastructural costs
  • Accelerated application deployment
Get Started

Why use aws_s3.query_export_to_s3?

This function provides several key advantages:

  • Direct export to Amazon S3: Eliminates the need first to store data locally.
  • Performance optimization: Leverages the Amazon RDS engine’s capabilities for efficient data transfer.
  • Simplified workflows: Reduces the need for external scripts, manual exports, or tools like pg_dump.

Prerequisites

Before proceeding, ensure you meet the following requirements:

  1. Running Amazon RDS for PostgreSQL version 5 or later.
  2. The aws_s3 extension must be enabled on your database.
  3. An AWS IAM role is assigned to your Amazon RDS instance with permissions to write to the target Amazon S3 bucket.
  4. An Amazon S3 bucket where the exported data will be stored.
  5. (Recommended) A VPC endpoint for Amazon S3 to enable secure, fast, private transfers.

rds

Step-by-Step Guide

Step 1: Create an AWS IAM Role

You will first need an AWS IAM role that Amazon RDS can assume to access your Amazon S3 bucket.

  1. Go to the AWS IAM Console and create a new role.
  2. Select AWS service -> Amazon RDS as the trusted entity.
  3. Attach a policy like the one below:

4. After creating the role, attach it to your Amazon RDS instance using the AWS CLI:

Step 2: Enable the aws_s3 Extension

Connect to your Amazon RDS PostgreSQL instance and run:

This ensures the necessary functionality is available.

Step 3: Export Query Results to Amazon S3

Use the aws_s3.query_export_to_s3 function to export your data directly to Amazon S3:

Parameters:

  • query: The SQL query whose result you want to export.
  • create_s3_uri: Helps create the target Amazon S3 URI using bucket name, object key, and region.
  • options: Additional options like format csv, HEADER, and DELIMITER for formatting.

Example Use Case

Suppose you want to export the last seven days of order data from an e-commerce platform:

Once complete, the CSV file will appear in the specified Amazon S3 bucket, ready for use with Amazon Athena, Amazon Redshift Spectrum, or other analytics tools.

Security Tips

  • Restrict bucket access with precise bucket policies.
  • Enable encryption using SSE-S3 or SSE-KMS to protect your data at rest.
  • Monitor access by enabling Amazon S3 server access logs and auditing activities through AWS CloudTrail.

Best Practices for Implementation

  • Partitioned Exports: Use query filters (e.g., WHERE timestamp >= now() – interval ‘1 day’) to export incremental data and avoid full-table exports.
  • File Format Optimization: Export to compressed or columnar formats like Parquet using external transformation jobs post-export.
  • Amazon S3 Bucket Lifecycle Rules: Set up retention policies to automatically archive or delete older export files.
  • Monitoring and Alerts: Log export failures and metrics via PostgreSQL logs or integrate with Amazon CloudWatch and AWS CloudTrail for visibility.

Use Cases in Production

  • Daily/Hourly Data Archival: Automate exports of transactional data to S3 for audit trails, compliance, or long-term storage.
  • Data Lake Integration: Move OLTP data to Amazon S3 for downstream analytics using Amazon Athena, Amazon Redshift Spectrum, or Amazon EMR.
  • Backup for Analytics: Export query-based snapshots of critical tables (e.g., user logs, financial transactions) to Amazon S3 for backup or analysis.
  • Hybrid Workflows: Combine on-demand exports with AWS Glue or Amazon SageMaker pipelines for machine learning and reporting.

Conclusion

The aws_s3.query_export_to_s3 function provides a highly efficient and straightforward method for exporting data from Amazon RDS PostgreSQL to Amazon S3. Eliminating the need for intermediate storage or tools ensures faster, safer, and more cost-effective data transfers.

For automation, you can schedule exports using the pg_cron extension or build workflows with AWS Lambda, AWS Step Functions, or Amazon EventBridge, creating a seamless bridge between your OLTP databases and cloud data lakes.

Drop a query if you have any questions regarding Amazon RDS PostgreSQL or Amazon S3 and we will get back to you quickly.

Empowering organizations to become ‘data driven’ enterprises with our Cloud experts.

  • Reduced infrastructure costs
  • Timely data-driven decisions
Get Started

About CloudThat

CloudThat is a leading provider of Cloud Training and Consulting services with a global presence in India, the USA, Asia, Europe, and Africa. Specializing in AWS, Microsoft Azure, GCP, VMware, Databricks, and more, the company serves mid-market and enterprise clients, offering comprehensive expertise in Cloud Migration, Data Platforms, DevOps, IoT, AI/ML, and more.

CloudThat is the first Indian Company to win the prestigious Microsoft Partner 2024 Award and is recognized as a top-tier partner with AWS and Microsoft, including the prestigious ‘Think Big’ partner award from AWS and the Microsoft Superstars FY 2023 award in Asia & India. Having trained 650k+ professionals in 500+ cloud certifications and completed 300+ consulting projects globally, CloudThat is an official AWS Advanced Consulting Partner, Microsoft Gold Partner, AWS Training PartnerAWS Migration PartnerAWS Data and Analytics PartnerAWS DevOps Competency PartnerAWS GenAI Competency PartnerAmazon QuickSight Service Delivery PartnerAmazon EKS Service Delivery Partner AWS Microsoft Workload PartnersAmazon EC2 Service Delivery PartnerAmazon ECS Service Delivery PartnerAWS Glue Service Delivery PartnerAmazon Redshift Service Delivery PartnerAWS Control Tower Service Delivery PartnerAWS WAF Service Delivery PartnerAmazon CloudFront Service Delivery PartnerAmazon OpenSearch Service Delivery PartnerAWS DMS Service Delivery PartnerAWS Systems Manager Service Delivery PartnerAmazon RDS Service Delivery PartnerAWS CloudFormation Service Delivery PartnerAWS ConfigAmazon EMR and many more.

FAQs

1. Can I export very large datasets with aws_s3.query_export_to_s3?

ANS: – Yes, but for extremely large datasets, exporting data in smaller chunks (such as by date range) is advisable to avoid timeout errors or memory issues.

2. Can I export to formats other than CSV?

ANS: – Currently, the function only supports the CSV format. You can adjust delimiters, headers, and other CSV-specific options via the options parameter.

WRITTEN BY Sunil H G

Sunil H G is a highly skilled and motivated Research Associate at CloudThat. He is an expert in working with popular data analysis and visualization libraries such as Pandas, Numpy, Matplotlib, and Seaborn. He has a strong background in data science and can effectively communicate complex data insights to both technical and non-technical audiences. Sunil's dedication to continuous learning, problem-solving skills, and passion for data-driven solutions make him a valuable asset to any team.

Share

Comments

    Click to Comment

Get The Most Out Of Us

Our support doesn't end here. We have monthly newsletters, study guides, practice questions, and more to assist you in upgrading your cloud career. Subscribe to get them all!