Voiced by Amazon Polly |
Introduction
As cloud architectures evolve, efficiently moving data from transactional databases to scalable storage like Amazon S3 becomes increasingly critical. Amazon RDS for PostgreSQL offers a built-in extension called aws_s3, which enables direct exports of query results to Amazon S3 through a secure, server-side operation.
One of the standout features provided by this extension is the aws_s3.query_export_to_s3 function greatly simplifies and accelerates data transfers without the need for intermediate storage or complex scripts.
In this blog post, we will walk you through the setup, permissions, usage, and best practices for using aws_s3.query_export_to_s3 to export data from Amazon RDS PostgreSQL to Amazon S3.
Pioneers in Cloud Consulting & Migration Services
- Reduced infrastructural costs
- Accelerated application deployment
Why use aws_s3.query_export_to_s3?
This function provides several key advantages:
- Direct export to Amazon S3: Eliminates the need first to store data locally.
- Performance optimization: Leverages the Amazon RDS engine’s capabilities for efficient data transfer.
- Simplified workflows: Reduces the need for external scripts, manual exports, or tools like pg_dump.
Prerequisites
Before proceeding, ensure you meet the following requirements:
- Running Amazon RDS for PostgreSQL version 5 or later.
- The aws_s3 extension must be enabled on your database.
- An AWS IAM role is assigned to your Amazon RDS instance with permissions to write to the target Amazon S3 bucket.
- An Amazon S3 bucket where the exported data will be stored.
- (Recommended) A VPC endpoint for Amazon S3 to enable secure, fast, private transfers.
Step-by-Step Guide
Step 1: Create an AWS IAM Role
You will first need an AWS IAM role that Amazon RDS can assume to access your Amazon S3 bucket.
- Go to the AWS IAM Console and create a new role.
- Select AWS service -> Amazon RDS as the trusted entity.
- Attach a policy like the one below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:PutObject", "s3:AbortMultipartUpload", "s3:ListBucket", "s3:GetBucketLocation" ], "Resource": [ "arn:aws:s3:::your-bucket-name", "arn:aws:s3:::your-bucket-name/*" ] } ] } |
4. After creating the role, attach it to your Amazon RDS instance using the AWS CLI:
1 2 3 4 |
aws rds add-role-to-db-instance \ --db-instance-identifier your-db-instance-id \ --role-arn arn:aws:iam::<your-account-id>:role/<your-role-name> \ --feature-name S3_EXPORT |
Step 2: Enable the aws_s3 Extension
Connect to your Amazon RDS PostgreSQL instance and run:
1 |
CREATE EXTENSION IF NOT EXISTS aws_s3 CASCADE; |
This ensures the necessary functionality is available.
Step 3: Export Query Results to Amazon S3
Use the aws_s3.query_export_to_s3 function to export your data directly to Amazon S3:
1 2 3 4 5 |
SELECT aws_s3.query_export_to_s3( 'SELECT * FROM your_table WHERE created_at >= current_date - interval ''7 days''', aws_commons.create_s3_uri('your-bucket-name', 'exports/data_export.csv', 'us-west-2'), options := 'format csv, HEADER' ); |
Parameters:
- query: The SQL query whose result you want to export.
- create_s3_uri: Helps create the target Amazon S3 URI using bucket name, object key, and region.
- options: Additional options like format csv, HEADER, and DELIMITER for formatting.
Example Use Case
Suppose you want to export the last seven days of order data from an e-commerce platform:
1 2 3 4 5 |
SELECT aws_s3.query_export_to_s3( 'SELECT order_id, customer_id, total_amount, order_date FROM orders WHERE order_date >= now() - interval ''7 days''', aws_commons.create_s3_uri('ecommerce-data-bucket', 'weekly_exports/orders_weekly.csv', 'us-east-1'), options := 'format csv, HEADER' ); |
Once complete, the CSV file will appear in the specified Amazon S3 bucket, ready for use with Amazon Athena, Amazon Redshift Spectrum, or other analytics tools.
Security Tips
- Restrict bucket access with precise bucket policies.
- Enable encryption using SSE-S3 or SSE-KMS to protect your data at rest.
- Monitor access by enabling Amazon S3 server access logs and auditing activities through AWS CloudTrail.
Best Practices for Implementation
- Partitioned Exports: Use query filters (e.g., WHERE timestamp >= now() – interval ‘1 day’) to export incremental data and avoid full-table exports.
- File Format Optimization: Export to compressed or columnar formats like Parquet using external transformation jobs post-export.
- Amazon S3 Bucket Lifecycle Rules: Set up retention policies to automatically archive or delete older export files.
- Monitoring and Alerts: Log export failures and metrics via PostgreSQL logs or integrate with Amazon CloudWatch and AWS CloudTrail for visibility.
Use Cases in Production
- Daily/Hourly Data Archival: Automate exports of transactional data to S3 for audit trails, compliance, or long-term storage.
- Data Lake Integration: Move OLTP data to Amazon S3 for downstream analytics using Amazon Athena, Amazon Redshift Spectrum, or Amazon EMR.
- Backup for Analytics: Export query-based snapshots of critical tables (e.g., user logs, financial transactions) to Amazon S3 for backup or analysis.
- Hybrid Workflows: Combine on-demand exports with AWS Glue or Amazon SageMaker pipelines for machine learning and reporting.
Conclusion
The aws_s3.query_export_to_s3 function provides a highly efficient and straightforward method for exporting data from Amazon RDS PostgreSQL to Amazon S3. Eliminating the need for intermediate storage or tools ensures faster, safer, and more cost-effective data transfers.
Drop a query if you have any questions regarding Amazon RDS PostgreSQL or Amazon S3 and we will get back to you quickly.
Empowering organizations to become ‘data driven’ enterprises with our Cloud experts.
- Reduced infrastructure costs
- Timely data-driven decisions
About CloudThat
CloudThat is a leading provider of Cloud Training and Consulting services with a global presence in India, the USA, Asia, Europe, and Africa. Specializing in AWS, Microsoft Azure, GCP, VMware, Databricks, and more, the company serves mid-market and enterprise clients, offering comprehensive expertise in Cloud Migration, Data Platforms, DevOps, IoT, AI/ML, and more.
CloudThat is the first Indian Company to win the prestigious Microsoft Partner 2024 Award and is recognized as a top-tier partner with AWS and Microsoft, including the prestigious ‘Think Big’ partner award from AWS and the Microsoft Superstars FY 2023 award in Asia & India. Having trained 650k+ professionals in 500+ cloud certifications and completed 300+ consulting projects globally, CloudThat is an official AWS Advanced Consulting Partner, Microsoft Gold Partner, AWS Training Partner, AWS Migration Partner, AWS Data and Analytics Partner, AWS DevOps Competency Partner, AWS GenAI Competency Partner, Amazon QuickSight Service Delivery Partner, Amazon EKS Service Delivery Partner, AWS Microsoft Workload Partners, Amazon EC2 Service Delivery Partner, Amazon ECS Service Delivery Partner, AWS Glue Service Delivery Partner, Amazon Redshift Service Delivery Partner, AWS Control Tower Service Delivery Partner, AWS WAF Service Delivery Partner, Amazon CloudFront Service Delivery Partner, Amazon OpenSearch Service Delivery Partner, AWS DMS Service Delivery Partner, AWS Systems Manager Service Delivery Partner, Amazon RDS Service Delivery Partner, AWS CloudFormation Service Delivery Partner, AWS Config, Amazon EMR and many more.
FAQs
1. Can I export very large datasets with aws_s3.query_export_to_s3?
ANS: – Yes, but for extremely large datasets, exporting data in smaller chunks (such as by date range) is advisable to avoid timeout errors or memory issues.
2. Can I export to formats other than CSV?
ANS: – Currently, the function only supports the CSV format. You can adjust delimiters, headers, and other CSV-specific options via the options parameter.

WRITTEN BY Sunil H G
Sunil H G is a highly skilled and motivated Research Associate at CloudThat. He is an expert in working with popular data analysis and visualization libraries such as Pandas, Numpy, Matplotlib, and Seaborn. He has a strong background in data science and can effectively communicate complex data insights to both technical and non-technical audiences. Sunil's dedication to continuous learning, problem-solving skills, and passion for data-driven solutions make him a valuable asset to any team.
Comments