Voiced by Amazon Polly |
Introduction
Transferring data between Amazon S3 buckets across AWS accounts using DataSync offers a streamlined and efficient way to securely move large volumes of data. AWS DataSync automates the process, ensuring data is transferred quickly and reliably, with features like incremental transfers and data validation. This method is ideal for migrating, synchronizing, or backing up data between S3 buckets in different accounts, all while maintaining security and integrity through AWS IAM roles and policies.
What is Data sync? How it works?
AWS DataSync is an online data transfer and discovery service that simplifies data migration and helps you quickly, easily, and securely transfer your file or object data to, from, and between AWS storage services.
On-premises storage transfers
DataSync works with the following on-premises storage systems:
- Network File System (NFS)
- Server Message Block (SMB)
- Hadoop Distributed File Systems (HDFS)
- Object storage
AWS storage transfers
AWS DataSync works with the following AWS storage services:
- Amazon S3
- Amazon EFS
- Amazon FSx for Windows File Server
- Amazon FSx for Lustre
- Amazon FSx for OpenZFS
- Amazon FSx for NetApp ONTAP
Use cases
Here are some main use cases for AWS DataSync:
- Migrate Data: Rapidly transfer active datasets to AWS storage with automatic encryption and data validation.
- Archive Cold Data: Move infrequently accessed data to long-term storage like Amazon S3 Glacier to free up on-premises capacity.
- Replicate Data: Copy data to various Amazon S3 storage classes or Amazon EFS and FSx for different storage needs.
- In-Cloud Processing: Transfer data to or from AWS for faster processing in industries such as machine learning, media, finance, and oil and gas.
Benefits
Using DataSync offers the following benefits:
- Simplify Migration Planning: Automated data collection and recommendations with AWS DataSync Discovery reduce time, effort, and costs, aiding budget planning and validating assumptions as you approach your migration.
- Automate Data Movement: AWS DataSync streamlines data transfers between storage systems and services, automating data-transfer processes and required infrastructure for high-performance and secure transfers.
- Transfer Data Securely: Provides end-to-end security, including encryption and integrity validation, while accessing AWS storage via AWS IAM roles and Amazon VPC endpoints to enhance data security.
- Move Data Faster: Accelerates transfers using a purpose-built protocol and multi-threaded architecture, speeding up migrations, analytics workflows, and data protection.
- Reduce Operational Costs: Cost-effective data movement with flat per-gigabyte pricing, eliminating the need for custom scripts or expensive transfer tools.
Pioneers in Cloud Consulting & Migration Services
- Reduced infrastructural costs
- Accelerated application deployment
Demo
Steps to migrate Amazon S3 bucket data across AWS accounts:
Pre-requisites
- For your source AWS account, there are two sets of permissions to consider with this kind of cross-account transfer:
- User permissions that allow a user to create AWS DataSync locations and tasks.
- AWS DataSync service permissions allow AWS DataSync to transfer data to the destination account bucket.
- Disable the destination bucket’s access control lists
User permissions:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
{ "Version": "2012-10-17", "Statement": [ { "Sid": "SourceUserRolePermissions", "Effect": "Allow", "Action": [ "datasync:CreateLocationS3", "datasync:CreateTask", "datasync:DescribeLocation*", "datasync:DescribeTaskExecution", "datasync:ListLocations", "datasync:ListTaskExecutions", "datasync:DescribeTask", "datasync:CancelTaskExecution", "datasync:ListTasks", "datasync:StartTaskExecution", "iam:CreateRole", "iam:CreatePolicy", "iam:AttachRolePolicy", "iam:ListRoles", "s3:GetBucketLocation", "s3:ListAllMyBuckets" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "iam:PassRole" ], "Resource": "*", "Condition": { "StringEquals": { "iam:PassedToService": [ "datasync.amazonaws.com" ] } } } ] } |
Step 1: In your source account, create an AWS DataSync AWS IAM role for destination bucket access
Step 2: In your destination account, update your Amazon S3 bucket policy
In your destination account, modify the destination Amazon S3 bucket policy to include the “DataSync IAM role” that you created in your source account.
Step 3: Create your AWS DataSync locations:
Create the AWS DataSync locations for your source and destination S3 buckets.
- Create your AWS DataSync Source location
- Create your AWS DataSync destination location
While still in your source account, launch CloudShell from the console:
Run the following command
- Replace destination-bucket with the name of the Amazon S3 bucket in your destination account.
- If your destination bucket is in a different Region than your source bucket, replace destination-bucket-region with the Region where the destination bucket resides
- Replace source-account-id with the source AWS account ID.
- Replace source-datasync-role with the AWS DataSync IAM role you created in your source account.
If the command returns a DataSync location ARN similar to this, you successfully created the location:
- Create Task
- Run the Task:
- Task History
Conclusion
In this blog article, we explored how to set up an AWS DataSync task that transfers objects between Amazon S3 buckets step-by-step without requiring the installation of an agent on Amazon EC2. Guidance on configuring tasks for cross-region and cross-account use cases was given in additional stages.
Drop a query if you have any questions regarding AWS DataSync and we will get back to you quickly.
Making IT Networks Enterprise-ready – Cloud Management Services
- Accelerated cloud migration
- End-to-end view of the cloud environment
About CloudThat
CloudThat is a leading provider of Cloud Training and Consulting services with a global presence in India, the USA, Asia, Europe, and Africa. Specializing in AWS, Microsoft Azure, GCP, VMware, Databricks, and more, the company serves mid-market and enterprise clients, offering comprehensive expertise in Cloud Migration, Data Platforms, DevOps, IoT, AI/ML, and more.
CloudThat is the first Indian Company to win the prestigious Microsoft Partner 2024 Award and is recognized as a top-tier partner with AWS and Microsoft, including the prestigious ‘Think Big’ partner award from AWS and the Microsoft Superstars FY 2023 award in Asia & India. Having trained 650k+ professionals in 500+ cloud certifications and completed 300+ consulting projects globally, CloudThat is an official AWS Advanced Consulting Partner, Microsoft Gold Partner, AWS Training Partner, AWS Migration Partner, AWS Data and Analytics Partner, AWS DevOps Competency Partner, AWS GenAI Competency Partner, Amazon QuickSight Service Delivery Partner, Amazon EKS Service Delivery Partner, AWS Microsoft Workload Partners, Amazon EC2 Service Delivery Partner, Amazon ECS Service Delivery Partner, AWS Glue Service Delivery Partner, Amazon Redshift Service Delivery Partner, AWS Control Tower Service Delivery Partner, AWS WAF Service Delivery Partner and many more.
To get started, go through our Consultancy page and Managed Services Package, CloudThat’s offerings.
FAQs
1. Can AWS DataSync handle large-scale migrations, and what are the best practices for optimizing performance?
ANS: – AWS DataSync is designed to handle large-scale migrations efficiently. To optimize performance, consider these best practices:
- Parallel Transfers: Configure tasks to use multiple parallel data transfer agents to speed up the migration process.
- Data Compression: Use compression to reduce the amount of data being transferred if supported and suitable for your use case.
- Network Bandwidth: Ensure sufficient bandwidth to handle the data transfer volume without impacting other network activities.
- Incremental Transfers: Take advantage of AWS DataSync’s capability to transfer only the changed data after the initial migration, which reduces the volume of data transferred in subsequent tasks.
2. How can I monitor and troubleshoot data migration tasks using AWS DataSync?
ANS: – You can monitor and troubleshoot data migration tasks using AWS DataSync through several methods:
- AWS Management Console: View detailed task status and logs in the AWS DataSync section of the AWS Management Console.
- Amazon CloudWatch Logs: AWS DataSync integrates with Amazon CloudWatch, where you can view logs and set up alarms for task metrics and errors.
- Task History and Metrics: Examine the history of task runs, success rates, and performance metrics in the AWS DataSync dashboard to identify any issues.
WRITTEN BY Ayush Agarwal
Ayush Agarwal works as a Research Associate at CloudThat. He has excellent analytical thinking and carries an optimistic approach toward his life. He is having sound Knowledge of AWS Cloud Services, Infra setup, Security, WAR, and Migration. He is always keen to learn and adopt new technologies.
Click to Comment