Case Study

An E-Commerce Platform Achieves Seamless AWS Data Migration to Enhance Availability, Lower Costs, and Meet Evolving User Needs

Download the Case Study
Industry

E-Commerce

Expertise

Amazon EMR, Amazon VPC, Amazon S3, AWS Glue, Amazon Athena, Amazon Redshift, Amazon EMR Notebooks

Offerings/Solutions

Efficient migration of data from GCP to Amazon S3, enabling seamless analytics with 47,965 tables, verified within 5 days, and now utilizing Amazon Athena for enhanced performance.

About the Client

Quikr, launched in 2008, is a prominent online marketplace in India. It enables individuals and businesses to trade products and services, boasting over 30 million monthly users and being one of the country’s largest classifieds platforms. 

Highlights

5 days

Rapid Data Migration

8000+ AWS Glue Crawlers

Scalable Data Structure

AWS Data Analytics Services

Cost-effective Analytics

The Challenge

Quikr’s goal was to boost platform availability, reduce costs, and adapt to its 30+ million users. This involved migrating a massive 200 TB dataset, including 47,965 BigQuery tables and 82 GCS buckets, from GCP to AWS. Converting BigQuery data to Parquet for better analytics was a challenge. Throughout the migration, Quikr prioritized minimal disruptions to its online platform and ensured secure, reliable data transfer. 

Solutions

  • We used Amazon EMR clusters to migrate 150 TB of data from GCP BigQuery to Amazon S3. 
  • The Amazon EMR cluster was set up on a VPC with spot instances for cost savings. 
  • Spot Amazon EC2 instances reduced costs by up to 90% compared to on-demand instances. 
  • We migrated 82 GCS Coldline data buckets to a single Amazon S3 bucket as per client requirements. 
  • Custom Python scripts on Amazon EMR Notebooks validated table data before and after migration. 
  • AWS Glue was created with Terraform to auto-configure databases and tables for Amazon Athena. 
  • Amazon S3 was used for uploading sales orders, inventory, and trends data. 
  • Amazon Glue ensured availability, durability, and scalability for ETL. 
  • Processed data from AWS Glue jobs were loaded into Amazon Redshift for reporting and business intelligence. 
  • Created a custom PySpark file for data migration with configurable source and destination. 
  • Wrote a Python script to assign PySpark jobs to the Amazon EMR cluster based on CSV values. 
  • Used DistCp for GCS to Amazon S3 data migration, automated through Python scripting. 

The Results

Data migration from GCP to AWS was swift, with efficient storage in Amazon S3, the creation of 8000+ AWS Glue crawlers, Amazon Athena for verification, and data conversion to Parquet, all enabling seamless transition and cost-effective analytics for the client. 

Download the Case Study

AWS Partner - Migration Services Competency

Pioneering Migration space by being an AWS Partner - Migration Services Competency.

Learn More

An authorized partner for all major cloud providers

A cloud agnostic organization with the rare distinction of being an authorized partner for AWS, Microsoft, Google and VMware.

Learn More

A house of strong pool of certified consulting experts

150+ cloud certified experts in AWS, Azure, GCP, VMware, etc.; delivered 200+ projects for top 100 fortune 500 companies.

Learn More

Get The Most Out Of Us

Our support doesn't end here. We have monthly newsletters, study guides, practice questions, and more to assist you in upgrading your cloud career. Subscribe to get them all!