High traffic learning cards application performance was hindered due to backend MongoDB database response time. Application with thousands of users with high request rates was becoming unresponsive with few concurrent requests. Expert advice was needed to improve Database performance and to document the design and performance considerations for MongoDB set up on AWS.
1. Improve database response time and in turn accelerate application page load which was initially around 60 Seconds
1. Investigate the learning App’s queries and determine long running and slow queries
2. Analyse current MongoDB database schema, indexes and shard keys
3. Analyse current MongoDB hardware on AWS
4. Suggest and implement database and hardware best practices for MongoDB sharded cluster
1. Initial investigation of database was done to determine –
a. Database schema
b. Shard keys used to partition the data across database nodes
c. Indexes created on the database and corresponding sizes
2. Initial investigation of Hardware showed that the hardware set up on AWS did not leverage Availability Zones by AWS and all replica sets consisted of two members each.
3. Initial investigation of Application was performed to narrow down the slow running queries
- MongoDB automatic Failover was failing due to less members in the replica set
- The shard key used was found to be inconsistent with the application query needs
- Although EBS volume used on AWS were IOPS enabled EBS volumes, the EC2 instances used were not optimized for better EBS performance
- Each member of the replicaset were moved to different Availability Zones on AWS and one Arbiter per replica set and automatic failover set up was verified
- Suggested new Shard Key based on the high frequency queries and the data design. The required collections were Un-sharded and Re-sharded with the new shard keys
- Based on the load metrics the hardware profiles on EC2 were changed to High Memory EC2 instances with EBS Optimization.
- For best performance, only one mongo process per host is advised. With appropriate sizing and resource allocation using virtualization or container technologies.
- For availability, multiple members of the same replica set should not be co-located on the same physical hardware or same data center or share any single point of failure such as a power supply.
- MongoDB supports write-ahead journaling of operations to facilitate crash recovery and node durability. Post data migration to AWS, enabling Journaling is recommended to ensure failsafe writes.
- Shard Keys for a sharded collection needs to be the one which would ensure maximum query distribution across the shards.
- On AWS, backup scripts can be written and executed on secondary machines to ensure application performance during the backup process.
- On AWS, disk snapshots are the recommended way to take regular back up of the MongoDB databases and config servers.
- Regular disaster recovery scenarios to be envisioned and mock disaster recovery can be performed.
- Mongo Management Service by MongoDB can be used to receive database usage and performance overview and alerts time to time.
- The page load time on the application was reduced to less than 5 seconds from more than 60 seconds.
- Database was highly available due to functioning Automatic failover and as AWS availability zones were leveraged appropriately.
- Better hardware profile also helped in improving hardware cost on AWS.