Sharding Strategies / Training
- Understanding the sharding architecture and deploying a sharded cluster
- Partitioning collections within a sharded cluster
- Optimizing a shard key for a partitioned collection
- Modifying your sharding approach over time through resharding
mongos
routers, config servers, and shards. The following diagrams show how a query flows through a sharded cluster.mongos
routers.mongos
routers act as an intermediary between client applications and the MongoDB shards. Applications always interact with the cluster through mongos
, which accepts queries from applications and routes them to the appropriate shard based on metadata about the collection locations, data ranges on each shard, and the primary shard of each database.Shards store the actual data (each a replica set with a primary (P) and secondary (S) members) and contain subdivisions of this data called chunks.
In a sharded cluster, not all collections need to be sharded. Each database has a primary shard where all unsharded collections are stored by default.
Sharding is a living architecture that supports data management and scaling, operating continuously behind the scenes to ensure optimal performance.
When setting up your sharding architecture, you'll need to make a few foundational decisions that affect how MongoDB distributes data, manages workload, and scales.
Well done! You now know sharding architecture in MongoDB and how to deploy a sharded cluster. Here are some key points to remember:
- Sharded Cluster Architecture: Shards store data,
mongos
sends queries, and config servers keep metadata. - Deployment: Use MongoDB Atlas for an easier setup or set up manually for more control.
- When to Add Shards: Consider adding more shards when resource utilization reaches 60-70%.
Next, we’ll explore how to partition a collection across shards.