Cluster Reliability / Troubleshooting Sharding Issues
Code Recap: Troubleshooting Sharding Issues
Check sharded cluster health with the sh.status()
method
Use the sh.status()
method to display the current sharding status of a cluster, including information about shards, databases, and collections.
Code:
sh.status()
Example Output:
<dbname>.<collection>
shard key: { <shard_key> : <1_or_hashed> }
unique: <boolean>
balancing: <boolean>
chunks:
<shard_name1> <number_of_chunks>
<shard_name2> <number_of_chunks>
...
{ <shard_key>: <min_range1> } -->> { <shard_key> : <max_range1> } on : <shard_name> <last_modified_timestamp>
{ <shard_key>: <min_range2> } -->> { <shard_key> : <max_range2> } on : <shard_name> <last_modified_timestamp>
...
tag: <tag1> { <shard_key> : <min_range1> } -->> { <shard_key> : <max_range1> }
...
See more results by setting the verbose flag to true
Use the verbose flag with the sh.status()
method to display information for all available chunks, if you have twenty or more:
sh.status(true)
Check data distribution with sh.getShardedDataDistribution()
sh.getShardedDataDistribution()
Method
Use the sh.getShardedDataDistribution()
method to return a formatted report of the distribution of data across all sharded collections.
sh.getShardedDataDistribution()
Example Output:
[
{
ns: 'bookstore.customers',
shards: [
{
shardName: 'atlas-2sc5re-shard-0',
numOrphanedDocs: 0,
numOwnedDocuments: 539619,
ownedSizeBytes: 84180564,
orphanedSizeBytes: 0
},
{
shardName: 'atlas-2sc5re-shard-1',
numOrphanedDocs: 0,
numOwnedDocuments: 331178,
ownedSizeBytes: 51663768,
orphanedSizeBytes: 0
}
]
}
]
Analyze query workload with the configureQueryAnalyzer
admin command
configureQueryAnalyzer
admin command
Use the configureQueryAnalyzer
admin command to enable query sampling for a collection on a replica set or sharded cluster. The sampled queries help analyzeShardKey
compute metrics on the read and write distribution of a shard key.
Code:
db.adminCommand(
{
configureQueryAnalyzer: <string>,
mode: <string>,
samplesPerSecond: <double>
}
)
Analyze a new potential shard key with the db.collection.analyzeShardKey()
method
db.collection.analyzeShardKey()
method
Use the db.collection.analyzeShardKey()
method command to evaluate the effectiveness of a shard key by calculating metrics on read and write distribution patterns for a collection.
db.collection.analyzeShardKey(
<shard_key>,
{
keyCharacteristics: <bool>,
readWriteDistribution: <bool>,
sampleRate: <double>,
sampleSize: <int>
}
)
Example Output:
[
{
ns: 'bookstore.customers',
shards: [
{
shardName: 'atlas-2sc5re-shard-0',
numOrphanedDocs: 0,
numOwnedDocuments: 539619,
ownedSizeBytes: 84180564,
orphanedSizeBytes: 0
},
{
shardName: 'atlas-2sc5re-shard-1',
numOrphanedDocs: 0,
numOwnedDocuments: 331178,
ownedSizeBytes: 51663768,
orphanedSizeBytes: 0
}
]
}
]
Resharding a collection
Stop the balancer
Use the sh.stopBalancer()
method to stop the balancer during the resharding process:
sh.stopBalancer()
Reshard the collection to a new shard key or the same shard key:
Use the sh.reshardCollection
method to reshard the collection to the same shard key or a new shard key, depending on your circumstances and desired outcome:
sh.reshardCollection( "<database.collection>", { <new_shard_key> } )
Start the balancer
Use the sh.startBalancer()
method to start the balancer after the resharding process is complete:
sh.startBalancer()
Refining a shard key
Refine an existing shard key with the refineCollectionShardKey
admin command
Use the refineCollectionShardKey
admin command to refine the shard key for a sharded collection by adding one or more fields to the existing shard key, improving data distribution and/or query performance:
Code:
db.adminCommand(
{
refineCollectionShardKey: "<database>.<collection>",
{{ key: { <existing_key_specification>, <suffix1>: <1_or_hashed>, ... }}}
}
)