Lesson 9: How to Monitor Indexes / Learn
Code Summary: How to Monitor Indexes
To use the $indexStats aggregation operator to return statistics regarding the use of each index for the collection, use the aggregate method on the collection. In this example, we are using a database called sample_analytics and a collection called customers:
db.customers.aggregate([{ $indexStats: {} }])
The operator returns an array of objects, one for each index on the collection. In this case, based on the accesses.ops field values, you can determine which indexes are being used frequently and those that are not used at all.
[
{
name: '_id_',
key: { _id: 1 },
accesses: { ops: Long("0"), since: ISODate("2023-06-15T19:08:51.580Z") },
host: '<cluster>.mongodb.net:27017'
},
...
{
name: 'accounts_1',
key: { accounts: 1 },
accesses: { ops: Long("67"), since: ISODate("2023-06-21T20:20:25.955Z") },
host: '<cluster>.mongodb.net:27017'
},
{
name: 'email_1_username_1',
key: { email: 1, username: 1 },
accesses: { ops: Long("57"), since: ISODate("2023-06-21T20:20:25.997Z") },
host: '<cluster>.mongodb.net:27017'
},
{
name: 'username_1_email_1',
key: { username: 1, email: 1 },
accesses: { ops: Long("0"), since: ISODate("2023-06-21T20:20:26.040Z") },
host: '<cluster>.mongodb.net:27017'
}
]
To profile operations that happen on a MongoDB instance, you can enable the database profiler. In this example, the profiler is enabled by setting the first argument to 1 and setting a slowms threshold to 30, indicating that any operation that takes longer than 30 ms is considered slow. Slow operations are then recorded in a system.profile collection within the database. If we were to set the first argument to 0, the profiler would be disabled, and setting it to 2 would log all operations, regardless of how long they take.
db.setProfilingLevel(1, { slowms: 30 })
To test the profiler in action, run a query against the database that is rather taxing on resources. In this example, we are using the sample_airbnb database which is rather large. We are then sorting each document by host.
db.listingsAndReviews.find().sort({ host: 1 })
Run the following command to query the system.profile collection and return the most recently recorded query on the listingsAndReviews collection:
db.system.profile.find({ op: 'query', ns: 'sample_airbnb.listingsAndReviews'}).sort( {ts: -1}).limit(1)
After running the query, you should see the system.profile entry related to the query to sort all documents by host. The query exceeded the slowms threshold, required a collection scan (COLLSCAN), and an in memory sort (stage: 'SORT'). Also, take note of the ts field which gives a timestamp for the query.
{
op: 'query',
ns: 'sample_airbnb.listingsAndReviews',
command: {
find: 'listingsAndReviews',
filter: {},
sort: { host: 1 },
...
},
...
millis: 479,
planSummary: 'COLLSCAN',
execStats: {
stage: 'SORT',
...
}
},
ts: ISODate("2023-07-12T16:47:27.915Z"),
...
}
To speed up the query, create an index on the host field and run it again to see if it still shows up in the system.profile collection.
db.listingsAndReviews.createIndex({ host: 1 })
db.system.profile.find({ op: 'query', ns: 'sample_airbnb.listingsAndReviews'}).sort( {ts: -1}).limit(1)
After running query, you should see that same entry for the query before we created an index as evidenced by the timestamp (ts). The index appears to have helped this query avoid going past the 30ms threshold.
{
op: 'query',
ns: 'sample_airbnb.listingsAndReviews',
...
ts: ISODate("2023-07-12T16:47:27.915Z"),
...
}