Query Optimization / Identify and Resolve Slow Queries
In this video, we're going to identify and analyze slow queries using the MongoDB shell. This is a great option if you're on a workstation with restricted network access or you're running an on prem MongoDB cluster.
You'll learn how to use the MongoDB database profiler, logs, and the explain command in the MongoDB shell to identify and analyze slow queries. Let's get started. The database profiler is a built in tool in MongoDB that allows us to collect and analyze data about the performance of operations executed against your database.
It records information about query execution times, the number of documents that were examined during the query, whether or not indexes were used, and other metrics that are crucial for performance tuning and optimization.
We can configure the profiler to collect data at different levels of detail, depending on the data that you want to gather.
The default setting is level zero, which does not profile any operations.
The most commonly used level is level one, which collects data for operations that take longer than a specified threshold.
The default threshold is one hundred milliseconds.
And finally, there's level two, which collects data for all operations regardless of execution time. This level provides the most detail but can impact performance due to the overhead of logging every operation.
Let's look at an example of it in action now. In this case, we're only looking for slow queries, so we'll set the profiler to level one and execute this command in MongoSH.
This command sets the profiling level to capture all queries that take longer than fifty milliseconds to execute rather than the default of one hundred. We define this threshold using the slow ms parameter.
Once the profiler is enabled, data can be collected immediately.
The profiler writes all data it collects to a system dot profile collection, a capped collection in each database that stores profiling data on database operations.
Now that our workload has been running for a while, we execute this command to retrieve profiler data from the system dot profile collection and check which queries have the longest execution time. The command retrieves five queries with the longest execution time and sorts them in descending order. Here are a few key fields we should focus on. The first is NS, short for namespace.
This indicates the database and collection where the query is performed.
The next is OP or operation type, which is an operation type such as query, insert, update, or delete.
Then we have command, which represents the query structure itself.
Millies gives us the execution time of the query. And finally, plan summary shows a brief description of the query execution plan. Let's see what each of these look like in our data. Here are the values for those fields from our top result.
We can see that the operation type was a query on sample underscore Airbnb dot listings and reviews underscore big underscore collection, and it took over three and a half seconds because a collection scan was used. Let's dig deeper into this slow query by running the explain method. The explain method gives us a lot more detail with execution stats about a query. The execution stats help us see that there were no rejected plans during the collection scan. This means the query planner was unable to compute alternative plans using different indexes or no other suitable indexes were available.
Now we know our first step to optimize the query will be to add an index to the collection. We can iterate on this process for any other underperforming queries that we want to analyze.
Once we're done, we'll disable the profiler to prevent it from affecting future performance.
We can do that by setting the profiling level to zero.
Now let's shift our attention to logs, which we can also use to identify slow queries in MongoSH.
By default, MongoDB logs any queries with execution times longer than one hundred milliseconds. To find slow queries using log files, we search for the term slow query in the logs themselves.
From the shell interface, we use the grep command to search for slow queries in the logs. We look for patterns in the log files to help isolate the queries that we want to focus on. These patterns include queries that appear frequently in the logs, queries with high execution times, and queries that scan a large number of documents. For example, we can identify the same slow query as we did previously by looking in the log files and searching for these patterns. But how can we identify them? An easy way to identify a repeated query when looking at a log file like this one is to look for a repeated query hash as seen here or a repeated filter in the log entry like this one. Queries that are executed often can consume significant resources, which leads to increased load and slower response time for other operations.
Next, to identify queries with high execution times, we can look at the working millis and duration millis values.
If we were investigating multiple query shapes, it would make sense to prioritize the queries that took the longest, as these would have the greatest impact on overall performance.
And finally, to find queries that scan a large number of documents, we can look at the docs examined value.
A high level of documents scanned often indicates that the query is not using indexes effectively. This leads to full collection scans, resulting in excessive CPU and IO load. Okay. So you've learned how to examine slow queries with the database profiler and logs in the MongoDB shell. We can also use the execution stats from the explain method to get detailed information about the slow query and how it was executed. Let's execute the last query we looked at with the explain method to analyze it in more detail.
Analyzing the execution stats, we can see a collection scan was run, and then a projection was used. The execution time was well over a second, and there were no rejected plans. This information indicates that this query would also benefit from an index. And since it uses a projection, we may be able to optimize the index, which we'll do later. Nice work. In this video, we explored how to identify and analyze slow queries using the MongoDB database profiler, log files, and by checking the execution stats from the explain command.
By leveraging these tools, we can pinpoint which queries need to be optimized and and how to go about improving them.
