Performance Tools and Techniques / Tools for Analyzing Slow Queries

6:33
Moving from observation of performance issues to action, this lesson introduces tools to address slow queries. In this lesson, we'll explore command line utilities and MongoDB Atlas features that give you granular insight into query execution, allowing you to identify slow queries, understand their impact, and take precise steps towards optimization. Let's get started. A slow query is any read operation that exceeds an acceptable execution time. By default, MongoDB has a threshold of one hundred milliseconds. For resource intensive queries like aggregations and generating large reports, it may be acceptable to go beyond this limit, but these should be carefully reviewed. Slow queries reduce application responsiveness and impact user satisfaction, especially during peak traffic when delays can cascade across systems. Beyond profiling queries, we can use additional tools for a deeper query analysis, including the MongoDB logs and the explain command. Let's start with the logs. Operations that exceed the one hundred millisecond threshold are logged with key details about query execution. We can change the threshold via DB dot set profiling level if needed. If we set it too high, we won't capture anything below the threshold. But if we set it too low, we risk running out of disk space as we log a higher number of operations. For this example, we'll set the threshold to one hundred fifty milliseconds. This change will persist until the next server restart. When evaluating log output, we want to pay attention to particular fields. Slow query or slow operation signifies that the command surpassed the slow operation threshold. Execution time millis shows the actual time the command took to execute. NS contains the database and collection name, and query or filter indicates the query pattern used. A simple way to extract slow query entries from MongoDB logs is to use a command line tool. For UNIX systems, this could be a grep command like this. The grep command, the string being searched, and the path to the log file. This retrieves all lines where MongoDB flagged a slow query in the log file. Each JSON entry is a single line and is hard to parse. To make it easier to read, we can use the JQ command. JQ is a popular command line processor for JSON. When we pipe the CREP output into JQ, the output is pretty printed and easier to read. But it's still so much information that we had to split it across five slides just so you could read it. Let's filter down the log entries further with JQ. We can filter slow operations based on execution time. Here, we use grep to find the slow query entries. We then pipe the output to JQ and filter on objects where the working millis are longer than two hundred milliseconds. As seen from our accounts here, this filters the number of log results returned to just the queries exceeding our desired threshold. Finally, since our database supports multiple applications which access different collections, we can also filter log entries by name space. We can further filter out the information we want to see by supplying a document of fields to JQ like this. Now we can only see the log entries that contain slow query from our desired collection. Here, we see a collection scan was performed, and thousands of documents were examined for a small result set. This suggests adding an appropriate index could optimize the query by avoiding costly collection scans and improving efficiency. The MongoDB log file is great for discovering and analyzing slow queries. But if we already know the query we wanna analyze, we can use the explain method in the MongoDB shell. For example, we have a slow query that matches on fields like address dot market and uses range scans on price. We can run the explain command to get information about how the query executes. The output includes detailed metrics and analytics on how the query was processed. Let's look at the key pieces of information it provides. Execution time millis represents the total duration spent processing, filtering, sorting, and returning query results. High execution times often indicate inefficiencies in query execution. The explain output reveals whether indexes were used or ignored. This helps verify if indexes match the query's filter or sort criteria. Docs examined versus n returned shows the document scanned versus those returned. This large mismatch signals more possible indexing issues. The ideal is a one to one ratio between these values. Finally, we can see the query stages in the output. MongoDB processes queries and stages, which represent the various operations performed to retrieve the requested data. Here, we see call scan, where the database reads the entire collection to find matching documents. The different stages of a query are often tied to performance bottlenecks. Costly stages include collection scans, index scans using the wrong index, and sort stages. Each of these can have a serious impact on query performance. Collection scans happen when there is no index to support the query, forcing MongoDB to scan the entire collection. This is problematic for large datasets. Adding an appropriate index is key to improving performance. An index scan shows MongoDB used an index to find data, which is preferred over collection scans. However, we should verify if the right index is being used for slow queries. Sorting large amounts of data without a matching index can cause slow sort stages and delays. To prevent this, follow the ESR or equality sort range guideline by building indexes specifically for the fields being sorted. This enables MongoDB to quickly retrieve data in the correct order directly from the index. In our example query, the dot explain command reveals inefficiencies. We see a collection scan because there's no appropriate index on address dot market or price. Delays will occur during a sort stage that lacks an index because the entire sorting process has to happen in the computer's memory. By analyzing these indicators, we can determine actionable steps to optimize the query, such as adding a compound index on address dot market and price to reduce collection scans and sorting delays. Great job. In this video, we learned about analyzing and optimizing slow queries to improve database performance. First, we used MongoDB logs with grep and j q to identify slow queries and filter relevant data. Then we utilize the b b dot collection dot explain method to dive deeper into query execution plans and identify inefficiencies like collection scans and missing indexes.