Monitoring Tooling / Utilizing Atlas Query Insights

3:18
When we ran a performance test on our application, we saw spikes in our query targeting and CPU metrics on the Atlas metrics panel. While this is useful, it doesn't give us all the information we need. In this video, we'll use query insights in Atlas to dig deeper and identify the cause of these spikes. Let's get started. MongoDB Atlas's query insights allows users to monitor, manage, and optimize database performance directly from the Atlas user interface. Query insights provides detailed namespace level metrics so you have a greater understanding on how well your database is performing. Query insights comprises two different features. The first is the name space insights, which provides collection level latency statistics and a view of performance trends over time. The other is the query profiler, which provides a view of slow and inefficient queries in a cluster over an extended time period. Let's see it in action. First, we navigate to the query insights page for our cluster. The page has two tabs, one for the namespace insights and one for the query profiler. Here on the namespace insights tab, we see two charts and a table. These contain information for the most active namespaces in our cluster. The chart show different namespaces in different colors, and we can get the legend if we hover over the chart values. If we want to add additional namespaces to the chart, we do so by pinning them in the table right here. By clicking the drop down, we can choose which metric we want to observe. We can also choose which types of operations we wanna see. Since we are concerned about slow queries, let's focus primarily on read operations and look at the total latency and operation count metrics. These metrics were relatively stable until here when a spike occurred on both charts. Our initial performance test shows a clear spike in total latency and operations on that specific collection. Since we know when the spikes happened, we'll switch to the query profiler tab to investigate why it happened. The query profiler page shows metrics related to specific queries. While the default chart shows operation execution time, we can also select and view other available metrics. Let's analyze the documents examined to return documents ratio during the period of increased query latency. Look here. An obvious outlier is apparent. Clicking on that data point shows more about the query, revealing a high documents examined to returned ratio. To dive deeper, let's click view more details. It looks like this particular query had no index coverage, so creating an index could significantly improve its performance. We can pin this namespace on a namespace insight page, making it easy to find when we wanna test the query again in the future. For example, after an index has been created to support it. Great job. In this video, we used query insights to look closer at our performance spikes. Specifically, we learned how to use namespace insights to identify the problematic collection and then the query profiler to pinpoint the inefficient query causing the issue.