Monitoring Tooling / Monitoring Query Performance with MongoDB Atlas Metrics and Real-Time Panel
MongoDB Atlas makes it easy to monitor all your essential metrics. In this video, we'll use the Atlas metrics and real time panel to monitor query performance. Let's get started.
We'll focus on a few key metrics for query performance, including query targeting, op counters, CPU usage, and memory utilization. And don't worry if you're unsure what these metrics are. We'll explain them later in this video. First, we'll use the Atlas metrics panel to select the metrics we want to monitor.
Then we'll performance test our application and view the impact this has on our metrics. To get to the Atlas metrics panel, we need to log in to Atlas.
Once logged in from our clusters page, we navigate to our applications cluster. In this case, our cluster is the performance skill cluster.
When we find our cluster, we click on view monitoring to go to our metrics page.
The Atlas metrics page displays charts for a variety of hardware, database, and search metrics that are valuable for assessing the overall health of our system.
Most metrics related to query performance are automatically displayed on the Atlas metrics page.
This is where we see query targeting, op counters, process and system CPU, system memory, swap usage, and query execution times.
You can also specify the time period and granularity of these metrics so you can zoom in or out over time for the query performance to identify trends or issues.
Let's take a closer look at each of these key metrics.
Query targeting tells us how efficiently queries are executing.
When we click on the chart information icon for query targeting, the chart displays two key ratios.
One is the ratio of index key scanned to documents returned. This shows how efficiently indexes are being used.
The other is the ratio of scanned objects or documents to the number of objects returned.
This shows how many documents had to be examined to satisfy a query versus the number of documents actually returned.
This shows the number of index items scanned to the number of documents returned by queries.
The scanned objects to returned objects shows the ratio of documents scanned in order to find the correct documents to satisfy the query.
For both metrics, a one to one ratio indicates high query efficiency and minimal unnecessary data operations.
If we see this value spike, it means the database is scanning large numbers of documents even using the index. This can severely impact the query performance.
Next, op counters show us how many CRUD operations are occurring during a specific time frame. Here, we can see that we have a few commands executing, but a significant spike in queries.
If we identify performance issues in the same window, we can most likely rule out issues related to inserts, updates, and deletes.
Next, the process CPU and system CPU metrics can tell us if our operations are consuming excessive CPU resources.
Here's an example of how these metrics may look in Atlas.
It's normal to see the CPU values increase as our load increases, but excessive spiking could indicate issues with crew performance.
Finally, we have the network traffic metric. This lets us see how many requests are coming into the database and how much data was transferred in or out over the wire during that period.
High levels of network traffic can affect query performance by increasing latency, so it's worth checking this metric out if you experience issues.
These metrics give us a snapshot of the database performance at a given moment. But if we need to see what is happening now, we need to use the real time panel. The real time panel shows us real time data about query performance.
The system metrics are displayed at the top of the page.
Operational metrics are on the left, and collection specific information is displayed on the right.
While we can see some variance in the chart data, the ranges are actually quite small, and we can see that the action is occurring in the admin and the config databases.
Let's put up our testing application.
Almost immediately, we can see a rise in the number of operations occurring against our database, and the products collection is at the top of the hottest collections.
If we look at our system metrics at the top, we can see the number of connections has risen significantly, indicating multiple users connecting from one of our application servers.
This leads to an expected spike in the CPU usage as those users begin sending queries to our database.
By looking at the hottest collections, we can see that the products collection in our database is now the most utilized collection on the server.
When the real time metrics show dramatic spikes in CPU or a large increase in query execution times, our next step is to identify the problematic queries.
Check out our skill on query optimization for more details.
Fantastic. In this video, we learned how to access and interpret key metrics in MongoDB Atlas to monitor query performance.
Then we explored using the real time panel to observe immediate impacts of application activity on these metrics.
