Query Optimization / Enhance Read and Write Operations

7:49
When optimizing query performance in MongoDB, it's essential to achieve a harmonious balance between read and write operations. Indexes significantly enhance read efficiency, but they can introduce overhead when handling high volume writes. To mitigate this, you can use bulk writes, which allow you to perform many writing operations in a single batch, reducing the performance impact. In this video, we'll explore how to leverage bulk writes to streamline data ingestion and maintain index efficiency. Let's get started. As we've already discussed, indexes can drastically improve query response times. However, they can have unwanted effects when it comes to write operations. Why? Well, each time a document is inserted, updated, or deleted, MongoDB must also adjust the indexes to ensure they continue to accurately reflect the data in the collection. This requirement means that every write operation carries additional overhead, which can impact performance. So when planning your database schema and indexing strategy, we need to consider this trade off. A simple way to enhance write performance is to limit the number of indexes to only those necessary for your application's queries. This both reduces write overhead and simplifies maintenance. But when you're inserting large volumes of data, such as those generated from ETL processes and data warehousing, we should consider using bulk write operations as it may improve performance. Bulk write operations allow you to perform multiple write operations in a single command and can significantly reduce the time needed to write data. There are many benefits of bulk write operations. First, they minimize the number of network round trips, which reduces interaction with the server. It's like filling a large commercial truck for delivery instead of sending several small vehicles. Additionally, bulk writes reduce the server side processing load, ensuring that resources are used more effectively. This results in a smoother, less resource intensive process. Plus, with bulk operations, index updates are processed in batches, leading to improved performance and efficiency. Finally, bulk write operations can speed up availability of data for querying. This allows users and applications to retrieve and interact with data much more quickly. MongoDB provides two primary commands for conducting bulk writes, insert many and bulk write. Each caters to different needs and provides unique advantages. Insert many is the go to solution for simple bulk insertions. It's excellent for inserting batches of documents when additions are your primary task. But bulk write is the superior choice if your operation involves mixing inserts, updates, and deletes. By now, you should be familiar with the insert many command. If you need a refresher, check out our videos on CRUD operations in MongoDB. In this video, we're going to focus on the bulk write command. The bulk write command is quite flexible. It allows us to execute a wide range of operations, including inserts, updates, and deletes in a single request. This versatility helps optimize operations across different collections or namespaces and streamlines multiple tasks into one coherent process. Let's look at how a typical bulk write command is structured. We'll cover the most commonly used command fields only. For a full list of the command fields available, check out the MongoDB documentation. To make it easier to see the entire syntax, let's look at the command in two columns so we can walk through each section. The bulk write command is run as an argument for the DB admin command. The first section is the operations array, specified as ops. This is where we can list the various insert, update, or delete tasks that we want to include in the bulk operation. For each operation, we specify the type followed by an integer value. This integer value is tied to a collection, which we'll go over in a moment. Each of the operation types have varying fields to specify how to process the data. For example, an insert operation only includes a document, whereas updates and deletes include filtering documents to specify which documents to apply the update or deletes to. Updates also include an update modifier document that specifies the updates that will be applied. After the operations array is the NS info array. It's an array of namespaces, so it includes info on the database and collection. For an operation in the ops array to be successful, the corresponding namespace must be found here. Remember that integer value next to the operation type? That value corresponds to the array index of a namespace in the array. So the first name space is zero, the next is one, and so on. After the operations and name space info, we have the option section of our bulk write. Every bulk operation is unique. Here, we can choose whether or not to require the operations to be run-in an ordered fashion. We can also specify a write concern for the operation as well as additional options that we won't cover here. Now that we have a general understanding of the syntax, let's look at a few examples that use the Bulk Write Command. We'll start with a single example of using the bulk write command with a single collection in the MongoDB shell. First, we'll run a simple query looking for these two IDs. As we can see here, there is information for the first ID, but it's missing the market information, and no document exists for the second ID. We can insert the missing document and update the other using this bulk write command. The first operation inserts data for the new document, and the second operation updates the document that was missing market info. It includes the filter and the update modifications. To make sure we're performing our insert and updates to the right collection, we specify the database and collection name for the NS info field. Once we execute the command, we get metadata for the bulk operation. There are no errors, and we can see that we inserted one document and updated one document. Now when we run our query again, we can see that the market has been updated for the first document, and the second document now exists. We can also perform bulk write operations on multiple collections at a time. For example, let's say we want to keep track of all hosts in a separate collection for quick retrieval. Let's add some basic host information to the document we just added. And while we do that, let's add an entry for the host in the new collection called hosts. Our bulk write operation is very similar to the last one. The primary difference to be aware of is that our update has a value of zero, which corresponds to our first listed name space, and our insert has a value of one, which corresponds to our new name space. When we execute the command, we get the familiar success message. Then when we query for our corresponding documents, we can see that our updated document includes the host information, and our new collection contains the host entry that we just added. By performing these operations using bulk write, we optimize the performance by executing multiple write operations in a single efficient round trip to the database, reducing latency and improving throughput. Awesome work. We just performed bulk operations using bulk write on a single collection and multiple collections. Bulk write is a great tool to have at your disposal, but keep in mind that it may not always be the best choice. When deciding between bulk write and insert many, it's best to consider your operations needs. Insert many is the go to solution for simple bulk insertions. It's excellent for inserting batches of documents when additions are your primary task. But if your operation involves a mix of inserts, updates, and deletes, BulkWright is the superior choice.