Search Fundamentals / Creating an Atlas Search Index: Data Types

6:11
Now, to ensure Atlas Search correctly interprets, stores, and optimizes that data for lightning fast queries, we must understand how it leverages the data types. Let's dive into this crucial next step in building your powerful search index. In this video, we'll cover data types that are supported by Atlas Search and define what index types to use in a search index. Let's quickly review where we are in the process of building the keyword movie search feature for mFlix, our movie streaming platform. After defining the mappings in the search index as static, we listed the fields that we want to index. Let's focus on the mappings document. We'll now need to supply a value for the data type for each field that we want to index. The data type dictates how the field is indexed, searched, and matched during query execution and ensures that operations such as filtering, scoring, and sorting behave as expected. Depending on whether you use dynamic or static mappings, Atlas Search supports different data types. Since we're using static mappings for this feature, let's start by looking at some of the options supported by MongoDB for static mappings. To begin with, Atlas Search supports common data types such as strings, numbers, booleans, and dates. MongoDB also supports other data types for static mappings, including arrays, object ID, document, and embedded documents. Let's explore these data types to get a better sense of what they do. First, let's talk about arrays. Arrays can hold all sorts of data types like strings, dates, unique IDs, or numbers. Here's the trick. Instead of indexing the array itself, Atlas Search indexes the individual items within the array based on their specific data type. For example, imagine our directors fields is an array of strings containing Lana Wachowski and Lily Wachowski. In the index definition, we simply tell Atlas Search to treat the directors field as a string type, and it handles indexing each name. But what happens if an array contains different types of data? For instance, if our directors array had both names and unique IDs mixed together? No problem. Atlas Search can handle that. We've defined the director's field as an array that holds both object ID and string types. Now, you can search using either a director's name or their object ID. And this flexibility isn't just for arrays. You can assign multiple data types to other fields too. Next up is the object ID data type. When you want to make a field that holds unique MongoDB IDs searchable, you must specifically assign it the object ID data type in your index definition. What if the field you want to search on is a sub document? A sub document is basically a document embedded inside another document. For instance, think of an awards field that's a sub document holding details like wins and nominations for a movie. This awards field is a perfect candidate for the document data type. When you define awards as a document type in your index, Atlas Search treats it as a single unit, allowing you to search within its contents. You even have the flexibility to index all fields dynamically or specific fields inside that sub document, like just the wins count. It's important to note that the document data type cannot be used for fields that hold an array of sub documents. Let's look at an example. For the sake of this demonstration, if a released field is actually an array where each item is a sub document detailing release dates for different countries, in the index definition, you'd use the embedded documents data type instead. Similar to the document type, embedded documents allows you to either automatically index all supported fields within those sub documents with dynamic mappings or choose to use static mappings to index a few individual fields. Keep in mind that Atlas Search indexes embedded documents separately from their parent documents. This means that using embedded documents can significantly increase the number of documents in your index, which can impact performance. You may be wondering about data types when using dynamic mappings. While you don't need to explicitly declare data types when using dynamic mappings, it's important to note that MongoDB supports the following data types: strings, numbers, booleans, arrays, object ID, date, and documents. And more will likely be added in the future. Check out the MongoDB documentation for a full list of data types that you can use with static mappings and considerations you should make for each type. Now that we know more about supported data types, let's finish the search index for our keyword movie feature by adding data types. In the planning phase, when we looked at a sample movie document, we identified the data types for plot and title as strings and released as a date type. Now we can add these types to our index definition. And now the search index for our feature is complete. When we run the create search index command in the shell, the name of our index should be returned. In distributed deployments, such as replica sets or sharded clusters, there may be a brief delay before the search index is consistent across all nodes due to eventual consistency. Once the index is fully propagated, it will be available for search queries from all nodes. It's also worth noting that while MongoDB allows for polymorphic data, search indexes will only include documents with the field value data type specified when the index was created. For example, if we try to insert a document with a field value data type differing from the one specified in the search index, that particular document won't be included in the index. Nice work. Let's recap what we covered in this video. Data types dictate how the field is indexed, searched, and matched during query execution. We must use supported data types for search to work as expected. You also learned that data type support in Atlas Search can vary depending on whether you're using dynamic or static mappings. You learned that we can assign a field multiple data types when defining a search index. Finally, we finished and created the search index for feature one of our mFlix streaming app.