Search Fundamentals / Implementing Atlas Search
To help you create a more intuitive and efficient search experience, a clear vision of your user experience is needed. This vision will also guide better technical decisions during Atlas Search implementation, preventing costly redesigns later.
In this video, we'll guide you through the essential planning steps you need to take before implementing Atlas Search features. We'll give you a list of critical questions to ask upfront and key considerations that will shape your implementation strategy.
We'll be working again with an example application for a movie streaming platform called mFlix. Our job is to transform the basic catalog browsing experience into an intuitive search experience that helps users quickly discover films they'll actually want to watch.
We'll build two features to accomplish this. First, users will be able to search the movies collection for movies matching specific criteria. Second, the system will display a count of movies matching a genre search.
Before we begin building search indexes and queries for these features, we need to plan our approach by examining the movies collection and considering the search experience we want our users to have.
To do this, we'll apply this list of questions to the mFlex scenario.
Let's start by considering what our users are searching for. Understanding what our users are searching for will help us determine if we need to use the search stage to return documents matching the search criteria or the search meta stage to return metadata. The metadata could be used to provide us with counts of matching documents.
Remember, we're designing two features. The first feature will allow users to search for movies based on keywords or phrases, so they'll expect to see matching content in their results. In this case, we'll use the search stage in our search query. But for the second feature, we only need to display a count of movies that match criteria.
For this, we'll use the search meta stage in the search query. Next, we need to determine which fields in the movie document contain likely search terms.
This will help us decide whether to index a few fields using static mappings or if we need to index all fields using dynamic mappings when we create a search index.
An index with static mappings is typically much smaller than an index with dynamic mappings.
We can also identify data types that we'll be working with.
To do that, let's look at this example document from the movies collection. For feature one, since we only want to index the title, plot, and released fields, we'll use static mappings for this index. As for the data types, title and plot are both strings, and released is a date type. Knowing these data types will come in handy when we create our index. For feature two, we only need to index the genre and released fields, so we'll use static mappings for this index.
For data types, genre is an array of strings and released is the date type. For this feature, we'll use a special data type called a token that we'll learn more about later. Next, we need to decide how closely a user's search term should match our data.
This is going to help us determine which operators to use when we create our search query. Search terms may be exact, similar, or partial matches for the data in your Atlas cluster.
Exact means that the search term must match the stored data exactly.
For example, searching for apple will only match documents where apple appears exactly as written, including case sensitivity and spelling. Similar means that the search term can match data that is phonetically close to the query or falls within a range.
With similar matches, searching for apple will match documents where apple is in a different case or appears in a compound word like pineapple. And partial means that the search term can match parts of a word or phrase. For example, searching for app might match apple, application, or appealing depending on the search configuration. In our case, both features will have to use exact matching to accommodate search terms and similar matching to accommodate the date range. We'll show you how you can use operators to blend matching behaviors when we build the search queries for each feature. Now we'll decide if we need advanced text analysis.
Text analysis tools like analyzers are helpful if your application requires text normalization, multi language support, or more. The type of analyzer you choose determines how text is broken into tokens when the data is indexed.
We'll stick with the default standard analyzer for both features here, but we recommend checking out the documentation to learn more about all the options provided by Atlas Search. Now we need to consider how we plan to present search results for our users.
We can adjust the presentation of search results in a search query using score, sort, search before, and search after query options or the facet collector. For example, we can use search before and search after to implement pagination in an application. For now, since we want to display the most relevant movies first, a standard behavior, we won't use these options to adjust how the results are returned for feature one. But for feature two, we will use the facet collector to collect movies based on a date range.
The facet collector categorizes and counts documents based on specific fields, often within defined ranges or categories. So it fits our need to use a date range. Finally, we need to think about how we can optimize search performance. Atlas search query performance is affected by your index configuration and the complexity of your queries.
Let's start by looking at index configuration. Search indexes take up space on disk and are continuously updated when documents are inserted, updated, or deleted. To be as efficient as possible, we'll only index the fields that are critical to our two features.
It's important to note that your data model can also impact index size. While we can index the entire document, this will increase the size of a search index and potentially degrade performance. MongoDB's documentation provides guidance for handling various data modeling patterns with Atlas Search. Also, added complexity for search queries in your search stage and subsequent pipeline stages will directly impact performance. Aim for a logical balance between query complexity and performance in your queries.
It's also important to know that search queries will compete with regular database tasks like adding, changing, or deleting data for the same resources. If your MongoDB cluster is handling these different operations at the same time, it could result in contention over CPU, memory, and disk IO and lead to overall performance problems for your database. To avoid this, MongoDB recommends using dedicated search nodes so that MongoDB and search processes run on different nodes.
Search nodes are specialized servers or instances in your MongoDB cluster that are only used for Atlas search questions. This keeps them from slowing down your main database work. We'll be using an m ten cluster to prototype the two search features for our demo. However, once in production, we recommend using an m thirty or higher cluster with dedicated search nodes.
Nice work. Let's recap what we covered in this lesson. Planning how you will implement Atlas Search will help you provide an excellent experience for users and avoid costly redesigns later on. We walked you through a framework that you can use to help you plan out how you want to implement Atlas Search in your application.
