Vector Search Fundamentals / Perform a Vector Search
Now that we've indexed our embeddings, let's see the real power of Atlas Vector Search. We'll query our data and find results based on semantic meaning. In this lesson, we'll create vectors for a query.
We'll use those vectors in the vector search aggregation stage to perform a search. After that, we'll add a pre filter to return results efficiently and increase performance. Get ready because this is the fun part.
Now you may be wondering how we can compare the text of a query with long lists of numbers in order to find similarity between our query and the vector embeddings. Good question.
Recall we had to generate embeddings for our plot field when we indexed our data.
We'll need to do the same thing with our query. Good thing we already have a function to do this. For the query to be successful, we must use the same model we used when indexing the data. If we tried to use a different model, we wouldn't be able to compare the query with the vector embeddings and find similarity.
Now that we know how to generate an embedding for the query, let's talk about the dollar sign vector search aggregation stage.
The vector search stage performs a nearest neighbor search on our embeddings. Similar to the search stage, the vector search stage must be the first stage in our aggregation pipeline. For this demonstration, we'll use the PyMongo driver, but Vector Search also has driver support for Node. Js, Java, and C.
Additional driver support is coming in the future, so keep an eye on the MongoDB documentation. We're almost ready to build our pipeline, but first, we need a query.
If you remember, vector search excels at extracting the semantic meaning from queries. So what does this look like? Lately, I've been really into movies about people who are escaping maximum security facilities. So let's use that as our query. If you noticed, this query isn't focused on a particular keyword, but rather a description of what I want. This is an example of a semantic query.
Remember, we use the Voyage three large model to generate vectors for our documents.
So we need to use the same model for our query to ensure the embeddings are the same vector space.
The input type parameter tells the model how to optimize the embedding. We use query for search queries and document when embedding the actual documents.
We'll plug this query into our embedding function along with the model, API key, and input type.
We'll then use the embedding in the vector search stage.
To start, we'll store our pipeline in a variable labeled pipeline.
Then, we'll add the dollar vector search stage, which has a few options we need to define.
Next, we specify whether we are going to perform an exact or approximate search. If we perform an exact search, we'll have to omit the number of candidates field later on. For this example, we'll set this to false to perform an approximate search.
We'll use the index we created in a prior lesson.
Next, we specify the path to our embeddings, which is the plot underscore embeddings field. After that, we provide the vector embeddings for the query. Next, we have num candidates, but let's skip it for a moment and look at limit.
This is the number of documents we want to return. Let's set it to ten, so we receive the ten documents most similar to our query. Okay. Let's go back to num candidates, which stands for number of candidates. What does that mean?
If you remember, to search our vectors with the approximate nearest neighbor algorithm, we use an HNSW graph. This is a multilayered structure that maps our vectors. We then use a nearest neighbors algorithm to identify vectors that are similar to our query. The process begins at a random point on the top layer, which contains fewer vectors that are separated by longer distances. Then we keep moving down layers, which contain more and more vectors linked closer together.
And we search until we reach the bottom layer and get as close as possible to the query point. The number of times we repeat that process using a different random entry point at the top layer is known as the number of candidates.
We recommend setting none candidates to at least twenty times your limit value to boost accuracy.
If you want ten results, start with none candidates at two hundred.
Higher values also reduce differences between your ANN and exact nearest neighbor results.
This over request approach lets you balance speed and recall in your searches, but tune this parameter based on your specific needs considering these key factors.
Index size. Larger collections need higher noncandidates values. A million vector collection requires significantly more candidates than a thousand vector one.
Limit value. Since non candidates correlates with index size, lower limits need proportionally higher candidate counts to maintain recall.
Vector quantization.
Quantized vectors, int eight or int one, sacrifice some accuracy for storage savings, so they might need higher numb candidates compared to full precision float thirty two vectors to achieve similar recall. With that in mind, I'll set num candidates to two hundred, and my limit is ten. Now we've completed the vector search stage. Let's add a projection stage to make our results easier to read. Finally, let's run the pipeline. Now remember, our original query was a movie about people who are trying to escape from a maximum security facility.
When we view our list of results, we see that each plot field describes a movie that involves a prison escape without using those exact words.
We see phrases like inmate leading the rebellion and escape with help from the inside.
This is awesome because Atlas took our query and returned results based on the meaning of the query. All we had to do was describe what we wanted to watch.
Now suppose I'm in the mood for a more recent movie. Remember when we added a filter to the index we created? Well, we can use it in our query.
The filter will pre filter results before performing the vector search. This improves performance by removing irrelevant data before we start, and therefore narrowing down the search space.
As you can imagine, searching an entire HNSW graph with thousands of dimensions is resource intensive.
Let's add the filter option to our vector search stage.
We'll filter on the year field and specify that I want to find movies released after twenty ten. We'll also update the projection stage.
Now, when we run the pipeline, we receive exactly what we're looking for.
Pre filtering is a powerful tool, and we recommend using it whenever possible.
Great work. Let's recap what we learned.
First, we learned that we must vectorize our query in order to perform a search.
And we need to use the same embedding model that we used for our index data.
We learned about the powerful vector search aggregation stage to perform a search. Finally, we added a filter to the vector search aggregation stage, which pre filtered our data. It returned specific results and improved performance.
Next, we'll learn how to combine keyword search and vector search to get the best of both worlds.
