RAG with MongoDB / Create a RAG Application

6:17
Now that we have chunks of data and their embedding stored in Alice, it's time to use them. In this video, we'll learn about the retriever component of a RAG system and how to implement it. Remember, all the code will be provided after this video. First, I'll create a file name rag. Py and include references to the database and collection that store the chunks we've created. Additionally, I'll specify the name of the vector search index. Next, let's create that index in Atlas. If you're unsure how to create a vector search index, check out our video on vector search indexes. For now, we set the path to the field with embeddings and define the type as vector. We'll also include the hascode field as a filter. This will allow us to pre filter on the has code field when we query. Now I'll paste this Atlas search index definition into the Atlas UI off screen. Okay. We have the setup out of the way. Let's zoom in on the retriever component. The main job of the retriever is to take the user's query and find chunks that relate to the query. Does this sound familiar? That's because this is exactly what Atlas Vector Search does. We take the user's query, send it through the same embedding model that we used on our chunks of data, and then use $vector search to find similarities. With that in mind, let's implement the retriever in our app. First, we need to point langchain to our data and embeddings in Atlas by setting up our vector store. To do this we import mongodb atlas vector search from the langchain mongodb package at the top of the file. We'll also import the embedding model so it can be used to vectorize queries. Keep in mind that we have to use the same embedding model for both the queries and the chunks. Next we'll create a variable named vector store to set mongodb atlas as our vector store. This variable will use the fromConnectionString method on the mongodb atlas vector search class. Inside this method we'll provide the Atlas connection string for our cluster. Next we provide the name of the database and collection that has our chunks. And after that, we specify the embedding model and its API key. Lastly, we specify our vector search index. We can now use the vector store variable anytime we need to interact with Atlas vector search. Search. Let's run a query to see how it works. We'll create a function named queryData that takes a query string as an argument. This function will hold all of our RackSystems logic. Next, we'll use the asRetriever method on the vector store variable and assign the result to a variable named retriever, setting Atlas vector search as our retriever. The retriever expects a couple of arguments. The search type, which determines the kind of search to perform, and the search quarks. For search type, we can choose from similarity, similarity score threshold, or maximal marginal relevance. Let's use similarity for now since we're not doing anything extra with the score yet. In search quarks we'll define k which is used to specify the number of documents to return and set it to three. Finally, we'll invoke our retriever with the query string and add a print statement to display the results. Now, let's test the queryData function by passing in a string as a query. Remember, our chunks are related to MongoDB, so it searched for the phrase, when did MongoDB begin supporting multi document transactions? The console was a little messy, so I cleaned it up for us to view. As you can see, the page content field is related to MongoDB transactions. We also see that metadata was returned. Speaking of metadata, we can leverage it to improve the performance of our retriever by pre filtering. If you recall, Atlas Vector Search can pre filter documents before performing a vector search. This reduces the number of vectors Atlas has to search which decreases the computational overhead and improves latency. Let's pre filter for chunks that don't have code. Back in the editor, we'll add a pre filter field to the search quarks object. Here we specify the field we want to pre filter on. To filter on a field it must be indexed. We added the has code field to our vector index so let's find every chunk without code. Now when we run the function again we can see the return document does not have code based on the has code field. As you can see pre filtering is a great way to avoid searching through every chunk. Another way we can influence our query results is with the score. We could use the score to modify our results by adding another filter that removes results below a certain score. This is where similarity score threshold for the search type comes into play. To do this, let's update the search type field to use similarity score threshold. And we have to update the search quarks field to include the score threshold. We'll set it to zero point zero one so no documents with a score below zero point zero one are returned. You might be thinking why zero point zero one? This seems low compared to the relevant scores calculated by Atlas Vector Search. That's correct. But langchain's search type with similarity score threshold normalizes the vector score, which results in lower scores. Lang chain also uses a slightly different similarity function than ios vector search which explains some of the difference. So when using this option it's a good idea to experiment with different thresholds. Now when we run it again we could see that not much has changed But behind the scenes, the search results that were returned were further filtered by the score of the return document. In other words, every return document scored higher than the score threshold that we set. Great work. Now we have a good idea of how the retriever works, but this is only one part of RAC. Next, we need to work on the answer generation component. Before we do that, let's clean up our example so we can build on it in the next video. I'll remove the prefilter and score threshold. I'll also remove where we invoke the retriever. Lastly, I'll change the search type back to Similarity Score since we're not gonna be using a score threshold. With our code ready for the next video, let's recap what we learned. First, we learned that the retriever processes the user's query to find relevant chunks of information. Next, we discovered that Atlas Vector Search is ideally suited to serve as a retriever in a RAG system. Finally, we explored how to pre filter on a metadata field and filter based on score. See you in the next video where we'll finish putting our application together.