You are currently acting as a learner.

Vector Search Fundamentals / Store Embeddings

6:55

You may have heard about vector embeddings in the context of AI, but it's important to understand that dedicated embedding models are a different category of AI models from large language models or LLMs. While LLMs like GPT focus on text generation with billions of parameters, embedding models are purpose built for creating vector representations. They're typically smaller, more efficient, and optimized specifically for semantic understanding. The accuracy of your results is directly related to the embedding model you choose. In this lesson, we'll learn about embedding models, a type of model that generates vector embeddings for data. We'll also use an embedding model to generate embeddings for our movie catalog data. Let's get started. There are a lot of embedding models to choose from, and by the time you're watching this, there will be even more. When examining embedding models, you'll want to make note of the number of dimensions the embedding model provides. Currently, Atlas Vector Search supports up to eight thousand one hundred and ninety two dimensions. But by the time you're watching this, that could increase. A common misconception is that the more dimensions you have, the better the results will be. This is not necessarily true. It's important to experiment with different models to determine what works best for your use case. For this lesson, we'll use an embedding model from Voyage AI to create embeddings for our movies. When we chose the embedding model, we also created an API key. We'll use this later to retrieve the vector embeddings. Okay. We have an existing movie catalog full of documents that need embeddings. And once we have them, we'll need to keep them up to date. So let's break this down into two steps. First, we'll install an event trigger to keep our embeddings up to date when the underlying documents change or we get a new document. That way, we're covered as our collection grows. Second, we need a batch job to generate embeddings for our existing documents. Using this approach keeps our embeddings up to date starting now while we slowly update the existing documents. We'll create a function that takes the plot field from a document and sends it to the embedding model. The embedding model will generate a vector embedding and return it. Once complete, we'll add the vector embedding array for the movie document as a new field or update it if it already exists. By the way, this is one of the really cool things about Atlas Vector Search. Our vector embeddings are stored alongside the rest of our data, so we don't need a secondary database just for vector embeddings. Alright. First up is our event trigger. There are many ways to do this, but Atlas makes it very easy using its trigger facility. We'll configure an Atlas trigger to run a JavaScript function every time a document is inserted, updated, or replaced. Here on the triggers tab in the Atlas dashboard, click on add a trigger. Use the database trigger type, and we'll have this trigger watch for events at the collection level. Now let's link the trigger to a cluster labeled cluster zero. Then we specify which database and collection to listen to. We tell the trigger to execute every time a document is inserted, updated, or replaced. Finally, we give the trigger access to the full document. Now that we've configured the trigger, let's create the function for it. First, we access the document. Next, we define the URL that we'll send API request to. After that, we specify the API key for the embedding model. I stored mine in Atlas App Services. Now let's write the post request to the embedding model. We'll send the API key in the header. In the body, we specify the field that we want an embedding for. In this case, I'm using the plot field. We also include the name of the model. Finally, we specify whether we're sending a query or document. In this case, we're sending a document. Once we receive a response from the embedding model, we need to parse the body text. After that, we'll save the embedding we received to a variable named embedding. Next, we specify the movie collection that holds our documents. And now we update our document and set a new field labeled plot underscore embedding, which holds the embedding we just generated from the embedding model. Let's wrap up the function by creating some basic error handling because you never know what can happen. Once we've added this function, we can give our trigger a name. We'll name it embeddings. Finally, let's save the new function and trigger with the save button. With the event trigger in place, it's time to update the existing corpus of documents. We're going to use Python to create a function that retrieves the embeddings. We can then loop through our data and add embeddings to each document. Obviously, this is not the most efficient approach, especially with large data sets. A more efficient approach would be to batch multiple documents to OpenAI, but this is just for illustration purposes. Also keep in mind you are not limited to Python. MongoDB supports a multitude of languages, and we're always adding more. Check out the MongoDB documentation for information on all supported programming languages. Okay. Let's create a function named getEmbeddings, which accepts a string, the name of the model, our API key, and the input type. First, we define the URL that we'll send the API request to. You can obtain this URL from the documentation of your embedding model. Next, we assemble the header that has the API key for the embedding model. Keep this hidden and don't share it. After that, we create the body of the request, which specifies the field we want the embedding for, the name of the model, and the input type. Remember, it can be either query or document. Finally, we send the post request to the embedding model and receive the embedding. Now we can take this function and add it to our application code anywhere you need to generate embeddings. Great work. Let's recap. In this video, you learned that an embedding model is a type of large language model that generates vector embeddings for data. You then learned how to generate embeddings for your data and add the embeddings array to your documents. We did this using Atlas triggers in Python, but you can implement it any way you like. Now that we have our embeddings, in the next video, we'll learn how they're indexed. See you there.