AI Agents with MongoDB / Understand how tools are called by agents
On their own, large language models, or LLMs, can do some amazing things. They can write essays, explain complex topics, and even generate code. They also don't get everything right, and they're limited to the information they were trained on.
When we build an AI agent, we provide LLMs with tools they can use to access information and perform tasks beyond their training data.
In fact, you might have already seen this concept in action. Some LLMs, like Google's Gemini, already use tools by performing web searches to provide you with up to date information when answering your questions. In this lesson, we'll explore exactly how agents use tools to enhance their capabilities.
We'll also build two tools that our agents will use. At their core, tools are specialized functions that extend an agent's capabilities beyond what the language model can do on its own. Let's review our AI agent diagram to see how our agents extend their reasoning capabilities with tools to achieve their goals. The Tools node in our diagram highlights the tools we're building.
These are specialized components that our agent can access when needed. You'll see that we'll be creating two tools for our AI agent. Why do agents need tools? Well, while language models can generate text and reason about information they were trained on, they can't access real time information, private data, or perform functions beyond their native capabilities.
Tools bridge this gap by allowing agents to interact with external systems, data sources, and perform operations not supported by the LLM.
When an agent uses tools, it follows a logical process. It determines which tool is appropriate and if any are necessary.
It formats the input for the tool, calls the tool, and then incorporates the response into its reasoning to generate a complete answer. Many different types of tools exist in the Agent ecosystem.
Tools can make API calls to external services, perform database queries, add in plugins to extend functionality, and perform web searches for current information.
This is not an exhaustive list, but it gives you a sense of the kind of tools we can create.
Our MongoDB tools in this demo follow this pattern by giving the agent access to specific data it wouldn't otherwise have. We'll create two specialized tools. The first is a vector search tool that will query our MongoDB collection to find relevant information for answering questions.
The second is a MongoDB Query API tool that retrieves specific documentation pages by title for summarization.
By implementing these tools, we'll give our agent the ability to provide accurate, up to date information about MongoDB that goes beyond what the language model knows on its own. Now that we understand what tools are, let's build our own. For this, we'll use langchain's tool decorator. The tool decorator registers a regular Python function as a tool in the langchain ecosystem.
When we decorate a function with @tool, langchain automatically makes it available to the agent, along with important metadata like the function's name, description, and parameter details. One important thing to note is that models will perform better if our tools have well chosen names, clear descriptions, and properly defined schemas.
This context helps the model understand when and how to use the tool effectively.
Let's begin by creating a component of our Vector Search tool. We'll create a helper function that retrieves embeddings.
This function will be used to generate vector embeddings for the user's query.
That way, we can perform a vector search against our data in MongoDB.
To start, we define a function called generate_embedding that takes a string input and returns a list of floating point numbers. Keep in mind, this is just a helper function, which won't be directly used by VLM, so we aren't using the tool decorator.
The function creates a Voyage AI client using our API key, then calls the embed method with our text. It specifies the voyage-3-lite model and the query as the input type. Finally, it returns the first embedding from the result.
With the helper function complete, let's build the Vector Search tool. Here, we've used the @tool decorator to craft the first tool. Remember, our function needs a clear, descriptive name and a concise description of its purpose. These elements will help the agent select the appropriate tool based on the query and expected response.
We'll name this function get_information_for_question_answering.
Inside this function, we start by generating the embedding for the user's query with the helper function. Next, we access the chunked _docs collection by retrieving the element at index 1 from the init_mongodbb function, and then we build an aggregation pipeline.
The pipeline has two stages. First, $vectorSearch stage that searches for documents with embeddings similar to the query embedding. Notice how we specify the index name, the field containing embeddings, the query vector, and parameters like num candidates and limit.
Next, we have a $project stage that shapes the output. Here, we're only including the Document body.
After running the aggregation, we join all the retrieved Document bodies with new lines to create a single string of context, which we then return.
This means we're combining the body field of the five most relevant documents as determined by the similarity function selected during indexing, concatenated into a single string.
The LLM will use this context to generate a meaningful response to the user's query. Now that we have the Vector Search tool, let's create the second tool which will query a document by the title field.
We'll use the @tool decorator again, but this time, the function looks up a specific document by title. We'll give it a descriptive name like get_page_content_for_summarization.
We create a query that matches the title field to the user's input, and we use a projection to return only the document body. If a matching document is found, we return its body. Otherwise, we return an error message. Keep in mind that this is a find command, so the field needs to be an exact match to return anything. In our case, each title is unique, but that is something you may have to consider and plan accordingly for. With this tool, our agent will receive a full page of MongoDB documentation, which it will summarize to answer a user's query. With these two tools, our agent will be able to both search for relevant information across the documentation and retrieve specific pages for summarization.
Now that the tools are complete, let's use them. We'll need to update the main function to test each tool with a few example queries. The first thing we'll add is a list called tools that contains the two functions.
This list will be crucial later when we integrate the tools with the agent's prompt template. To test the tools, let's call each one directly with sample queries. We'll use the vector search tool to ask about MongoDB backup best practices.
With the Document Query tool, we'll request the content from a page with the title Create a MongoDB Deployment.
These direct tool invocations allow us to quickly check that the tools are working before we integrate them with the agent's decision making process. We can check that the Vector Search is finding relevant content and the document lookup is retrieving the correct pages. When we run this code, we'll see substantial output from both tools. The Vector Search tool provides five of the most relevant documents about MongoDB backup best practices, pulling from various documentation chunks. The Document Retrieval tool returns the full content of the Create a MongoDB Deployment page. This confirms that both tools are working for demonstration purposes, but proper testing is required for production environments.
Now, let's clean up the main function by removing the tool calls. In the next lesson, we'll integrate these tools into the agent's decision making process, So we'll want to leave the main function ready for that next step. Great job. Let's recap what we've learned in this lesson.
We started by understanding what tools are, specialized functions that extend an agent's capabilities beyond what the language model can do on its own. We saw that tools can take many forms from API calls to database queries. We then explored how to create tools using Langchain's @tool decorator, which registers regular Python functions into tools that an agent can use. We built two specific tools.
First, we made a vector search tool that finds relevant information in our chunked_docs collection based on the user's query. The response from this tool will help the agent answer questions about MongoDB.
And second, we created a document retrieval tool that fetches complete documentation pages by title. This tool will help our agent to summarize entire pages of the MongoDB documentation.
We also learned the importance of providing good names, descriptions, and parameter definitions for tools to help the agent understand when and how to use them. By building tools, we've taken a significant step toward creating an agent. Our agent will access and use information beyond what's contained in the language model itself. This is a key capability that makes AI agents powerful and useful.
