AI Agents with MongoDB / Understand how tools are called by agents

5:33
Welcome back. In our previous lessons, we set up the environment, connected to MongoDB, and created two powerful tools to retrieve information from the database. Now we need to give the LLM access to the tools. In this lesson, we'll create a prompt enabling the LLM to call the tools we've created. This allows our AI agent to answer user queries by fetching the right information from our MongoDB collections. Let's get started. To understand how LLMs can use tools, we first have to discuss a feature called function calling. This capability allows language models to interact with external tools by generating structured outputs that can be used as function arguments. To call a function, the LLM analyzes a user's request and decides which function would be most helpful. Then, it generates the specific inputs needed. When an LLM makes a function call, it creates a structured output with the name of the function and the arguments properly formatted. Most major language models today support function calling, including models from OpenAI, Anthropic, and Google. However, since there is not an open standard and the specific implementation details may vary significantly between providers, it's important to check your model's documentation for the exact syntax and capabilities. Now let's implement function calling for our agent. Just a heads up, we'll be working in the main function of our application, which is where everything comes together. First, we specify the model. We're using GPT-4o with a temperature of zero. Setting the temperature this way gives very consistent and deterministic responses, so it's perfect for tool usage. Keep in mind that you can choose whichever model you want as long as it supports function calling. Next, let's put together the prompt that will guide our LLM's behavior. This tells the model how to think and when to use tools. This is a simple prompt. A more realistic prompt would have additional instructions that would set clear objectives, provide context, specify the format of the final output, and handle potential errors. Back to our prompt. Notice the tool_names placeholder? This is where we inject the names of the tools. By using a placeholder, we can easily update tools without rewriting the prompt. Now we need to fill in this placeholder. To do this, we'll use partials for langchain. Partials are a powerful way to prefill values in templates. Think of them like a form, where some fields are already filled in. Here, we extract the names of the tools, join them with commas, and pre-fill the toolnames placeholder. This creates a reusable prompt template with these values already inserted. Finally, let's bind the tools to the LLM so we can use them. Langchain makes this easy with the bind_tools method, which accepts a list of tools. After that, we create a chain with the prompts and tools using the pipe operator (|). In langchain, chains are a way to compose different components into a single processing pipeline. In this case, we're creating a chain that formats the prompt and then passes it to the l m that has our tools bound to it. Now that we've set up our LLM with tools, let's test if it will choose the correct tool when prompted with different types of questions. To do this, we call invoke on our chain with test queries. We then access the tool_calls property of the response, which shows us which tools the LLM decided to use and what arguments it passed to them. This way, we can evaluate the LLM's ability to select the appropriate tool for different types of requests. Keep in mind, this test is just for demonstration purposes. The first query, what are some best practices for data backups in MongoDB? Is designed to be broad and should lead to the activation of our vector search tool. The second query, give me a summary of the page titled create a MongoDB deployment, explicitly requests a summary of a specific document. This should lead the agent to select the second tool. Let's run the code and see what happens. We can see the LLM correctly identified that the first question about backup best practices should use our get_information_for_question_answering tool. And for the second question asking for a summary of a specific page, it correctly chose the get_page_content_for_summarization tool. Notice how the LLM also properly formatted the arguments for each tool call, extracting the key information from the user's query. This is the power of function calling in action. Awesome job. The agent is successfully identifying when to use the Vector Search tool for general questions about MongoDB and when to use the Page Content tool for summarization requests. Since this was just for testing purposes, let's remove those tool call checks from our code. We don't need them in our final implementation. With that complete, let's take a moment to recap what we learned. We learned that function calling allows language models to interact with external tools by generating structured outputs that can be used as function arguments. After that, we use langchain to give the LLM access to the tools we created. Finally, we tested that the LLM chose the correct tool based on the user's query.