Tool use
Function calling: how models select tools
The mechanics of tool selection and invocation.
The model never executes anything
"Function calling" sounds like the model runs code. It doesn't. The model produces a structured request that asks you to call a function. Your code decides whether to honor it.
A function call is fundamentally JSON.
{
"name": "get_weather",
"arguments": { "city": "Tokyo" }
}What "function calling" actually is
The phrase "function calling" sounds like the model executes code. It doesn't. The model never executes anything. What it does is produce a structured request that asks you to call a function, and your code decides whether to honor the request.
A function call from the model is, fundamentally, JSON:
{
"name": "get_weather",
"arguments": { "city": "Tokyo" }
}That's it. The model returns this object instead of free-form text, and your loop is responsible for matching the name to a real Python function and invoking it. The "calling" happens entirely on your side.
How the model picks a tool
You hand the model a list of tool definitions. Each tool has a name, a description, and a JSON schema for its arguments:
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a city.",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"},
},
"required": ["city"],
},
},
},
{
"type": "function",
"function": {
"name": "search_web",
"description": "Search the public web and return result snippets.",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string"},
},
"required": ["query"],
},
},
},
]The model sees those definitions in its context. When you ask "what's the weather in Tokyo?", the model:
- Reads the user's message
- Reads the tool list
- Decides whether any tool is relevant
- If yes, returns a tool call request. If no, returns a normal text response.
The decision is made the same way the model decides anything: it's a next-token prediction conditioned on the prompt. Tool descriptions are part of the prompt, so good descriptions matter as much as a good system prompt.
What the model is actually trained on
Modern function-calling models (Llama 3, GPT-4, Claude, Mistral) are fine-tuned on examples of (tool list + user message + correct tool call). The training teaches the model two things:
- When to call a tool. Given a question that needs external information, output a tool call instead of guessing.
- How to format the call. Output valid JSON that matches the schema you provided.
Older or weaker models either skip step 1 (they hallucinate answers instead of calling tools) or fail at step 2 (they invent fields, mismatch types, or wrap JSON in prose). One of the practical reasons to use a recent model is that function calling becomes much more reliable.
A trace of one tool call
Say your messages list contains:
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What's the weather in Tokyo?"},
]You call:
response = ollama.chat(model="llama3", messages=messages, tools=tools)The response comes back as a structured object. Pseudocode of what's inside:
response.message = {
"role": "assistant",
"content": "",
"tool_calls": [
{
"id": "call_abc123",
"function": {
"name": "get_weather",
"arguments": {"city": "Tokyo"},
},
},
],
}content is empty because the model has nothing to say yet. It's making a request. Your loop then runs get_weather(city="Tokyo"), captures "22C, sunny," and appends:
messages.append(response.message) # the assistant's tool call
messages.append({
"role": "tool",
"content": "22C, sunny",
"tool_call_id": "call_abc123",
})The next call to the model sees all four messages. Now it has the data and can produce a final text response.
Tool calls vs prompting
A common alternative to function calling is to ask the model to output a special token format ("Action: search\nArgs: weather Tokyo") and parse it yourself. This was the standard approach pre-2023 and is still how some research papers describe ReAct. It works, but:
| Approach | Reliability | Effort | Notes |
|---|---|---|---|
| Native function calling | High | Low | The model is fine-tuned for this |
| Custom prompt format | Medium | Medium | Works but you parse and validate |
| Plain text "I'll call X" | Low | High | The model often skips it or makes things up |
For new agents, use native function calling whenever your model supports it. We'll build a manual ReAct parser in Module 4 because it's instructive, but in production you should default to the native API.
Multiple tool calls in one response
The model can request multiple tools in a single response. Most APIs surface this as a list:
for call in response.message.tool_calls:
result = registry[call.function.name](**call.function.arguments)
messages.append({"role": "tool", "content": str(result), "tool_call_id": call.id})This matters for performance. If a model needs to look up three different cities, it can request all three in one turn instead of three sequential turns. You can run them in parallel:
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor() as ex:
futures = {
call.id: ex.submit(registry[call.function.name], **call.function.arguments)
for call in response.message.tool_calls
}
for call_id, future in futures.items():
messages.append({"role": "tool", "content": str(future.result()), "tool_call_id": call_id})Parallel tool execution is one of the easiest performance wins in an agent system.
What 'tool' and 'function' really are
The OpenAI / Anthropic / Ollama APIs all use slightly different field names: tools vs functions, tool_calls vs function_call. The shape is the same: a name, a JSON-schema for arguments, a structured response object. If you understand one, you can read all of them.
Key takeaway
Function calling is not "the model running your code." It's the model returning a structured JSON request that your code decides whether to honor. The model picks a tool by reading its description, so the quality of your tool descriptions controls how often the model picks the right one. The next lesson is entirely about that: how to write tool descriptions and schemas the model will actually use correctly.
Done with this lesson?