The orchestration loop
Error handling inside the loop
Tool failures, hallucinated actions, and stuck loops.
Video coming soon
Errors are normal, not exceptional
Inside an agent loop, things go wrong constantly. Tools time out. APIs return 500s. The model hallucinates a tool name that doesn't exist. The model passes the wrong arguments. A search returns zero results. JSON fails to parse.
In a normal program, these would crash the script. In an agent loop, they're just another observation to feed back to the model. Done well, the model recovers and tries something else. Done poorly, one error tears down the whole loop.
There are four classes of error you need to handle inside the loop.
1. Tool execution errors
The most common case. A tool you defined raises an exception. Maybe the network failed, maybe a file doesn't exist, maybe the database returned an error.
The wrong way is to let it propagate:
# DON'T DO THIS
for call in message.tool_calls:
result = tool_registry[call.function.name](**call.function.arguments)
# If this raises, the entire loop diesThe right way is to catch it and return the error as a tool observation:
for call in message.tool_calls:
try:
result = tool_registry[call.function.name](**call.function.arguments)
content = str(result)
except Exception as e:
content = f"ERROR: {type(e).__name__}: {e}"
messages.append({
"role": "tool",
"content": content,
"tool_call_id": call.id,
})Now the model sees the error, can reason about it, and can try a different approach. Models are surprisingly good at recovering from clear error messages, especially if the message includes which arguments failed.
2. Hallucinated tool names
The model invents a tool that doesn't exist. This happens more often than you'd expect, especially with smaller models.
for call in message.tool_calls:
name = call.function.name
if name not in tool_registry:
messages.append({
"role": "tool",
"content": f"ERROR: Tool '{name}' does not exist. Available tools: {list(tool_registry.keys())}",
"tool_call_id": call.id,
})
continue
# ... execute as normal ...Listing the available tools in the error message is the key part. Without it the model often invents a different nonexistent tool on the next turn. With it, the model usually picks a real one.
3. Bad arguments
The model calls a real tool with wrong or malformed arguments. Maybe a required field is missing, maybe a string was passed where an integer was expected.
You can validate against the tool schema before executing:
import jsonschema
def call_tool(tool_def, fn, args):
try:
jsonschema.validate(args, tool_def["parameters"])
except jsonschema.ValidationError as e:
return f"ERROR: Invalid arguments. {e.message}. Schema: {tool_def['parameters']}"
try:
return str(fn(**args))
except Exception as e:
return f"ERROR: {type(e).__name__}: {e}"Including the schema in the error gives the model a fighting chance to fix its own mistake. We dig deeper into schema design in the next module.
4. Malformed model output
Modern function-calling APIs handle most parsing for you, but it still happens: the model returns invalid JSON, claims to call a tool but doesn't include the call, or returns an empty response.
response = ollama.chat(model="llama3", messages=messages, tools=tools)
message = response.message
if not message.content and not message.tool_calls:
# Model returned nothing useful. Nudge it.
messages.append({
"role": "user",
"content": "Your last response was empty. Please continue or finalize your answer."
})
continueThis kind of nudge is cheap. One extra turn beats a silent failure.
Putting it together
Here is a loop with all four error classes handled:
def run_agent(goal, tools, tool_registry, max_iterations=15):
messages = [{"role": "system", "content": f"Goal: {goal}"}]
available = list(tool_registry.keys())
for step in range(max_iterations):
try:
response = ollama.chat(model="llama3", messages=messages, tools=tools)
except Exception as e:
return {"ok": False, "reason": f"model_error: {e}"}
message = response.message
messages.append(message)
# Class 4: empty response
if not message.content and not message.tool_calls:
messages.append({"role": "user", "content": "Empty response. Continue or finalize."})
continue
if not message.tool_calls:
return {"ok": True, "answer": message.content}
for call in message.tool_calls:
name = call.function.name
# Class 2: hallucinated tool
if name not in tool_registry:
messages.append({
"role": "tool",
"content": f"ERROR: Tool '{name}' not found. Available: {available}",
"tool_call_id": call.id,
})
continue
# Class 1 + 3: execution and argument errors
try:
result = tool_registry[name](**call.function.arguments)
content = str(result)
except TypeError as e:
content = f"ERROR: Bad arguments to {name}: {e}"
except Exception as e:
content = f"ERROR: {type(e).__name__} in {name}: {e}"
messages.append({"role": "tool", "content": content, "tool_call_id": call.id})
return {"ok": False, "reason": "max_iterations"}This loop will not crash on a single tool failure. It also gives the model enough information to recover most of the time.
The recovery principle
The pattern across all four classes is the same: convert errors into observations. The model already knows how to react to information. It's much better at reasoning about an error message than it is at avoiding errors in the first place.
| Error class | What it looks like | What you do |
|---|---|---|
| Tool exception | Network, file, API failure | Catch, format as tool result |
| Hallucinated name | Tool doesn't exist | Return error with list of real tools |
| Bad arguments | Missing/malformed fields | Return error with the schema |
| Empty model output | No content, no tool calls | Inject a user-role nudge |
Don't swallow real bugs
"Convert errors into observations" applies to runtime errors the model can plausibly recover from. It does not apply to bugs in your own code (a typo in a function name, a missing import). For those, let the exception propagate during development. A useful split: catch the tool body, but not the registry lookup itself once you trust your registration code.
Key takeaway
A robust loop treats errors like any other observation. Catch them, format them as a tool message, and let the model decide what to do. Four error classes cover almost everything: tool exceptions, missing tools, bad arguments, empty responses. Handle all four and your agent goes from "fragile demo" to "actually deployable."
The next module switches focus from the loop to the tools themselves: how the model picks them, how to describe them, and how to design schemas the model will use correctly.
Done with this lesson?