The orchestration loop
Anatomy of the orchestration loop
Input, reasoning, action selection, execution, observation, repeat.
Video coming soon
The five phases
In the last module we sketched a three-step loop: perceive, reason, act. That's the right intuition, but when you actually sit down to build a robust loop, three steps becomes five. Every iteration of a real agent moves through these phases in order:
- Input assembly. Build the message list the model will see.
- Reasoning. Send the messages to the model and get its decision.
- Action selection. Parse the response into either a tool call or a final answer.
- Execution. Run the tool and capture its output.
- Observation. Append the tool result to the message list. Loop back to step 1.
Three of these are pure Python. Only one is an LLM call. That ratio matters. Most of what makes an agent reliable is the boring scaffolding around the model.
A loop you can actually read
Here is the same agent from the last module, rewritten to make every phase explicit.
import ollama
def run_agent(goal, tools, tool_registry, max_iterations=10):
# 1. INPUT: seed the conversation
messages = [
{"role": "system", "content": f"Your goal: {goal}"},
]
for step in range(max_iterations):
# 2. REASONING: model sees the full state, decides what to do
response = ollama.chat(
model="llama3",
messages=messages,
tools=tools,
)
message = response.message
messages.append(message)
# 3. ACTION SELECTION: tool call, or final answer?
if not message.tool_calls:
return {"answer": message.content, "steps": step + 1}
# 4. EXECUTION: run each tool the model picked
for call in message.tool_calls:
fn = tool_registry[call.function.name]
result = fn(**call.function.arguments)
# 5. OBSERVATION: feed result back into the message list
messages.append({
"role": "tool",
"content": str(result),
"tool_call_id": call.id,
})
return {"answer": None, "steps": max_iterations, "reason": "max_iterations"}Read it once and notice: the model only shows up in phase 2. Everything else is a list, a dictionary lookup, or a function call. The "intelligence" of your agent lives in the model. The "reliability" of your agent lives in everything else.
Why each phase exists
Input assembly
You decide what the model gets to see. That includes the system prompt, the goal, the conversation so far, and any tool results from previous iterations. This is the lever you pull when the agent is confused or distracted. Phase 1 is where context engineering happens (we cover this in Module 5).
Reasoning
The only phase where you spend tokens. Treat it as expensive. Every other phase should be designed to make this one count.
Action selection
Modern models return a structured response that either contains tool calls or doesn't. The absence of a tool call is the model saying "I'm done thinking." Most agent bugs are really action selection bugs: the model called the wrong tool, called a tool with bad arguments, or stopped when it shouldn't have.
Execution
This is your code, not the model's. You control timeouts, retries, error handling, sandboxing. The model never executes anything. It only requests.
Observation
What you put back into the message list is what the model will reason about next. A 50KB tool output that gets dumped raw into the messages will pollute every future iteration. We'll talk about trimming, summarizing, and filtering observations in Module 5.
A trace through the phases
Goal: "What's the weather in Tokyo?" Tools: get_weather(city).
| Step | Phase 1 (Input) | Phase 2 (Reasoning) | Phase 3 (Selection) | Phase 4 (Execution) | Phase 5 (Observation) |
|---|---|---|---|---|---|
| 1 | system + goal | "I should call get_weather" | tool_call: get_weather("Tokyo") | returns "22C, sunny" | append tool result |
| 2 | system + goal + assistant + tool result | "I have the answer" | no tool calls, final answer | (none) | (loop exits) |
Two iterations. One tool call. The model never touched the weather API. Your code did.
The loop is just a state transition
If you squint, the orchestration loop is a state machine where the state is messages and each iteration is a transition. We make this explicit in Track 2 when we model the loop as a real state machine with named states like PLANNING, EXECUTING, REFLECTING. For now, a list of messages is enough.
The phases in production code
Real agent frameworks (LangChain, OpenAI Agents SDK, Anthropic's tool use) all implement these same five phases. They differ in:
- How phase 1 is built. Some frameworks let you inject memory, retrievals, or summaries automatically.
- How phase 4 is sandboxed. Function calling vs subprocess vs container.
- How phase 5 is filtered. Truncation, summarization, schema enforcement.
But the skeleton is identical. Once you can see the five phases in your own code, you can read any framework's source by mapping its functions back onto this loop.
Key takeaway
An agent is a five-phase loop where only one phase calls the model. Everything else is plain code. If you want a more reliable agent, the answer is almost never "use a smarter model." It is "improve phases 1, 4, or 5." The next lesson covers the most important phase 3 question: when does the loop stop?
Done with this lesson?