Supervisor/worker pattern

Orchestrator delegates to specialists.

Video lesson Interactive exercise ~10 min

Video coming soon

One brain to delegate, many to specialize

Supervisor/worker is the workhorse multi-agent topology. One agent (the supervisor) reads the request, decides which specialist should handle it, dispatches the work, reads the result, and either returns it or dispatches the next step. The workers are the specialists from the last module: each owns a domain, a tool set, and a focused system prompt.

Think of it as prompt routing from Module 1 with the routes upgraded from "different system prompts" to "different agents." Same dispatch shape, but the workers can now have their own loops, tools, and even their own state.

The shape

WORKERS = {
    "code": code_agent,
    "data": data_agent,
    "comms": comms_agent,
}
 
 
def supervisor(user_request):
    plan = plan_steps(user_request)
    results = []
    for step in plan.steps:
        worker = WORKERS[step.worker]
        handoff = Handoff(
            from_agent="supervisor",
            to_agent=step.worker,
            intent=step.intent,
            context=summarize(results),
            artifacts=step.artifacts,
            return_to="supervisor",
        )
        results.append(worker.run(handoff))
    return synthesize(user_request, results)

The supervisor:

Plans. Breaks the request into steps and assigns each to a worker.
Dispatches. Issues structured handoffs.
Synthesizes. Combines worker outputs into a final answer.

The workers do not know each other exist. They only see handoffs from the supervisor.

Why supervisor/worker beats prompt routing

Prompt routing picks one persona for the whole request. Supervisor/worker picks one worker per step, and can revise the plan based on what each worker finds.

Three concrete benefits:

1. Multi-step requests work

A request like "find the bug, propose a fix, and draft a Slack note about it" needs three different specialists. Prompt routing can only pick one. The supervisor dispatches three handoffs in sequence.

2. Plans can adapt to findings

If the code worker reports "no bug found, this looks like a config issue," the supervisor can re-plan and dispatch to the ops worker instead of continuing to the comms worker. Pipelines cannot do this; supervisor/worker can.

3. Each worker stays clean

The code worker only ever sees handoffs about code. Its context never gets polluted with deploy logs or Slack drafts. Same for the others. This is the multi-agent benefit you actually wanted from Module 2 last lesson.

Designing the supervisor

The supervisor is doing two distinct jobs and it pays to think of them separately:

Job 1: planning

Given a request, decompose it into steps. Each step has a worker, an intent, and any artifacts. This is hard for the model to do well without structure. Two practical patterns:

Constrain the plan with a schema. The supervisor must output a list of {worker, intent, artifacts} objects. Anything else is rejected. Use structured output (tool-call format) rather than free-form text.
Limit the worker vocabulary. Show the supervisor a one-line summary of each worker's domain in its system prompt. The plan can only reference workers that exist.

Job 2: synthesis

Once all workers have reported, the supervisor writes the final answer. Synthesis is where the hidden quality comes from. A bad supervisor concatenates worker outputs and calls it a day. A good supervisor cross-checks: does the code worker's "fix" align with what the ops worker found? Are there contradictions? What does the user actually need?

Synthesis is its own prompt and often deserves its own LLM call, separate from the plan-and-dispatch logic.

The 'planner = executor' anti-pattern

A common mistake: writing the supervisor as a single prompt that plans, dispatches, and synthesizes in one big system message. Two reasons it fails:

The plan field gets polluted with worker outputs as the loop runs, so re-planning becomes confused.
Synthesis runs in a context full of dispatch decisions rather than worker findings.

Cleaner: split the supervisor into two roles, a planner and a synthesizer, each with its own prompt. The dispatch is just code in between. You can use the same model for both, just don't conflate the prompts.

Supervisor/worker vs orchestrator/specialist

These names are interchangeable in practice. "Orchestrator" tends to imply more sophisticated planning; "supervisor" tends to imply more direct delegation. The shape is the same. Don't get hung up on the vocabulary; pay attention to whether your supervisor plans (multi-step, adaptive) or merely routes (single-shot dispatch).

If yours just routes, you have prompt routing with extra steps. Add real planning or simplify to single-agent with routing.

The handoff loop in detail

Every supervisor/worker system has a loop with the following structure:

1. supervisor receives user request
2. supervisor plans first step
3. supervisor dispatches handoff to worker
4. worker runs its own loop (tool use, reasoning, etc)
5. worker returns a structured result
6. supervisor reads result; updates plan or dispatches next step
7. when plan is done, supervisor synthesizes final answer

Step 4 is itself an entire ReAct loop from Track 1. The supervisor does not see the worker's tool calls or intermediate reasoning. It sees only the structured result.

This is the outer loop / inner loop split that we revisit in Module 4. The supervisor's loop is the outer. Each worker has its own inner loop.

When supervisor/worker is overkill

Despite being the workhorse pattern, supervisor/worker is not always the right answer. Skip it when:

The work is single-domain. A specialist agent without a supervisor is fine; the supervisor adds nothing.
The work is fixed and sequential. A pipeline is simpler and faster.
The "plan" is always the same. If your supervisor always dispatches to A, then B, then C, regardless of input, you have a hardcoded pipeline pretending to be a planner. Just build the pipeline.

If none of those apply, supervisor/worker is probably right.

Cost shape: supervisors pay per step

A supervisor/worker system pays for: one planning call, one dispatch (free, just code), one worker run per step, and one synthesis call. With N steps, that is N+2 LLM calls minimum. Compare this to a monolith doing the same work in one loop with maybe N tool calls. Multi-agent buys you cleaner state at the cost of more LLM calls. Make sure that trade is worth it for your workload before committing.

Key takeaway

Supervisor/worker is the workhorse multi-agent pattern: one agent plans and synthesizes, others specialize. It earns its complexity when you need multi-step, multi-domain work that adapts to findings. Build the supervisor as two roles (planner and synthesizer) sharing a structured handoff schema with the workers. The next lesson scales this idea up: what happens when the workers are themselves supervisors over their own teams.

>_supervisor-worker.py

Loading editor...

Output will appear here.

Done with this lesson?

Sequential pipelines

Orchestration topologies

Hierarchical orchestration

Orchestration topologies