Minbook
KO
Multi-Agent Workflow — 6 Patterns from Supervisor to Swarm

Multi-Agent Workflow — 6 Patterns from Supervisor to Swarm

M. · · 9 min read

Six core patterns of multi-agent workflow (Supervisor / Sequential / Hierarchical / Network / Swarm / Map-Reduce), grounded in primary sources from LangGraph, CrewAI, OpenAI, and Anthropic. Each pattern's topology and fit, plus a decision framework for production.

The previous article Single vs Multi-Agent — Same Sources, Opposite Conclusions defined a multi-agent system as “multiple LLMs collaborating via distributed decision-making and delegation.” The two cases compared in that article — Anthropic Multi-Agent Research System and Cognition Devin — were both dynamic-delegation cases, where the lead agent decided routing on the fly.

But the majority of multi-agent systems running in production are different. The more common structure places multiple agents within a predefined flow — a hybrid. LangGraph Supervisor, CrewAI Sequential Process, AutoGen GroupChat — different names for the same territory. Anthropic calls this region “Workflows,” LangGraph calls it “Multi-Agent Workflow,” and the OpenAI Agents SDK calls it “Manager-style orchestration.”

This article catalogs six core patterns from this hybrid territory, grounded in primary sources. Each pattern’s topology, fit, no-fit, and reference implementation, followed by a decision framework for pattern selection.

What Multi-Agent Workflow Is

Bringing back the three-way classification from the previous article:

  • Workflow: “LLMs and tools are orchestrated through predefined code paths.” (Anthropic)
  • Single Agent: a single LLM dynamically decides its own procedure and tool use
  • Multi-Agent System (MAS): multiple LLMs collaborating via distributed decision-making and delegation

Multi-Agent Workflow lives in the empty space between these three. The flow is predefined code (Workflow trait), but multiple agents are placed within it (MAS trait). Extending the previous article’s definition table:

DimensionWorkflowMulti-Agent WorkflowMulti-Agent System (dynamic)
Decides next actionCode (predefined)Code (predefined)LLM (dynamic delegation)
Number of agentsUsually 1MultipleMultiple
Routing mechanismConditionals / sequentialSupervisor selects from predefined candidates, or explicit graphLead delegates freely
Debugging difficultyLowMediumHigh
Cost predictabilityVery highHighLow (15x, high variance)
Reference implementationsLangChain Chain, prompt chainingLangGraph Supervisor, CrewAI Sequential, AutoGen GroupChatAnthropic Multi-Agent Research, custom orchestrator

The LangGraph official documentation makes the same distinction explicit:

“Workflows have predetermined code paths and are designed to operate in a certain order. Agents are dynamic and define their own processes and tool usage.” (LangGraph Docs)

The core difference is who decides the flow. If code decides the next action, it’s Multi-Agent Workflow. If an LLM decides, it’s a true Multi-Agent System. This single difference determines debugging difficulty, cost predictability, and operational complexity.

Why Multi-Agent Workflow Is More Common in Production

The trade-offs of dynamic delegation laid out in the previous article — cost explosion, sync bottleneck, rainbow deployment, non-deterministic debugging, cascade failure — all stem from the flow depending on LLM decisions. Multi-Agent Workflow nails the flow into code and mitigates most of these trade-offs.

  • Debuggability: The flow is defined as a graph or sequential code, so traces are deterministic. LangGraph allows step-by-step tracing of state changes at each node.
  • Cost predictability: The upper bound on call count is explicit in the flow. Dynamic delegation can spawn 50+ subagents based on the lead’s judgment, while a Supervisor pattern has the worker count nailed in code.
  • Sync bottleneck avoidance: Explicit fan-out/fan-in shows where to parallelize and where to synchronize, in code.
  • Permission separation: Which tools each agent can use is determined at flow-definition time. Dynamic delegation tends to allow the lead to use any tool freely.

The OpenAI Agents SDK guide captures this operational advantage:

“Manager-style orchestration: A central manager agent… routing tasks to specialist agents while maintaining oversight. Recommended for predictable cost and easier debugging.” (OpenAI Agents SDK)

This is why production teams willingly forgo the 90.2% gain of dynamic delegation and choose Multi-Agent Workflow instead. They buy predictability over potential.

Six Core Patterns

The LangGraph official documentation presents five categories (Network / Supervisor / Hierarchical / Custom / Swarm), and adding Sequential and Map-Reduce — both common in production — gives six core patterns.

Pattern 1 — Supervisor (Hub-and-Spoke)

The most common and simplest pattern. A single supervisor agent sits at the center, and multiple worker agents form a hub-and-spoke topology around it.

---
config:
  look: handDrawn
  theme: neutral
---
flowchart TD
    User --> Supervisor
    Supervisor --> Worker1[Worker 1]
    Supervisor --> Worker2[Worker 2]
    Supervisor --> Worker3[Worker 3]
    Worker1 --> Supervisor
    Worker2 --> Supervisor
    Worker3 --> Supervisor
    Supervisor --> Done

    classDef entry fill:#e0f2fe,stroke:#0284c7,stroke-width:2px,color:#0c4a6e
    classDef exit fill:#ecfccb,stroke:#65a30d,stroke-width:2px,color:#365314
    class User entry
    class Done exit

The flow is simple. The supervisor receives a user request and picks one item from its tool list (= worker agent names) to call. The worker returns a result to the supervisor, and the supervisor decides whether to call another worker or terminate.

LangGraph standardizes this pattern with the langgraph-supervisor-py library. CrewAI calls the same structure Hierarchical Process (specifying the supervisor LLM via the manager_llm argument).

ItemDetail
FitRouting per task across multiple specialists (e.g., customer inquiry → billing / shipping / refund)
No-fitTasks where every worker must be called every time, tasks requiring direct worker-to-worker communication
Reference implementationsLangGraph Supervisor, CrewAI Hierarchical Process, OpenAI Agents SDK Manager pattern
Critical decision pointSupervisor’s routing accuracy. Wrong routing → wrong worker attempts the task

Pattern 2 — Sequential (Pipeline)

The simplest flow — a predefined order, agents passing tasks down a chain. CrewAI’s default Process is this.

---
config:
  look: handDrawn
  theme: neutral
---
flowchart LR
    User --> A1[Researcher Agent]
    A1 --> A2[Writer Agent]
    A2 --> A3[Editor Agent]
    A3 --> Done

    classDef entry fill:#e0f2fe,stroke:#0284c7,stroke-width:2px,color:#0c4a6e
    classDef exit fill:#ecfccb,stroke:#65a30d,stroke-width:2px,color:#365314
    class User entry
    class Done exit

Each agent takes the previous agent’s output as input. A linear chain with no branching or conditionals. CrewAI’s Sequential Process documentation states:

“Sequential Process: Tasks are executed one after another, where each task can build upon the results of previous tasks.” (CrewAI Docs)

ItemDetail
FitTasks with defined steps — research → write → edit / data collection → analysis → report
No-fitTasks needing branches, conditionals, or retries; tasks where only some steps run
Reference implementationsCrewAI Sequential Process, LangChain SequentialChain (legacy)
Critical decision pointOutput format consistency between steps. Each agent must reliably parse the previous output

The simplest is also the most robust. A common production pattern is to start with sequential when introducing a first multi-agent system, then evolve to supervisor.

Pattern 3 — Hierarchical (Multi-level Supervisor)

The Supervisor pattern stacked into multiple levels. Sub-team supervisors sit beneath a top-level supervisor, and each sub-team has its own workers.

---
config:
  look: handDrawn
  theme: neutral
---
flowchart TD
    Top[Top Supervisor]
    Top --> S1[Research Team Supervisor]
    Top --> S2[Writing Team Supervisor]
    S1 --> W1[Web Researcher]
    S1 --> W2[Fact Checker]
    S2 --> W3[Writer]
    S2 --> W4[Editor]

    classDef entry fill:#e0f2fe,stroke:#0284c7,stroke-width:2px,color:#0c4a6e
    class Top entry

The LangGraph official tutorial “Hierarchical Agent Teams” is the reference implementation. research_team and writing_team each have their own sub-supervisor, and a top_level_supervisor coordinates between the two teams.

ItemDetail
FitDomain separation + multiple workers collaborating within each domain. Maps to large-organization workflow structure
No-fitSimple tasks, tasks with only 1-2 domains (unnecessary layer)
Reference implementationsLangGraph Hierarchical Agent Teams
Critical decision pointDomain boundary setting. Wrong cuts → frequent inter-team communication, top supervisor becomes a bottleneck

The complexity is high, so it’s over-engineering for small organizations or simple tasks. Anthropic’s “start simple” emphasis is partly a warning against the temptation of these multi-layer structures.

Pattern 4 — Network (Pre-defined DAG)

A pattern where agents are called along an explicitly defined directed acyclic graph (DAG). No supervisor — each agent’s choice of which agent to call next is hardcoded into graph edges.

The general form of this pattern is defining an arbitrary graph with LangGraph’s StateGraph. Nodes are agents, edges are explicit call relationships. If the supervisor pattern is hub-and-spoke, the network pattern is mesh.

ItemDetail
FitTasks with complex but clearly definable dependencies. Compilation pipelines, multi-stage data processing
No-fitTasks where the flow varies per task (if the graph differs per task, dynamic is a better fit)
Reference implementationsLangGraph custom StateGraph
Critical decision pointCorrectness of the graph itself. Cycles, dead-ends, infinite loops must be eliminated at design time

This pattern has the highest expressive power but the highest design overhead. Going to network when supervisor and sequential would suffice is over-engineering. Conversely, using supervisor for tasks that genuinely need mesh dependencies forces the supervisor to absorb all graph logic, concentrating complexity in one node.

Pattern 5 — Swarm (Handoff-based)

A pattern where the currently active agent explicitly hands off control to the next agent. Not a supervisor — the agent currently doing the work directly designates the next agent.

---
config:
  look: handDrawn
  theme: neutral
---
flowchart LR
    Triage[Triage Agent] -->|handoff| Math[Math Agent]
    Triage -->|handoff| Code[Code Agent]
    Math -->|handoff| Code
    Code -->|handoff| Math

    classDef entry fill:#e0f2fe,stroke:#0284c7,stroke-width:2px,color:#0c4a6e
    class Triage entry

OpenAI announced this pattern in late 2024 with the experimental swarm library, which later settled into the production-ready Agents SDK as the handoff primitive. LangGraph supports the same pattern via langgraph-swarm-py.

The OpenAI Agents SDK official documentation states:

“Handoffs allow an agent to delegate to another agent, transferring full message history. This is different from manager-style orchestration, where the manager retains control.” (OpenAI Agents SDK)

ItemDetail
FitTasks needing natural transfer between specialized agents. Customer inquiries triaged → directly transferred to a specialist
No-fitTasks needing central control or oversight; regulated industries where the audit trail must be visible from a single control point
Reference implementationsOpenAI Agents SDK Handoff, LangGraph Swarm, original OpenAI Swarm (experimental)
Critical decision pointPreventing infinite ping-pong. Endless A → B → A → B handoffs must be blocked at the code level

Swarm’s biggest advantage is no supervisor bottleneck; its biggest disadvantage is that handoff decisions are distributed, making accountability unclear. If the supervisor pattern is “central control,” swarm is “peer handoff.”

Pattern 6 — Map-Reduce (Fan-out + Aggregator)

A pattern that fans out parallel tasks and aggregates results. The multi-agent version of classic Map-Reduce.

---
config:
  look: handDrawn
  theme: neutral
---
flowchart TD
    Start --> Splitter[Splitter Agent]
    Splitter --> W1[Worker 1]
    Splitter --> W2[Worker 2]
    Splitter --> W3[Worker 3]
    W1 --> Agg[Aggregator Agent]
    W2 --> Agg
    W3 --> Agg
    Agg --> Done

    classDef entry fill:#e0f2fe,stroke:#0284c7,stroke-width:2px,color:#0c4a6e
    classDef exit fill:#ecfccb,stroke:#65a30d,stroke-width:2px,color:#365314
    class Start entry
    class Done exit

LangGraph standardizes this pattern with the Send API. Calling Send(node, state) to invoke the same node with multiple states triggers parallel execution, with results automatically gathered into a list.

ItemDetail
FitTasks that can be parallelized and require result merging. Multi-source research, large-scale document classification, batch processing
No-fitTasks with strong sequential dependencies, tasks with frequent sub-task communication
Reference implementationsLangGraph Send (Parallelization); partially used in Anthropic’s multi-agent research
Critical decision pointAggregator’s merge logic. Raw concatenation can be handled by an LLM, but semantic merging makes the aggregator itself a design challenge

Anthropic’s Research System partially uses this pattern — the lead fans out search queries, subagents search in parallel, and the lead aggregates results. The difference from pure Map-Reduce (= a fixed splitter) is that the lead dynamically determines fan-out count.

Auxiliary Patterns — Group Chat and Evaluator-Optimizer

Beyond the six above, two auxiliary patterns appear in production.

  • Group Chat (Round-Robin) — The pattern AutoGen standardized. Multiple agents speak in a defined turn order, accumulating messages in a shared context. AutoGen GroupChat + GroupChatManager. The fit case is brainstorming or debate where multiple perspectives accumulate. This is also the pattern most often (incorrectly) used as a synonym for “multi-agent” in popular material — in fact it’s a very specific pattern.
  • Evaluator-Optimizer (Critic-Actor) — One of the six patterns in Anthropic’s Building Effective Agents guide. A Generator agent produces output, an Evaluator evaluates, terminate if satisfied, otherwise the Generator retries. Natural for coding tasks (write+review loop) and writing (draft+edit loop). But without clear evaluator criteria, it loops infinitely.

Both patterns reduce to variants of the six above. Group Chat is “Sequential with explicit turn-taking,” and Evaluator-Optimizer is “Network with a conditional loop between two nodes.”

Pattern Selection — The Decision

The first question for a production team facing 6+2 patterns is “which pattern fits my task?” Drawing a matrix on two axes — topology + coordination — makes the choice visible.

Topology \ CoordinationSync (synchronous wait)Async fan-outConditional / branching
Hub-and-spokeSupervisorSupervisor + Map-Reduce insideSupervisor (routing)
LinearSequential(rare)Sequential w/ conditions
Mesh / DAGNetworkMap-ReduceNetwork
Multi-levelHierarchicalHierarchical + Map-ReduceHierarchical
Handoff (peer-to-peer)Swarm(rare)Swarm
---
config:
  look: handDrawn
  theme: neutral
---
flowchart TD
    A[Task definition] --> B{Routing across multiple workers?}
    B -->|Yes| C{Multiple domain layers?}
    C -->|Yes| D[Hierarchical]
    C -->|No| E[Supervisor]
    B -->|No| F{Predefined order?}
    F -->|Yes| G[Sequential]
    F -->|No| H{Parallel fan-out + merge?}
    H -->|Yes| I[Map-Reduce]
    H -->|No| J{Explicit graph for dependencies?}
    J -->|Yes| K[Network]
    J -->|No| L{Autonomous handoff between agents?}
    L -->|Yes| M[Swarm]
    L -->|No| E

    classDef entry fill:#e0f2fe,stroke:#0284c7,stroke-width:2px,color:#0c4a6e
    classDef exit fill:#ecfccb,stroke:#65a30d,stroke-width:2px,color:#365314
    class A entry
    class D,E,G,I,K,M exit

The first branch in the decision tree is “routing across multiple workers, or a fixed flow?” That single branch separates half of the patterns.

Production Is Hybrid — Combining Patterns

It is rare for production to use one of the 6+2 patterns alone. Most combine two or more.

  • Supervisor + Map-Reduce: The supervisor receives a task, fans some stages out to workers in parallel, and the aggregator merges. Closest to the actual structure of Anthropic Research System.
  • Hierarchical + Sequential: A top supervisor routes between sub-teams, and each sub-team runs Sequential internally. Maps to large-organization workflow.
  • Sequential + Evaluator-Optimizer: A pipeline with a critic-actor loop inserted at one stage. Write → (draft+review loop) → edit.
  • Supervisor + Swarm: The supervisor handles initial classification, then swarm handoff between workers takes over. Common in customer-service systems.

The OpenAI Agents SDK official guide summarizes this hybrid:

“These patterns are not mutually exclusive. A common production setup combines a manager (Supervisor) for top-level routing with handoffs (Swarm) at the leaf level for specialist-to-specialist transfer.” (OpenAI Agents SDK)

The most common production decision-making mistake is “trying to solve everything with one pattern.” A single pattern is reference; the actual system is hybrid. You need to know the patterns to design the hybrid.

Next Area — Evaluation

This article limited itself to the pattern catalog and selection criteria. The next stage after picking a pattern and building a system is evaluation.

  • How to evaluate multi-agent systems — Evaluation methodology for non-deterministic systems. The limits of LLM-as-judge, human eval, and benchmarks (UC Berkeley CRDI in April 2026 reported that SWE-bench, GAIA, and AgentBench are vulnerable to reward hacking). Unlike single-call evaluation, multi-agent requires trace-level evaluation, and the reliability of the evaluator agent itself becomes part of the evaluation.

Evaluation is its own large area, separate from pattern selection. This article’s scope is limited to the patterns themselves.

Sources

Share

Related Posts