Multi-Agent Systems: Orchestrating AI for Complex Tasks
Single agents are great for focused tasks. But some problems are too complex for one agent to handle well.
A customer complaint might need: sentiment analysis, account lookup, policy review, response drafting, and approval routing. One agent doing all this becomes unwieldy. Multiple specialised agents, coordinated properly, can handle it elegantly.
Here's how to design multi-agent systems that actually work.
When to Use Multi-Agent
Multi-agent adds complexity. Only use it when justified:
Good reasons for multi-agent:
- Tasks require genuinely different capabilities (reasoning vs retrieval vs execution)
- Subtasks have different security/permission requirements
- You want to reuse agents across different workflows
- Sequential human review makes sense between stages
- Different response time requirements for different parts
Bad reasons for multi-agent:
- It sounds more sophisticated
- You saw a demo that looked cool
- You think multiple agents are "smarter"
A well-designed single agent often beats a poorly-designed multi-agent system. Default to simplicity.
Architecture Patterns
Pattern 1: Pipeline
Agents in sequence, each transforming output for the next.
Input → Agent A → Output A → Agent B → Output B → Agent C → Final Output
Use when:
- Tasks have clear sequential stages
- Each stage's output is the next stage's input
- No feedback loops needed
Example: Document processing
Document → Extraction Agent → Structured Data →
Validation Agent → Validated Data →
Action Agent → System Updates
Pattern 2: Router/Dispatcher
Central agent routes to specialists based on request type.
┌→ Specialist A →┐
Input → Router Agent →→ Specialist B →→ Output
└→ Specialist C →┘
Use when:
- Requests fall into distinct categories
- Specialists have different capabilities
- You want to add specialists without changing routing logic
Example: Customer service
Customer Query → Router →→ Billing Specialist
→→ Technical Specialist
→→ Sales Specialist
→→ General Agent
Pattern 3: Orchestrator/Worker
Orchestrator agent plans and coordinates worker agents.
┌→ Worker A
Task → Orchestrator →→ Worker B → Orchestrator → Output
└→ Worker C
Use when:
- Tasks require dynamic planning
- Multiple workers may be needed in varying combinations
- Coordination logic is complex
Example: Research task
Research Question → Orchestrator → Plans approach
→ Dispatches to Web Search Agent
→ Dispatches to Document Search Agent
→ Synthesises results
→ Identifies gaps
→ Dispatches for follow-up
→ Compiles final answer
Pattern 4: Supervisor/Review
Work agents do tasks, supervisor agents review and approve.
Task → Work Agent → Draft → Supervisor Agent → Approved? → Yes → Output
→ No → Feedback → Work Agent
Use when:
- Quality control is important
- Human-like review is valuable
- Different expertise needed for doing vs reviewing
Example: Content generation
Content Request → Writer Agent → Draft →
Editor Agent → Feedback →
Writer Agent → Revised →
Compliance Agent → Approved → Output
Coordination Mechanisms
Shared State
Agents communicate through shared state:
class WorkflowState:
task_id: str
input_data: dict
agent_outputs: dict # {agent_name: output}
current_stage: str
metadata: dict
Each agent reads what it needs, writes its output. Orchestrator manages transitions.
Message Passing
Agents communicate through explicit messages:
@dataclass
class AgentMessage:
sender: str
recipient: str
message_type: str # "request", "response", "error"
content: dict
correlation_id: str
More explicit than shared state, better for distributed systems.
Event-Driven
Agents react to events:
# Agent A completes and publishes event
event_bus.publish(Event(
type="extraction_complete",
data=extracted_data
))
# Agent B subscribes and reacts
@subscribe("extraction_complete")
def handle_extraction(event):
# Process the extracted data
pass
Good for loose coupling and async processing.
Implementation Considerations
Agent Identity and Scope
Each agent should have:
- Clear responsibility: One thing it does well
- Defined inputs/outputs: Contract with other agents
- Appropriate permissions: Only what it needs
- Independent testability: Can test in isolation
Error Handling Across Agents
What happens when Agent B fails mid-workflow?
Options:
- Retry: Agent B tries again (for transient failures)
- Fallback: Alternative agent or approach
- Compensate: Undo what Agent A did
- Escalate: Human takes over
- Partial success: Return what completed
Design error handling explicitly. Multi-agent failures are complex.
Observability
Debugging multi-agent is harder than single agent:
Log everything:
- Inter-agent messages
- State transitions
- Agent decisions and reasoning
- Timing information
Correlation IDs: Track a request across all agents
Visualisation: Tools to see workflow execution
Without good observability, you'll never debug production issues.
Performance
Multi-agent adds latency. Each agent hop has overhead:
- LLM inference time
- Network latency
- Serialization/deserialization
Strategies:
- Parallel execution where possible
- Cache agent outputs
- Right-size LLMs (don't use GPT-4 for simple routing)
- Set timeout budgets per workflow
Real-World Example
Here's a multi-agent system we designed for complex customer service:
┌─────────────────────────────────────────────────────────────┐
│ Orchestrator │
│ - Receives customer message │
│ - Maintains conversation state │
│ - Decides which specialists to invoke │
│ - Synthesises final response │
└─────────────────────┬───────────────────────────────────────┘
│
┌────────────────┼────────────────┐
↓ ↓ ↓
┌─────────┐ ┌─────────────┐ ┌──────────┐
│ Intent │ │ Account │ │ Knowledge │
│ Classifier│ │ Agent │ │ Agent │
└────┬────┘ └──────┬──────┘ └─────┬────┘
│ │ │
│ ┌──────────┴──────────┐ │
│ │ CRM, Billing, │ │
│ │ Order Systems │ │
│ └─────────────────────┘ │
│ │
│ ┌───────────────────────┴─────┐
│ │ Knowledge Base, Policies, │
│ │ Product Documentation │
│ └─────────────────────────────┘
│
│ ┌─────────────┐
└────────→│ Action │
│ Agent │
└──────┬──────┘
│
┌───────┴───────┐
│ Execute: │
│ - Updates │
│ - Escalations │
│ - Notifications│
└───────────────┘
The orchestrator coordinates specialists. Each specialist is focused and testable. The system handles more complexity than a single agent could manage coherently.
When Multi-Agent Fails
Common failure modes:
Over-engineering: Building multi-agent for simple tasks. Adds latency and complexity without benefit.
Poor coordination: Agents work at cross-purposes. State inconsistency.
Unclear responsibility: Multiple agents think they should handle something. Or none do.
Cascading failures: One agent fails, everything fails.
Debugging nightmares: Can't figure out what went wrong across agent interactions.
Start simple. Add agents only when single-agent is clearly insufficient.
Getting Started with Multi-Agent
If you're considering multi-agent:
- Exhaust single-agent options first: Is it really necessary?
- Map the workflow: What are the distinct stages/responsibilities?
- Choose the pattern: Pipeline, router, orchestrator, or hybrid?
- Design contracts: Clear inputs/outputs for each agent
- Build observability first: You'll need it
- Test extensively: Especially error paths
We've built multi-agent systems for complex workflows. Happy to discuss your architecture.