====== Single vs Multi-Agent Architectures ======
Choosing between single-agent and multi-agent architectures is a critical design decision that impacts cost, latency, reliability, and maintainability. Understanding single agent architecture and how single agent systems compare to multi-agent approaches is essential for building effective AI workflows. This guide provides a research-backed framework based on published benchmarks and real-world deployments.
===== Overview =====
* **Single Agent** — One LLM-powered agent handles the entire workflow with access to all tools and context. Simple, fast, predictable.
* **Orchestrator + Specialists(([[https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ai-agents/single-agent-multiple-agents|Microsoft Azure - Single Agent vs Multiple Agents]]))** — A central coordinator delegates subtasks to domain-specific agents. Modular, scalable, but adds coordination overhead.
* **Peer-to-Peer** — Autonomous agents collaborate via message passing or shared memory. Maximum parallelism, hardest to debug.
===== Decision Tree =====
graph TD
A[Start: Define Your Task] --> B{How many distinct\ndomains or skills?}
B -->|1-2| C{Workflow\npredictable?}
B -->|3+| D{Tasks\nindependent?}
C -->|Yes| E[Single Agent]
C -->|No| F{Context window\nsufficient?}
F -->|Yes| E
F -->|No| G[Orchestrator + Specialists(([[https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ai-agents/single-agent-multiple-agents|Microsoft Azure - Single Agent vs Multiple Agents]]))]
D -->|Yes| H{Need real-time\nparallelism?}
D -->|No| G
H -->|Yes| I[Peer-to-Peer]
H -->|No| G
E --> J{Scaling beyond\n10K ops per day?}
J -->|Yes| K[Consider Multi-Agent]
J -->|No| L[Stay Single Agent]
style E fill:#4CAF50,color:#fff
style G fill:#FF9800,color:#fff
style I fill:#9C27B0,color:#fff
style L fill:#4CAF50,color:#fff
style K fill:#FF9800,color:#fff
===== Architecture Comparison =====
^ Factor ^ Single Agent ^ Orchestrator + Specialists(([[https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ai-agents/single-agent-multiple-agents|Microsoft Azure - Single Agent vs Multiple Agents]])) ^ Peer-to-Peer ^
| **Complexity** | Low | Medium | High |
| **Latency** | Lowest (1 LLM call) | Medium (2-5 LLM calls) | Variable (parallel) |
| **Token Cost** | 1x baseline | 3-5x baseline | 10-15x baseline |
| **Debugging** | Simple, unified logs | Moderate, trace per agent | Hard, distributed tracing |
| **Failure Mode** | Single point of failure | Isolated failures, graceful degradation | Partial ops continue |
| **Scalability** | Limited by context window | Good with agent specialization | Best for high parallelism |
| **Accuracy (SWE-bench)** | ~65% | ~72% | Similar to orchestrator |
| **Best For** | Sequential, well-defined tasks | Multi-domain workflows | High-throughput parallel tasks |
//Sources: SWE-bench Verified 2025, Redis engineering blog, Microsoft Azure architecture guidance//
===== Complexity Thresholds =====
Use these rules of thumb based on published research:
^ Indicator ^ Single Agent ^ Multi-Agent ^
| Tool/function count | 1-5 | 6+ |
| Distinct knowledge domains | 1-2 | 3+ |
| Daily operations | Less than 10,000 | Over 10,000 |
| Context required | Fits one window | Exceeds context limits |
| Task independence | Sequential | Parallelizable |
| Error tolerance | Can retry whole workflow | Needs isolated recovery |
===== Pattern Details =====
=== Single Agent ===
One agent, one context window, all tools available. Start here.
# Single agent with tool access
tools = [search_tool, calculator_tool, database_tool]
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a research assistant with access to search, calculation, and database tools."},
{"role": "user", "content": user_query}
],
tools=tools
)
# Agent decides which tools to call in sequence
**Strengths**: Low latency, simple debugging, minimal token overhead, easy to deploy.
**Weaknesses**: Context window limits, single point of failure, struggles with 6+ tools.
=== Orchestrator + Specialists(([[https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ai-agents/single-agent-multiple-agents|Microsoft Azure - Single Agent vs Multiple Agents]])) ===
A coordinator routes tasks to domain experts. Most common multi-agent pattern.
# Orchestrator pattern
class Orchestrator:
def __init__(self):
self.agents = {
"researcher": ResearchAgent(model="gpt-4o"),
"coder": CodingAgent(model="claude-sonnet"),
"reviewer": ReviewAgent(model="gpt-4o"),
}
def process(self, task):
plan = self.plan(task) # Decompose into subtasks
results = {}
for step in plan:
agent = self.agents[step.agent_type]
results[step.id] = agent.execute(
step.instruction,
context=self.gather_context(step, results)
)
return self.synthesize(results)
**Strengths**: Modular, each agent optimized for its domain, isolated failures, scalable.
**Weaknesses**: Orchestrator becomes bottleneck, 3-5x token cost, coordination latency.
=== Peer-to-Peer ===
Agents communicate directly without central control. Best for embarrassingly parallel tasks.
# Peer-to-peer with shared memory
import asyncio
class PeerAgent:
def __init__(self, role, shared_memory):
self.role = role
self.memory = shared_memory # Shared state store
async def work(self, task):
result = await self.llm_call(task)
await self.memory.publish(self.role, result)
# React to other agents' outputs
async for update in self.memory.subscribe():
if self.should_respond(update):
await self.respond(update)
# Launch agents in parallel
agents = [PeerAgent("analyst", mem), PeerAgent("writer", mem), PeerAgent("critic", mem)]
await asyncio.gather(*[a.work(task) for a in agents])
**Strengths**: Maximum parallelism, no central bottleneck, up to 64% throughput improvement.
**Weaknesses**: 10-15x token cost, chaotic coordination, extremely hard to debug.
===== Real-World Frameworks =====
^ Framework ^ Pattern ^ Best For ^ Key Feature ^
| **LangGraph** | Orchestrator | Stateful multi-step workflows | Graph-based state machines |
| **CrewAI** | Orchestrator + Specialists(([[https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ai-agents/single-agent-multiple-agents|Microsoft Azure - Single Agent vs Multiple Agents]])) | Role-based team workflows | Agent roles and delegation |
| **AutoGen** | Peer-to-Peer | Research and collaborative tasks | Dynamic agent conversations |
| **OpenAI Swarm** | Orchestrator | Lightweight agent handoffs | Minimal coordination overhead |
| **Claude Tools** | Single Agent | Tool-heavy sequential tasks | Native tool use, large context |
===== Failure Modes and Mitigation =====
=== Single Agent Failures ===
* **Context overflow** — Mitigate with summarization or RAG
* **Tool selection errors** — Mitigate with better tool descriptions
* **Total crash** — Mitigate with retry logic and checkpointing
=== Multi-Agent Failures ===
* **Coordination deadlock** — Mitigate with timeouts and fallback agents
* **Message storms** — Mitigate with rate limiting and turn-taking protocols
* **Inconsistent state** — Mitigate with shared memory and consensus mechanisms
* **Cascading failures** — Mitigate with circuit breakers per agent
===== Performance Benchmarks =====
From published 2025-2026 research:
* Multi-agent coding systems score **72.2% on SWE-bench Verified** vs ~65% for single agents using the same base model((([[https://arxiv.org/html/2509.10769v1|AgentArch Benchmark - Evaluating Agent Architectures.]]))),((([[https://redis.io/blog/single-agent-vs-multi-agent-systems/|Redis - Single Agent vs Multi-Agent Systems.]])))
* Multi-agent systems show **23% higher accuracy** on complex reasoning tasks(([[https://vibecoding.app/blog/multi-agent-vs-single-agent-coding|VibeCoding - Multi-Agent vs Single-Agent Coding]]))
* Multi-agent throughput improvement of **up to 64%** for parallelizable workloads
* Multi-agent token cost is **10-15x higher** for complex requests
* Single-agent latency is typically **2-5x lower** than orchestrated systems
===== Key Takeaways =====
- **Default to single agent**. It covers 80% of use cases with lower cost and complexity.
- **Add agents for specialization**, not just because you can. Each agent should have a clear, distinct role.
- **Orchestrator + Specialists(([[https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ai-agents/single-agent-multiple-agents|Microsoft Azure - Single Agent vs Multiple Agents]]))** is the most practical multi-agent pattern for production.
- **Peer-to-peer** is rarely needed outside high-throughput parallel processing.
- **Measure the tradeoff**: multi-agent gains 7-23% accuracy but costs 3-15x more tokens.
===== See Also =====
* [[when_to_use_rag_vs_fine_tuning|When to Use RAG vs Fine-Tuning]] — Choosing knowledge strategies
* [[how_to_structure_system_prompts|How to Structure System Prompts]] — Prompt design for agents
* [[how_to_choose_chunk_size|How to Choose Chunk Size]] — RAG optimization
===== References =====