This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revision | |||
| agent_design_patterns [2026/03/31 02:02] – Comprehensive AI Agent Design Patterns index page agent | agent_design_patterns [2026/03/31 02:04] (current) – Create comprehensive agent design patterns article agent | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| - | ====== | + | ====== Agent Design Patterns ====== |
| - | **AI Agent Design Patterns** | + | Agent design patterns |
| - | This page serves as the **definitive index** of agent design | + | The patterns |
| - | > "I think AI agent workflows will drive massive AI progress this year -- perhaps even more than the next generation of foundation models." | + | ===== Core Agentic Patterns ===== |
| - | ---- | + | These are the foundational patterns identified by Andrew Ng and widely adopted across the agent-building community((Andrew Ng, "Four AI Agent Strategies That Improve GPT-4 and GPT-3.5", DeepLearning.AI, The Batch newsletter, 2024 [[https://www.deeplearning.ai/the-batch/|DeepLearning.AI]])). They represent the building blocks from which more complex agent architectures are composed. |
| - | + | ||
| - | ===== Core Agentic Patterns | + | |
| - | + | ||
| - | In early 2024, Andrew Ng identified four foundational agentic design patterns that enable LLM-based agents to dramatically outperform zero-shot prompting -- in some cases allowing | + | |
| ==== Reflection ==== | ==== Reflection ==== | ||
| - | An agent critiques its own output and uses that feedback to iteratively | + | Reflection is a pattern where an agent critiques its own output and iteratively |
| - | + | ||
| - | * **When | + | |
| - | * **Frameworks:** LangGraph (Evaluator-Optimizer workflow), Reflexion, AutoGen self-critique loops. | + | |
| - | * **See also: | + | |
| ==== Tool Use / ReAct ==== | ==== Tool Use / ReAct ==== | ||
| - | The agent dynamically selects and invokes | + | The Tool Use pattern enables agents to call external tools — APIs, databases, code interpreters, search engines — within a structured reason-act |
| - | + | ||
| - | * **When to use:** Any task requiring | + | |
| - | * **Frameworks: | + | |
| - | * **See also: | + | |
| ==== Planning ==== | ==== Planning ==== | ||
| - | The agent decomposes | + | Planning is the pattern of decomposing |
| - | + | ||
| - | * **When to use:** Multi-step tasks, research workflows, software development, | + | |
| - | * **Frameworks: | + | |
| - | * **See also: | + | |
| ==== Multi-Agent Collaboration ==== | ==== Multi-Agent Collaboration ==== | ||
| - | Multiple | + | Multi-Agent Collaboration involves multiple |
| - | * **When to use:** Complex workflows exceeding a single agent' | + | ==== Human-in-the-Loop ==== |
| - | * **Frameworks: | + | |
| - | * **See also:** [[multi_agent_systems|Multi-Agent Systems]] | + | |
| - | ---- | + | Human-in-the-Loop (HITL) is the pattern of incorporating human oversight, approval gates, or feedback injection into the agent' |
| ===== Reasoning Patterns ===== | ===== Reasoning Patterns ===== | ||
| - | Reasoning patterns | + | Reasoning patterns |
| ==== Chain of Thought (CoT) ==== | ==== Chain of Thought (CoT) ==== | ||
| - | The foundational reasoning pattern. The model generates explicit | + | Chain of Thought prompts the model to produce |
| - | + | ||
| - | * **When to use:** Arithmetic, commonsense reasoning, any task where step-by-step reasoning reduces errors. | + | |
| - | * **Frameworks: | + | |
| ==== Tree of Thoughts (ToT) ==== | ==== Tree of Thoughts (ToT) ==== | ||
| - | Extends | + | Tree of Thoughts extends |
| - | + | ||
| - | * **When to use:** Creative problem-solving, | + | |
| - | * **Frameworks: | + | |
| ==== Graph of Thoughts (GoT) ==== | ==== Graph of Thoughts (GoT) ==== | ||
| - | Generalizes ToT by modeling | + | Graph of Thoughts generalizes Tree of Thoughts |
| - | + | ||
| - | * **When to use:** Complex | + | |
| - | * **Frameworks: | + | |
| - | + | ||
| - | ==== Chain of Draft ==== | + | |
| - | A token-efficient variant of CoT where the model generates minimal, concise intermediate steps rather than verbose reasoning traces. Achieves comparable accuracy to CoT while using significantly fewer tokens. ((source [[https:// | + | ==== Chain of Draft (CoD) ==== |
| - | * **When to use:** Cost-sensitive or latency-sensitive applications | + | Chain of Draft is an efficiency-oriented variant of CoT where the agent produces minimal, abbreviated reasoning steps rather than verbose explanations. Each intermediate step contains only the essential information needed to advance the reasoning. This preserves the accuracy benefits of CoT while significantly reducing token usage and latency. Use CoD when you need CoT-level |
| - | * **Frameworks: | + | |
| ==== Self-Consistency ==== | ==== Self-Consistency ==== | ||
| - | Samples | + | Self-Consistency generates |
| - | * **When to use:** High-stakes decisions where a single reasoning trace may be unreliable; math and logic problems. | + | ==== ReAct ==== |
| - | * **Frameworks: | + | |
| - | ==== ReAct (Reason + Act) ==== | + | ReAct interleaves |
| - | + | ||
| - | Interleaves | + | |
| - | + | ||
| - | * **When to use:** The default | + | |
| - | * **Frameworks: | + | |
| - | * **See also: | + | |
| ==== Reflexion ==== | ==== Reflexion ==== | ||
| - | Extends ReAct with an explicit | + | Reflexion adds a verbal |
| - | + | ||
| - | * **When to use:** Multi-attempt tasks like coding challenges, where learning from mistakes significantly improves success rate. | + | |
| - | * **Frameworks: | + | |
| - | * **See also: | + | |
| ==== Self-Refine ==== | ==== Self-Refine ==== | ||
| - | A single-model iterative refinement loop: the LLM generates output, critiques it, then refines it -- without external | + | Self-Refine is a single-agent iterative refinement loop: generate, get feedback, refine. The same model both produces output and critiques it, using structured |
| - | + | ||
| - | * **When to use:** Tasks where iterative polishing | + | |
| - | * **Frameworks: | + | |
| - | + | ||
| - | ---- | + | |
| ===== Orchestration Patterns ===== | ===== Orchestration Patterns ===== | ||
| - | Orchestration patterns | + | Orchestration patterns |
| - | + | ||
| - | ==== Supervisor / Manager Pattern ==== | + | |
| - | A central supervisor agent receives user requests, decomposes them into sub-tasks, delegates to specialized worker agents, reviews results, and synthesizes a final response. The supervisor has delegation tools; workers have domain-specific tools. ((source [[https:// | + | ==== Supervisor |
| - | * **When to use:** Production systems requiring deterministic control, quality review of worker | + | A central supervisor agent receives tasks, delegates them to specialized |
| - | * **Frameworks: | + | |
| ==== Peer-to-Peer / Swarm ==== | ==== Peer-to-Peer / Swarm ==== | ||
| - | Agents communicate directly with each other without a central coordinator. Each agent decides independently when to hand off work, creating | + | In the swarm pattern, agents operate as equals |
| - | + | ||
| - | * **When to use:** Creative/ | + | |
| - | * **Frameworks: | + | |
| ==== Hierarchical Delegation ==== | ==== Hierarchical Delegation ==== | ||
| - | A multi-level tree of supervisor | + | Hierarchical delegation extends the supervisor |
| - | + | ||
| - | * **When to use:** Enterprise-scale workflows with clearly defined organizational structure (e.g., research then analysis then writing then review). | + | |
| - | * **Frameworks: | + | |
| ==== Pipeline / Sequential ==== | ==== Pipeline / Sequential ==== | ||
| - | Agents process tasks in a fixed linear order, with each agent' | + | The pipeline pattern chains agents |
| - | + | ||
| - | * **When to use:** Data ETL, document processing pipelines, any workflow with clear sequential dependencies. | + | |
| - | * **Frameworks: | + | |
| - | + | ||
| - | ==== Map-Reduce for Agents ==== | + | |
| - | + | ||
| - | A coordinator fans out identical or varied sub-tasks | + | |
| - | + | ||
| - | * **When to use:** Bulk document analysis, parallel research across multiple sources, any task with independent sub-problems. | + | |
| - | * **Frameworks: | + | |
| - | ==== Router Pattern (Semantic Routing) | + | ==== Map-Reduce |
| - | An LLM-based router classifies incoming requests and directs them to the most appropriate specialized agent or workflow. Acts as an intelligent dispatcher using semantic understanding rather than rule-based routing. | + | Map-Reduce distributes independent subtasks across multiple agents in parallel (map phase), then aggregates their results into a final output (reduce phase). This pattern excels at processing large datasets |
| - | * **When to use:** Systems handling diverse request types that require different capabilities or toolsets. | + | ==== Router |
| - | * **Frameworks: | + | |
| - | ---- | + | A router agent analyzes incoming requests and directs them to the most appropriate specialized agent or pipeline based on the request' |
| ===== Memory Patterns ===== | ===== Memory Patterns ===== | ||
| - | Memory patterns | + | Memory patterns |
| ==== Short-Term Memory (Context Window) ==== | ==== Short-Term Memory (Context Window) ==== | ||
| - | The most basic memory: the LLM's context | + | Short-term |
| - | + | ||
| - | * **When | + | |
| - | * **Frameworks: | + | |
| ==== Long-Term Memory (Vector Store) ==== | ==== Long-Term Memory (Vector Store) ==== | ||
| - | Persistent | + | Long-term memory persists information across conversations using external |
| - | + | ||
| - | * **When to use:** Personalization, knowledge accumulation, | + | |
| - | * **Frameworks: | + | |
| ==== Episodic Memory ==== | ==== Episodic Memory ==== | ||
| - | Records | + | Episodic memory stores records of specific past experiences — complete |
| - | + | ||
| - | * **When to use:** Agents that need to learn from past successes/ | + | |
| - | * **Frameworks: | + | |
| ==== Working Memory / Scratchpad ==== | ==== Working Memory / Scratchpad ==== | ||
| - | A temporary workspace | + | Working memory provides the agent with an explicit scratchpad |
| - | + | ||
| - | * **When to use:** Complex | + | |
| - | * **Frameworks: | + | |
| - | + | ||
| - | ---- | + | |
| ===== Communication Patterns ===== | ===== Communication Patterns ===== | ||
| - | Communication patterns | + | Communication patterns |
| - | + | ||
| - | ==== Human-in-the-Loop (HITL) ==== | + | |
| - | The agent handles routine operations autonomously but escalates edge cases, high-stakes decisions, or low-confidence outputs to a human for review and approval. The level of autonomy can be tuned from full human oversight to sparse supervision. | + | ==== Human-in-the-Loop ==== |
| - | * **When to use:** Production systems | + | Human-in-the-loop communication establishes structured interaction points where the agent requests |
| - | * **Frameworks: | + | |
| ==== Agent-to-Agent Messaging ==== | ==== Agent-to-Agent Messaging ==== | ||
| - | Agents communicate | + | Agent-to-agent messaging enables direct communication between agents |
| - | + | ||
| - | * **When to use:** Cross-framework agent collaboration, enterprise systems with agents | + | |
| - | * **Frameworks: | + | |
| ==== Shared Blackboard ==== | ==== Shared Blackboard ==== | ||
| - | All agents read from and write to a shared knowledge store (the " | + | The shared blackboard pattern provides a common knowledge store that all agents |
| - | + | ||
| - | * **When to use:** Problems requiring incremental, | + | |
| - | * **Frameworks: | + | |
| ==== Event-Driven ==== | ==== Event-Driven ==== | ||
| - | Agents subscribe to event streams and react to relevant | + | Event-driven communication uses an event bus or message queue to decouple agent interactions. |
| - | + | ||
| - | * **When to use:** Real-time monitoring, workflow automation triggered by external events, scalable microservice-style agent systems. | + | |
| - | * **Frameworks: | + | |
| - | + | ||
| - | ---- | + | |
| ===== Reliability Patterns ===== | ===== Reliability Patterns ===== | ||
| - | Reliability patterns ensure | + | Reliability patterns ensure |
| ==== Retry with Backoff ==== | ==== Retry with Backoff ==== | ||
| - | When a tool call or LLM request fails, the agent retries with exponentially increasing delays. | + | When an LLM call, tool invocation, |
| - | + | ||
| - | * **When to use:** Any agent making | + | |
| - | * **Frameworks: | + | |
| ==== Fallback Chains ==== | ==== Fallback Chains ==== | ||
| - | Defines | + | Fallback chains define |
| - | + | ||
| - | * **When to use:** High-availability systems where agent failure must not block the user. | + | |
| - | * **Frameworks: | + | |
| ==== Circuit Breaker ==== | ==== Circuit Breaker ==== | ||
| - | After repeated failures of a particular tool or service, the agent stops calling | + | The circuit breaker pattern monitors failure rates for external services and temporarily |
| - | * **When to use:** Agents depending on unreliable external services; production systems with strict latency budgets. | + | ==== Guardrails / Validation ==== |
| - | * **Frameworks: | + | |
| - | ==== Guardrails | + | Guardrails |
| - | + | ||
| - | Structured checks | + | |
| - | + | ||
| - | * **When | + | |
| - | * **Frameworks: | + | |
| ==== Dual LLM (Planner + Executor) ==== | ==== Dual LLM (Planner + Executor) ==== | ||
| - | Two models | + | The dual LLM pattern separates planning from execution using two different |
| - | + | ||
| - | * **When to use:** Cost-sensitive production systems, high-throughput pipelines where not every step requires frontier-model | + | |
| - | * **Frameworks: | + | |
| - | + | ||
| - | ---- | + | |
| ===== Efficiency Patterns ===== | ===== Efficiency Patterns ===== | ||
| - | Efficiency patterns | + | Efficiency patterns |
| ==== Caching (Semantic + Exact) ==== | ==== Caching (Semantic + Exact) ==== | ||
| - | **Exact caching** stores responses keyed by identical inputs. **Semantic caching** uses embedding similarity to reuse responses | + | Caching stores the results of previous LLM calls or tool invocations for reuse. |
| - | + | ||
| - | * **When to use:** High-traffic agents with repetitive queries; cost optimization for expensive models. | + | |
| - | * **Frameworks: | + | |
| ==== Speculative Execution ==== | ==== Speculative Execution ==== | ||
| - | The agent generates | + | Speculative execution runs multiple |
| - | + | ||
| - | * **When to use:** Latency-critical | + | |
| - | * **Frameworks: | + | |
| ==== Budget-Aware Reasoning ==== | ==== Budget-Aware Reasoning ==== | ||
| - | The agent monitors its token usage, API costs, or wall-clock time and adjusts its reasoning depth accordingly. May use cheaper models for simple sub-tasks, skip optional refinement steps, or invoke early stopping when confident. | + | Budget-aware reasoning constrains the agent's resource consumption — limiting the number of LLM calls, total tokens, tool invocations, or wall-clock time. The agent must reason about how to allocate its budget across subtasks |
| - | + | ||
| - | * **When to use:** Production | + | |
| - | * **Frameworks: | + | |
| ==== Parallel Tool Calling ==== | ==== Parallel Tool Calling ==== | ||
| - | The agent invokes | + | Parallel tool calling executes |
| - | + | ||
| - | * **When | + | |
| - | * **Frameworks: | + | |
| - | + | ||
| - | ---- | + | |
| - | + | ||
| - | ===== Pattern Selection Guide ===== | + | |
| - | + | ||
| - | ^ Concern ^ Start With ^ Scale To ^ | + | |
| - | | Single-agent reasoning | CoT / ReAct | ToT / Reflexion / Self-Consistency | | + | |
| - | | Multi-step tasks | Planning + Tool Use | Map-Reduce / Hierarchical Delegation | | + | |
| - | | Multi-agent coordination | Supervisor | Hierarchical / Swarm (depending on control | + | |
| - | | Memory | Short-term + RAG | Episodic + Working Memory + Knowledge Graphs | | + | |
| - | | Reliability | Guardrails + Retry | Circuit Breaker + Fallback Chains + Dual LLM | | + | |
| - | | Efficiency | Caching | Budget-Aware + Parallel Tool Calling | + | |
| - | + | ||
| - | ---- | + | |
| ===== See Also ===== | ===== See Also ===== | ||
| - | * [[agent_loop|Agent Loop]] | ||
| * [[react_framework|ReAct Framework]] | * [[react_framework|ReAct Framework]] | ||
| + | * [[multi_agent_systems|Multi-Agent Systems]] | ||
| * [[planning|Planning]] | * [[planning|Planning]] | ||
| - | * [[multi_agent_systems|Multi-Agent Systems]] | + | * [[tool_use|Tool Use]] |
| - | * [[modular_architectures|Modular Architectures]] | + | * [[human_in_the_loop|Human-in-the-Loop]] |
| + | * [[chain_of_thought|Chain of Thought]] | ||
| + | * [[critic_self_correction|Critic & Self-Correction]] | ||
| + | * [[guardrails|Guardrails & Validation]] | ||
| + | * [[agentic_ai|Agentic AI Overview]] | ||
| ===== References ===== | ===== References ===== | ||
| - | |||