====== Modular Architectures ======
[[modular|Modular]] architectures for AI agents organize system components into discrete, interchangeable modules such as planning, memory, tool use, and perception layers. This separation of concerns allows developers to independently upgrade, test, and swap individual components without rebuilding the entire agent. By 2025-2026, modular design has become the dominant paradigm for production agent systems, supported by standardized protocols like MCP and A2A that enable interoperability across vendors and frameworks.(([[https://arxiv.org/abs/2309.02427|Sumers, T. et al. "Cognitive Architectures for Language Agents." arXiv:2309.02427, 2023.]])) Perception layers in these architectures handle multimodal input understanding and structured observation extraction, while separate reasoning layers process complex decision-making.(([[https://cobusgreyling.substack.com/p/nvidia-nemotron-3-nano-omni|Cobus Greyling (LLMs). "Agent Architecture (Perception + Reasoning Layers)." 2026]]))

<mermaid>
graph TD
    User[User] --> Orch[Orchestrator]

    subgraph Agent Modules
        Orch --> Planner[Planner]
        Orch --> Mem[Memory Manager]
        Orch --> TR[Tool Router]
        Orch --> CM[Context Manager]
    end

    subgraph External Services
        TR --> API[External APIs]
        TR --> DB[Databases]
        TR --> Code[Code Execution]
        Mem --> VDB[Vector Store]
    end

    Planner -->|Plan| Orch
    Mem -->|Context| Orch
    TR -->|Results| Orch
    CM -->|Managed context| Orch
    Orch -->|Response| User
</mermaid>

===== Design Principles =====
[[modular|Modular]] agent architectures follow several core principles:

  * **Separation of Concerns**: Each module handles a single responsibility, planning, tool execution, memory, or user interaction
  * **Standardized Interfaces**: Modules communicate through well-defined protocols, enabling mix-and-match composition
  * **Independent Scaling**: Components can be scaled, replaced, or upgraded without affecting the rest of the system
  * **Progressive Complexity**: Start with a monolithic agent and decompose into modules as complexity demands

The typical progression is: single agent with tools (prototyping) to structured workflows (production) to multi-[[agent_orchestration|agent orchestration]] (enterprise scale).

===== Architectural Patterns =====
==== Single Agent with Tools ====
The simplest [[modular|modular]] pattern: a central LLM augmented by 10-20 external functions for search, code execution, data retrieval, and other [[tool_using_agents|tool-use]] capabilities. Quick to build and debug, but scales poorly beyond the tool limit due to [[context_window_management|context constraints]] and tool selection errors.

==== Agentic Workflows ====
Directed graphs of specialized processing steps, where each node performs a specific function (generate, evaluate, refine). Unlike free-form [[agent_loop|agent loops]], workflows follow predetermined paths with conditional branching. Suitable for predictable multi-step tasks like content pipelines, data processing, and approval workflows.

==== Multi-Agent Systems ====
Multiple specialized agents collaborate on complex tasks:

  * **Hub-and-Spoke**: A central orchestrator delegates to specialized worker agents and synthesizes their results
  * **Pipeline**: Sequential chains where each agent handles a distinct stage
  * **Peer-to-Peer**: Agents communicate directly, handing off tasks based on expertise
  * **Hierarchical**: Multi-level supervisor-worker trees for complex decomposition

Research shows that compiling [[multi_agent_systems|multi-agent systems]] into single agents with **skills** (reusable capability modules) can reduce token usage by 54% and latency by 50% while matching accuracy, suggesting that the overhead of inter-agent communication is often unnecessary.(([[https://arxiv.org/abs/2308.11432|Wang, L. et al. "A Survey on Large Language Model based Autonomous Agents." arXiv:2308.11432, 2023.]]))

===== Framework Implementations =====
**[[langgraph|LangGraph]] ([[langchain|LangChain]])** provides graph-based workflow composition with StateGraph nodes for agents, tools, and decision points; conditional edges for dynamic routing based on state; explicit [[human_in_the_loop|human-in-the-loop]] nodes for enterprise oversight; shared state management across graph traversals; and 4.2M+ downloads with strong enterprise adoption.

**[[crewai|CrewAI]]** focuses on role-based multi-agent collaboration with agents having defined roles, goals, and tool access; sequential and parallel task execution within crews; built-in delegation between agents; and good production readiness with moderate learning curve.

**[[microsoft|Microsoft]] [[autogen|AutoGen]]** enables conversational [[multi_agent_systems|multi-agent systems]] where agents interact through message passing; supports dynamic orchestration and peer handoffs; built-in [[human_in_the_loop|human-in-the-loop]] modes (ALWAYS, TERMINATE, NEVER); best for exploratory multi-agent collaboration.

**[[openai_agents_sdk|OpenAI Agents SDK]]** provides a minimal but production-focused approach with agents, tools, and handoffs as core primitives; built-in tracing and observability; provider-agnostic design despite OpenAI branding; native tools including web search, file search, and [[computer_use|computer use]].

===== Interoperability Protocols =====
Three protocols have emerged as the interoperability stack for [[modular|modular]] agents in 2025-2026:

==== MCP (Model Context Protocol) ====
[[anthropic_context_protocol|Anthropic's Model Context Protocol]] for connecting agents to tools and data sources. MCP provides standardized tool discovery, universal connectivity (replacing custom integrations), and context sharing across systems.(([[https://www.anthropic.com/news/model-context-protocol|Anthropic. "Introducing the Model Context Protocol." November 2024.]])) Adopted by OpenAI (March 2025), Google (April 2025), and donated to the Linux Foundation (December 2025). By early 2026: 97 million monthly SDK downloads, 5,800+ available servers.(([[https://modelcontextprotocol.io/specification/2025-03-26|Model Context Protocol (MCP) Specification, Anthropic, 2025]]))(([[https://github.com/modelcontextprotocol/modelcontextprotocol|MCP GitHub Repository]]))

==== Google A2A Protocol ====
Agent-to-[[agent_protocol|Agent protocol]] facilitating collaboration between agents across different systems and vendors. Complementary to MCP, while MCP connects agents to tools, A2A connects agents to each other, enabling cross-organizational agent ecosystems.(([[https://google.[[github|github]])).io/A2A/specification/|Agent2Agent (A2A) Protocol Specification, Google, 2025]]))

==== AG-UI ====
Agent-User Interface protocol for standardizing how agents communicate with humans, feedback collection, approval workflows, and status reporting. Completes the protocol stack alongside MCP (agent-to-tool) and A2A (agent-to-agent).(([[https://github.com/ag-ui-protocol/ag-ui|GitHub: ag-ui-protocol/ag-ui, AG-UI: Agent-User Interaction Protocol]]))

===== Plugin Systems =====
[[modular|Modular]] architectures often expose plugin interfaces for extending capabilities:

  * **[[autogpt|AutoGPT]] Plugins**: Community-extensible modules for adding tools, memory backends, and integrations
  * **ChatGPT Plugins (legacy)**: Demonstrated the plugin model for LLM-based agents, later absorbed into native tool use
  * **[[mcp_servers|MCP Servers]]**: The modern equivalent of plugins, standardized, discoverable tool providers that any MCP-compatible agent can use
  * **[[langchain|LangChain]] Tools**: Typed tool definitions that can be shared across agents and workflows

===== Composable Agent Design =====
The trend toward composable agents treats agent capabilities as building blocks:

  * **Skills as Modules**: Reusable capability packages (e.g., "web research," "code review," "data analysis") that agents can load dynamically
  * **Hierarchical Tool Routing**: For catalogs exceeding 50-100 capabilities, hierarchical selection prevents errors and reduces reasoning overhead
  * **Context Layers**: Separating data context from model configuration for flexibility across deployments
  * **Open Architectures**: Avoiding vendor lock-in through standard protocols and interchangeable components

===== Code Example: Agent with Pluggable Modules =====
<code python>
from abc import ABC, abstractmethod


class Module(ABC):
    @abstractmethod
    def run(self, context: dict) -> dict:
        pass


class PlannerModule(Module):
    def run(self, context: dict) -> dict:
        goal = context["goal"]
        steps = [f"Research {goal}", f"Analyze findings for {goal}", f"Summarize results"]
        return {"plan": steps}


class MemoryModule(Module):
    def __init__(self):
        self.store = []

    def run(self, context: dict) -> dict:
        if "save" in context:
            self.store.append(context["save"])
        query = context.get("query", "")
        relevant = [m for m in self.store if query.lower() in m.lower()]
        return {"memories": relevant}


class ToolModule(Module):
    def __init__(self, tools: dict):
        self.tools = tools

    def run(self, context: dict) -> dict:
        tool_name = context.get("tool")
        if tool_name in self.tools:
            return {"result": self.toolstool_name(context.get("input", ""))}
        return {"error": f"Tool '{tool_name}' not found"}


class ModularAgent:
    def __init__(self):
        self.modules: dict[str, Module] = {}

    def register(self, name: str, module: Module):
        self.modulesname = module

    def execute(self, module_name: str, context: dict) -> dict:
        return self.modulesmodule_name.run(context)


agent = ModularAgent()
agent.register("planner", PlannerModule())
agent.register("memory", MemoryModule())
agent.register("tools", ToolModule({"search": lambda q: f"Results for: {q}"}))

plan = agent.execute("planner", {"goal": "climate change impacts"})
print("Plan:", plan["plan"])

agent.execute("memory", {"save": "User prefers concise answers", "query": ""})
memories = agent.execute("memory", {"query": "concise"})
print("Memories:", memories["memories"])

result = agent.execute("tools", {"tool": "search", "input": plan["plan"][0]})
print("Tool result:", result["result"])
</code>

===== Production Trends (2025-2026) =====
  * Emphasis on guardrails, tracing, and observability for production reliability
  * Smaller, specialized models per module rather than one large model for everything
  * Compilation techniques that flatten [[multi_agent_systems|multi-agent systems]] into efficient single-agent execution when possible
  * Evaluation tooling addressing the testing bottleneck for complex agent systems
  * Open-source agent OS paradigms treating the agent framework as an operating system for AI capabilities

===== See Also =====

  * [[all_layers_flexibility_vs_control|Flexibility vs Control Across All Layers]]
  * [[single_agent_architecture|Single Agent Architecture: Design Patterns for Solo AI Agents]]
  * [[ai_agents|AI Agents]]
  * [[agent_interface_design|Agent Interface Design]]
  * [[agent_first_architecture|Agent-First Architecture]]

===== References =====