====== MetaGPT: Multi-Agent Framework with SOPs ====== MetaGPT is a multi-agent collaboration framework introduced by Hong et al. (2023) that assigns LLM-powered agents to specialized software company roles — Product Manager, Architect, Engineer, and QA — and coordinates them through Standardized Operating Procedures (SOPs) encoded as structured prompts. By mimicking real-world software development workflows in an assembly-line paradigm, MetaGPT significantly reduces cascading hallucinations and produces higher-quality software artifacts than chat-based multi-agent approaches. graph TD REQ[User Requirement] --> PM[Product Manager] PM -->|PRD| ARCH[Architect] ARCH -->|Design Docs| ENG[Engineer] ENG -->|Code| QA[QA Engineer] QA -->|Bug Reports| ENG QA -->|Pass| DEPLOY[Deployed Code] ===== Architecture and Role Specialization ===== MetaGPT organizes agents into a pipeline reflecting a real software company: * **Product Manager** — Analyzes user requirements, produces PRDs (Product Requirement Documents) * **Architect** — Designs system architecture, generates technical design documents and API specifications * **Engineer** — Implements code based on architectural specifications, handling cross-file dependencies * **QA Engineer** — Tests generated code and provides bug reports for iterative refinement Each role is encoded as a specialized prompt template that constrains the agent's behavior to its domain expertise. The SOPs define not just what each agent does, but the handoff protocols between stages. ===== Shared Message Pool and Communication Protocol ===== Rather than allowing unconstrained agent-to-agent chat (which leads to role confusion and hallucination cascading), MetaGPT uses a **shared message pool** architecture: * Agents publish structured outputs to a central message repository * Each agent subscribes only to message types relevant to its role * This publish-subscribe model eliminates redundant cross-talk * Intermediate outputs serve as verifiable artifacts (PRDs, design docs, code) This structured communication is key to MetaGPT's advantage over frameworks like AutoGPT and ChatDev, where free-form conversation often leads to degraded outputs. ===== Executable Feedback Mechanism ===== MetaGPT incorporates a code execution feedback loop that debugs and runs generated code, feeding runtime results back to the Engineer agent: * Boosts Pass@1 by **4.2%** on HumanEval and **5.4%** on MBPP * Improves feasibility scores from 3.67 to 3.75 * Reduces human revision cost from 2.25 to 0.83 ===== Code Example ===== # Simplified MetaGPT role definition pattern from metagpt.roles import Role from metagpt.actions import WriteCode, WriteDesign class Architect(Role): name: str = "Alice" profile: str = "Architect" goal: str = "Design a concise, usable, complete software system" def __init__(self, **kwargs): super().__init__(**kwargs) self.set_actions([WriteDesign]) # Subscribe only to ProductManager outputs self.watch([WritePRD]) class Engineer(Role): name: str = "Bob" profile: str = "Engineer" goal: str = "Write elegant, readable, extensible code" def __init__(self, **kwargs): super().__init__(**kwargs) self.set_actions([WriteCode]) # Subscribe to Architect design documents self.watch([WriteDesign]) ===== Benchmark Results ===== MetaGPT achieves state-of-the-art performance on standard coding benchmarks using GPT-4 as the base LLM: ^ Benchmark ^ MetaGPT Pass@k ^ Improvement with Feedback ^ | HumanEval | 85.9% | +4.2% Pass@1 | | MBPP | 87.7% | +5.4% Pass@1 | On collaborative software engineering tasks, MetaGPT scores **3.9/5** compared to ChatDev (2.1) and AutoGPT (1.0), with a 100% task completion rate and lower time and token costs. ===== Mathematical Formulation ===== The SOP-guided workflow can be modeled as a directed acyclic graph (DAG) of role transitions: G = (V, E) \text{ where } V = \{r_1, r_2, \ldots, r_n\} \text{ are roles} Each edge represents an artifact handoff: e_{ij} = (r_i, r_j, a_{ij}) \text{ where } a_{ij} \text{ is the structured artifact from role } i \text{ to } j ===== References ===== * [[https://arxiv.org/abs/2308.00352|Hong et al. "MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework" (arXiv:2308.00352)]] * [[https://github.com/geekan/MetaGPT|MetaGPT GitHub Repository (45K+ stars)]] * [[https://arxiv.org/abs/2307.04721|Qian et al. "Communicative Agents for Software Development" (ChatDev)]] ===== See Also ===== * [[camel|CAMEL — Role-playing multi-agent communication framework]] * [[swe_agent|SWE-agent — Agent-Computer Interface for software engineering]] * [[self_play_agents|Self-Play Agents — Self-improvement through competitive interaction]]