====== MetaGPT: Multi-Agent Framework with SOP-Based Collaboration ======
MetaGPT is a multi-agent collaboration framework introduced by Hong et al. (2023) that assigns LLM-powered agents to specialized software company roles — Product Manager, Architect, Engineer, and QA — and coordinates them through Standard Operating Procedures (SOPs) — a standard operating procedure model for multi-agent collaboration — encoded as structured prompts.(([[https://arxiv.org/abs/2308.00352|Hong et al. "MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework" (arXiv:2308.00352]])) By mimicking real-world software development workflows in an assembly-line paradigm, MetaGPT significantly reduces cascading hallucinations and produces higher-quality software artifacts than chat-based multi-agent approaches.


<mermaid>
graph TD
    REQ[User Requirement] --> PM[Product Manager]
    PM -->|PRD| ARCH[Architect]
    ARCH -->|Design Docs| ENG[Engineer]
    ENG -->|Code| QA[QA Engineer]
    QA -->|Bug Reports| ENG
    QA -->|Pass| DEPLOY[Deployed Code]
</mermaid>

===== Architecture and Role Specialization =====
MetaGPT organizes agents into a pipeline reflecting a real software company:

  * **Product Manager** — Analyzes user requirements, produces PRDs (Product Requirement Documents)
  * **Architect** — Designs system architecture, generates technical design documents and API specifications
  * **Engineer** — Implements code based on architectural specifications, handling cross-file dependencies
  * **QA Engineer** — Tests generated code and provides bug reports for iterative refinement

Each role is encoded as a specialized prompt template that constrains the agent's behavior to its domain expertise. The SOPs define not just what each agent does, but the handoff protocols between stages.

===== Shared Message Pool and Communication Protocol =====
Rather than allowing unconstrained agent-to-agent chat (which leads to role confusion and hallucination cascading), MetaGPT uses a **shared message pool** architecture:(([[https://github.com/geekan/MetaGPT|MetaGPT GitHub Repository (45K+ stars]]))

  * Agents publish [[structured_outputs|structured outputs]] to a central message repository
  * Each agent subscribes only to message types relevant to its role
  * This publish-subscribe model eliminates redundant cross-talk
  * Intermediate outputs serve as verifiable artifacts (PRDs, design docs, code)

This structured communication is key to MetaGPT's advantage over frameworks like [[autogpt|AutoGPT]] and ChatDev, where free-form conversation often leads to degraded outputs.(([[https://arxiv.org/abs/2307.04721|Qian et al. "Communicative Agents for Software Development" (ChatDev]]))

===== Executable Feedback Mechanism =====
MetaGPT incorporates a code execution feedback loop that debugs and runs generated code, feeding runtime results back to the Engineer agent:

  * Boosts Pass@1 by **4.2%** on HumanEval and **5.4%** on MBPP
  * Improves feasibility scores from 3.67 to 3.75
  * Reduces human revision cost from 2.25 to 0.83

===== Code Example =====
<code python>
# Simplified MetaGPT role definition pattern
from metagpt.roles import Role
from metagpt.actions import WriteCode, WriteDesign

class Architect(Role):
    name: str = "Alice"
    profile: str = "Architect"
    goal: str = "Design a concise, usable, complete software system"

    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.set_actions([WriteDesign])
        # Subscribe only to ProductManager outputs
        self.watch([WritePRD])

class Engineer(Role):
    name: str = "Bob"
    profile: str = "Engineer"
    goal: str = "Write elegant, readable, extensible code"

    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.set_actions([WriteCode])
        # Subscribe to Architect design documents
        self.watch([WriteDesign])
</code>

===== Benchmark Results =====
MetaGPT achieves state-of-the-art performance on standard coding benchmarks using GPT-4 as the base LLM:

^ Benchmark ^ MetaGPT Pass@k ^ Improvement with Feedback ^
| HumanEval | 85.9% | +4.2% Pass@1 |
| MBPP | 87.7% | +5.4% Pass@1 |

On collaborative software engineering tasks, MetaGPT scores **3.9/5** compared to [[chatdev|ChatDev]] (2.1) and [[autogpt|AutoGPT]] (1.0), with a 100% task completion rate and lower time and token costs.

===== Mathematical Formulation =====
The SOP-guided workflow can be modeled as a directed acyclic graph (DAG) of role transitions:

<latex>G = (V, E) \text{ where } V = \{r_1, r_2, \ldots, r_n\} \text{ are roles}</latex>

Each edge represents an artifact handoff:

<latex>e_{ij} = (r_i, r_j, a_{ij}) \text{ where } a_{ij} \text{ is the structured artifact from role } i \text{ to } j</latex>

===== See Also =====
  * [[agentverse|AgentVerse: Facilitating Multi-Agent Collaboration]]
  * [[agentgpt|AgentGPT]]
  * [[meta_harness|Meta-Harness]]
  * [[langroid|Langroid]]
  * [[agenttuning|AgentTuning: Enabling Generalized Agent Capabilities in LLMs]]

===== References =====