====== MetaGPT: Multi-Agent Framework with SOPs ======
MetaGPT is a multi-agent collaboration framework introduced by Hong et al. (2023) that assigns LLM-powered agents to specialized software company roles — Product Manager, Architect, Engineer, and QA — and coordinates them through Standardized Operating Procedures (SOPs) encoded as structured prompts. By mimicking real-world software development workflows in an assembly-line paradigm, MetaGPT significantly reduces cascading hallucinations and produces higher-quality software artifacts than chat-based multi-agent approaches.
graph TD
REQ[User Requirement] --> PM[Product Manager]
PM -->|PRD| ARCH[Architect]
ARCH -->|Design Docs| ENG[Engineer]
ENG -->|Code| QA[QA Engineer]
QA -->|Bug Reports| ENG
QA -->|Pass| DEPLOY[Deployed Code]
===== Architecture and Role Specialization =====
MetaGPT organizes agents into a pipeline reflecting a real software company:
* **Product Manager** — Analyzes user requirements, produces PRDs (Product Requirement Documents)
* **Architect** — Designs system architecture, generates technical design documents and API specifications
* **Engineer** — Implements code based on architectural specifications, handling cross-file dependencies
* **QA Engineer** — Tests generated code and provides bug reports for iterative refinement
Each role is encoded as a specialized prompt template that constrains the agent's behavior to its domain expertise. The SOPs define not just what each agent does, but the handoff protocols between stages.
===== Shared Message Pool and Communication Protocol =====
Rather than allowing unconstrained agent-to-agent chat (which leads to role confusion and hallucination cascading), MetaGPT uses a **shared message pool** architecture:
* Agents publish structured outputs to a central message repository
* Each agent subscribes only to message types relevant to its role
* This publish-subscribe model eliminates redundant cross-talk
* Intermediate outputs serve as verifiable artifacts (PRDs, design docs, code)
This structured communication is key to MetaGPT's advantage over frameworks like AutoGPT and ChatDev, where free-form conversation often leads to degraded outputs.
===== Executable Feedback Mechanism =====
MetaGPT incorporates a code execution feedback loop that debugs and runs generated code, feeding runtime results back to the Engineer agent:
* Boosts Pass@1 by **4.2%** on HumanEval and **5.4%** on MBPP
* Improves feasibility scores from 3.67 to 3.75
* Reduces human revision cost from 2.25 to 0.83
===== Code Example =====
# Simplified MetaGPT role definition pattern
from metagpt.roles import Role
from metagpt.actions import WriteCode, WriteDesign
class Architect(Role):
name: str = "Alice"
profile: str = "Architect"
goal: str = "Design a concise, usable, complete software system"
def __init__(self, **kwargs):
super().__init__(**kwargs)
self.set_actions([WriteDesign])
# Subscribe only to ProductManager outputs
self.watch([WritePRD])
class Engineer(Role):
name: str = "Bob"
profile: str = "Engineer"
goal: str = "Write elegant, readable, extensible code"
def __init__(self, **kwargs):
super().__init__(**kwargs)
self.set_actions([WriteCode])
# Subscribe to Architect design documents
self.watch([WriteDesign])
===== Benchmark Results =====
MetaGPT achieves state-of-the-art performance on standard coding benchmarks using GPT-4 as the base LLM:
^ Benchmark ^ MetaGPT Pass@k ^ Improvement with Feedback ^
| HumanEval | 85.9% | +4.2% Pass@1 |
| MBPP | 87.7% | +5.4% Pass@1 |
On collaborative software engineering tasks, MetaGPT scores **3.9/5** compared to ChatDev (2.1) and AutoGPT (1.0), with a 100% task completion rate and lower time and token costs.
===== Mathematical Formulation =====
The SOP-guided workflow can be modeled as a directed acyclic graph (DAG) of role transitions:
G = (V, E) \text{ where } V = \{r_1, r_2, \ldots, r_n\} \text{ are roles}
Each edge represents an artifact handoff:
e_{ij} = (r_i, r_j, a_{ij}) \text{ where } a_{ij} \text{ is the structured artifact from role } i \text{ to } j
===== References =====
* [[https://arxiv.org/abs/2308.00352|Hong et al. "MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework" (arXiv:2308.00352)]]
* [[https://github.com/geekan/MetaGPT|MetaGPT GitHub Repository (45K+ stars)]]
* [[https://arxiv.org/abs/2307.04721|Qian et al. "Communicative Agents for Software Development" (ChatDev)]]
===== See Also =====
* [[camel|CAMEL — Role-playing multi-agent communication framework]]
* [[swe_agent|SWE-agent — Agent-Computer Interface for software engineering]]
* [[self_play_agents|Self-Play Agents — Self-improvement through competitive interaction]]