Motivation
Framework
Debate Dynamics
Key Properties
Key Results
Societies of Thought
Mathematical Formulation
References
See Also

Multi-Agent Debate

Multi-Agent Debate (MAD) is a reasoning framework introduced by Du et al. (2023) where multiple LLM instances independently generate responses to a query, then iteratively critique and refine each other's answers over several rounds of structured debate until convergence. The approach significantly improves factuality and reasoning accuracy by leveraging diverse reasoning perspectives and cross-verification.

graph TD Q[Question] --> A1[Agent A Proposes] Q --> A2[Agent B Proposes] Q --> A3[Agent C Proposes] A1 & A2 & A3 --> R1[Round 1: Debate] R1 --> R2[Round 2: Refine] R2 --> C[Consensus] C --> ANS[Final Answer]

Motivation

Individual LLM instances are prone to overconfidence, hallucination, and reasoning errors that go uncorrected in single-pass generation. Inspired by Minsky's “Society of Mind” concept, MAD treats multiple LLM copies as a deliberative group where errors in one agent's reasoning are challenged by others, driving convergence toward correct answers through argumentative pressure.

Framework

The debate protocol operates in three phases:

Independent Generation — Each of $n$ agents independently produces an initial response with reasoning for the given query
Debate Rounds — In each round, every agent receives all other agents' responses via a consensus prompt, critiques them, verifies consistency, and updates its own response
Convergence — After $r$ rounds, the final answer is determined by majority vote or consensus among agents

The standard configuration uses 3 agents debating for 2 rounds, balancing computational cost against accuracy gains.

Debate Dynamics

The convergence mechanism relies on cross-verification pressure:

$$P(\text{correct after debate}) > P(\text{correct single agent})$$

Initially diverse responses (especially on uncertain queries) shift toward agreement as agents verify each other's claims. Incorrect answers typically stem from isolated reasoning errors that other agents identify and challenge.

class MultiAgentDebate:
    def __init__(self, model, num_agents=3, num_rounds=2):
        self.model = model
        self.num_agents = num_agents
        self.num_rounds = num_rounds
 
    def debate(self, question):
        # Phase 1: Independent generation
        responses = []
        for i in range(self.num_agents):
            response = self.model.generate(
                f"Answer this question with detailed reasoning:\n{question}"
            )
            responses.append(response)
 
        # Phase 2: Iterative debate rounds
        for round_num in range(self.num_rounds):
            new_responses = []
            for i in range(self.num_agents):
                other_responses = [r for j, r in enumerate(responses) if j != i]
                prompt = self._build_debate_prompt(
                    question, responses[i], other_responses
                )
                updated = self.model.generate(prompt)
                new_responses.append(updated)
            responses = new_responses
 
        # Phase 3: Majority vote for final answer
        answers = [self._extract_answer(r) for r in responses]
        return majority_vote(answers)
 
    def _build_debate_prompt(self, question, own_response, others):
        other_text = "\n---\n".join(others)
        return (
            f"Question: {question}\n\n"
            f"Your previous response:\n{own_response}\n\n"
            f"Other agents' responses:\n{other_text}\n\n"
            f"Examine the other responses carefully. Where do you agree "
            f"or disagree? Update your answer based on this discussion."
        )

Key Properties

Black-box compatible — Works with any LLM API without requiring access to model internals like logits or gradients
Combinable — Orthogonal to other prompting techniques; can be combined with CoT, zero-shot, or few-shot prompting
Cross-model — Agents can use different models (e.g., ChatGPT + Gemini debating together)

Key Results

Math reasoning (GSM8K): Significant accuracy gains over single-agent baselines and zero-shot CoT
Factuality: Reduced hallucination rates on biography generation and factual QA tasks
Cross-model debate: ChatGPT (14/20) + Bard (11/20) individually improved to 17/20 when debating jointly
Performance scales with more agents and rounds, though with diminishing returns

Societies of Thought

Kim et al. (2026) discovered that modern reasoning models like DeepSeek-R1 and QwQ-32B internally simulate multi-agent debate without explicit prompting — a phenomenon they term “Societies of Thought” (arXiv:2601.10825).

Key findings from their analysis of 8,000+ reasoning traces:

Reasoning models exhibit dramatically more question-answer sequences, perspective shifts, and explicit conflicts between viewpoints compared to standard instruction-tuned models
Internal “personas” emerge with distinct personality traits (measured via Big Five) and domain expertise
The model catches its own errors through self-debate: e.g., “But here, it's cyclohexa-1,3-diene, not benzene”
Controlled RL experiments show that base models increase conversational/debate behaviors when rewarded solely for reasoning accuracy
Fine-tuning with conversational scaffolding accelerates reasoning improvement

This suggests that multi-agent debate is not merely an external prompting strategy but reflects a fundamental computational pattern for effective reasoning — paralleling collective intelligence in human groups where diversity enables superior problem-solving.

Mathematical Formulation

For $n$ agents over $r$ rounds, agent $i$'s response at round $t$ is:

$$a_i^{(t)} = \text{LLM}\left(q, a_i^{(t-1)}, \{a_j^{(t-1)}\}_{j \neq i}\right)$$

The final answer is selected by majority vote:

$$a^* = \text{mode}\left(\{\text{extract}(a_i^{(r)})\}_{i=1}^{n}\right)$$

Table of Contents