====== CAMEL: Communicative Agents for Mind Exploration ======

CAMEL (Communicative Agents for "Mind" Exploration of Large Language Model Society) is a role-playing framework by Li et al. (2023) that enables autonomous cooperation between LLM agents through **inception prompting**. By assigning agents distinct roles — AI User and AI Assistant — and embedding recursive task instructions into prompts, CAMEL demonstrates that LLM agents can collaborate on complex tasks with minimal human intervention while generating valuable conversational datasets for research.

===== Inception Prompting =====

Inception prompting is CAMEL's core mechanism for guiding autonomous agent behavior. It works by embedding layered instructions within the initial prompt that:

  * Define each agent's role, constraints, and communication protocol
  * Recursively reinforce task focus to prevent conversation drift
  * Include termination conditions and role-boundary rules
  * Specify output format requirements for structured collaboration

The key insight is that by "planting the seed" of behavior deeply within the prompt structure, agents maintain coherent role-playing across extended multi-turn conversations without requiring human intervention at each step.

===== Role-Playing Architecture =====

CAMEL's architecture centers on a three-agent system:

  * **Task Specifier Agent** — Takes a high-level task description and crafts detailed, role-appropriate inception prompts. It assigns roles dynamically based on task requirements and oversees session integrity.
  * **AI User** — Proposes ideas, refines instructions, and drives the task forward by simulating a user's intent. Provides feedback and course corrections.
  * **AI Assistant** — Executes, critiques, and improves upon the AI User's inputs to progress toward task completion.

The conversation protocol enforces:
  * Strict turn alternation between AI User and AI Assistant
  * Role labels in each message (e.g., "AI User:", "AI Assistant:") to prevent role flipping
  * Maximum turn limits and completion detection for termination
  * Deviation correction when agents stray from assigned roles

===== Code Example =====

<code python>
# Simplified CAMEL-style role-playing interaction
from camel.agents import ChatAgent
from camel.messages import BaseMessage
from camel.types import RoleType

# Define inception prompts for each role
user_inception = (
    "You are an AI user tasked with directing an AI assistant "
    "to implement a sorting algorithm. Give clear instructions "
    "one step at a time. Do not write code yourself."
)
assistant_inception = (
    "You are an AI assistant that implements code based on "
    "the AI user's instructions. Write clean Python code and "
    "explain your implementation decisions."
)

# Initialize agents with role-specific prompts
user_agent = ChatAgent(
    system_message=BaseMessage.make_user_message(
        role_name="Algorithm_Expert", content=user_inception
    ),
    role_type=RoleType.USER,
)
assistant_agent = ChatAgent(
    system_message=BaseMessage.make_assistant_message(
        role_name="Python_Developer", content=assistant_inception
    ),
    role_type=RoleType.ASSISTANT,
)

# Role-playing loop with turn alternation
for turn in range(max_turns):
    user_msg = user_agent.step(assistant_response)
    assistant_response = assistant_agent.step(user_msg)
    if "[TASK_DONE]" in assistant_response.content:
        break
</code>

===== Generated Datasets =====

CAMEL's role-playing framework generates large-scale conversational datasets across multiple domains:

  * **AI Society** — 25K conversations on general collaborative tasks
  * **CAMEL Code** — Programming-focused dialogues for code generation
  * **CAMEL Math** — Mathematical reasoning conversations
  * **CAMEL Science** — Scientific problem-solving dialogues
  * **Misalignment** — Adversarial conversations for safety research

===== Benchmark Results =====

^ Evaluation ^ CAMEL ^ Baseline ^
| Human eval win rate vs GPT-3.5-turbo | 76.3% | 10.4% loss |
| CAMEL-7B (LLaMA fine-tuned) HumanEval Pass@1 | 57.9% | 14.0% (base LLaMA-7B) |
| HumanEval+ | Outperforms Vicuna-7B | — |

These results demonstrate that conversational data generated through structured role-playing can effectively fine-tune smaller models to competitive performance levels.

===== Theoretical Framework =====

CAMEL models agent interaction as a Markov decision process with role constraints:

<latex>\pi_{user}(a_t | s_t, r_{user}) \text{ and } \pi_{asst}(a_t | s_t, r_{asst})</latex>

where <latex>r</latex> represents the role-conditioned inception prompt and <latex>s_t</latex> is the conversation state at turn <latex>t</latex>. The inception prompt acts as a persistent prior that constrains the policy space to role-appropriate actions.

===== References =====

  * [[https://arxiv.org/abs/2303.17760|Li et al. "CAMEL: Communicative Agents for Mind Exploration of Large Language Model Society" (arXiv:2303.17760)]]
  * [[https://github.com/camel-ai/camel|CAMEL GitHub Repository]]
  * [[https://www.camel-ai.org/|CAMEL-AI Official Website]]

===== See Also =====

  * [[metagpt|MetaGPT — SOP-based multi-agent collaboration]]
  * [[self_play_agents|Self-Play Agents — Competitive and cooperative agent training]]
  * [[agent_distillation|Agent Distillation — Compressing agent behaviors into smaller models]]