Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety & Security
Evaluation
Meta
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety & Security
Evaluation
Meta
This is an old revision of the document!
This landmark survey by Xi et al. (2023)1) from Fudan NLP Group provides the most comprehensive overview of LLM-based agents, proposing a unifying conceptual framework of brain, perception, and action modules.2) With over 1,500 citations, it is the most influential survey in the LLM agent space.
The survey traces the concept of agents from philosophical origins (Descartes, Locke, Hume) through AI history (symbolic AI, reinforcement learning) to the modern era where LLMs serve as the foundation for general-purpose agents.3) The central thesis: LLMs possess the versatile capabilities needed to serve as a starting point for designing AI agents that can adapt to diverse scenarios.
Published in Science China Information Sciences (2025)4), the paper covers single-agent systems, multi-agent cooperation, and human-agent interaction.
The brain is the LLM itself, providing core cognitive functions:5)
The agent's decision at each step can be formalized as:
<latex>a_t = \pi_\theta(o_t, m_t, g)</latex>
where <latex>\pi_\theta</latex> is the LLM-based policy, <latex>o_t</latex> is the current observation, <latex>m_t</latex> is the memory state, and <latex>g</latex> is the goal.
Perception extends the agent beyond text:
Actions are the agent's interface with the world:
The survey categorizes agent applications into three paradigms:6)
| Paradigm | Description | Examples |
|---|---|---|
| Single Agent | One LLM agent solving tasks autonomously | AutoGPT, HuggingGPT, WebGPT |
| Multi-Agent | Multiple agents cooperating or competing | Generative Agents, CAMEL, AgentVerse |
| Human-Agent | Collaboration between humans and LLM agents | Copilot, interactive assistants |
The survey further examines agent societies, covering:
# Conceptual implementation of the Brain-Perception-Action framework class LLMAgent: def __init__(self, llm, tools, memory_store): self.brain = BrainModule(llm) self.perception = PerceptionModule(modalities=['text', 'vision']) self.action = ActionModule(tools=tools) self.memory = MemoryModule(memory_store) def step(self, observation, goal): # Perception: process multimodal input processed_obs = self.perception.process(observation) # Brain: reason and plan with memory context memory_context = self.memory.retrieve(processed_obs) plan = self.brain.reason(processed_obs, memory_context, goal) # Action: execute the plan result = self.action.execute(plan) # Update memory self.memory.store(processed_obs, plan, result) return result