Browse
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety
Meta
Browse
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety
Meta
Advanced reasoning and planning encompasses the techniques and architectures that enable AI agents to break down complex problems, formulate multi-step strategies, and adapt their approach based on intermediate results. These capabilities are fundamental to building agents that can operate autonomously on open-ended tasks, moving beyond simple prompt-response interactions to exhibit goal-directed behavior.
Chain-of-Thought (CoT) prompting, introduced by Wei et al., 2022, remains the foundational technique for eliciting step-by-step reasoning from LLMs.1) By including intermediate reasoning steps in prompts, CoT dramatically improves performance on arithmetic, commonsense, and symbolic reasoning tasks.
Key variants and extensions include:
Modern reasoning models like OpenAI o3, DeepSeek-R1, and Claude 3.7 Sonnet use extended CoT with inference-time compute scaling, where additional computation at generation time yields deeper, more accurate reasoning.
Tree of Thoughts (ToT), introduced by Yao et al., 2023, organizes reasoning into a tree structure where multiple reasoning paths are explored simultaneously via breadth-first or depth-first search.4) Each node represents an intermediate “thought” that is evaluated by the LLM for progress toward the goal.
Graph of Thoughts (GoT), proposed by Besta et al., 2024, ETH Zurich, generalizes CoT and ToT by modeling reasoning as an arbitrary directed graph.5) This enables:
Matrix of Thought (MoT) (Tang et al., 2025) re-evaluates the chain-vs-tree tradeoff and proposes structured matrices that capture both sequential and parallel reasoning dimensions.
A comprehensive taxonomy by Besta et al. (2025) titled “Demystifying Chains, Trees, and Graphs of Thoughts” provides a unified framework comparing these topologies across efficiency, accuracy, and cost dimensions.
For complex, long-horizon tasks, agents employ hierarchical decomposition:
Modern agents like OpenAI Deep Research and Anthropic Claude use hierarchical planning to break hours-long research tasks into manageable sub-tasks, coordinating tool use, memory retrieval, and synthesis.
As of 2025, frontier models employ distinct reasoning strategies:
Key benchmarks for evaluating reasoning and planning: