====== Chain of Thought Agents ======
Chain of Thought (CoT) agents are AI systems that explicitly decompose complex problems into intermediate reasoning steps before arriving at a final answer or action. By verbalizing their thought process, these agents achieve significantly improved performance on tasks requiring multi-step logic, arithmetic, and commonsense reasoning(([[https://arxiv.org/abs/2201.11903|Wei et al. "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models" arXiv:2201.11903, 2022]])). CoT prompting has become a foundational technique in modern agent design, enabling more transparent and reliable decision-making across [[autonomous_agents|autonomous agent]] systems.

===== Origins and Evolution =====
CoT prompting was introduced by [[https://arxiv.org/abs/2201.11903|Wei et al., 2022]] at [[google|Google]], demonstrating that prompting LLMs with "Let's think step-by-step" or providing few-shot examples with explicit reasoning chains dramatically improved performance on math and logic benchmarks. The technique evolved rapidly:

  * **Zero-Shot CoT (2022)**: [[https://arxiv.org/abs/2210.11610|Kojima et al., 2022]] showed that simply appending "Let's think step by step" to prompts elicits reasoning without examples(([[https://arxiv.org/abs/2210.11610|Kojima et al. "Large Language Models are Zero-Shot Reasoners" arXiv:2210.11610, 2022]]))
  * **Self-Consistency (2023)**: [[https://arxiv.org/abs/2203.11171|Wang et al., 2023]] proposed generating multiple reasoning paths and selecting the most consistent answer via majority voting, reducing hallucinations(([[https://arxiv.org/abs/2203.11171|Wang et al. "Self-Consistency Improves Chain of Thought Reasoning in Language Models" arXiv:2203.11171, 2023]]))
  * **Tree-of-Thought (2023)**: [[https://arxiv.org/abs/2305.10601|Yao et al., 2023]] extended CoT to explore branching reasoning paths like a search tree, with self-evaluation at each node for complex decision-making(([[https://arxiv.org/abs/2305.10601|Yao et al. "Tree of Thoughts: Deliberate Problem Solving with Large Language Models" arXiv:2305.10601, 2023]]))
  * **Graph-of-Thought (2023)**: [[https://arxiv.org/abs/2304.11195|Besta et al., 2023]] organized reasoning into directed acyclic graphs, enabling interconnected reasoning paths for multifaceted problems beyond linear chains(([[https://arxiv.org/abs/2304.11195|Besta et al. "Graph of Thoughts: Solving Elaborate Problems with Large Language Models" arXiv:2304.11195, 2023]]))

By 2025, CoT has evolved from simple prompting into a core architectural component of reasoning models.

===== Reasoning Models =====
The most significant evolution of CoT is its internalization within dedicated reasoning models:

  * **[[openai|OpenAI]] o1/o3**: Use internal long [[chain_of_thought|chain-of-thought reasoning]], spending additional compute on "thinking" before responding. These models excel at mathematics, coding, and scientific reasoning by generating hundreds of intermediate reasoning steps internally.
  * **DeepSeek R1**: Open-source reasoning model supporting extended chains for sustained logical reasoning, trained using [[rlhf|GRPO]] (Group Relative Policy Optimization) to develop reasoning capabilities(([[https://arxiv.org/abs/2501.12948|DeepSeek-AI et al. "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning" arXiv:2501.12948, 2025]])).
  * **Claude's Extended Thinking**: [[anthropic|Anthropic]]'s approach enables prolonged step-by-step deliberation via an API parameter, with separate pricing for reasoning tokens. Provides transparent visibility into the model's reasoning process. Claude uses a dedicated thinking [[block|block]] to process logic before generating a final response, making the reasoning chain inspectable for debugging and trust, and fundamentally changing the standard LLM contract from fire-and-forget to a transparent process where the model shows its work(([[https://cobusgreyling.substack.com/p/building-with-claude-extended-thinking|Cobus Greyling "Extended Thinking" LLMs (2026]]))

These models demonstrate that CoT is not merely a prompting technique but a fundamental capability that can be trained into models through [[reinforcement_learning|reinforcement learning]].

===== CoT in Agent Systems =====
Within [[autonomous_agents|autonomous agent]] architectures, CoT serves multiple roles:

  * **Planning**: Agents use CoT to decompose objectives into subtasks, forming the reasoning backbone of [[plan_and_execute_agents|plan-and-execute]] patterns
  * **Tool Selection**: [[react_agents|ReAct agents]] use CoT-style reasoning to determine which [[tool_using_agents|tools]] to invoke and in what order
  * **Self-Verification**: Agents apply CoT to check their own outputs for consistency and correctness before committing to actions
  * **Error Recovery**: When actions fail, CoT enables agents to reason about what went wrong and generate alternative approaches

The combination of CoT with tool use, as in the [[react_agents|ReAct pattern]], produces agents that are both more capable and more interpretable than either pure reasoning or pure action approaches.

===== Code Example: CoT Agent with Step-by-Step Reasoning =====
<code python>
from [[openai|openai]] import [[openai|OpenAI]]

client = [[openai|OpenAI]]()

COT_SYSTEM_PROMPT = """You are a reasoning agent. For every question:
1. Break the problem into explicit reasoning steps
2. Show your work for each step inside <step> tags
3. Verify your reasoning in a <verify> tag
4. Give your final answer in an <answer> tag

Example format:
<step>First, I identify that...</step>
<step>Next, I calculate...</step>
<verify>Checking: ...</verify>
<answer>The answer is...</answer>"""


def cot_agent(question: str) -> dict:
    """Run a chain-of-thought agent that shows explicit reasoning steps."""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": COT_SYSTEM_PROMPT},
            {"role": "user", "content": question},
        ],
        temperature=0.2,
    )
    raw = response.choices[0].message.content

    # Parse structured reasoning output
    import re
    steps = re.findall(r"<step>(.*?)</step>", raw, re.DOTALL)
    verification = re.findall(r"<verify>(.*?)</verify>", raw, re.DOTALL)
    answer = re.findall(r"<answer>(.*?)</answer>", raw, re.DOTALL)

    return {
        "steps": [s.strip() for s in steps],
        "verification": verification[0].strip() if verification else None,
        "answer": answer[0].strip() if answer else raw,
        "raw": raw,
    }


result = cot_agent(
    "A store has 3 types of boxes. Small holds 4 items, medium holds 9, large holds 15. "
    "I need to pack exactly 58 items using the fewest boxes. What combination should I use?"
)

print("Reasoning steps:")
for i, step in enumerate(result["steps"], 1):
    print(f"  {i}. {step}")
if result["verification"]:
    print(f"\nVerification: {result['verification']}")
print(f"\nFinal answer: {result['answer']}")
</code>

===== Multimodal and Structured CoT =====
Recent advances extend CoT beyond text:

  * **Multimodal CoT**: Reasoning over images, audio, and video alongside text, grounding chain-of-thought in diverse data modalities
  * **Structured CoT for Code Generation**: Research published in ACM TOSEM (2025) proposes structured CoT variants specifically optimized for code reasoning tasks
  * **Action Chain-of-Thought**: [[agibot|AGIBOT]]'s GO-2 model employs action-oriented CoT where the AI generates a sequence of executable action intents before committing to raw control commands, translating high-level reasoning into reliable physical movements for robotic control through a semantic path that guides fast-following control modules(([[https://www.rohan-paul.com/p/[[cursor|cursor]]-just-turned-its-agent-workflow|Rohan's Bytes "Action Chain-of-Thought" (2026]]))
  * **Multi-Agent CoT**: Multiple LLMs collaborating through shared reasoning chains, with each agent contributing domain-specific reasoning steps

===== See Also =====

  * [[chain_of_thought|Chain-of-Thought Reasoning]]
  * [[cognitive_architectures_language_agents|Cognitive Architectures for Language Agents (CoALA)]]
  * [[agent_loop|Agent Loop]]
  * [[hypothesis_testing_iteration|Hypothesis Testing and Iteration]]
  * [[advanced_reasoning_planning|Advanced Reasoning and Planning]]

===== References =====