Table of Contents

ReAct Agents

ReAct agents combine reasoning and acting in an interleaved fashion, allowing large language models to generate verbal reasoning traces alongside task-specific actions.1) Introduced by Yao et al., 2022 in the paper “ReAct: Synergizing Reasoning and Acting in Language Models,” the ReAct paradigm enables agents to dynamically plan, retrieve information, and adjust their approach based on observations from the environment. This synergy between thinking and doing has become the architectural standard for production AI agents.

The ReAct Pattern

ReAct operates through an iterative Thought → Action → Observation loop:

  1. Thought: The agent reasons about the current state, what information is needed, and what action to take next, using chain-of-thought reasoning
  2. Action: The agent selects and invokes a tool or takes an action based on its reasoning (e.g., search a database, call an API, execute code)
  3. Observation: The result of the action is fed back to the agent as new context
  4. The loop repeats until the task is complete or a stopping condition is met

This agent loop is fundamentally different from pure CoT (which reasons without acting) or pure action generation (which acts without explicit reasoning). By interleaving the two, ReAct agents ground their reasoning in real observations and make their tool-use decisions interpretable.

Python Example

from [[openai|openai]] import [[openai|OpenAI]]
import json
 
client = [[openai|OpenAI]]()
 
# Define available tools
TOOLS = {
    "search": lambda query: f"Results for '{query}': Python was created by Guido van Rossum in 1991.",
    "calculate": lambda expr: str(eval(expr)),
}
 
TOOL_DESCRIPTIONS = "\n".join([
    "- search(query): Search for factual information",
    "- calculate(expr): Evaluate a math expression",
    "- finish(answer): Return the final answer",
])
 
def react_agent(question: str, max_steps: int = 5) -> str:
    """Simple ReAct agent: Thought -> Action -> Observation loop."""
    messages = [{"role": "system", "content": (
        f"You are a ReAct agent. Available tools:\n{TOOL_DESCRIPTIONS}\n\n"
        "For each step, output exactly:\n"
        "Thought: <your reasoning>\n"
        "Action: <tool_name>(argument)\n"
        "When ready, use: Action: finish(your answer)"
    )}]
    messages.append({"role": "user", "content": question})
 
    for step in range(max_steps):
        resp = client.chat.completions.create(
            model="gpt-4o", messages=messages
        )
        output = resp.choices[0].message.content
        messages.append({"role": "assistant", "content": output})
        print(f"--- Step {step + 1} ---\n{output}")
 
        # Parse the action
        for line in output.split("\n"):
            if line.startswith("Action:"):
                action_str = line.split("Action:")[1].strip()
                func_name = action_str.split("(")[0]
                arg = action_str.split("(", 1)[1].rstrip(")")
 
                if func_name == "finish":
                    return arg
 
                # Execute tool and feed observation back
                observation = TOOLS.get(func_name, lambda x: "Unknown tool")(arg)
                messages.append({"role": "user", "content": f"Observation: {observation}"})
                print(f"Observation: {observation}\n")
                break
    return "Max steps reached"
 
answer = react_agent("Who created Python and what year was that? How many years ago from 2025?")
print(f"Final answer: {answer}")

Original Research Results

The original ReAct paper demonstrated effectiveness across four benchmarks:2)

On HotPotQA and Fever, ReAct outperformed standard action generation and remained competitive with pure chain-of-thought approaches. The researchers found that combining ReAct with CoT, allowing the model to fall back to CoT when ReAct fails to converge, produced the most robust results.

Framework Implementations

ReAct has become the default agent pattern across major frameworks:

ReAct vs. Other Patterns

ReAct occupies a middle ground between simpler and more complex agent architectures:

vs. Pure Chain-of-Thought: CoT provides 10-40% accuracy improvements on multi-step tasks3) but relies entirely on the model's training knowledge. ReAct surpasses pure CoT by enabling tool-use loops that verify reasoning against external data, reducing hallucination. However, CoT is cheaper in token usage since it requires no tool calls.

vs. Plan-and-Execute: Plan-and-execute generates a complete plan upfront, then executes steps sequentially. ReAct plans adaptively, one step at a time. Key tradeoffs:

vs. Function-Calling Agents: Function-calling agents use structured JSON outputs to invoke tools but may lack explicit reasoning traces. ReAct provides more interpretable decision-making through its verbal Thought steps, at the cost of additional tokens.

Real-World Usage

In production, ReAct agents handle:

Influence on Modern Agent Design

ReAct's core contribution, interleaving reasoning with action, established the architectural standard for production AI agents. Its influence extends to:

See Also

References