====== ReAct Agents ====== ReAct agents combine reasoning and acting in an interleaved fashion, allowing large language models to generate verbal reasoning traces alongside task-specific actions.(([[https://arxiv.org/abs/2210.03629|Yao et al. - ReAct: Synergizing Reasoning and Acting in Language Models (2022]])) Introduced by [[https://arxiv.org/abs/2210.03629|Yao et al., 2022]] in the paper "ReAct: Synergizing Reasoning and Acting in Language Models," the ReAct paradigm enables agents to dynamically plan, retrieve information, and adjust their approach based on observations from the environment. This synergy between thinking and doing has become the architectural standard for production AI agents. ===== The ReAct Pattern ===== ReAct operates through an iterative **Thought -> Action -> Observation** loop: - **Thought**: The agent reasons about the current state, what information is needed, and what action to take next, using [[chain_of_thought_agents|chain-of-thought reasoning]] - **Action**: The agent selects and invokes a tool or takes an action based on its reasoning (e.g., search a database, call an API, execute code) - **Observation**: The result of the action is fed back to the agent as new context - The loop repeats until the task is complete or a stopping condition is met This [[agent_loop|agent loop]] is fundamentally different from pure CoT (which reasons without acting) or pure action generation (which acts without explicit reasoning). By interleaving the two, ReAct agents ground their reasoning in real observations and make their tool-use decisions interpretable. ===== Python Example ===== from [[openai|openai]] import [[openai|OpenAI]] import json client = [[openai|OpenAI]]() # Define available tools TOOLS = { "search": lambda query: f"Results for '{query}': Python was created by Guido van Rossum in 1991.", "calculate": lambda expr: str(eval(expr)), } TOOL_DESCRIPTIONS = "\n".join([ "- search(query): Search for factual information", "- calculate(expr): Evaluate a math expression", "- finish(answer): Return the final answer", ]) def react_agent(question: str, max_steps: int = 5) -> str: """Simple ReAct agent: Thought -> Action -> Observation loop.""" messages = [{"role": "system", "content": ( f"You are a ReAct agent. Available tools:\n{TOOL_DESCRIPTIONS}\n\n" "For each step, output exactly:\n" "Thought: \n" "Action: (argument)\n" "When ready, use: Action: finish(your answer)" )}] messages.append({"role": "user", "content": question}) for step in range(max_steps): resp = client.chat.completions.create( model="gpt-4o", messages=messages ) output = resp.choices[0].message.content messages.append({"role": "assistant", "content": output}) print(f"--- Step {step + 1} ---\n{output}") # Parse the action for line in output.split("\n"): if line.startswith("Action:"): action_str = line.split("Action:")[1].strip() func_name = action_str.split("(")[0] arg = action_str.split("(", 1)[1].rstrip(")") if func_name == "finish": return arg # Execute tool and feed observation back observation = TOOLS.get(func_name, lambda x: "Unknown tool")(arg) messages.append({"role": "user", "content": f"Observation: {observation}"}) print(f"Observation: {observation}\n") break return "Max steps reached" answer = react_agent("Who created Python and what year was that? How many years ago from 2025?") print(f"Final answer: {answer}") ===== Original Research Results ===== The original ReAct paper demonstrated effectiveness across four benchmarks:(([[https://arxiv.org/abs/2210.03629|Yao et al. - ReAct: Synergizing Reasoning and Acting in Language Models (2022]])) * **HotPotQA**: Multi-hop question answering with Wikipedia API access * **Fever**: Fact verification tasks * **ALFWorld**: Text-based game environments requiring sequential decision-making * **WebShop**: Web navigation and shopping tasks On HotPotQA and Fever, ReAct outperformed standard action generation and remained competitive with pure [[chain_of_thought_agents|chain-of-thought]] approaches. The researchers found that combining ReAct with CoT, allowing the model to fall back to CoT when ReAct fails to converge, produced the most robust results. ===== Framework Implementations ===== ReAct has become the default agent pattern across major frameworks: * **[[langchain|LangChain]]**: The original AgentExecutor uses ReAct as its core reasoning pattern, with the agent receiving tool descriptions and generating Thought/Action/Observation sequences * **[[langgraph|LangGraph]]**: Builds upon ReAct principles for stateful, graph-based agent workflows with more control over execution flow * **[[llamaindex|LlamaIndex]]**: Implements ReAct agents for retrieval-augmented tasks, combining reasoning with document search * **[[claude|Claude]]/[[anthropic|Anthropic]]**: Claude's agentic systems implement ReAct-style loops for tool-calling workflows * **[[openai_agents_sdk|OpenAI Agents SDK]]**: Uses ReAct-influenced reasoning-action cycles as the foundation for autonomous agent behavior ===== ReAct vs. Other Patterns ===== ReAct occupies a middle ground between simpler and more complex agent architectures: **vs. Pure Chain-of-Thought:** CoT provides 10-40% accuracy improvements on multi-step tasks(([[https://arxiv.org/abs/2201.11903|Wei et al. - Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (2022]])) but relies entirely on the model's training knowledge. ReAct surpasses pure CoT by enabling tool-use loops that verify reasoning against external data, reducing hallucination. However, CoT is cheaper in token usage since it requires no tool calls. **vs. [[plan_and_execute_agents|Plan-and-Execute]]:** Plan-and-execute generates a complete plan upfront, then executes steps sequentially. ReAct plans adaptively, one step at a time. Key tradeoffs: * ReAct is faster and cheaper for simple tasks (2,000-3,000 tokens vs. 3,000-4,500) * Plan-and-execute achieves higher accuracy on complex but predictable tasks (92% vs. 85%) * ReAct excels in unpredictable environments where the solution path depends on intermediate results * Plan-and-execute provides better visibility into the overall approach before execution begins **vs. Function-Calling Agents:** Function-calling agents use structured JSON outputs to invoke tools but may lack explicit reasoning traces. ReAct provides more interpretable decision-making through its verbal Thought steps, at the cost of additional tokens. ===== Real-World Usage ===== In production, ReAct agents handle: * **Multi-hop information retrieval**: Queries where intermediate search results guide subsequent searches * **Dynamic tool selection**: Tasks requiring the agent to reason about which tool to use based on context * **Sequential decision-making**: Workflows requiring real-time adaptation based on observed outcomes * **Customer support**: Agents that reason about customer issues, look up account data, and take actions ===== Influence on Modern Agent Design ===== ReAct's core contribution, interleaving reasoning with action, established the architectural standard for production AI agents. Its influence extends to: * Tool-calling implementations across all major LLM APIs(([[https://arxiv.org/abs/2302.04761|Schick et al. - Toolformer: Language Models Can Teach Themselves to Use Tools (2023]])) * The design of [[agent_orchestration|agent orchestration]] platforms emphasizing reasoning transparency * Hybrid approaches that combine ReAct with plan-and-execute for different task phases * [[extended_thinking|Extended Thinking]] features in models like Claude, which formalize the reasoning step with dedicated compute(([[https://arxiv.org/abs/2303.11366|Shinn et al. - Reflexion: Language Agents with Verbal Reinforcement Learning (2023]])) ===== See Also ===== * [[react_framework|ReAct: Reasoning and Acting]] * [[reasoning_models|Reasoning Models]] * [[reasoning_via_planning|RAP: Reasoning via Planning with LLM as World Model]] * [[chain_of_thought_agents|Chain of Thought Agents]] * [[inner_monologue_agents|Inner Monologue: Embodied Reasoning with Language Models]] ===== References =====