Browse
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety
Meta
Browse
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety
Meta
ReAct is a prompting framework that synergizes reasoning (chain-of-thought style) and acting (tool use and environment interaction) within large language models.1) Proposed by Yao et al., 2022 in “ReAct: Synergizing Reasoning and Acting in Language Models,” ReAct interleaves the generation of reasoning traces with task-specific actions, allowing the model to dynamically plan, retrieve information, and adjust its approach based on observations from the environment. This tight coupling of thought and action has proven effective for tasks such as question answering, fact verification, and interactive decision-making, and has become a foundational pattern for LLM agent architectures.
ReAct operates through an iterative Thought-Action-Observation cycle:2)
search[query], lookup[term], or finish[answer].
This loop repeats until the model decides to terminate (e.g., by calling finish[answer]). The full trajectory of thoughts, actions, and observations is maintained in the prompt context, giving the model a working memory of its reasoning and actions so far.
The key innovation is that thoughts and actions are interleaved, not separated. The reasoning traces ground the model's decisions in explicit logic, while the actions ground the reasoning in real-world information, reducing hallucination.
| Approach | Mechanism | Strengths | Weaknesses |
| Chain-of-Thought | Pure internal reasoning without external actions | Strong on knowledge tasks; no environment needed3) | No external grounding; hallucinations on factual queries |
| Action-Only | Direct tool calls without explicit reasoning | Low overhead; fast execution | Opaque decisions; poor error recovery |
| ReAct | Interleaved reasoning + actions | Interpretable; robust to failures; self-correcting | Higher token cost; potential reasoning loops |
On HotpotQA (multi-hop question answering), ReAct with PaLM-540B achieved 41% exact match, compared to 37% for CoT alone.4) The reasoning traces allowed the model to decompose multi-hop questions and correct course when initial searches returned irrelevant results.
On Fever (fact verification), ReAct similarly outperformed both reasoning-only and action-only baselines by using search actions to verify claims and reasoning to synthesize evidence.
On ALFWorld (text-based household tasks), ReAct completed 30-40% of episodes, far exceeding imitation learning baselines (~10%), by reasoning through environment feedback to plan multi-step actions like finding and cleaning objects.
ReAct's effectiveness depends on the design of the action space. In the original paper, actions were simple text commands (search, lookup, finish) interfacing with Wikipedia. In practice, the action space can include:
The action space should be well-defined with clear semantics, as ambiguous action definitions lead to the model making incorrect tool calls. Each tool should return structured observations that the model can reason about in subsequent thoughts.
ReAct has become the dominant pattern for LLM agent implementations:
LangChain5) provides create_react_agent which parses LLM outputs for thought/action pairs and manages the execution loop. It supports custom tools and integrates with the broader LangChain ecosystem for chains and memory.
The following example demonstrates a LangChain ReAct agent with a search tool that follows the thought/action/observation loop:
# ReAct agent using [[langchain|LangChain]] with a search tool from langchain_openai import ChatOpenAI from langchain_community.tools import DuckDuckGoSearchRun from [[langgraph|langgraph]].prebuilt import create_react_agent llm = ChatOpenAI(model="gpt-4o") tools = [DuckDuckGoSearchRun()] # create_react_agent builds the thought-action-observation loop automatically agent = create_react_agent(llm, tools) # The agent reasons step-by-step, calling search as needed result = agent.invoke( {"messages": [{"role": "user", "content": "Who founded SpaceX and when?"}]} ) print(result["messages"][-1].content)
LlamaIndex implements ReAct-style agents for retrieval-augmented generation, where the agent reasons about which query engine or index to use, executes the retrieval, and reasons about the results before answering.
Production patterns commonly seen in ReAct deployments include:
By 2025, ReAct has evolved into a family of architectures incorporating multi-agent collaboration, vision capabilities (for temporal action detection), and hybrid approaches with reward models. It remains the most widely implemented agent reasoning pattern, though newer frameworks like 6) Reflexion and planning-based approaches extend it with self-improvement and formal search.