Architecture
Python Example
BabyAGI-Style Planning
LangChain PlanAndExecute
Comparison to ReAct
Dynamic Replanning
Hierarchical Planning
Research and Production Trends
See Also
References

Plan and Execute Agents

Plan-and-execute agents separate the planning phase from the execution phase, first generating a complete step-by-step plan and then carrying out each step individually.¹⁾ This two-phase architecture addresses limitations of purely reactive agents like ReAct by providing a structured roadmap before any actions are taken. The approach improves reliability on complex multi-step tasks by reducing compounding errors that arise from step-by-step improvisation.

Architecture

The plan-and-execute pattern consists of two core components operating in sequence:

Planner: An LLM that receives the user's objective and decomposes it into an explicit, ordered sequence of steps. The planner considers dependencies between steps and the tools available for execution.
Executor: A separate agent (often a ReAct-style agent) that carries out each step, invoking tools as needed and reporting results back.

State management persists the plan, current step index, and accumulated results, enabling resumability if execution is interrupted. This separation of concerns allows different models or configurations for planning versus execution, for example, using a more capable model for planning and a faster model for routine execution steps.²⁾

Python Example

from [[langgraph|langgraph]].graph import StateGraph, START, END
from langchain_openai import ChatOpenAI
from typing import TypedDict, Annotated
import operator
 
class PlanExecuteState(TypedDict):
    objective: str
    plan: list[str]
    current_step: int
    results: Annotated[list[str], operator.add]
    final_answer: str
 
planner_llm = ChatOpenAI(model="gpt-4o")
executor_llm = ChatOpenAI(model="gpt-4o-mini")
 
def plan_step(state: PlanExecuteState) -> dict:
    resp = planner_llm.invoke(
        f"Break this objective into 3-5 concrete steps:\n{state['objective']}\n"
        "Return each step on a new line, numbered."
    )
    steps = [line.strip() for line in resp.content.split("\n") if line.strip()]
    return {"plan": steps, "current_step": 0}
 
def execute_step(state: PlanExecuteState) -> dict:
    idx = state["current_step"]
    step = state["plan"][idx]
    context = "\n".join(state["results"]) if state["results"] else "No prior results."
    resp = executor_llm.invoke(
        f"Execute this step: {step}\nPrior results:\n{context}"
    )
    return {"results": [resp.content], "current_step": idx + 1}
 
def should_continue(state: PlanExecuteState) -> str:
    return "execute" if state["current_step"] < len(state["plan"]) else "finalize"
 
def finalize(state: PlanExecuteState) -> dict:
    all_results = "\n".join(state["results"])
    resp = planner_llm.invoke(f"Synthesize these results:\n{all_results}")
    return {"final_answer": resp.content}
 
# Build the [[langgraph|LangGraph]] state machine
graph = StateGraph(PlanExecuteState)
graph.add_node("plan", plan_step)
graph.add_node("execute", execute_step)
graph.add_node("finalize", finalize)
graph.add_edge(START, "plan")
graph.add_edge("plan", "execute")
graph.add_conditional_edges("execute", should_continue)
graph.add_edge("finalize", END)
 
app = graph.compile()
result = app.invoke({"objective": "Research and summarize HNSW algorithms"})
print(result["final_answer"])

BabyAGI-Style Planning

BabyAGI pioneered a dynamic variant of plan-and-execute where the task queue is continuously regenerated.³⁾ Rather than creating a fixed plan upfront, BabyAGI's task creation agent generates new tasks based on execution results, and a prioritization agent reorders them. This produces adaptive planning that evolves with the task, combining the structure of plan-and-execute with some of the flexibility of ReAct.

Modern frameworks have adopted this pattern with hierarchical supervisors for multi-level task decomposition, where high-level goals are broken into sub-plans, each managed by specialized agents.

LangChain PlanAndExecute

LangChain implements plan-and-execute through LangGraph's stateful graphs, providing:⁴⁾

Structured Workflows: Nodes represent planning and execution stages, connected by edges that manage state transitions
Dynamic Replanning: Feedback loops allow the executor to report failures to the planner, triggering plan revision mid-execution
Tool Integration: Executors access shared tool registries, RAG systems, and external APIs
Human-in-the-Loop: Checkpoint nodes where humans can review and modify the plan before execution continues

Comparison to ReAct

Plan-and-execute and ReAct represent fundamentally different strategies for agent decision-making:⁵⁾

Aspect	Plan-and-Execute	ReAct
Process	Upfront full plan, then sequential execution	Iterative reason-act cycles per step
LLM Calls	Fewer (planning happens once)	More (continuous reasoning at each step)
Token Usage	3,000-4,500 per task	2,000-3,000 per task
Task Accuracy	~92% on complex predictable tasks	~85% on similar tasks
Strengths	Predictable, cost-efficient for structured tasks, visible progress	Adaptive, flexible, lower overhead for simple tasks
Weaknesses	Plans can become stale if environment changes	Higher cumulative cost, less structured
Security	Better isolation (control flow separated from execution)	More exposed to prompt injection during reasoning

In practice, hybrid approaches dominate: plan-and-execute for the overall structure with ReAct-style execution within individual steps, combining strategic planning with tactical flexibility.⁶⁾)

Dynamic Replanning

Static plans fail when the environment changes or unexpected results occur. Modern plan-and-execute agents incorporate feedback loops for mid-execution adaptation:

Execution Feedback: Executors report step outcomes, including failures, to the planner
Plan Revision: The planner generates an updated plan incorporating new information
Selective Replanning: Only affected downstream steps are revised, preserving completed work⁷⁾)
Iteration Limits: Maximum replan counts prevent infinite loops

LangGraph implements this through graph cycles where execution nodes can route back to planning nodes based on conditional logic.

Hierarchical Planning

Complex tasks benefit from multi-level planning hierarchies:⁸⁾

Supervisor Agents: High-level planners that decompose objectives into major phases
Mid-Tier Planners: Create detailed step sequences for each phase
Worker Agents: Execute individual steps using tools and report results upward

Patterns include hub-and-spoke (central orchestrator), pipeline (sequential handoffs), and peer-to-peer (collaborative decomposition). These hierarchies scale plan-and-execute to handle workflows that would overwhelm a single planning step.

Research and Production Trends

Late 2025 research emphasizes production-grade plan-and-execute with:

Parallel DAG execution for independent plan steps
Human-in-the-loop verification at critical decision points
Security through separated planning and execution contexts
Process reward models for evaluating plan quality
Frameworks like LangGraph, CrewAI, and AutoGen providing built-in plan-and-execute primitives

References

¹⁾

https://arxiv.org/abs/2305.04091|Wang, L. et al. “Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models.”

²⁾ , ⁴⁾

https://python.langchain.com/docs/how_to/|LangChain Documentation

³⁾

https://github.com/yoheinakajima/babyagi|GitHub: yoheinakajima/babyagi

⁵⁾

https://arxiv.org/abs/2210.03629|Yao, S. et al. “ReAct: Synergizing Reasoning and Acting in Language Models.”

⁶⁾

(https://arxiv.org/abs/2305.10601|Yao, S. et al. “Tree of Thoughts: Deliberate Problem Solving with Large Language Models.” arXiv:2305.10601, 2023.

⁷⁾

(https://arxiv.org/abs/2303.11366|Shinn, N. et al. “Reflexion: Language Agents with Verbal Reinforcement Learning.” arXiv:2303.11366, 2023.

⁸⁾

https://arxiv.org/abs/2308.11432|Wang, L. et al. “A Survey on Large Language Model based Autonomous Agents.”

Table of Contents