Plan-and-execute agents separate the planning phase from the execution phase, first generating a complete step-by-step plan and then carrying out each step individually.1) This two-phase architecture addresses limitations of purely reactive agents like ReAct by providing a structured roadmap before any actions are taken. The approach improves reliability on complex multi-step tasks by reducing compounding errors that arise from step-by-step improvisation.
The plan-and-execute pattern consists of two core components operating in sequence:
State management persists the plan, current step index, and accumulated results, enabling resumability if execution is interrupted. This separation of concerns allows different models or configurations for planning versus execution, for example, using a more capable model for planning and a faster model for routine execution steps.2)
from [[langgraph|langgraph]].graph import StateGraph, START, END from langchain_openai import ChatOpenAI from typing import TypedDict, Annotated import operator class PlanExecuteState(TypedDict): objective: str plan: list[str] current_step: int results: Annotated[list[str], operator.add] final_answer: str planner_llm = ChatOpenAI(model="gpt-4o") executor_llm = ChatOpenAI(model="gpt-4o-mini") def plan_step(state: PlanExecuteState) -> dict: resp = planner_llm.invoke( f"Break this objective into 3-5 concrete steps:\n{state['objective']}\n" "Return each step on a new line, numbered." ) steps = [line.strip() for line in resp.content.split("\n") if line.strip()] return {"plan": steps, "current_step": 0} def execute_step(state: PlanExecuteState) -> dict: idx = state["current_step"] step = state["plan"][idx] context = "\n".join(state["results"]) if state["results"] else "No prior results." resp = executor_llm.invoke( f"Execute this step: {step}\nPrior results:\n{context}" ) return {"results": [resp.content], "current_step": idx + 1} def should_continue(state: PlanExecuteState) -> str: return "execute" if state["current_step"] < len(state["plan"]) else "finalize" def finalize(state: PlanExecuteState) -> dict: all_results = "\n".join(state["results"]) resp = planner_llm.invoke(f"Synthesize these results:\n{all_results}") return {"final_answer": resp.content} # Build the [[langgraph|LangGraph]] state machine graph = StateGraph(PlanExecuteState) graph.add_node("plan", plan_step) graph.add_node("execute", execute_step) graph.add_node("finalize", finalize) graph.add_edge(START, "plan") graph.add_edge("plan", "execute") graph.add_conditional_edges("execute", should_continue) graph.add_edge("finalize", END) app = graph.compile() result = app.invoke({"objective": "Research and summarize HNSW algorithms"}) print(result["final_answer"])
BabyAGI pioneered a dynamic variant of plan-and-execute where the task queue is continuously regenerated.3) Rather than creating a fixed plan upfront, BabyAGI's task creation agent generates new tasks based on execution results, and a prioritization agent reorders them. This produces adaptive planning that evolves with the task, combining the structure of plan-and-execute with some of the flexibility of ReAct.
Modern frameworks have adopted this pattern with hierarchical supervisors for multi-level task decomposition, where high-level goals are broken into sub-plans, each managed by specialized agents.
LangChain implements plan-and-execute through LangGraph's stateful graphs, providing:4)
Plan-and-execute and ReAct represent fundamentally different strategies for agent decision-making:5)
| Aspect | Plan-and-Execute | ReAct |
|---|---|---|
| Process | Upfront full plan, then sequential execution | Iterative reason-act cycles per step |
| LLM Calls | Fewer (planning happens once) | More (continuous reasoning at each step) |
| Token Usage | 3,000-4,500 per task | 2,000-3,000 per task |
| Task Accuracy | ~92% on complex predictable tasks | ~85% on similar tasks |
| Strengths | Predictable, cost-efficient for structured tasks, visible progress | Adaptive, flexible, lower overhead for simple tasks |
| Weaknesses | Plans can become stale if environment changes | Higher cumulative cost, less structured |
| Security | Better isolation (control flow separated from execution) | More exposed to prompt injection during reasoning |
In practice, hybrid approaches dominate: plan-and-execute for the overall structure with ReAct-style execution within individual steps, combining strategic planning with tactical flexibility.6))
Static plans fail when the environment changes or unexpected results occur. Modern plan-and-execute agents incorporate feedback loops for mid-execution adaptation:
LangGraph implements this through graph cycles where execution nodes can route back to planning nodes based on conditional logic.
Complex tasks benefit from multi-level planning hierarchies:8)
Patterns include hub-and-spoke (central orchestrator), pipeline (sequential handoffs), and peer-to-peer (collaborative decomposition). These hierarchies scale plan-and-execute to handle workflows that would overwhelm a single planning step.
Late 2025 research emphasizes production-grade plan-and-execute with: