This is an old revision of the document!

Agent Loop

The agent loop, also known as the perception-thought-action cycle, is the fundamental execution pattern underlying most AI agent architectures. In each iteration, the agent perceives its environment through observations, reasons about the current state and its goals, and then selects and executes an action that modifies the environment. This continuous cycle enables autonomous agents to operate over multiple steps, adapting their behavior based on feedback from each action.

graph TD P[Perceive] -->|Gather context| T[Think / Reason] T -->|Select action| A[Act] A -->|Execute tool or response| O[Observe] O -->|Capture result| D{Task Complete?} D -->|No| P D -->|Yes| F[Return Result]

The Core Cycle

Every agent loop implementation shares the same basic structure:

Perceive: Gather context from user inputs, prior action results, API responses, error messages, or environmental observations
Think/Reason: The LLM analyzes accumulated context, reasons about the next step using chain-of-thought reasoning, and determines which action to take
Act: Execute the selected action – invoke a tool, generate a response, modify a file, or delegate to another agent
Observe: Capture the result of the action as new context for the next iteration
Repeat: Continue until the task is complete, a failure condition is met, or a maximum iteration limit is reached

This cycle runs in a “while” loop structure, forming the backbone of every major agentic system. The ReAct pattern is a specific instantiation of this loop where the Think step produces an explicit verbal reasoning trace.

The following example implements a minimal agent loop in pure Python that reasons, calls tools, and iterates until the task is done:

# Minimal agent loop with tool use in pure Python
from openai import OpenAI
import json
 
client = OpenAI()
 
def calculator(expression: str) -> str:
    return str(eval(expression, {"__builtins__": {}}, {}))
 
TOOLS = [{"type": "function", "function": {
    "name": "calculator", "description": "Evaluate a math expression",
    "parameters": {
        "type": "object",
        "properties": {"expression": {"type": "string"}},
        "required": ["expression"],
    },
}}]
TOOL_FNS = {"calculator": calculator}
 
def agent_loop(user_query, max_iterations=5):
    messages = [{"role": "user", "content": user_query}]
    for _ in range(max_iterations):
        response = client.chat.completions.create(
            model="gpt-4o", messages=messages, tools=TOOLS
        )
        msg = response.choices[0].message
        messages.append(msg)
        if not msg.tool_calls:
            return msg.content  # No more tool calls -- task complete
        for tc in msg.tool_calls:
            result = TOOL_FNS[tc.function.name](**json.loads(tc.function.arguments))
            messages.append({"role": "tool", "tool_call_id": tc.id, "content": result})
    return messages[-1].content
 
print(agent_loop("What is (17 * 31) + (45 * 22)?"))

OODA Loop Analogy

The agent loop closely mirrors the OODA loop (Observe-Orient-Decide-Act) from military strategy, developed by Colonel John Boyd:

Observe aligns with perception – gathering data from the environment
Orient matches reasoning – analyzing context, understanding the situation relative to goals
Decide covers planning – selecting the best action from available options
Act executes the chosen action, with results feeding back to the next Observe phase

Both frameworks emphasize rapid iteration and adaptation. The OODA loop's insight that faster iteration cycles create strategic advantage applies directly to agent design: agents that quickly process observations and act on them outperform those with slower loops, particularly in dynamic environments.

Framework Implementations

Modern frameworks implement the agent loop with varying levels of sophistication:

LangGraph uses a StateGraph where nodes represent agents or tools and edges manage state transitions. The loop is expressed as graph traversal, with supervisor nodes routing between worker agents and conditional edges controlling iteration. Supports hierarchical loops, parallel execution, and shared knowledge bases.

CrewAI implements role-based crews with sequential or parallel task execution. The agent loop includes built-in delegation – agents can hand off subtasks to teammates – and refinement cycles where outputs are reviewed and improved.

AutoGen uses conversational multi-agent loops where agents communicate by passing messages. Inner loops handle individual step execution while outer loops manage overall strategy. Supports peer handoff patterns and dynamic orchestration.

OpenAI Agents SDK implements a manager pattern where a central agent delegates via tool calls in a single loop. Sub-agents are invoked as tools, creating nested loops for multi-agent workflows. Designed for cost-optimized, observable iterations.

Event-Driven vs. Polling Loops

Agent loops can be implemented using two fundamental execution models:

Polling Loops: Traditional “while True” structures that check state periodically. Simple to implement and debug, used in LangGraph's iterative graph execution. Can be resource-intensive for long-running tasks.
Event-Driven Loops: React to asynchronous events like tool callbacks, messages from other agents, or external triggers. More efficient for production systems, used in AutoGen and the OpenAI Agents SDK for real-time responsiveness.

Hybrid approaches dominate in 2025 production systems, combining polling for the main loop with event-driven handling for tool responses and inter-agent communication.

Streaming Agent Loops

Modern agent loops stream partial outputs in real-time rather than waiting for complete iterations:

Token-by-Token Streaming: Reasoning traces and action selections appear as they are generated, improving user experience for long tasks
Iteration Events: Frameworks like LangGraph expose each loop iteration as a discrete event for monitoring and debugging
Observability Traces: The OpenAI Agents SDK provides structured traces of each loop step, including token usage, tool calls, and timing

Streaming is critical for managing the high token costs of agentic loops (up to 15x standard chat) by providing early visibility into agent behavior and enabling early intervention.

Error Recovery

Robust agent loops handle failures gracefully through several mechanisms:

Observation-Based Recovery: Error messages from failed tool calls feed back into the reasoning step, allowing the agent to diagnose and retry
Maximum Iteration Limits: Hard caps prevent infinite loops when agents get stuck in repetitive patterns
Dual-Loop Systems: Outer loops manage overall strategy while inner loops handle step execution, enabling strategic resets without losing all progress
Graceful Degradation: Falling back to simpler strategies (e.g., from ReAct to pure CoT) when tool-based approaches fail repeatedly
Checkpointing: Saving loop state at each iteration for resumability after crashes or timeouts

Human-in-the-Loop

Production agent loops incorporate human oversight at strategic points:

Approval Gates: The loop pauses at critical decision points for human review before executing high-impact actions
Feedback Injection: Humans can modify the agent's plan or provide additional context mid-loop
Escalation Triggers: Automatic handoff to humans when the agent's confidence drops below a threshold or errors accumulate
Supervision Modes: Ranging from full autonomy (no human involvement) to always-approve (human confirms every action), with intermediate modes for production flexibility

AI Agent Knowledge Base

Sidebar

Table of Contents

Agent Loop

The Core Cycle

OODA Loop Analogy

Framework Implementations

Event-Driven vs. Polling Loops

Streaming Agent Loops

Error Recovery

Human-in-the-Loop

See Also

AI Agent Knowledge Base

User Tools

Site Tools

Sidebar

Table of Contents

Agent Loop

The Core Cycle

OODA Loop Analogy

Framework Implementations

Event-Driven vs. Polling Loops

Streaming Agent Loops

Error Recovery

Human-in-the-Loop

See Also

Page Tools