====== Common Agent Failure Modes ======

A systematic catalog of how LLM-based agents fail in production. For each failure mode: symptoms, root causes, and actionable fixes. Based on real incident reports from 2024-2026 including the Kiro AWS outage, Claude Code infinite loop bugs, and enterprise deployment statistics.

===== The Reality of Agent Failures =====

Agent failures are fundamentally different from traditional software bugs. Traditional software fails predictably (null pointers, timeouts). Agents fail probabilistically — the same input can succeed 9 times and fail catastrophically on the 10th.

**Production statistics (2025-2026):**
  * 88-95% of AI agent pilots never reach production (Gartner, Deloitte, MIT 2025)(([[https://hypersense-software.com/blog/2026/01/12/why-88-percent-ai-agents-fail-production/|HyperSense Software: "Why 88% of AI Agents Fail in Production," 2026]]))
  * 42% of started AI initiatives are abandoned (S&P Global 2025)
  * 40% of multi-agent deployments fail within 6 months (TechAhead 2025)
  * Amazon's Kiro AI agent autonomously deleted a production AWS environment, causing a 13-hour outage (2026)(([[https://particula.tech/blog/ai-agent-production-safety-kiro-incident|Particula Tech: "When AI Agents Delete Production: Lessons from Amazon's Kiro Incident," 2026]]))
  * Claude Code sub-agent consumed 27M tokens in an infinite loop over 4.6 hours (GitHub Issue #15909)(([[https://github.com/anthropics/claude-code/issues/15909|GitHub anthropics/claude-code Issue #15909: "Sub-agent stuck in infinite loop, consumed 27M tokens," 2025]]))

===== Failure Mode Catalog =====

==== 1. Reasoning Failures ====

**Symptoms:**
  * Agent makes illogical decisions on edge cases
  * Multi-step plans break at unexpected points
  * Agent confidently executes wrong plan

**Root Causes:**
  * Model struggles with multi-hop reasoning chains
  * Ambiguous problem decomposition
  * Insufficient few-shot examples for the task type

**Fixes:**
  * Add chain-of-thought prompting with explicit reasoning steps
  * Implement human-in-the-loop checkpoints for critical decisions
  * Break complex tasks into smaller, verifiable sub-tasks
  * Test with adversarial edge cases before deployment

==== 2. Tool Use Errors ====

**Symptoms:**
  * Agent calls tools with wrong parameters
  * Agent calls non-existent tools (hallucinated tool names)
  * Agent misinterprets tool output and takes wrong action

**Root Causes:**
  * Incomplete or ambiguous tool descriptions
  * Tool output format changes not reflected in prompts
  * Too many tools available (decision fatigue)

**Fixes:**
  * Write precise tool descriptions with parameter types and examples
  * Validate tool call parameters before execution
  * Limit available tools to those relevant to current task
  * Add tool output parsing with error handling

<code python>
import json
from typing import Any

class SafeToolExecutor:
    """Validate and execute tool calls with error handling."""

    def __init__(self, tools: dict):
        self.tools = tools
        self.max_retries = 3
        self.call_log = []

    def execute(self, tool_name: str, params: dict) -> dict:
        # Validate tool exists
        if tool_name not in self.tools:
            return {"error": f"Tool '{tool_name}' not found. Available: {list(self.tools.keys())}"}

        tool = self.tools[tool_name]

        # Validate parameters against schema
        required = tool.get("required_params", [])
        missing = [p for p in required if p not in params]
        if missing:
            return {"error": f"Missing required params: {missing}"}

        # Execute with retry and timeout
        for attempt in range(self.max_retries):
            try:
                result = tool["function"](**params)
                self.call_log.append({
                    "tool": tool_name, "params": params,
                    "result": "success", "attempt": attempt + 1
                })
                return {"result": result}
            except Exception as e:
                if attempt == self.max_retries - 1:
                    self.call_log.append({
                        "tool": tool_name, "params": params,
                        "result": f"failed: {e}", "attempt": attempt + 1
                    })
                    return {"error": str(e)}
        return {"error": "Max retries exceeded"}

    def get_cost_report(self) -> dict:
        return {
            "total_calls": len(self.call_log),
            "failures": sum(1 for c in self.call_log if "failed" in c["result"]),
            "tools_used": list(set(c["tool"] for c in self.call_log)),
        }
</code>

==== 3. Context Overflow ====

**Symptoms:**
  * Agent "forgets" earlier instructions mid-conversation
  * Quality degrades as conversation grows
  * Agent contradicts its own earlier statements
  * Tool results from early calls are silently dropped

**Root Causes:**
  * Conversation history exceeds context window
  * No summarization or pruning of old context
  * Large tool outputs consume disproportionate context

**Fixes:**
  * Implement sliding window with summarization of older turns
  * Compress tool outputs before adding to context
  * Monitor token usage per turn and alert before overflow
  * Use a memory system (short-term + long-term retrieval)

==== 4. Infinite Loops ====

**Real incident:** A Claude Code sub-agent ran npm install 300+ times over 4.6 hours, consuming 27M tokens at 128K context per iteration. A LangGraph agent processed 2,847 iterations at $400+ cost for a $5 task(([[https://docs.bswen.com/blog/2026-03-11-prevent-ai-agent-infinite-loops/|BSWEN: "How Do You Stop AI Agents From Infinite Loops?" 2026]])).

**Symptoms:**
  * Agent repeats the same action with same parameters
  * Token usage spikes without progress
  * Agent alternates between two states without converging

**Root Causes:**
  * No loop detection or iteration limits
  * Agent receives ambiguous error and retries identically
  * Circular dependency between tools (Tool A calls Tool B calls Tool A)
  * Agent cannot recognize task completion

**Fixes:**

<code python>
import time
import hashlib

class LoopDetector:
    """Detect and prevent infinite loops in agent execution."""

    def __init__(self, max_iterations: int = 50, max_cost_usd: float = 10.0):
        self.max_iterations = max_iterations
        self.max_cost_usd = max_cost_usd
        self.iteration = 0
        self.total_tokens = 0
        self.action_hashes = []
        self.cost_per_1k_tokens = 0.01  # Adjust per model

    def check(self, action: str, params: dict, tokens_used: int) -> dict:
        self.iteration += 1
        self.total_tokens += tokens_used
        estimated_cost = (self.total_tokens / 1000) * self.cost_per_1k_tokens

        # Check iteration limit
        if self.iteration > self.max_iterations:
            return {"halt": True, "reason": f"Max iterations ({self.max_iterations}) exceeded"}

        # Check cost limit
        if estimated_cost > self.max_cost_usd:
            return {"halt": True, "reason": f"Cost limit (${self.max_cost_usd}) exceeded: ${estimated_cost:.2f}"}

        # Check for repeated actions (same action + params = loop)
        action_hash = hashlib.md5(f"{action}{params}".encode()).hexdigest()
        recent_hashes = self.action_hashes[-10:]  # Check last 10
        repeat_count = recent_hashes.count(action_hash)
        self.action_hashes.append(action_hash)

        if repeat_count >= 3:
            return {"halt": True, "reason": f"Action '{action}' repeated {repeat_count}x with same params"}

        return {"halt": False, "iteration": self.iteration, "cost": f"${estimated_cost:.2f}"}
</code>

==== 5. Goal Drift ====

**Symptoms:**
  * Agent starts performing task A but gradually shifts to task B
  * Output addresses tangentially related topic
  * Agent gets sidetracked by interesting but irrelevant information from tools

**Root Causes:**
  * System prompt gets diluted by conversation length
  * Tool results introduce attractive distractions
  * No mechanism to periodically re-anchor to original goal

**Fixes:**
  * Repeat the goal in every prompt (not just the system message)
  * Add a "goal check" step every N iterations
  * Use structured output that must reference the original task
  * Implement a planning step that produces a checklist, then track progress

==== 6. Prompt Injection ====

**Symptoms:**
  * Agent performs unexpected actions after processing user input
  * Agent ignores system instructions and follows user-injected instructions
  * Sensitive data leaked through crafted queries

**Root Causes:**
  * No input sanitization or boundary between instructions and data
  * Agent processes untrusted content (emails, web pages) as instructions
  * Insufficient separation between system and user context

**Fixes:**
  * Separate data from instructions using clear delimiters
  * Treat all tool outputs and user inputs as untrusted data
  * Implement output filtering for sensitive content
  * Use canary tokens to detect instruction override attempts
  * Apply principle of least privilege to tool permissions

==== 7. Hallucination ====

See [[why_is_my_agent_hallucinating|Why Is My Agent Hallucinating?]] for the dedicated guide.

**Quick summary:** Agent generates plausible but wrong information. Fix with RAG grounding, chain-of-verification, low temperature, and constrained decoding.

==== 8. Cost Runaway ====

**Symptoms:**
  * API bills orders of magnitude higher than expected
  * Agent makes far more LLM calls than necessary
  * Large context windows used for simple tasks

**Root Causes:**
  * No cost monitoring or budget caps
  * Agent retries failures without backoff
  * Verbose tool outputs inflate context (and cost) per call
  * No model routing (using GPT-4 for tasks GPT-4o-mini could handle)

**Fixes:**

<code python>
class CostGuard:
    """Monitor and limit agent API costs in real-time."""

    PRICING = {  # USD per 1M tokens (input/output)
        "gpt-4o": {"input": 2.50, "output": 10.00},
        "gpt-4o-mini": {"input": 0.15, "output": 0.60},
        "claude-sonnet-4": {"input": 3.00, "output": 15.00},
        "claude-haiku-3.5": {"input": 0.80, "output": 4.00},
    }

    def __init__(self, budget_usd: float = 5.0):
        self.budget = budget_usd
        self.total_cost = 0.0
        self.calls = []

    def track(self, model: str, input_tokens: int, output_tokens: int) -> dict:
        pricing = self.PRICING.get(model, {"input": 5.0, "output": 15.0})
        cost = (input_tokens * pricing["input"] + output_tokens * pricing["output"]) / 1_000_000
        self.total_cost += cost
        self.calls.append({"model": model, "cost": cost})

        if self.total_cost > self.budget:
            return {"allowed": False, "reason": f"Budget exceeded: ${self.total_cost:.4f} / ${self.budget}"}
        return {"allowed": True, "total_cost": f"${self.total_cost:.4f}", "remaining": f"${self.budget - self.total_cost:.4f}"}

    def recommend_model(self, task_complexity: str) -> str:
        """Route to cheapest sufficient model."""
        routing = {
            "simple": "gpt-4o-mini",      # Classification, extraction, formatting
            "moderate": "claude-haiku-3.5", # Summarization, Q&A
            "complex": "gpt-4o",           # Multi-step reasoning, code generation
            "critical": "claude-sonnet-4",  # High-stakes decisions
        }
        return routing.get(task_complexity, "gpt-4o-mini")
</code>

===== Failure Mode Decision Diagram =====

<mermaid>
graph TD
    A[Agent Misbehaving] --> B{What type of failure?}
    B --> C[Wrong output]
    B --> D[Stuck/looping]
    B --> E[Unexpected behavior]
    B --> F[Cost explosion]

    C --> C1{Is output fabricated?}
    C1 -->|Yes| C2[Hallucination - see dedicated guide]
    C1 -->|No| C3{Is reasoning wrong?}
    C3 -->|Yes| C4[Add chain-of-thought + verification]
    C3 -->|No| C5[Tool misuse - fix tool descriptions]

    D --> D1{Same action repeating?}
    D1 -->|Yes| D2[Infinite loop - add loop detector]
    D1 -->|No| D3{Agent oscillating?}
    D3 -->|Yes| D4[Circular dependency - break cycle]
    D3 -->|No| D5[Context overflow - add summarization]

    E --> E1{After processing external input?}
    E1 -->|Yes| E2[Prompt injection - sanitize inputs]
    E1 -->|No| E3{Doing unrelated tasks?}
    E3 -->|Yes| E4[Goal drift - re-anchor to objective]
    E3 -->|No| E5[Check system prompt and tool config]

    F --> F1[Add CostGuard + model routing]
    F --> F2[Add iteration limits]
    F --> F3[Compress tool outputs]
</mermaid>

===== Production Safety Checklist =====

  * **Before deployment:**
    - [ ] Set iteration limits (max 50-100 per task)
    - [ ] Set cost budget per task and per day
    - [ ] Implement loop detection
    - [ ] Add human-in-the-loop for destructive actions
    - [ ] Test with adversarial inputs
    - [ ] Validate tool descriptions and parameter schemas

  * **During operation:**
    - [ ] Monitor token usage per session
    - [ ] Alert on repeated identical tool calls
    - [ ] Log all tool calls with parameters and results
    - [ ] Track goal alignment score
    - [ ] Monitor cost per task vs. baseline

  * **Incident response:**
    - [ ] Kill switch to halt agent immediately
    - [ ] Audit trail of all actions taken
    - [ ] Rollback capability for destructive actions
    - [ ] Post-mortem template for agent incidents

===== See Also =====

  * [[why_is_my_agent_hallucinating|Why Is My Agent Hallucinating?]]
  * [[why_is_my_rag_returning_bad_results|Why Is My RAG Returning Bad Results?]]
  * [[how_to_handle_rate_limits|How to Handle Rate Limits]]

===== References =====