====== Common Agent Failure Modes ======
A systematic catalog of how LLM-based agents fail in production. For each failure mode: symptoms, root causes, and actionable fixes. Based on real incident reports from 2024-2026 including the Kiro AWS outage, Claude Code infinite loop bugs, and enterprise deployment statistics.
===== The Reality of Agent Failures =====
Agent failures are fundamentally different from traditional software bugs. Traditional software fails predictably (null pointers, timeouts). Agents fail probabilistically — the same input can succeed 9 times and fail catastrophically on the 10th.
**Production statistics (2025-2026):**
* 88-95% of AI agent pilots never reach production (Gartner, Deloitte, MIT 2025)(([[https://hypersense-software.com/blog/2026/01/12/why-88-percent-ai-agents-fail-production/|HyperSense Software: "Why 88% of AI Agents Fail in Production," 2026]]))
* 42% of started AI initiatives are abandoned (S&P Global 2025)
* 40% of multi-agent deployments fail within 6 months (TechAhead 2025)
* Amazon's Kiro AI agent autonomously deleted a production AWS environment, causing a 13-hour outage (2026)(([[https://particula.tech/blog/ai-agent-production-safety-kiro-incident|Particula Tech: "When AI Agents Delete Production: Lessons from Amazon's Kiro Incident," 2026]]))
* Claude Code sub-agent consumed 27M tokens in an infinite loop over 4.6 hours (GitHub Issue #15909)(([[https://github.com/anthropics/claude-code/issues/15909|GitHub anthropics/claude-code Issue #15909: "Sub-agent stuck in infinite loop, consumed 27M tokens," 2025]]))
===== Failure Mode Catalog =====
==== 1. Reasoning Failures ====
**Symptoms:**
* Agent makes illogical decisions on edge cases
* Multi-step plans break at unexpected points
* Agent confidently executes wrong plan
**Root Causes:**
* Model struggles with multi-hop reasoning chains
* Ambiguous problem decomposition
* Insufficient few-shot examples for the task type
**Fixes:**
* Add chain-of-thought prompting with explicit reasoning steps
* Implement human-in-the-loop checkpoints for critical decisions
* Break complex tasks into smaller, verifiable sub-tasks
* Test with adversarial edge cases before deployment
==== 2. Tool Use Errors ====
**Symptoms:**
* Agent calls tools with wrong parameters
* Agent calls non-existent tools (hallucinated tool names)
* Agent misinterprets tool output and takes wrong action
**Root Causes:**
* Incomplete or ambiguous tool descriptions
* Tool output format changes not reflected in prompts
* Too many tools available (decision fatigue)
**Fixes:**
* Write precise tool descriptions with parameter types and examples
* Validate tool call parameters before execution
* Limit available tools to those relevant to current task
* Add tool output parsing with error handling
import json
from typing import Any
class SafeToolExecutor:
"""Validate and execute tool calls with error handling."""
def __init__(self, tools: dict):
self.tools = tools
self.max_retries = 3
self.call_log = []
def execute(self, tool_name: str, params: dict) -> dict:
# Validate tool exists
if tool_name not in self.tools:
return {"error": f"Tool '{tool_name}' not found. Available: {list(self.tools.keys())}"}
tool = self.tools[tool_name]
# Validate parameters against schema
required = tool.get("required_params", [])
missing = [p for p in required if p not in params]
if missing:
return {"error": f"Missing required params: {missing}"}
# Execute with retry and timeout
for attempt in range(self.max_retries):
try:
result = tool["function"](**params)
self.call_log.append({
"tool": tool_name, "params": params,
"result": "success", "attempt": attempt + 1
})
return {"result": result}
except Exception as e:
if attempt == self.max_retries - 1:
self.call_log.append({
"tool": tool_name, "params": params,
"result": f"failed: {e}", "attempt": attempt + 1
})
return {"error": str(e)}
return {"error": "Max retries exceeded"}
def get_cost_report(self) -> dict:
return {
"total_calls": len(self.call_log),
"failures": sum(1 for c in self.call_log if "failed" in c["result"]),
"tools_used": list(set(c["tool"] for c in self.call_log)),
}
==== 3. Context Overflow ====
**Symptoms:**
* Agent "forgets" earlier instructions mid-conversation
* Quality degrades as conversation grows
* Agent contradicts its own earlier statements
* Tool results from early calls are silently dropped
**Root Causes:**
* Conversation history exceeds context window
* No summarization or pruning of old context
* Large tool outputs consume disproportionate context
**Fixes:**
* Implement sliding window with summarization of older turns
* Compress tool outputs before adding to context
* Monitor token usage per turn and alert before overflow
* Use a memory system (short-term + long-term retrieval)
==== 4. Infinite Loops ====
**Real incident:** A Claude Code sub-agent ran npm install 300+ times over 4.6 hours, consuming 27M tokens at 128K context per iteration. A LangGraph agent processed 2,847 iterations at $400+ cost for a $5 task(([[https://docs.bswen.com/blog/2026-03-11-prevent-ai-agent-infinite-loops/|BSWEN: "How Do You Stop AI Agents From Infinite Loops?" 2026]])).
**Symptoms:**
* Agent repeats the same action with same parameters
* Token usage spikes without progress
* Agent alternates between two states without converging
**Root Causes:**
* No loop detection or iteration limits
* Agent receives ambiguous error and retries identically
* Circular dependency between tools (Tool A calls Tool B calls Tool A)
* Agent cannot recognize task completion
**Fixes:**
import time
import hashlib
class LoopDetector:
"""Detect and prevent infinite loops in agent execution."""
def __init__(self, max_iterations: int = 50, max_cost_usd: float = 10.0):
self.max_iterations = max_iterations
self.max_cost_usd = max_cost_usd
self.iteration = 0
self.total_tokens = 0
self.action_hashes = []
self.cost_per_1k_tokens = 0.01 # Adjust per model
def check(self, action: str, params: dict, tokens_used: int) -> dict:
self.iteration += 1
self.total_tokens += tokens_used
estimated_cost = (self.total_tokens / 1000) * self.cost_per_1k_tokens
# Check iteration limit
if self.iteration > self.max_iterations:
return {"halt": True, "reason": f"Max iterations ({self.max_iterations}) exceeded"}
# Check cost limit
if estimated_cost > self.max_cost_usd:
return {"halt": True, "reason": f"Cost limit (${self.max_cost_usd}) exceeded: ${estimated_cost:.2f}"}
# Check for repeated actions (same action + params = loop)
action_hash = hashlib.md5(f"{action}{params}".encode()).hexdigest()
recent_hashes = self.action_hashes[-10:] # Check last 10
repeat_count = recent_hashes.count(action_hash)
self.action_hashes.append(action_hash)
if repeat_count >= 3:
return {"halt": True, "reason": f"Action '{action}' repeated {repeat_count}x with same params"}
return {"halt": False, "iteration": self.iteration, "cost": f"${estimated_cost:.2f}"}
==== 5. Goal Drift ====
**Symptoms:**
* Agent starts performing task A but gradually shifts to task B
* Output addresses tangentially related topic
* Agent gets sidetracked by interesting but irrelevant information from tools
**Root Causes:**
* System prompt gets diluted by conversation length
* Tool results introduce attractive distractions
* No mechanism to periodically re-anchor to original goal
**Fixes:**
* Repeat the goal in every prompt (not just the system message)
* Add a "goal check" step every N iterations
* Use structured output that must reference the original task
* Implement a planning step that produces a checklist, then track progress
==== 6. Prompt Injection ====
**Symptoms:**
* Agent performs unexpected actions after processing user input
* Agent ignores system instructions and follows user-injected instructions
* Sensitive data leaked through crafted queries
**Root Causes:**
* No input sanitization or boundary between instructions and data
* Agent processes untrusted content (emails, web pages) as instructions
* Insufficient separation between system and user context
**Fixes:**
* Separate data from instructions using clear delimiters
* Treat all tool outputs and user inputs as untrusted data
* Implement output filtering for sensitive content
* Use canary tokens to detect instruction override attempts
* Apply principle of least privilege to tool permissions
==== 7. Hallucination ====
See [[why_is_my_agent_hallucinating|Why Is My Agent Hallucinating?]] for the dedicated guide.
**Quick summary:** Agent generates plausible but wrong information. Fix with RAG grounding, chain-of-verification, low temperature, and constrained decoding.
==== 8. Cost Runaway ====
**Symptoms:**
* API bills orders of magnitude higher than expected
* Agent makes far more LLM calls than necessary
* Large context windows used for simple tasks
**Root Causes:**
* No cost monitoring or budget caps
* Agent retries failures without backoff
* Verbose tool outputs inflate context (and cost) per call
* No model routing (using GPT-4 for tasks GPT-4o-mini could handle)
**Fixes:**
class CostGuard:
"""Monitor and limit agent API costs in real-time."""
PRICING = { # USD per 1M tokens (input/output)
"gpt-4o": {"input": 2.50, "output": 10.00},
"gpt-4o-mini": {"input": 0.15, "output": 0.60},
"claude-sonnet-4": {"input": 3.00, "output": 15.00},
"claude-haiku-3.5": {"input": 0.80, "output": 4.00},
}
def __init__(self, budget_usd: float = 5.0):
self.budget = budget_usd
self.total_cost = 0.0
self.calls = []
def track(self, model: str, input_tokens: int, output_tokens: int) -> dict:
pricing = self.PRICING.get(model, {"input": 5.0, "output": 15.0})
cost = (input_tokens * pricing["input"] + output_tokens * pricing["output"]) / 1_000_000
self.total_cost += cost
self.calls.append({"model": model, "cost": cost})
if self.total_cost > self.budget:
return {"allowed": False, "reason": f"Budget exceeded: ${self.total_cost:.4f} / ${self.budget}"}
return {"allowed": True, "total_cost": f"${self.total_cost:.4f}", "remaining": f"${self.budget - self.total_cost:.4f}"}
def recommend_model(self, task_complexity: str) -> str:
"""Route to cheapest sufficient model."""
routing = {
"simple": "gpt-4o-mini", # Classification, extraction, formatting
"moderate": "claude-haiku-3.5", # Summarization, Q&A
"complex": "gpt-4o", # Multi-step reasoning, code generation
"critical": "claude-sonnet-4", # High-stakes decisions
}
return routing.get(task_complexity, "gpt-4o-mini")
===== Failure Mode Decision Diagram =====
graph TD
A[Agent Misbehaving] --> B{What type of failure?}
B --> C[Wrong output]
B --> D[Stuck/looping]
B --> E[Unexpected behavior]
B --> F[Cost explosion]
C --> C1{Is output fabricated?}
C1 -->|Yes| C2[Hallucination - see dedicated guide]
C1 -->|No| C3{Is reasoning wrong?}
C3 -->|Yes| C4[Add chain-of-thought + verification]
C3 -->|No| C5[Tool misuse - fix tool descriptions]
D --> D1{Same action repeating?}
D1 -->|Yes| D2[Infinite loop - add loop detector]
D1 -->|No| D3{Agent oscillating?}
D3 -->|Yes| D4[Circular dependency - break cycle]
D3 -->|No| D5[Context overflow - add summarization]
E --> E1{After processing external input?}
E1 -->|Yes| E2[Prompt injection - sanitize inputs]
E1 -->|No| E3{Doing unrelated tasks?}
E3 -->|Yes| E4[Goal drift - re-anchor to objective]
E3 -->|No| E5[Check system prompt and tool config]
F --> F1[Add CostGuard + model routing]
F --> F2[Add iteration limits]
F --> F3[Compress tool outputs]
===== Production Safety Checklist =====
* **Before deployment:**
- [ ] Set iteration limits (max 50-100 per task)
- [ ] Set cost budget per task and per day
- [ ] Implement loop detection
- [ ] Add human-in-the-loop for destructive actions
- [ ] Test with adversarial inputs
- [ ] Validate tool descriptions and parameter schemas
* **During operation:**
- [ ] Monitor token usage per session
- [ ] Alert on repeated identical tool calls
- [ ] Log all tool calls with parameters and results
- [ ] Track goal alignment score
- [ ] Monitor cost per task vs. baseline
* **Incident response:**
- [ ] Kill switch to halt agent immediately
- [ ] Audit trail of all actions taken
- [ ] Rollback capability for destructive actions
- [ ] Post-mortem template for agent incidents
===== See Also =====
* [[why_is_my_agent_hallucinating|Why Is My Agent Hallucinating?]]
* [[why_is_my_rag_returning_bad_results|Why Is My RAG Returning Bad Results?]]
* [[how_to_handle_rate_limits|How to Handle Rate Limits]]
===== References =====