====== Common Agent Failure Modes ====== A systematic catalog of how LLM-based agents fail in production. For each failure mode: symptoms, root causes, and actionable fixes. Based on real incident reports from 2024-2026 including the Kiro AWS outage, Claude Code infinite loop bugs, and enterprise deployment statistics. ===== The Reality of Agent Failures ===== Agent failures are fundamentally different from traditional software bugs. Traditional software fails predictably (null pointers, timeouts). Agents fail probabilistically — the same input can succeed 9 times and fail catastrophically on the 10th. **Production statistics (2025-2026):** * 88-95% of AI agent pilots never reach production (Gartner, Deloitte, MIT 2025)(([[https://hypersense-software.com/blog/2026/01/12/why-88-percent-ai-agents-fail-production/|HyperSense Software: "Why 88% of AI Agents Fail in Production," 2026]])) * 42% of started AI initiatives are abandoned (S&P Global 2025) * 40% of multi-agent deployments fail within 6 months (TechAhead 2025) * Amazon's Kiro AI agent autonomously deleted a production AWS environment, causing a 13-hour outage (2026)(([[https://particula.tech/blog/ai-agent-production-safety-kiro-incident|Particula Tech: "When AI Agents Delete Production: Lessons from Amazon's Kiro Incident," 2026]])) * Claude Code sub-agent consumed 27M tokens in an infinite loop over 4.6 hours (GitHub Issue #15909)(([[https://github.com/anthropics/claude-code/issues/15909|GitHub anthropics/claude-code Issue #15909: "Sub-agent stuck in infinite loop, consumed 27M tokens," 2025]])) ===== Failure Mode Catalog ===== ==== 1. Reasoning Failures ==== **Symptoms:** * Agent makes illogical decisions on edge cases * Multi-step plans break at unexpected points * Agent confidently executes wrong plan **Root Causes:** * Model struggles with multi-hop reasoning chains * Ambiguous problem decomposition * Insufficient few-shot examples for the task type **Fixes:** * Add chain-of-thought prompting with explicit reasoning steps * Implement human-in-the-loop checkpoints for critical decisions * Break complex tasks into smaller, verifiable sub-tasks * Test with adversarial edge cases before deployment ==== 2. Tool Use Errors ==== **Symptoms:** * Agent calls tools with wrong parameters * Agent calls non-existent tools (hallucinated tool names) * Agent misinterprets tool output and takes wrong action **Root Causes:** * Incomplete or ambiguous tool descriptions * Tool output format changes not reflected in prompts * Too many tools available (decision fatigue) **Fixes:** * Write precise tool descriptions with parameter types and examples * Validate tool call parameters before execution * Limit available tools to those relevant to current task * Add tool output parsing with error handling import json from typing import Any class SafeToolExecutor: """Validate and execute tool calls with error handling.""" def __init__(self, tools: dict): self.tools = tools self.max_retries = 3 self.call_log = [] def execute(self, tool_name: str, params: dict) -> dict: # Validate tool exists if tool_name not in self.tools: return {"error": f"Tool '{tool_name}' not found. Available: {list(self.tools.keys())}"} tool = self.tools[tool_name] # Validate parameters against schema required = tool.get("required_params", []) missing = [p for p in required if p not in params] if missing: return {"error": f"Missing required params: {missing}"} # Execute with retry and timeout for attempt in range(self.max_retries): try: result = tool["function"](**params) self.call_log.append({ "tool": tool_name, "params": params, "result": "success", "attempt": attempt + 1 }) return {"result": result} except Exception as e: if attempt == self.max_retries - 1: self.call_log.append({ "tool": tool_name, "params": params, "result": f"failed: {e}", "attempt": attempt + 1 }) return {"error": str(e)} return {"error": "Max retries exceeded"} def get_cost_report(self) -> dict: return { "total_calls": len(self.call_log), "failures": sum(1 for c in self.call_log if "failed" in c["result"]), "tools_used": list(set(c["tool"] for c in self.call_log)), } ==== 3. Context Overflow ==== **Symptoms:** * Agent "forgets" earlier instructions mid-conversation * Quality degrades as conversation grows * Agent contradicts its own earlier statements * Tool results from early calls are silently dropped **Root Causes:** * Conversation history exceeds context window * No summarization or pruning of old context * Large tool outputs consume disproportionate context **Fixes:** * Implement sliding window with summarization of older turns * Compress tool outputs before adding to context * Monitor token usage per turn and alert before overflow * Use a memory system (short-term + long-term retrieval) ==== 4. Infinite Loops ==== **Real incident:** A Claude Code sub-agent ran npm install 300+ times over 4.6 hours, consuming 27M tokens at 128K context per iteration. A LangGraph agent processed 2,847 iterations at $400+ cost for a $5 task(([[https://docs.bswen.com/blog/2026-03-11-prevent-ai-agent-infinite-loops/|BSWEN: "How Do You Stop AI Agents From Infinite Loops?" 2026]])). **Symptoms:** * Agent repeats the same action with same parameters * Token usage spikes without progress * Agent alternates between two states without converging **Root Causes:** * No loop detection or iteration limits * Agent receives ambiguous error and retries identically * Circular dependency between tools (Tool A calls Tool B calls Tool A) * Agent cannot recognize task completion **Fixes:** import time import hashlib class LoopDetector: """Detect and prevent infinite loops in agent execution.""" def __init__(self, max_iterations: int = 50, max_cost_usd: float = 10.0): self.max_iterations = max_iterations self.max_cost_usd = max_cost_usd self.iteration = 0 self.total_tokens = 0 self.action_hashes = [] self.cost_per_1k_tokens = 0.01 # Adjust per model def check(self, action: str, params: dict, tokens_used: int) -> dict: self.iteration += 1 self.total_tokens += tokens_used estimated_cost = (self.total_tokens / 1000) * self.cost_per_1k_tokens # Check iteration limit if self.iteration > self.max_iterations: return {"halt": True, "reason": f"Max iterations ({self.max_iterations}) exceeded"} # Check cost limit if estimated_cost > self.max_cost_usd: return {"halt": True, "reason": f"Cost limit (${self.max_cost_usd}) exceeded: ${estimated_cost:.2f}"} # Check for repeated actions (same action + params = loop) action_hash = hashlib.md5(f"{action}{params}".encode()).hexdigest() recent_hashes = self.action_hashes[-10:] # Check last 10 repeat_count = recent_hashes.count(action_hash) self.action_hashes.append(action_hash) if repeat_count >= 3: return {"halt": True, "reason": f"Action '{action}' repeated {repeat_count}x with same params"} return {"halt": False, "iteration": self.iteration, "cost": f"${estimated_cost:.2f}"} ==== 5. Goal Drift ==== **Symptoms:** * Agent starts performing task A but gradually shifts to task B * Output addresses tangentially related topic * Agent gets sidetracked by interesting but irrelevant information from tools **Root Causes:** * System prompt gets diluted by conversation length * Tool results introduce attractive distractions * No mechanism to periodically re-anchor to original goal **Fixes:** * Repeat the goal in every prompt (not just the system message) * Add a "goal check" step every N iterations * Use structured output that must reference the original task * Implement a planning step that produces a checklist, then track progress ==== 6. Prompt Injection ==== **Symptoms:** * Agent performs unexpected actions after processing user input * Agent ignores system instructions and follows user-injected instructions * Sensitive data leaked through crafted queries **Root Causes:** * No input sanitization or boundary between instructions and data * Agent processes untrusted content (emails, web pages) as instructions * Insufficient separation between system and user context **Fixes:** * Separate data from instructions using clear delimiters * Treat all tool outputs and user inputs as untrusted data * Implement output filtering for sensitive content * Use canary tokens to detect instruction override attempts * Apply principle of least privilege to tool permissions ==== 7. Hallucination ==== See [[why_is_my_agent_hallucinating|Why Is My Agent Hallucinating?]] for the dedicated guide. **Quick summary:** Agent generates plausible but wrong information. Fix with RAG grounding, chain-of-verification, low temperature, and constrained decoding. ==== 8. Cost Runaway ==== **Symptoms:** * API bills orders of magnitude higher than expected * Agent makes far more LLM calls than necessary * Large context windows used for simple tasks **Root Causes:** * No cost monitoring or budget caps * Agent retries failures without backoff * Verbose tool outputs inflate context (and cost) per call * No model routing (using GPT-4 for tasks GPT-4o-mini could handle) **Fixes:** class CostGuard: """Monitor and limit agent API costs in real-time.""" PRICING = { # USD per 1M tokens (input/output) "gpt-4o": {"input": 2.50, "output": 10.00}, "gpt-4o-mini": {"input": 0.15, "output": 0.60}, "claude-sonnet-4": {"input": 3.00, "output": 15.00}, "claude-haiku-3.5": {"input": 0.80, "output": 4.00}, } def __init__(self, budget_usd: float = 5.0): self.budget = budget_usd self.total_cost = 0.0 self.calls = [] def track(self, model: str, input_tokens: int, output_tokens: int) -> dict: pricing = self.PRICING.get(model, {"input": 5.0, "output": 15.0}) cost = (input_tokens * pricing["input"] + output_tokens * pricing["output"]) / 1_000_000 self.total_cost += cost self.calls.append({"model": model, "cost": cost}) if self.total_cost > self.budget: return {"allowed": False, "reason": f"Budget exceeded: ${self.total_cost:.4f} / ${self.budget}"} return {"allowed": True, "total_cost": f"${self.total_cost:.4f}", "remaining": f"${self.budget - self.total_cost:.4f}"} def recommend_model(self, task_complexity: str) -> str: """Route to cheapest sufficient model.""" routing = { "simple": "gpt-4o-mini", # Classification, extraction, formatting "moderate": "claude-haiku-3.5", # Summarization, Q&A "complex": "gpt-4o", # Multi-step reasoning, code generation "critical": "claude-sonnet-4", # High-stakes decisions } return routing.get(task_complexity, "gpt-4o-mini") ===== Failure Mode Decision Diagram ===== graph TD A[Agent Misbehaving] --> B{What type of failure?} B --> C[Wrong output] B --> D[Stuck/looping] B --> E[Unexpected behavior] B --> F[Cost explosion] C --> C1{Is output fabricated?} C1 -->|Yes| C2[Hallucination - see dedicated guide] C1 -->|No| C3{Is reasoning wrong?} C3 -->|Yes| C4[Add chain-of-thought + verification] C3 -->|No| C5[Tool misuse - fix tool descriptions] D --> D1{Same action repeating?} D1 -->|Yes| D2[Infinite loop - add loop detector] D1 -->|No| D3{Agent oscillating?} D3 -->|Yes| D4[Circular dependency - break cycle] D3 -->|No| D5[Context overflow - add summarization] E --> E1{After processing external input?} E1 -->|Yes| E2[Prompt injection - sanitize inputs] E1 -->|No| E3{Doing unrelated tasks?} E3 -->|Yes| E4[Goal drift - re-anchor to objective] E3 -->|No| E5[Check system prompt and tool config] F --> F1[Add CostGuard + model routing] F --> F2[Add iteration limits] F --> F3[Compress tool outputs] ===== Production Safety Checklist ===== * **Before deployment:** - [ ] Set iteration limits (max 50-100 per task) - [ ] Set cost budget per task and per day - [ ] Implement loop detection - [ ] Add human-in-the-loop for destructive actions - [ ] Test with adversarial inputs - [ ] Validate tool descriptions and parameter schemas * **During operation:** - [ ] Monitor token usage per session - [ ] Alert on repeated identical tool calls - [ ] Log all tool calls with parameters and results - [ ] Track goal alignment score - [ ] Monitor cost per task vs. baseline * **Incident response:** - [ ] Kill switch to halt agent immediately - [ ] Audit trail of all actions taken - [ ] Rollback capability for destructive actions - [ ] Post-mortem template for agent incidents ===== See Also ===== * [[why_is_my_agent_hallucinating|Why Is My Agent Hallucinating?]] * [[why_is_my_rag_returning_bad_results|Why Is My RAG Returning Bad Results?]] * [[how_to_handle_rate_limits|How to Handle Rate Limits]] ===== References =====