Core Concepts
Reasoning Techniques
Memory Systems
Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools & Products
Safety & Governance
Evaluation
Research
Development
Meta
Core Concepts
Reasoning Techniques
Memory Systems
Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools & Products
Safety & Governance
Evaluation
Research
Development
Meta
Agent threat modeling is the systematic analysis of security vulnerabilities in LLM-based autonomous agents. As agents gain capabilities to execute code, access tools, and interact with external systems, they introduce novel attack surfaces that extend far beyond traditional prompt injection. The OWASP Top 10 for Agentic Applications (2026) and research by Schneier et al. frame these as multi-stage “Promptware Kill Chains” that hijack planning, tools, and propagation across systems.
In agentic systems, prompt injections evolve from isolated manipulations into coordinated multi-tool, multi-step attacks:
The Promptware Kill Chain (Schneier et al., 2026) models five stages of agentic prompt injection attacks:
Agents inherit user privileges for tools, creating dangerous attack vectors:
Compromised agents can leak sensitive data through multiple channels:
Agent supply chains introduce multiple points of compromise:
Defense-in-depth strategies for securing LLM agents:
Input/Output Validation:
Tool Sandboxing and Privilege Minimization:
Goal-Lock and Human-in-the-Loop:
Monitoring and Detection:
# Example: Agent threat detection middleware class AgentSecurityMiddleware: def __init__(self, policy): self.policy = policy self.injection_detector = InjectionClassifier() self.anomaly_detector = BehaviorAnomalyDetector() def validate_tool_call(self, agent_id, tool_name, arguments): """Validate a tool call before execution.""" # Check tool is in agent's allowlist if tool_name not in self.policy.allowed_tools(agent_id): raise SecurityViolation(f"Unauthorized tool: {tool_name}") # Scan arguments for injection attempts if self.injection_detector.scan(str(arguments)): raise SecurityViolation("Potential injection in tool args") # Check for anomalous behavior patterns if self.anomaly_detector.is_anomalous(agent_id, tool_name): self.escalate_to_human(agent_id, tool_name, arguments) return True # Allow execution