====== Agent Index ====== **The AI Agent Index** is a systematic documentation effort that catalogues and analyzes deployed agentic AI systems. The 2025 AI Agent Index, hosted at aiagentindex.mit.edu, provides a comprehensive report documenting the origins, design, capabilities, ecosystem, and safety features of 30 prominent AI agents, analyzed across 45 fields in six categories by seven expert annotators using publicly available data. ===== Methodology ===== The Agent Index follows a rigorous annotation methodology: * **Selection criteria** — Agents are chosen based on three factors: degree of agency (autonomous decision-making capability), real-world impact (deployment scale and user base), and practicality (availability for actual use) * **Annotation framework** — Each agent is analyzed across 45 fields organized into six categories, totaling 1,350 data points across all agents * **Expert review** — Seven domain experts annotate agents using only publicly available information, following the methodology established in the 2024 index with revised criteria * **Agent types** — Three primary categories: chat agents (12), browser agents (5), and enterprise agents (13) * **Timeliness** — 24 of 30 indexed agents were released or received major updates in 2024-2025 ===== Taxonomy of Deployed Agents ===== The index categorizes deployed agentic systems by their primary function: **Information Synthesis (12/30 agents):** * Chat-based agents that process queries, retrieve information, and generate comprehensive responses * Examples include ChatGPT Agent, Claude, Gemini, and domain-specific assistants **Workflow Automation (11/30 agents, mostly enterprise):** * Agents targeting business process automation across HR, sales, customer support, and IT operations * Represent the "second wave" of agent deployment after initial chat agent proliferation **GUI / Browser Operations (7/30 agents):** * Agents that interact with web interfaces to complete tasks like form filling, ordering, booking, and data extraction * Require visual understanding and complex multi-step web navigation ===== Capability Comparisons ===== The index compares agents across multiple capability dimensions: * **Technical capabilities** — Underlying models, available tools, system architecture, and memory mechanisms * **Autonomy levels** — Degree of independent operation, from fully supervised to largely autonomous * **Approval requirements** — Whether human confirmation is needed for specific action types * **Monitoring features** — Real-time observability into agent decision-making and actions * **Emergency controls** — Kill switches and emergency stop mechanisms * **Ecosystem factors** — Identification protocols, interoperability with other systems, and API availability * **Model selection** — Whether agents can automatically select between multiple underlying models or are locked to a single provider ===== Safety Features ===== A critical finding of the Agent Index is the significant gap in safety documentation: * **Inconsistent reporting** — Safety feature documentation varies widely across agents, with many providing minimal or no information * **Limited evaluations** — Only a handful of agents (ChatGPT Agent, OpenAI Codex, Claude Code, Gemini 2.5) publish agent-specific safety evaluations * **Enterprise compliance focus** — Enterprise platforms tend to emphasize general compliance certifications (SOC 2, GDPR) rather than agent-specific safety measures * **Persistent gaps** — Across all 1,350 annotated fields, safety, evaluation, and ecosystem transparency show the most significant documentation deficiencies * **Guardrails variability** — Sandboxing, output filtering, and behavioral constraints are implemented inconsistently with no industry standard ===== Market Landscape ===== The index reveals a rapidly evolving market: **Research Growth:** * 2025 "AI agent" research papers exceeded the combined total from 2020-2024 by more than 2x * Academic and industry interest has accelerated sharply **Enterprise Adoption:** * McKinsey's mid-2025 survey: 62% of 1,993 companies experimenting with AI agents * Salesforce Agentic Enterprise Index: 119% agent creation growth (Jan-Jun 2025) among first-movers * 22x rise in agent-led customer service conversations * 80% monthly growth in agent-initiated actions **Deployment Sectors:** * Customer service leads enterprise adoption * Retail, travel/hospitality, and financial services lead consumer-facing deployment * Travel/hospitality shows 133% monthly growth rate * 94% of consumers opted to interact with agents when available **Workforce Integration:** * Employee interactions with agents grew 65% monthly * Agent-initiated actions from employee use grew 76% monthly * Escalations to humans rose to 32% in Q2 2025, suggesting appropriate boundary-setting # Example: Agent Index data structure for cataloguing agents agent_entry = { "name": "Example Agent", "type": "enterprise", # chat | browser | enterprise "version": "2.0", "release_date": "2025-03", "product_overview": { "primary_function": "workflow_automation", "domains": ["customer_support", "sales"], "deployment": "cloud_saas", }, "technical_capabilities": { "base_models": ["gpt-4o", "claude-sonnet"], "model_selection": "automatic", "tools": ["web_search", "code_exec", "email", "crm_api"], "memory": "persistent_conversation", "architecture": "react_loop", }, "autonomy_and_control": { "autonomy_level": "semi-autonomous", "human_approval_required": ["financial_transactions", "data_deletion"], "kill_switch": True, "monitoring": "real_time_dashboard", }, "safety_and_evaluation": { "published_evals": False, "guardrails": ["output_filtering", "input_validation"], "sandboxing": "container_isolation", "third_party_audit": False, "compliance": ["SOC2", "GDPR"], }, } ===== References ===== * [[https://arxiv.org/abs/2602.17753|The 2025 AI Agent Index (arXiv:2602.17753)]] * [[https://aiagentindex.mit.edu|AI Agent Index — MIT]] * [[https://aiagentindex.mit.edu/data/2025-AI-Agent-Index.pdf|2025 AI Agent Index Full Report (PDF)]] * [[https://www.salesforce.com/news/stories/agentic-enterprise-index-insights-h1-2025/|Agentic Enterprise Index H1 2025 — Salesforce]] ===== See Also ===== * [[agent_as_a_judge|Agent-as-a-Judge]] * [[swe_bench|SWE-bench]] * [[web_arena_benchmark|WebArena Benchmark]] * [[agent_governance_frameworks|Agent Governance Frameworks]] * [[agent_threat_modeling|Agent Threat Modeling]] * [[agent_sandbox_security|Agent Sandbox Security]] * [[reasoning_reward_models|Reasoning Reward Models]]