Agentic Skills

Agentic Skills are composable, procedural modules that LLM agents load on demand to dynamically extend their capabilities without retraining. Formalized in arXiv:2602.12430, the agentic skills paradigm shifts from monolithic models — where all procedural knowledge is baked into weights — to modular agents that acquire specialized abilities at runtime through progressive context injection.¹⁾²⁾

Overview

As LLM agents are deployed across increasingly diverse domains (software engineering, GUI interaction, data analysis), encoding all necessary procedural knowledge into model weights becomes impractical. Agentic skills address this by defining portable, standardized packages of instructions, code, and resources that agents load into their context window when relevant tasks are detected. Skills are typically defined in plain English and can be triggered through natural language commands, abstracting implementation complexity.³⁾ Domain-specific capabilities can be attached to agents to improve their responses with specialized knowledge—for example, authentication-related skills provide domain expertise in SSO, directory sync, and RBAC to enable more accurate code generation.⁴⁾

Skills follow three design principles:

Progressive Disclosure — Content is loaded incrementally, from lightweight metadata to full resources
Portable Definitions — Skills are stored as standardized SKILL.md files transferable across agents
MCP Integration — Skills complement the Model Context Protocol for tool use; skills provide the “how” while MCP tools provide the “what”

Skill Structure

Each skill is formalized with applicability conditions, procedural instructions, and resources, organized into three progressive loading tiers:

Level 1: Frontmatter

Lightweight metadata always available for quick matching — name, description, version, trigger phrases, and dependencies. This enables efficient skill selection without loading full content into the context window.

Level 2: Instructions

Procedural knowledge including workflows, step-by-step logic, and decision trees (typically 200-5,000 tokens). Loaded only upon skill activation.

Level 3: Resources

Auxiliary assets such as scripts, templates, schemas, and reference data. Loaded on-demand when instructions reference them.

Skill Definition Example

SKILL.md frontmatter (Level 1)
skill_definition = {
    "name": "visual-layout-critic",
    "description": "Evaluate rendered visuals for spatial clarity, "
                   "text readability, and element occlusions",
    "version": "1.0.0",
    "triggers": [
        "review layout",
        "check visual quality",
        "refine positioning"
    ],
    "dependencies": ["vision-language-model", "PIL"],
    "applicability": {
        "contexts": ["ui_design", "document_layout", "presentation"],
        "requires_vision": True
    }
}
 
Level 2: Instructions (loaded on activation)
instructions = """
Evaluation Workflow
1. Capture the current visual state of the target element
2. Analyze spatial relationships between all visible components
3. Check text elements for minimum readable font size (>= 12px)
4. Identify any overlapping or occluded elements
5. Score layout on clarity (1-10), readability (1-10), balance (1-10)
6. Generate specific improvement recommendations
 
Decision Criteria
- If any text < 12px: flag as critical
- If overlap > 15% of any element: flag as warning
- If whitespace ratio < 0.2: suggest spacing improvements
"""
 
Level 3: Resources (loaded on-demand)
evaluation_template = {
    "clarity_score": None,
    "readability_score": None,
    "balance_score": None,
    "critical_issues": [],
    "warnings": [],
    "recommendations": []
}

Skill Creation Pipeline

Skills can be created through multiple methods, including a key automated pipeline that extracts skills from open-source repositories:⁵⁾

Repository Structural Analysis — Map codebases to identify recurring procedural patterns
Semantic Identification — Two-stage process using dense retrieval for latent skill candidates, then cross-encoder ranking based on recurrence, verifiability, non-obviousness, and generalizability
Translation to SKILL.md — Generate frontmatter, draft generalizable instructions (avoiding repo-specific details), and bundle necessary assets

Skill Invocation Process

Selection — Agent matches user queries or task context to skill trigger phrases and applicability conditions
Loading — Progressive disclosure injects only the necessary tiers into the context window
Execution — Instructions guide the agent's reasoning; resources and tools activate via MCP for concrete actions
Persistence — Skills are stored durably on the filesystem, separate from ephemeral session data

Ecosystem Scale

The agentic skills ecosystem has grown rapidly:⁶⁾

Over 84,192 skills created in 136 days across the community
In software engineering benchmarks (OSWorld, SWE-bench), skills improve reasoning efficiency — of 24 perfect-score cases, 8 used fewer tokens with skills injected
Security analysis found 26.1% of community-contributed skills contain vulnerabilities, motivating the proposed Skill Trust and Lifecycle Governance Framework with a four-tier permission model

References

¹⁾

“Agentic Skills: Dynamic Capability Extension for LLM Agents.” arXiv:2602.12430

²⁾

Ben's Bites (2026

³⁾

“Agent Skill Definition.” Ben's Bites (2026)

⁴⁾

TLDR AI (2026

⁵⁾

“Automated Skill Extraction from Repositories.” arXiv:2603.11808

⁶⁾

“Agentic Skills Ecosystem Analysis.” arXiv:2603.15401

AI Agent Knowledge Base

Sidebar

Table of Contents

Agentic Skills

Overview

Skill Structure

Level 1: Frontmatter

Level 2: Instructions

Level 3: Resources

Skill Definition Example

Skill Creation Pipeline

Skill Invocation Process

Ecosystem Scale

See Also

References

AI Agent Knowledge Base

User Tools

Site Tools

Sidebar

Table of Contents

Agentic Skills

Overview

Skill Structure

Level 1: Frontmatter

Level 2: Instructions

Level 3: Resources

Skill Definition Example

Skill Creation Pipeline

Skill Invocation Process

Ecosystem Scale

See Also

References

Page Tools