====== Agentic Skills ====== **Agentic Skills** are composable, procedural modules that LLM agents load on demand to dynamically extend their capabilities without retraining. Formalized in [[https://arxiv.org/abs/2602.12430|arXiv:2602.12430]], the agentic skills paradigm shifts from monolithic models — where all procedural knowledge is baked into weights — to modular agents that acquire specialized abilities at runtime through progressive context injection. ===== Overview ===== As LLM agents are deployed across increasingly diverse domains (software engineering, GUI interaction, data analysis), encoding all necessary procedural knowledge into model weights becomes impractical. Agentic skills address this by defining portable, standardized packages of instructions, code, and resources that agents load into their context window when relevant tasks are detected. Skills follow three design principles: * **Progressive Disclosure** — Content is loaded incrementally, from lightweight metadata to full resources * **Portable Definitions** — Skills are stored as standardized SKILL.md files transferable across agents * **MCP Integration** — Skills complement the Model Context Protocol for tool use; skills provide the "how" while MCP tools provide the "what" ===== Skill Structure ===== Each skill is formalized with applicability conditions, procedural instructions, and resources, organized into three progressive loading tiers: === Level 1: Frontmatter === Lightweight metadata always available for quick matching — name, description, version, trigger phrases, and dependencies. This enables efficient skill selection without loading full content into the context window. === Level 2: Instructions === Procedural knowledge including workflows, step-by-step logic, and decision trees (typically 200-5,000 tokens). Loaded only upon skill activation. === Level 3: Resources === Auxiliary assets such as scripts, templates, schemas, and reference data. Loaded on-demand when instructions reference them. ===== Skill Definition Example ===== # SKILL.md frontmatter (Level 1) skill_definition = { "name": "visual-layout-critic", "description": "Evaluate rendered visuals for spatial clarity, " "text readability, and element occlusions", "version": "1.0.0", "triggers": [ "review layout", "check visual quality", "refine positioning" ], "dependencies": ["vision-language-model", "PIL"], "applicability": { "contexts": ["ui_design", "document_layout", "presentation"], "requires_vision": True } } # Level 2: Instructions (loaded on activation) instructions = """ ## Evaluation Workflow 1. Capture the current visual state of the target element 2. Analyze spatial relationships between all visible components 3. Check text elements for minimum readable font size (>= 12px) 4. Identify any overlapping or occluded elements 5. Score layout on clarity (1-10), readability (1-10), balance (1-10) 6. Generate specific improvement recommendations ## Decision Criteria - If any text < 12px: flag as critical - If overlap > 15% of any element: flag as warning - If whitespace ratio < 0.2: suggest spacing improvements """ # Level 3: Resources (loaded on-demand) evaluation_template = { "clarity_score": None, "readability_score": None, "balance_score": None, "critical_issues": [], "warnings": [], "recommendations": [] } ===== Skill Creation Pipeline ===== Skills can be created through multiple methods, including a key automated pipeline that extracts skills from open-source repositories: - **Repository Structural Analysis** — Map codebases to identify recurring procedural patterns - **Semantic Identification** — Two-stage process using dense retrieval for latent skill candidates, then cross-encoder ranking based on recurrence, verifiability, non-obviousness, and generalizability - **Translation to SKILL.md** — Generate frontmatter, draft generalizable instructions (avoiding repo-specific details), and bundle necessary assets ===== Skill Invocation Process ===== - **Selection** — Agent matches user queries or task context to skill trigger phrases and applicability conditions - **Loading** — Progressive disclosure injects only the necessary tiers into the context window - **Execution** — Instructions guide the agent's reasoning; resources and tools activate via MCP for concrete actions - **Persistence** — Skills are stored durably on the filesystem, separate from ephemeral session data ===== Ecosystem Scale ===== The agentic skills ecosystem has grown rapidly: * Over **84,192 skills** created in 136 days across the community * In software engineering benchmarks (OSWorld, SWE-bench), skills improve reasoning efficiency — of 24 perfect-score cases, 8 used fewer tokens with skills injected * Security analysis found 26.1% of community-contributed skills contain vulnerabilities, motivating the proposed **Skill Trust and Lifecycle Governance Framework** with a four-tier permission model ===== References ===== * [[https://arxiv.org/abs/2602.12430|arXiv:2602.12430 — Agentic Skills: Dynamic Capability Extension for LLM Agents]] * [[https://arxiv.org/abs/2603.11808|arXiv:2603.11808 — Automated Skill Extraction from Repositories]] * [[https://arxiv.org/abs/2603.15401|arXiv:2603.15401 — Agentic Skills Ecosystem Analysis]] ===== See Also ===== * [[self_evolving_agents|Self-Evolving Agents]] * [[model_context_protocol|Model Context Protocol (MCP)]] * [[tool_use|Tool Use in LLM Agents]] * [[agent_memory|Agent Memory Systems]]