smolagents

smolagents is a lightweight, open-source Python library from Hugging Face for building AI agents with minimal code. Released on December 31, 2024, it serves as a successor to the heavier Transformer Agents library, emphasizing simplicity, code-based agent actions, and broad LLM compatibility.

The core philosophy of smolagents is that agents should write and execute Python code directly rather than producing JSON tool calls — leading to more expressive, composable, and efficient agent behavior.

Code Agents vs. Tool Agents

smolagents distinguishes between two primary agent types:

CodeAgent writes actions directly in Python code, enabling loops, conditionals, and nested operations for complex orchestration. Code is executed in secure sandboxes (Docker, Modal, or E2B). This is the preferred approach for complex tasks because LLMs are better at generating Python (abundant in training data) than structured JSON.

ToolCallingAgent uses traditional JSON/text-based tool calls for simpler scenarios where the agent selects a function and provides arguments. This matches the standard approach used by most other frameworks.

Type	Mechanism	Best For
CodeAgent	Writes Python code directly	Complex orchestration, multi-step logic
ToolCallingAgent	JSON-based tool selection	Simple tool routing, standard patterns

Key Features

Minimal codebase: Core logic is small and readable — agents run in just a few lines
Model-agnostic: Works with HuggingFace Hub models, OpenAI, Anthropic, LiteLLM, and local models via Ollama
Modality-agnostic: Handles text, vision, video, and audio
Tool-agnostic: Integrates MCP servers, LangChain tools, and HuggingFace Spaces
Hub integration: Share and load agents and tools via Hugging Face Hub as Gradio Spaces
Secure execution: Code agents run in sandboxed environments to prevent unsafe operations

Architecture

smolagents follows a minimalist agentic progression:

Level 1: LLM makes simple routing decisions (if/else paths)
Level 2: LLM selects tools and arguments (tool-calling pattern)
Level 3: LLM runs multi-step loops with sub-agents and complex workflows

The framework prioritizes code generation over JSON for agent actions because:

LLMs are better trained on Python than JSON schemas
Python is inherently composable (variables, loops, functions)
Code is more general and expressive than structured tool calls

Code Example

from smolagents import CodeAgent, InferenceClientModel, tool, WebSearchTool
 
# Use a HuggingFace-hosted model
model = InferenceClientModel()
 
# Simple CodeAgent with no tools
agent = CodeAgent(tools=[], model=model)
result = agent.run('Calculate the sum of numbers from 1 to 10')
print(result)  # 55
 
# Agent with custom tool and web search
@tool
def translate(text: str, target_lang: str) -> str:
    """Translate text to the target language."""
    return model.generate(f'Translate to {target_lang}: {text}')
 
agent = CodeAgent(
    tools=[WebSearchTool(), translate],
    model=model
)
result = agent.run('Search for top AI news and translate the headline to French')

Differences from Heavier Frameworks

Unlike LangChain, AutoGen, or CrewAI, smolagents is deliberately minimal:

No complex graph definitions or state machines
No heavy dependency chains
Code generation instead of JSON tool schemas
Focus on developer understanding — the entire framework is readable in an afternoon
Direct integration with Hugging Face ecosystem

AI Agent Knowledge Base

Sidebar

Table of Contents

smolagents

Code Agents vs. Tool Agents

Key Features

Architecture

Code Example

Differences from Heavier Frameworks

References

See Also

AI Agent Knowledge Base

User Tools

Site Tools

Sidebar

Table of Contents

smolagents

Code Agents vs. Tool Agents

Key Features

Architecture

Code Example

Differences from Heavier Frameworks

References

See Also

Page Tools