Instructor

Instructor is a lightweight Python library for extracting structured, validated data from large language models by leveraging Pydantic models and function calling. Rather than parsing free-form text, Instructor patches LLM client libraries to return data conforming to user-defined schemas, with automatic retry logic and validation. Created by Jason Liu (jxnl), it supports 15+ LLM providers.¹⁾.com/jxnl/instructor|github.com/jxnl/instructor]]))

Website: python.useinstructor.com²⁾
GitHub: github.com/jxnl/instructor
Install: pip install instructor
License: MIT
Ports: Python (primary), Go, TypeScript, Ruby, Elixir

How It Works

Instructor patches provider SDKs (like OpenAI's client) to add a response_model parameter that accepts any Pydantic model. The library handles schema generation, prompt construction, response parsing, and validation automatically.

import instructor
from pydantic import BaseModel
from [[openai|openai]] import [[openai|OpenAI]]
 
class Person(BaseModel):
    name: str
    age: int
    occupation: str
 
client = instructor.from_openai([[openai|OpenAI]]())
person = client.chat.completions.create(
    model="gpt-4o",
    response_model=Person,
    messages=[{"role": "user", "content": "John is a 30-year-old software engineer."}]
)
# Returns: Person(name='John', age=30, occupation='software engineer')

Pydantic Integration

Instructor builds directly on Pydantic for schema definitions, providing:

Type safety: Standard Python type hints with IDE autocompletion
Nested structures: Complex models with lists, optionals, and nested objects
Custom validators: Pydantic validators for business logic constraints
Semantic validation: LLM-powered validation for subjective criteria (e.g., “is this summary accurate?”)
Zero new syntax: Uses standard Pydantic models, no framework-specific DSL

The following example extracts structured data from unstructured text with nested models and validation:

# Extract structured contact info with nested models and auto-retry on validation failure
import instructor
from pydantic import BaseModel, field_validator
from [[openai|openai]] import [[openai|OpenAI]]
 
class Address(BaseModel):
    street: str
    city: str
    state: str
 
class Contact(BaseModel):
    name: str
    email: str
    address: Address
 
    @field_validator("email")
    @classmethod
    def validate_email(cls, v):
        if "@" not in v:
            raise ValueError("Invalid email format")
        return v
 
client = instructor.from_openai([[openai|OpenAI]]())
contact = client.chat.completions.create(
    model="gpt-4o",
    response_model=Contact,
    max_retries=3,  # auto-retries on validation failure
    messages=[{"role": "user", "content": "Jane Doe, jane@acme.com, lives at 123 Main St, Austin, TX"}],
)
print(contact.model_dump_json(indent=2))

Supported Providers

Instructor uses a unified interface via instructor.from_provider() or provider-specific patchers:

OpenAI (GPT-4o, GPT-5, o3) - core integration
Anthropic (Claude 3.5, Claude 4)
Google (Gemini)
Cohere
Ollama (local models like Llama 3)
DeepSeek, Together, Groq
llama-cpp-python (local inference)
Writer
Any OpenAI-compatible API

Key Features

Retry Logic: Automatic retries on validation failure using Tenacity integration, with configurable max attempts
Streaming: Support for partial responses and real-time list building
Low Abstraction: Zero-overhead patch that can be enabled/disabled without refactoring
Multimodal: Support for vision inputs alongside text
llms.txt: Implements the llms.txt specification for documentation discoverability
Iterable responses: Stream lists of objects as they are generated

Use Cases

Extracting structured data from unstructured text (invoices, emails, documents)
Building reliable data pipelines from LLM outputs
Classification and categorization with validated outputs
Content generation with schema-enforced structure
RAG systems requiring structured query decomposition

References

Related Pages

¹⁾

github

²⁾

python.useinstructor.com

AI Agent Knowledge Base

Sidebar

Table of Contents

Instructor

How It Works

Pydantic Integration

Supported Providers

Key Features

Use Cases

See Also

References

Related Pages

AI Agent Knowledge Base

User Tools

Site Tools

Sidebar

Table of Contents

Instructor

How It Works

Pydantic Integration

Supported Providers

Key Features

Use Cases

See Also

References

Related Pages

Page Tools