Browse
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety
Meta
Browse
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety
Meta
Instructor is a lightweight Python library for extracting structured, validated data from large language models by leveraging Pydantic models and function calling. Rather than parsing free-form text, Instructor patches LLM client libraries to return data conforming to user-defined schemas, with automatic retry logic and validation. Created by Jason Liu (jxnl), it supports 15+ LLM providers.1).com/jxnl/instructor|github.com/jxnl/instructor]]))
pip install instructor
Instructor patches provider SDKs (like OpenAI's client) to add a response_model parameter that accepts any Pydantic model. The library handles schema generation, prompt construction, response parsing, and validation automatically.
import instructor from pydantic import BaseModel from [[openai|openai]] import [[openai|OpenAI]] class Person(BaseModel): name: str age: int occupation: str client = instructor.from_openai([[openai|OpenAI]]()) person = client.chat.completions.create( model="gpt-4o", response_model=Person, messages=[{"role": "user", "content": "John is a 30-year-old software engineer."}] ) # Returns: Person(name='John', age=30, occupation='software engineer')
Instructor builds directly on Pydantic for schema definitions, providing:
The following example extracts structured data from unstructured text with nested models and validation:
# Extract structured contact info with nested models and auto-retry on validation failure import instructor from pydantic import BaseModel, field_validator from [[openai|openai]] import [[openai|OpenAI]] class Address(BaseModel): street: str city: str state: str class Contact(BaseModel): name: str email: str address: Address @field_validator("email") @classmethod def validate_email(cls, v): if "@" not in v: raise ValueError("Invalid email format") return v client = instructor.from_openai([[openai|OpenAI]]()) contact = client.chat.completions.create( model="gpt-4o", response_model=Contact, max_retries=3, # auto-retries on validation failure messages=[{"role": "user", "content": "Jane Doe, jane@acme.com, lives at 123 Main St, Austin, TX"}], ) print(contact.model_dump_json(indent=2))
Instructor uses a unified interface via instructor.from_provider() or provider-specific patchers: