====== Outlines ======

**Outlines** is an open-source Python library by **.txt** (dottxt) for reliable structured generation with large language models.((https://github.com/dottxt-ai/outlines|GitHub Repository)) With over **14,000 stars** on GitHub, it guarantees that LLM outputs conform to JSON schemas, regex patterns, Pydantic models, or context-free grammars — with zero runtime overhead through pre-compiled token constraints.

Unlike post-generation parsing that fails on malformed output, Outlines constrains decoding at inference time by manipulating next-token logits, ensuring 100%% valid structured output on every generation. It works across OpenAI, Hugging Face Transformers, llama.cpp, vLLM, and more.

===== How Structured Generation Works =====

Outlines constrains LLM decoding at inference time by manipulating **next-token logits** to ensure outputs conform to user-defined structures.((https://dottxt-ai.github.io/outlines/latest/|Official Documentation)) Schemas, regex patterns, or grammars are compiled into efficient token-level guides during an upfront compilation step. These guides are then applied autoregressively during generation — adding only microseconds of overhead per token.

This prevents invalid outputs like malformed JSON, eliminating retries or parsing failures common in unconstrained generation.

===== Key Features =====

  * **JSON schema compliance** — Generate valid JSON matching any schema or Pydantic model
  * **Regex constraints** — Guide output to match any regular expression pattern
  * **Context-free grammars** — Support for advanced hierarchical structures
  * **Zero runtime overhead** — Constraints compiled once, applied in microseconds
  * **Model-agnostic** — OpenAI, Transformers, llama.cpp, vLLM, exllama2, mlx-lm
  * **Minimal abstractions** — Integrates with Python control flow, no framework lock-in
  * **Robust prompting** — Template system for few-shot, ReAct, and agent patterns((https://dottxt-ai.github.io/outlines/welcome/|Getting Started Guide))

===== Installation and Usage =====

<code python>
# Install Outlines
# pip install outlines

# Structured output with Pydantic and OpenAI
from pydantic import BaseModel
from typing import Literal
import outlines
import openai

class Customer(BaseModel):
    name: str
    urgency: Literal["high", "medium", "low"]
    issue: str

client = openai.OpenAI()
model = outlines.from_openai(client, "gpt-4o")
customer = model("Alice needs help with login issues ASAP", Customer)
# Always returns a valid Customer object

# Regex-constrained generation with local model
import outlines
model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")

# Generate a valid phone number every time
phone_gen = outlines.generate.regex(model, r"\(\d{3}\) \d{3}-\d{4}")
result = phone_gen("Please provide a US phone number: ")

# JSON generation from schema
from pydantic import BaseModel
from typing import List

class Task(BaseModel):
    title: str
    priority: Literal["high", "medium", "low"]
    tags: List[str]

json_gen = outlines.generate.json(model, Task)
task = json_gen("Create a task for fixing the login bug")
# Guaranteed valid Task object
</code>

===== Architecture =====

<code>
%%{init: {'theme': 'dark'}}%%
graph TB
    Dev([Developer]) -->|Schema / Regex / Grammar| Compiler[Outlines Compiler]
    Compiler -->|Token Constraints| Guide[Token Guide]
    Dev -->|Prompt| Model[LLM Backend]
    Model -->|Next-Token Logits| Mask[Logit Masking]
    Guide -->|Valid Token Set| Mask
    Mask -->|Constrained Logits| Model
    Model -->|Output| Result[Structured Output]
    subgraph Backends
        OAI[OpenAI]
        HF[Transformers]
        LLAMA[llama.cpp]
        VLLM[vLLM]
    end
    Model --- Backends
    subgraph Constraint Types
        JSON[JSON Schema]
        Regex[Regex Pattern]
        Pydantic[Pydantic Model]
        CFG[Context-Free Grammar]
    end
    Compiler --- Constraint Types
</code>

===== Comparison with Alternatives =====

^ Feature ^ Outlines ^ Guidance ^ Instructor ^ Raw Prompting ^
| Guaranteed structure | Yes | Yes | Retry-based | No |
| Runtime overhead | Near-zero | Low | Medium (retries) | None |
| Model support | Many backends | Local models | OpenAI-focused | Any |
| Constraint types | JSON, Regex, CFG | Select, Regex, CFG | JSON only | None |
| Framework dependency | None | Guidance | OpenAI SDK | None |


===== See Also =====

  * [[guidance|Guidance — Microsoft's Structured Generation Language]]
  * [[promptfoo|Promptfoo — LLM Evaluation and Red Teaming]]
  * [[deepeval|DeepEval — Unit-Test Style LLM Evaluation]]

===== References =====