Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety & Security
Evaluation
Meta
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety & Security
Evaluation
Meta
Guidance is an open-source Python library by Microsoft for controlling and constraining the outputs of large language models. With over 21,000 stars on GitHub, it implements constrained decoding — steering token generation at the inference layer to guarantee outputs match specified formats like JSON, Python, HTML, SQL, and more.1)2)3)
Rather than relying on prompt engineering, retry loops, or post-processing, Guidance enforces structural constraints directly during model inference, achieving 100 guaranteed output structure with 30-50 reduction in latency and costs compared to conventional prompting techniques.
Guidance implements constrained decoding by steering the language model token by token during inference. Instead of generating text freely and hoping it matches a desired format, Guidance manipulates the token probability distribution at each step to ensure only valid tokens can be selected.
The library batches any additional text added by the user as execution unfolds, treating the entire process as a single API call rather than multiple sequential calls. This eliminates the need for expensive retries or fine-tuning.