Table of Contents

DSPy

DSPy (Declarative Self-improving Python) is a framework developed by Stanford NLP for programming — not prompting — language models. Rather than manually crafting prompts, DSPy lets developers define the behavior of LM-powered programs through structured signatures, composable modules, and algorithmic optimizers that automatically tune prompts, few-shot examples, and even model weights.

As of 2025, DSPy has reached version 3.0 and represents a paradigm shift in how developers build LM applications — treating language models as programmable modules analogous to layers in a neural network.

Core Concepts

Signatures define the input/output contract for a language model call, similar to type annotations. Instead of writing a prompt, you declare what goes in and what comes out. DSPy automatically expands signatures into optimized prompts.

Modules are composable building blocks that implement specific LM invocation strategies. Built-in modules include ChainOfThought, ReAct, ProgramOfThought, and MultiChainComparison. Developers compose modules into programs like building blocks.

Optimizers (formerly called teleprompters or compilers) algorithmically tune the entire program. Given a metric and a small set of training examples (as few as 10-20), optimizers like BootstrapFewShot, MIPRO, and MIPROv2 automatically select the best instructions, few-shot demonstrations, and configurations. This eliminates manual prompt engineering.

Programming vs. Prompting

The fundamental insight of DSPy is that prompt engineering is brittle and non-transferable. When you change your LLM, pipeline, or data, hand-tuned prompts break. DSPy addresses this by:

This approach has demonstrated significant improvements: MIPROv2 targets 20%+ performance gains on representative tasks with limited labeled data.

Code Example

import dspy
 
# Configure the language model
lm = dspy.LM('openai/gpt-4o-mini')
dspy.configure(lm=lm)
 
# Define a signature for question answering
class GenerateAnswer(dspy.Signature):
    "Answer questions with short factoid answers."
    question: str = dspy.InputField()
    answer: str = dspy.OutputField(desc='often between 1 and 5 words')
 
# Create a module using ChainOfThought
qa = dspy.ChainOfThought(GenerateAnswer)
 
# Use it directly
pred = qa(question='What is the capital of France?')
print(pred.answer)  # Paris
 
# Optimize with training data and a metric
from dspy.teleprompt import BootstrapFewShot
 
def accuracy_metric(example, pred, trace=None):
    return example.answer.lower() == pred.answer.lower()
 
optimizer = BootstrapFewShot(metric=accuracy_metric, max_bootstrapped_demos=4)
compiled_qa = optimizer.compile(qa, trainset=trainset)

How DSPy Differs from LangChain

Aspect DSPy LangChain
Paradigm Declarative programming + automatic optimization Chain-based orchestration with manual prompts
Prompt Engineering Eliminated via optimizers Manual, user-managed
Portability Programs transfer across LMs Prompts are model-specific
Abstraction Signatures and modules Chains and agents
Optimization Built-in algorithmic compilers No native self-improving capability
Philosophy Treat LMs like neural network layers Treat LMs like APIs to chain together

References

See Also