Browse
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety
Meta
Browse
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety
Meta
Self-Refine, introduced by Madaan et al. (2023), is a framework that improves LLM outputs through iterative self-feedback: the same model generates an initial output, critiques it for weaknesses, and refines it based on that feedback. No external training, reinforcement learning, or additional models are required, the approach works purely at inference time with a single LLM.1)
Humans rarely produce perfect first drafts, we revise through iterative critique and improvement. Self-Refine brings this natural revision process to LLMs. While a model's initial output may contain errors, the same model can often identify those issues when explicitly prompted to critique, and fix them when prompted to revise. This leverages an asymmetry: evaluation is easier than generation.
Self-Refine operates in three steps, iterated until convergence:
$$y_0 = \text{LLM}(x), \quad f_t = \text{LLM}_{\text{fb}}(x, y_t), \quad y_{t+1} = \text{LLM}_{\text{refine}}(x, y_t, f_t)$$
The loop repeats for $T$ iterations or until the model indicates no further improvements are possible.
def self_refine(model, task_prompt, max_iters=3): """Iterative self-refinement loop.""" output = model.generate(task_prompt) for i in range(max_iters): critique = model.generate( f"Review and identify specific issues:\n" f"Task: {task_prompt}\nOutput: {output}" ) if "no improvements needed" in critique.lower(): break output = model.generate( f"Improve based on feedback:\n" f"Task: {task_prompt}\nOutput: {output}\n" f"Feedback: {critique}" ) return output
Two mechanisms determine when to stop:
Self-Refine was evaluated on 7 diverse tasks with improvements of 5-40% over direct generation:2)
| Task | Improvement |
| Code optimization | ~31% relative gain |
| Sentiment reversal | ~9% absolute gain |
| Math reasoning | Consistent improvement |
| Dialogue response | Quality gains in coherence |
| Code readability | Multi-iteration gains |
| Acronym generation | Strong human preference |
| Review rewriting | ~20% average improvement |
Results hold across GPT-3.5 and GPT-4, with quality improving monotonically over 2-3 rounds. Human evaluators consistently prefer Self-Refine outputs.
Self-Refine demonstrates that significant quality improvements are achievable without any additional training, purely through structured inference-time computation. This suggests that frontier model capabilities are underutilized by single-pass generation, and iterative refinement is a general-purpose method for extracting better performance.