====== Self-Refine ====== **Self-Refine**, introduced by Madaan et al. (2023), is a framework that improves LLM outputs through iterative self-feedback: the same model **generates** an initial output, **critiques** it for weaknesses, and **refines** it based on that feedback. No external training, reinforcement learning, or additional models are required --- the approach works purely at inference time with a single LLM. ===== Motivation ===== Humans rarely produce perfect first drafts --- we revise through iterative critique and improvement. Self-Refine brings this natural revision process to LLMs. While a model's initial output may contain errors, the same model can often //identify// those issues when explicitly prompted to critique, and //fix// them when prompted to revise. This leverages an asymmetry: evaluation is easier than generation. ===== The Generate-Critique-Refine Loop ===== Self-Refine operates in three steps, iterated until convergence: - **Generate**: The LLM produces an initial output $y_0$ given a task prompt $x$ - **Critique**: The LLM provides multi-aspect feedback $f_t$ on the current output $y_t$, identifying specific weaknesses - **Refine**: The LLM generates an improved output $y_{t+1}$ conditioned on $x$, $y_t$, and $f_t$ $$y_0 = \text{LLM}(x), \quad f_t = \text{LLM}_{\text{fb}}(x, y_t), \quad y_{t+1} = \text{LLM}_{\text{refine}}(x, y_t, f_t)$$ The loop repeats for $T$ iterations or until the model indicates no further improvements are possible. def self_refine(model, task_prompt, max_iters=3): """Iterative self-refinement loop.""" output = model.generate(task_prompt) for i in range(max_iters): critique = model.generate( f"Review and identify specific issues:\n" f"Task: {task_prompt}\nOutput: {output}" ) if "no improvements needed" in critique.lower(): break output = model.generate( f"Improve based on feedback:\n" f"Task: {task_prompt}\nOutput: {output}\n" f"Feedback: {critique}" ) return output ===== Key Design Principles ===== * **Single model**: The same LLM serves as generator, critic, and refiner * **No training**: Operates purely at inference time via prompting * **Task-agnostic**: Works across diverse tasks with task-specific critique templates * **Multi-aspect feedback**: Critique evaluates multiple dimensions (correctness, style, completeness) ===== Stopping Criteria ===== Two mechanisms determine when to stop: * **Self-assessment**: The model states no further improvements are possible * **Fixed budget**: A maximum iteration count (typically 2-4), balancing quality vs. latency ===== Experimental Results ===== Self-Refine was evaluated on **7 diverse tasks** with improvements of **5-40%** over direct generation: | **Task** | **Improvement** | | Code optimization | ~31% relative gain | | Sentiment reversal | ~9% absolute gain | | Math reasoning | Consistent improvement | | Dialogue response | Quality gains in coherence | | Code readability | Multi-iteration gains | | Acronym generation | Strong human preference | | Review rewriting | ~20% average improvement | Results hold across GPT-3.5 and GPT-4, with quality improving monotonically over 2-3 rounds. Human evaluators consistently prefer Self-Refine outputs. ===== Limitations ===== * Performance depends on the model's self-critique ability --- weaker models may not identify their own errors * Each iteration adds latency (roughly 3x tokens per refinement round) * Diminishing returns after 2-3 iterations on most tasks * Occasional regression where refinement introduces new errors ===== Significance ===== Self-Refine demonstrates that significant quality improvements are achievable //without any additional training// --- purely through structured inference-time computation. This suggests that frontier model capabilities are underutilized by single-pass generation, and iterative refinement is a general-purpose method for extracting better performance. ===== References ===== * [[https://arxiv.org/abs/2303.17651|Madaan et al. "Self-Refine: Iterative Refinement with Self-Feedback" (2023)]] * [[https://arxiv.org/abs/2210.11610|Shinn et al. "Reflexion: Language Agents with Verbal Reinforcement Learning" (2023)]] * [[https://arxiv.org/abs/2305.20050|Chen et al. "Teaching Large Language Models to Self-Debug" (2023)]] ===== See Also ===== * [[chain_of_verification|Chain-of-Verification (CoVe)]] * [[constitutional_ai|Constitutional AI]] * [[cognitive_architectures_language_agents|Cognitive Architectures for Language Agents]]