Table of Contents

Self-Refine

Self-Refine, introduced by Madaan et al. (2023), is a framework that improves LLM outputs through iterative self-feedback: the same model generates an initial output, critiques it for weaknesses, and refines it based on that feedback. No external training, reinforcement learning, or additional models are required, the approach works purely at inference time with a single LLM.1)

Motivation

Humans rarely produce perfect first drafts, we revise through iterative critique and improvement. Self-Refine brings this natural revision process to LLMs. While a model's initial output may contain errors, the same model can often identify those issues when explicitly prompted to critique, and fix them when prompted to revise. This leverages an asymmetry: evaluation is easier than generation.

The Generate-Critique-Refine Loop

Self-Refine operates in three steps, iterated until convergence:

  1. Generate: The LLM produces an initial output $y_0$ given a task prompt $x$
  2. Critique: The LLM provides multi-aspect feedback $f_t$ on the current output $y_t$, identifying specific weaknesses
  3. Refine: The LLM generates an improved output $y_{t+1}$ conditioned on $x$, $y_t$, and $f_t$

$$y_0 = \text{LLM}(x), \quad f_t = \text{LLM}_{\text{fb}}(x, y_t), \quad y_{t+1} = \text{LLM}_{\text{refine}}(x, y_t, f_t)$$

The loop repeats for $T$ iterations or until the model indicates no further improvements are possible.

def self_refine(model, task_prompt, max_iters=3):
    """Iterative self-refinement loop."""
    output = model.generate(task_prompt)
 
    for i in range(max_iters):
        critique = model.generate(
            f"Review and identify specific issues:\n"
            f"Task: {task_prompt}\nOutput: {output}"
        )
        if "no improvements needed" in critique.lower():
            break
        output = model.generate(
            f"Improve based on feedback:\n"
            f"Task: {task_prompt}\nOutput: {output}\n"
            f"Feedback: {critique}"
        )
    return output

Key Design Principles

Stopping Criteria

Two mechanisms determine when to stop:

Experimental Results

Self-Refine was evaluated on 7 diverse tasks with improvements of 5-40% over direct generation:2)

Task Improvement
Code optimization ~31% relative gain
Sentiment reversal ~9% absolute gain
Math reasoning Consistent improvement
Dialogue response Quality gains in coherence
Code readability Multi-iteration gains
Acronym generation Strong human preference
Review rewriting ~20% average improvement

Results hold across GPT-3.5 and GPT-4, with quality improving monotonically over 2-3 rounds. Human evaluators consistently prefer Self-Refine outputs.

Limitations

Significance

Self-Refine demonstrates that significant quality improvements are achievable without any additional training, purely through structured inference-time computation. This suggests that frontier model capabilities are underutilized by single-pass generation, and iterative refinement is a general-purpose method for extracting better performance.

See Also

References