====== Self-Refine ======
**Self-Refine**, introduced by Madaan et al. (2023), is a framework that improves LLM outputs through iterative self-feedback: the same model **generates** an initial output, **critiques** it for weaknesses, and **refines** it based on that feedback. No external training, [[reinforcement_learning|reinforcement learning]], or additional models are required, the approach works purely at inference time with a single LLM.((([[https://arxiv.org/abs/2303.17651|Madaan et al. "Self-Refine: Iterative Refinement with Self-Feedback." arXiv:2303.17651, 2023.]]))

===== Motivation =====
Humans rarely produce perfect first drafts, we revise through iterative critique and improvement. Self-Refine brings this natural revision process to LLMs. While a model's initial output may contain errors, the same model can often //identify// those issues when explicitly prompted to critique, and //fix// them when prompted to revise. This leverages an asymmetry: evaluation is easier than generation.

===== The Generate-Critique-Refine Loop =====
Self-Refine operates in three steps, iterated until convergence:

  - **Generate**: The LLM produces an initial output $y_0$ given a task prompt $x$
  - **Critique**: The LLM provides multi-aspect feedback $f_t$ on the current output $y_t$, identifying specific weaknesses
  - **Refine**: The LLM generates an improved output $y_{t+1}$ conditioned on $x$, $y_t$, and $f_t$

$$y_0 = \text{LLM}(x), \quad f_t = \text{LLM}_{\text{fb}}(x, y_t), \quad y_{t+1} = \text{LLM}_{\text{refine}}(x, y_t, f_t)$$

The loop repeats for $T$ iterations or until the model indicates no further improvements are possible.

<code python>
def self_refine(model, task_prompt, max_iters=3):
    """Iterative self-refinement loop."""
    output = model.generate(task_prompt)

    for i in range(max_iters):
        critique = model.generate(
            f"Review and identify specific issues:\n"
            f"Task: {task_prompt}\nOutput: {output}"
        )
        if "no improvements needed" in critique.lower():
            break
        output = model.generate(
            f"Improve based on feedback:\n"
            f"Task: {task_prompt}\nOutput: {output}\n"
            f"Feedback: {critique}"
        )
    return output
</code>

===== Key Design Principles =====
  * **Single model**: The same LLM serves as generator, critic, and refiner
  * **No training**: Operates purely at inference time via prompting
  * **Task-agnostic**: Works across diverse tasks with task-specific critique templates
  * **Multi-aspect feedback**: Critique evaluates multiple dimensions (correctness, style, completeness)

===== Stopping Criteria =====
Two mechanisms determine when to stop:

  * **Self-assessment**: The model states no further improvements are possible
  * **Fixed budget**: A maximum iteration count (typically 2-4), balancing quality vs. latency

===== Experimental Results =====
Self-Refine was evaluated on **7 diverse tasks** with improvements of **5-40%** over direct generation:((([[https://arxiv.org/abs/2303.17651|Madaan et al. "Self-Refine: Iterative Refinement with Self-Feedback." arXiv:2303.17651, 2023.]]))

| **Task** | **Improvement** |
| Code optimization | ~31% relative gain |
| Sentiment reversal | ~9% absolute gain |
| Math reasoning | Consistent improvement |
| Dialogue response | Quality gains in coherence |
| Code readability | Multi-iteration gains |
| Acronym generation | Strong human preference |
| Review rewriting | ~20% average improvement |

Results hold across GPT-3.5 and GPT-4, with quality improving monotonically over 2-3 rounds. Human evaluators consistently prefer Self-Refine outputs.

===== Limitations =====
  * Performance depends on the model's self-critique ability, weaker models may not identify their own errors
  * Each iteration adds latency (roughly 3x tokens per refinement round)
  * Diminishing returns after 2-3 iterations on most tasks
  * Occasional regression where refinement introduces new errors

===== Significance =====
Self-Refine demonstrates that significant quality improvements are achievable //without any additional training//, purely through structured inference-time computation. This suggests that frontier model capabilities are underutilized by single-pass generation, and iterative refinement is a general-purpose method for extracting better performance.

===== See Also =====
  * [[generation_vs_iterative_refinement_design|Generation vs Iterative Refinement in AI Design]]
  * [[recursive_self_improvement|Recursive Self-Improvement]]
  * [[critic_self_correction|CRITIC: LLMs Can Self-Correct with Tool-Interactive Critiquing]]
  * [[self_improving_ai|Self-Improving AI]]
  * [[self_modifying_software|Self-Modifying Software]]

===== References =====
  * [[https://arxiv.org/abs/2303.17651|Madaan et al. "Self-Refine: Iterative Refinement with Self-Feedback" (2023)]]
  * [[https://arxiv.org/abs/2210.11610|Shinn et al. "Reflexion: Language Agents with Verbal Reinforcement Learning" (2023)]](([[https://arxiv.org/abs/2210.11610|Shinn et al. "Reflexion: Language Agents with Verbal Reinforcement Learning." arXiv:2210.11610, 2023.]]))
  * [[https://arxiv.org/abs/2305.20050|Chen et al. "Teaching Large Language Models to Self-Debug" (2023)]](([[https://arxiv.org/abs/2305.20050|Chen et al. "Teaching Large Language Models to Self-Debug." arXiv:2305.20050, 2023.]]))