Table of Contents

Automated Program Repair

LLM-powered agents are revolutionizing automated program repair (APR) by moving beyond static code rewriting to interactive debugging, context-aware refinement, and specification-centric repair. This page covers InspectCoder's debugger collaboration, REFINE's patch refinement framework, and VibeRepair's specification-centric approach.

The APR Challenge

Automated program repair aims to automatically fix bugs in software given issue descriptions, codebases, and test suites. LLM-based APR faces several challenges:

InspectCoder: Dynamic Analysis with Debugger Collaboration

InspectCoder (arXiv:2510.18327) is the first agentic program repair system that enables LLMs to actively conduct dynamic analysis through interactive debugger control.

Dual-Agent Architecture

Key Capabilities

Results

REFINE: Context-Aware Patch Refinement

REFINE (Pabba et al., 2025) transforms partially correct “Draft Patches” into correct ones through a systematic refinement framework.

Three Key Challenges Addressed

  1. Context disambiguation: Resolves vague issue descriptions and unclear code context by enriching the repair prompt with structured context
  2. Candidate diversification: Uses test-time scaling to generate diverse patch candidates, increasing the probability of including a correct fix
  3. Partial fix aggregation: An LLM-powered code review process combines insights from multiple partial fixes into a complete solution

Integration Architecture

REFINE is designed as a general refinement module that plugs into existing APR systems:

Results

Specification Vibing (VibeRepair)

VibeRepair (Zhu et al., 2026) introduces a paradigm shift from code-centric repair to specification-centric repair, treating bug fixing as behavior-specification alignment rather than ad-hoc code editing.

The Specification-Centric Approach

  1. Code to Specification: Translates buggy code into a structured behavior specification capturing intended runtime behavior
  2. Specification Repair: Infers and repairs misalignments in the specification (not the code)
  3. Specification to Code: Synthesizes corrected code strictly guided by the repaired behavior specification

On-Demand Reasoning

For difficult cases, an enrichment component provides:

Results

Code Example

# InspectCoder-style debugger-driven APR (simplified)
class InspectCoderAgent:
    def __init__(self, inspector_llm, coder_llm, debugger):
        self.inspector = inspector_llm
        self.coder = coder_llm
        self.debugger = debugger
 
    def repair(self, buggy_code, test_suite, max_iterations=5):
        for iteration in range(max_iterations):
            # Phase 1: Dynamic analysis via Program Inspector
            diagnosis = self.inspect(buggy_code, test_suite)
 
            # Phase 2: Patch generation via Patch Coder
            patch = self.coder.generate_patch(buggy_code, diagnosis)
 
            # Phase 3: Verification
            results = test_suite.run(patch)
            if results.all_pass:
                return patch
            # Feed failing test back to Inspector
            buggy_code = patch  # Refine from current best
 
        return None  # Could not repair within budget
 
    def inspect(self, code, test_suite):
        failing_test = test_suite.get_first_failing()
        # Inspector decides breakpoint strategy
        breakpoints = self.inspector.plan_breakpoints(code, failing_test)
        self.debugger.set_breakpoints(breakpoints)
 
        # Run under debugger and collect state
        states = self.debugger.run(code, failing_test)
 
        # Inspector analyzes runtime state
        root_cause = self.inspector.analyze(states, code, failing_test)
 
        # Optional: perturbation experiment
        if root_cause.confidence < 0.7:
            perturbed = self.inspector.perturb_state(states, root_cause.hypothesis)
            root_cause = self.inspector.refine_hypothesis(perturbed)
 
        return root_cause

Comparison of APR Approaches

Method Paradigm Benchmark Key Result Innovation
InspectCoder Dynamic analysis BigCodeBench-R +60% relative improvement Debugger collaboration
REFINE Patch refinement SWE-Bench Lite 51.67% resolution Draft patch aggregation
VibeRepair Specification-centric Defects4J v1.2 174 bugs (+19%) Behavior specification repair

References

See Also