AutoCodeRover
AutoCodeRover is an autonomous program improvement agent that automatically detects and fixes issues in software repositories. Developed by researchers at the National University of Singapore, it combines LLM-based reasoning with program structure-aware code search to generate patches for real-world GitHub issues. With over 3,100 GitHub stars, it achieved 37.3% on SWE-bench Lite and 46.2% on SWE-bench Verified at under $0.70 per task.1))
GitHub: AutoCodeRoverSG/auto-code-rover
Key Features
Structure-Aware Code Search — Navigates project structure using AST-level understanding rather than simple text search
Autonomous Issue Resolution — Takes a GitHub issue description and produces a working patch without human intervention
Cost Efficient — Resolves tasks at less than $0.70 per issue on average
SWE-bench Performance — 37.3% pass@1 on SWE-bench Lite, 46.2% on SWE-bench Verified
2))
Multi-Stage Pipeline — Systematic approach: context retrieval, fault localization, patch generation, and validation
Docker-Based Execution — Runs in containerized environments for reproducibility and safety
Multiple LLM Support — Works with GPT-4, Claude, and other large language models
Architecture
AutoCodeRover is built in Python with a pipeline architecture:
Issue Parser — Extracts intent, error descriptions, and expected behavior from GitHub issue text
Code Search Engine — AST-aware search across classes, methods, and code blocks using program structure
Context Collector — Gathers relevant code snippets, test files, and documentation
Patch Generator — LLM-based reasoning to produce minimal, correct patches
Validation Engine — Runs existing test suites to verify patches don't introduce regressions
SWE-bench Integration — Direct support for SWE-bench evaluation framework via Docker
Usage Example
# Clone the repository
git clone https://github.com/AutoCodeRoverSG/auto-code-rover.git
cd auto-code-rover
# Build the Docker image
docker build -t acr .
# Run on a specific SWE-bench instance
python3 ACR.py --task django__django-16379 \
--model gpt-4 \
--output results/
# Run on a custom GitHub issue
python3 ACR.py --repo https://github.com/user/project \
--issue 42 \
--model claude-3.5-sonnet
How It Works
graph TD
A[GitHub Issue] --> B[Issue Parser]
B --> C[Extract Intent & Error Info]
C --> D[Code Search Engine]
D --> E[AST-Level Structure Analysis]
E --> F[Relevant Classes & Methods]
F --> G[Context Collection]
G --> H[Code Snippets + Tests + Docs]
H --> I[LLM Reasoning]
I --> J[Fault Localization]
J --> K[Patch Generation]
K --> L[Minimal Code Patch]
L --> M[Test Validation]
M --> N{Tests Pass?}
N -->|Yes| O[Output Patch]
N -->|No| P[Refine with Feedback]
P --> I
Research
AutoCodeRover was introduced in the paper “AutoCodeRover: Autonomous Program Improvement” (arXiv:2404.05427) by Yuntong Zhang, Haifeng Ruan, Zhiyu Fan, and Abhik Roychoudhury. Key contributions:3))
Program Structure Awareness — Demonstrated that AST-level code navigation significantly outperforms text-based search for bug localization
Spectrum-Based Fault Localization — Combines traditional SE techniques with LLM reasoning
Iterative Refinement — Multi-round patch generation with test feedback loops
Cost Analysis — Showed autonomous patching is economically viable at scale
The team later developed the Sonar Foundation Agent, scoring 79.2% on SWE-bench Verified.
See Also
Agentless — Lightweight localize-then-repair approach
Trae Agent — ByteDance's research-friendly CLI agent
Devon — Open-source pair programmer
-
Cline — Model-agnostic autonomous coding agent
References