====== SWE-agent: Agent-Computer Interface for Software Engineering ======
SWE-agent is a language model agent system by Yang et al. (Princeton, 2024) that resolves real-world [[github|GitHub]] issues autonomously through a carefully designed **Agent-Computer Interface (ACI)**(([[https://arxiv.org/abs/2405.15793|Yang et al. "SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering" (arXiv:2405.15793]])) . Rather than giving the LLM raw terminal access, SWE-agent provides a minimal set of custom shell commands for searching, viewing, and editing code — an interface design that dramatically improves the agent's ability to navigate large codebases and produce correct patches. Accepted at NeurIPS 2024.


<mermaid>
graph TD
    ISS[[[github|GitHub]] Issue] --> SEARCH[Search Codebase]
    SEARCH --> VIEW[View File]
    VIEW --> EDIT[Edit Code]
    EDIT --> TEST[Run Tests]
    TEST --> CHECK{Tests Pass?}
    CHECK -->|No| SEARCH
    CHECK -->|Yes| SUBMIT[Submit Patch]
</mermaid>

===== Agent-Computer Interface (ACI) Design =====
The ACI is SWE-agent's central contribution — the insight that **how** an agent interacts with a computer matters as much as the underlying model capability. The interface provides:

  * **Structured observations**: File contents displayed with line numbers, surrounding context indicators (e.g., "400 lines above, 2684 lines below"), and syntax highlighting
  * **Paginated viewing**: Prevents token overflow by showing manageable chunks of code rather than entire files
  * **Action-observation loop**: Agent issues a command, receives structured output, reasons about next steps, and iterates
  * **Repository isolation**: Each task runs in a cloned [[github|GitHub]] repo preserving the exact pre-fix state

The deliberate simplicity of the interface reduces hallucination risk — fewer, well-defined commands produce more reliable agent behavior than unrestricted shell access.

===== Custom Command Set =====
SWE-agent uses three core tool categories:

=== Search ===
<code>
search <regex> [--filename <regex>]
</code>
Finds files or code matching patterns across the repository. Supports regex for both content and filename filtering.

=== File Viewer ===
<code>
open <filename> [<line_number>]
scroll_up / scroll_down
goto <line_number>
</code>
Paginated file navigation with line numbers and context awareness. The viewer maintains state across commands, showing the agent's current position in the file.

=== Edit ===
<code>
edit <filename> <line_start> <line_end>
<new_content>
end_of_edit
</code>
Replaces specific line ranges with new content. This precise, line-addressed editing avoids the ambiguity of natural language edit instructions.

Standard Unix utilities (''ls'', ''grep'', ''git'') remain available for auxiliary tasks.

===== Code Example =====
<code python>
# SWE-agent style ACI interaction loop
class SWEAgentLoop:
    def __init__(self, model, repo_path, issue_description):
        self.model = model
        self.repo = repo_path
        self.issue = issue_description
        self.history = []

    def run(self, max_steps: int = 30) -> str:
        observation = self.setup_environment()
        for step in range(max_steps):
            # Model reasons about current state and selects action
            action = self.model.generate_action(
                issue=self.issue,
                observation=observation,
                history=self.history,
            )
            # Execute command through ACI
            observation = self.execute_aci_command(action)
            self.history.append((action, observation))

            if action.startswith("submit"):
                return self.generate_patch()
        return self.generate_patch()

    def execute_aci_command(self, command: str) -> str:
        if command.startswith("search"):
            return self.search_codebase(command)
        elif command.startswith("open"):
            return self.open_file_viewer(command)
        elif command.startswith("edit"):
            return self.apply_edit(command)
        else:
            return self.run_shell(command)
</code>

===== SWE-bench Results =====
[[swe_bench|SWE-bench]] is a benchmark of 2,294 real GitHub issues from 12 popular Python repositories, requiring full repository-level bug fixing and feature implementation.(([[https://arxiv.org/abs/2310.06770|Jimenez et al. "[[swe_bench|SWE-bench]])): Can Language Models Resolve Real-World GitHub Issues?"]])) )

SWE-agent was among the first agent systems to demonstrate strong autonomous performance on [[swe_bench|SWE-bench]].(([[https://github.com/swe-agent/swe-agent|SWE-agent GitHub Repository]])) ) The benchmark has since become the standard evaluation for coding agents:

^ System ^ [[swe_bench|SWE-bench]] Verified (500 tasks) ^
| SWE-agent (GPT-4) | ~18% (early 2024) |
| SWE-agent + [[claude|Claude]] 3.5 Sonnet | ~33% (late 2024) |
| Current SOTA (2026) | ~79% (with advanced scaffolding) |

SWE-agent's key contribution is not just the benchmark scores but the demonstration that **ACI design is a first-class research problem** — the same underlying LLM performs significantly better with well-designed tool interfaces.

===== Comparison with Other Approaches =====
^ Approach ^ Key Difference ^
| **[[agentless|Agentless]]** | Three-phase pipeline (localize, repair, validate) — no agentic loop |
| **[[openhands|OpenHands]]** | Broader action space including web browsing and code writing |
| **HyperAgent** | Multi-agent architecture for multi-language tasks |
| **SWE-agent** | Minimal ACI focused on search/view/edit reliability |

===== Design Principles =====
The ACI design embodies several principles for effective agent-tool interaction:

<latex>P(\text{correct patch} | \text{ACI}) > P(\text{correct patch} | \text{raw shell})</latex>

  * **Minimal action space**: Fewer, well-defined commands reduce action selection errors
  * **Structured observations**: Formatted output with line numbers provides unambiguous context
  * **Stateful navigation**: The file viewer remembers position, reducing redundant exploration
  * **Precise editing**: Line-addressed edits eliminate ambiguity in code modification

===== See Also =====
  * [[swe_chat|SWE-chat]]
  * [[swe_verified|SWE-Verified]]
  * [[how_to_build_a_coding_agent|How to Build a Coding Agent]]
  * [[agentless|Agentless]]
  * [[code_generation_agents|Code Generation Agents]]

===== References =====