====== OpenHands ====== OpenHands (formerly OpenDevin) is an open-source autonomous software engineering platform that enables AI agents to write, debug, test, and refactor code through executable actions. Built on the CodeAct paradigm, it represents the leading open-source alternative to proprietary coding agents like Devin, achieving state-of-the-art results on SWE-Bench benchmarks. ===== Overview ===== OpenHands originated as OpenDevin, an open-source effort to replicate and surpass the capabilities demonstrated by Cognition's proprietary Devin agent. The project rebranded to OpenHands and evolved into a comprehensive Software Agent SDK. Hosted at [[https://github.com/All-Hands-AI/OpenHands|github.com/All-Hands-AI/OpenHands]], it provides a modular framework for building autonomous software engineering agents that can interact with codebases, terminals, and browsers. The platform supports multiple LLM backends including Anthropic Claude, OpenAI GPT models, and open-source alternatives, making it the most flexible open-source coding agent framework available. ===== CodeAct Paradigm ===== CodeAct is the core paradigm underlying OpenHands. Rather than generating natural language plans that must be parsed and executed, CodeAct agents directly generate and execute Python and bash commands. This approach provides several advantages: * **Direct execution** — actions are code, eliminating translation errors * **Rich feedback** — execution output provides concrete signals for self-correction * **Tool composability** — any command-line tool or Python library becomes available * **Reproducibility** — executed code serves as an auditable trace CodeAct 2.1 (November 2024) introduced function calling for precise tool specification, integrated Anthropic Claude 3.5 Sonnet as the default model, and fixed issues with directory navigation and looping behaviors. # CodeAct paradigm: agent generates executable code as actions # Rather than: "I should look at the file structure" # The agent directly executes: import os # Explore repository structure for root, dirs, files in os.walk("/workspace/project", topdown=True): dirs[:] = [d for d in dirs if d not in [".git", "node_modules", "__pycache__"]] level = root.replace("/workspace/project", "").count(os.sep) indent = " " * 2 * level print(f"{indent}{os.path.basename(root)}/") if level < 2: for file in files: print(f"{indent} {file}") ===== Architecture ===== OpenHands evolved from a monolithic V0 design to V1, a modular Software Agent SDK with several architectural innovations: ==== Event-Sourced State Management ==== All agent actions and observations are recorded as immutable events, enabling deterministic replay, pause/resume, and debugging. This event-driven execution loop supports incremental steps, security checks, and native reasoning support for models like Anthropic's ThinkingBlock. ==== Sandboxed Execution ==== Agents operate in isolated sandbox environments (Docker containers) that prevent unintended side effects on the host system. The V1 architecture supports opt-in sandboxing with a unified process by default, aligning with the Model Context Protocol (MCP). ==== Multi-Agent Workflows ==== OpenHands supports parallel multi-agent execution for large-scale tasks. For example, Angular-to-React migrations can be decomposed using dependency trees, with separate agents handling independent components simultaneously in non-interfering cloud sandboxes. ==== Composable Interfaces ==== A two-layer composability model decouples research from production, supporting CLI, Web UI, and GitHub app interfaces. This enables both rapid prototyping and enterprise deployment. ===== Benchmarks ===== OpenHands CodeAct 2.1 achieved state-of-the-art open-source results on SWE-Bench: | Benchmark | Score | Significance | | SWE-Bench Verified | 53% resolve rate | Top open-source agent | | SWE-Bench Lite | 41.7% resolve rate | Surpassed prior SOTA | These results demonstrate that open-source agents can match or exceed proprietary alternatives on real-world software engineering tasks. ===== Comparison with Alternatives ===== | Aspect | OpenHands | Devin | SWE-Agent | | Open Source | Yes, fully reproducible | No (closed, proprietary) | Yes | | Architecture | Modular SDK, multi-agent, event-driven | Single-agent, proprietary sandbox | Simpler agent loop | | SWE-Bench Verified | 53% (top open) | Proprietary scores | Competitive but lower | | Multi-Agent | Yes, parallel execution | No | No | | LLM Flexibility | Any LLM backend | Fixed model | Limited backends | | Deployment | CLI, Web UI, GitHub App | Cloud only | CLI primarily | ===== Key Features ===== * **Typed tools with Pydantic validation** — separates action specification from execution for distributed use * **Native reasoning support** — integrates extended thinking from Claude, reasoning fields from OpenAI * **GitHub integration** — operates as a GitHub App for PR-based workflows * **Micro-agents** — specialized sub-agents for task decomposition within larger projects * **Cloud sandboxes** — isolated environments for parallel agent execution ===== Recent Developments ===== * **November 2024** — CodeAct 2.1 release with Claude 3.5 integration and top SWE-Bench scores * **November 2025** — V1 SDK architectural overhaul for scalability, modularity, and MCP alignment * **February 2026** — Support for advanced models including GPT-5-Codex; demonstrated massive automated refactoring capabilities * **Ongoing** — AMD integration for local developer use; multi-agent parallel refactors via dependency graphs ===== Getting Started ===== # Install and run OpenHands # pip install openhands from openhands import Agent agent = Agent( model="anthropic/claude-sonnet-4-20250514", workspace="/path/to/project", sandbox="docker" ) # Agent autonomously resolves a GitHub issue result = agent.resolve( task="Fix the TypeError in utils.py when input is None", repo="https://github.com/example/project" ) print(result.summary) ===== References ===== * [[https://github.com/All-Hands-AI/OpenHands|OpenHands GitHub Repository]] * [[https://openhands.dev/blog/openhands-codeact-21-an-open-state-of-the-art-software-development-agent|CodeAct 2.1 Blog Post]] * [[https://arxiv.org/abs/2511.03690|OpenHands V1 Architecture Paper (arXiv:2511.03690)]] * [[https://www.swebench.com/|SWE-Bench Leaderboard]] ===== See Also ===== * [[agent_trajectory_optimization|Agent Trajectory Optimization]] * [[long_horizon_agents|Long-Horizon Agents]] * [[durable_execution_for_agents|Durable Execution for Agents]]