====== OpenHands ====== OpenHands (formerly OpenDevin) is an open-source autonomous software engineering platform that enables AI agents to write, debug, test, and refactor code through executable actions.((https://github.com/All-Hands-AI/OpenHands|OpenHands GitHub Repository)) Built on the CodeAct paradigm, it represents the leading open-source alternative to proprietary coding agents like Devin, achieving state-of-the-art results on [[swe_bench|SWE-Bench]] benchmarks. ===== Overview ===== OpenHands originated as OpenDevin, an open-source effort to [[replicate|replicate]] and surpass the capabilities demonstrated by Cognition's proprietary Devin agent. The project rebranded to OpenHands and evolved into a comprehensive Software Agent SDK. Hosted at [[https://[[github|github]].com/All-Hands-AI/OpenHands|github.com/All-Hands-AI/OpenHands]], it provides a [[modular|modular]] framework for building autonomous software engineering agents that can interact with codebases, terminals, and browsers(([[https://sub.thursdai.news/p/thursdai-ai-engineer-europe-mythos|ThursdAI (2026]])). The platform supports multiple LLM backends including [[anthropic|Anthropic]] Claude, [[openai|OpenAI]] GPT models, and open-source alternatives, making it the most flexible open-source coding agent framework available.((https://arxiv.org/abs/2511.03690|OpenHands V1 Architecture Paper (arXiv:2511.03690))) ===== CodeAct Paradigm ===== CodeAct is the core paradigm underlying OpenHands. Rather than generating natural language plans that must be parsed and executed, CodeAct agents directly generate and execute Python and bash commands.((https://openhands.dev/blog/openhands-codeact-21-an-open-state-of-the-art-software-development-agent|CodeAct 2.1 Blog Post)) This approach provides several advantages: * **Direct execution** — actions are code, eliminating translation errors * **Rich feedback** — execution output provides concrete signals for self-correction * **Tool composability** — any command-line tool or Python library becomes available * **Reproducibility** — executed code serves as an auditable trace CodeAct 2.1 (November 2024) introduced [[function_calling|function calling]] for precise tool specification, integrated [[anthropic|Anthropic]] [[claude|Claude]] 3.5 Sonnet as the default model, and fixed issues with directory navigation and looping behaviors. # CodeAct paradigm: agent generates executable code as actions # Rather than: "I should look at the file structure" # The agent directly executes: import os # Explore repository structure for root, dirs, files in os.walk("/workspace/project", topdown=True): dirs[:] = [d for d in dirs if d not in [".git", "node_modules", "__pycache__"]] level = root.replace("/workspace/project", "").count(os.sep) indent = " " * 2 * level print(f"{indent}{os.path.basename(root)}/") if level < 2: for file in files: print(f"{indent} {file}") ===== Architecture ===== OpenHands evolved from a monolithic V0 design to V1, a [[modular|modular]] Software Agent SDK with several architectural innovations:((https://arxiv.org/abs/2511.03690|OpenHands V1 Architecture Paper (arXiv:2511.03690))) ==== Event-Sourced State Management ==== All agent actions and observations are recorded as immutable events, enabling deterministic replay, pause/resume, and debugging. This event-driven execution loop supports incremental steps, security checks, and native reasoning support for models like [[anthropic|Anthropic]]'s ThinkingBlock. ==== Sandboxed Execution ==== Agents operate in isolated sandbox environments (Docker containers) that prevent unintended side effects on the host system. The V1 architecture supports opt-in sandboxing with a unified process by default, aligning with the Model Context Protocol (MCP). ==== Multi-Agent Workflows ==== OpenHands supports parallel multi-agent execution for large-scale tasks. For example, Angular-to-React migrations can be decomposed using dependency trees, with separate agents handling independent components simultaneously in non-interfering cloud sandboxes. ==== Composable Interfaces ==== A two-layer composability model decouples research from production, supporting CLI, Web UI, and [[github|GitHub]] app interfaces. This enables both rapid prototyping and enterprise deployment. ===== Benchmarks ===== OpenHands CodeAct 2.1 achieved state-of-the-art open-source results on [[swe_bench|SWE-Bench]]:((https://www.swebench.com/|SWE-Bench Leaderboard)) | Benchmark | Score | Significance | | [[swe_bench|SWE-Bench]] Verified | 53% resolve rate | Top open-source agent | | [[swe_bench|SWE-Bench]] Lite | 41.7% resolve rate | Surpassed prior SOTA | These results demonstrate that open-source agents can match or exceed proprietary alternatives on real-world software engineering tasks. ===== Comparison with Alternatives ===== | Aspect | OpenHands | Devin | SWE-Agent | | Open Source | Yes, fully reproducible | No (closed, proprietary) | Yes | | Architecture | [[modular|Modular]] SDK, multi-agent, event-driven | Single-agent, proprietary sandbox | Simpler [[agent_loop|agent loop]] | | [[swe_bench|SWE-Bench]] Verified | 53% (top open) | Proprietary scores | Competitive but lower | | Multi-Agent | Yes, parallel execution | No | No | | LLM Flexibility | Any LLM backend | Fixed model | Limited backends | | Deployment | CLI, Web UI, [[github|GitHub]] App | Cloud only | CLI primarily | ===== Key Features ===== * **Typed tools with Pydantic validation** — separates action specification from execution for distributed use * **Native reasoning support** — integrates [[extended_thinking|extended thinking]] from Claude, reasoning fields from [[openai|OpenAI]] * **[[github|GitHub]] integration** — operates as a GitHub App for PR-based workflows * **Micro-agents** — specialized sub-agents for [[task_decomposition|task decomposition]] within larger projects * **Cloud sandboxes** — isolated environments for [[parallel_agents|parallel agent execution]] ===== Industry Adoption and Impact ===== By 2026, OpenHands has emerged as a cornerstone project in the broader ecosystem of self-improving intelligent software, with widespread adoption across global AI laboratories.((https://sub.thursdai.news/p/the-lopopolo-zechner-spectrum|Self-Improving Software Trends (2026))) The platform's open-source nature and [[modular|modular]] architecture have positioned it as a foundational technology for autonomous software improvement initiatives, demonstrating significant real-world impact on how organizations approach continuous code refinement and maintenance. ===== Recent Developments ===== * **November 2024** — CodeAct 2.1 release with [[claude|Claude]] 3.5 integration and top [[swe_bench|SWE-Bench]] scores((https://openhands.dev/blog/openhands-codeact-21-an-open-state-of-the-art-software-development-agent|CodeAct 2.1 Blog Post)) * **November 2025** — V1 SDK architectural overhaul for scalability, modularity, and MCP alignment((https://arxiv.org/abs/2511.03690|OpenHands V1 Architecture Paper (arXiv:2511.03690))) * **February 2026** — Support for advanced models including GPT-5-[[codex|Codex]]; demonstrated massive automated refactoring capabilities * **Ongoing** — AMD integration for local developer use; multi-agent parallel refactors via dependency graphs ===== Getting Started ===== # Install and run OpenHands # pip install openhands from openhands import Agent agent = Agent( model="[[anthropic|anthropic]]/[[claude|claude]]-sonnet-4-20250514", workspace="/path/to/project", sandbox="docker" ) # Agent autonomously resolves a [[github|GitHub]] issue result = agent.resolve( task="Fix the TypeError in utils.py when input is None", repo="https://[[github|github]].com/example/project" ) print(result.summary) ===== See Also ===== * [[opendev|OpenDev]] * [[open_claws|Open Claws]] * [[openagents|OpenAgents: An Open Platform for Language Agents in the Wild]] * [[open_interpreter|Open Interpreter: Local AI Agent & Terminal AI Agent]] * [[openrouter|OpenRouter]] ===== References =====