OpenHands

OpenHands (formerly OpenDevin) is an open-source autonomous software engineering platform that enables AI agents to write, debug, test, and refactor code through executable actions. Built on the CodeAct paradigm, it represents the leading open-source alternative to proprietary coding agents like Devin, achieving state-of-the-art results on SWE-Bench benchmarks.

Overview

OpenHands originated as OpenDevin, an open-source effort to replicate and surpass the capabilities demonstrated by Cognition's proprietary Devin agent. The project rebranded to OpenHands and evolved into a comprehensive Software Agent SDK. Hosted at github.com/All-Hands-AI/OpenHands, it provides a modular framework for building autonomous software engineering agents that can interact with codebases, terminals, and browsers.

The platform supports multiple LLM backends including Anthropic Claude, OpenAI GPT models, and open-source alternatives, making it the most flexible open-source coding agent framework available.

CodeAct Paradigm

CodeAct is the core paradigm underlying OpenHands. Rather than generating natural language plans that must be parsed and executed, CodeAct agents directly generate and execute Python and bash commands. This approach provides several advantages:

Direct execution — actions are code, eliminating translation errors
Rich feedback — execution output provides concrete signals for self-correction
Tool composability — any command-line tool or Python library becomes available
Reproducibility — executed code serves as an auditable trace

CodeAct 2.1 (November 2024) introduced function calling for precise tool specification, integrated Anthropic Claude 3.5 Sonnet as the default model, and fixed issues with directory navigation and looping behaviors.

# CodeAct paradigm: agent generates executable code as actions
# Rather than: "I should look at the file structure"
# The agent directly executes:
 
import os
 
# Explore repository structure
for root, dirs, files in os.walk("/workspace/project", topdown=True):
    dirs[:] = [d for d in dirs if d not in [".git", "node_modules", "__pycache__"]]
    level = root.replace("/workspace/project", "").count(os.sep)
    indent = " " * 2 * level
    print(f"{indent}{os.path.basename(root)}/")
    if level < 2:
        for file in files:
            print(f"{indent}  {file}")

Architecture

OpenHands evolved from a monolithic V0 design to V1, a modular Software Agent SDK with several architectural innovations:

Event-Sourced State Management

All agent actions and observations are recorded as immutable events, enabling deterministic replay, pause/resume, and debugging. This event-driven execution loop supports incremental steps, security checks, and native reasoning support for models like Anthropic's ThinkingBlock.

Sandboxed Execution

Agents operate in isolated sandbox environments (Docker containers) that prevent unintended side effects on the host system. The V1 architecture supports opt-in sandboxing with a unified process by default, aligning with the Model Context Protocol (MCP).

Multi-Agent Workflows

OpenHands supports parallel multi-agent execution for large-scale tasks. For example, Angular-to-React migrations can be decomposed using dependency trees, with separate agents handling independent components simultaneously in non-interfering cloud sandboxes.

Composable Interfaces

A two-layer composability model decouples research from production, supporting CLI, Web UI, and GitHub app interfaces. This enables both rapid prototyping and enterprise deployment.

Benchmarks

OpenHands CodeAct 2.1 achieved state-of-the-art open-source results on SWE-Bench:

Benchmark	Score	Significance
SWE-Bench Verified	53% resolve rate	Top open-source agent
SWE-Bench Lite	41.7% resolve rate	Surpassed prior SOTA

These results demonstrate that open-source agents can match or exceed proprietary alternatives on real-world software engineering tasks.

Comparison with Alternatives

Aspect	OpenHands	Devin	SWE-Agent
Open Source	Yes, fully reproducible	No (closed, proprietary)	Yes
Architecture	Modular SDK, multi-agent, event-driven	Single-agent, proprietary sandbox	Simpler agent loop
SWE-Bench Verified	53% (top open)	Proprietary scores	Competitive but lower
Multi-Agent	Yes, parallel execution	No	No
LLM Flexibility	Any LLM backend	Fixed model	Limited backends
Deployment	CLI, Web UI, GitHub App	Cloud only	CLI primarily

Key Features

Typed tools with Pydantic validation — separates action specification from execution for distributed use
Native reasoning support — integrates extended thinking from Claude, reasoning fields from OpenAI
GitHub integration — operates as a GitHub App for PR-based workflows
Micro-agents — specialized sub-agents for task decomposition within larger projects
Cloud sandboxes — isolated environments for parallel agent execution

Recent Developments

November 2024 — CodeAct 2.1 release with Claude 3.5 integration and top SWE-Bench scores
November 2025 — V1 SDK architectural overhaul for scalability, modularity, and MCP alignment
February 2026 — Support for advanced models including GPT-5-Codex; demonstrated massive automated refactoring capabilities
Ongoing — AMD integration for local developer use; multi-agent parallel refactors via dependency graphs

Getting Started

# Install and run OpenHands
# pip install openhands
 
from openhands import Agent
 
agent = Agent(
    model="anthropic/claude-sonnet-4-20250514",
    workspace="/path/to/project",
    sandbox="docker"
)
 
# Agent autonomously resolves a GitHub issue
result = agent.resolve(
    task="Fix the TypeError in utils.py when input is None",
    repo="https://github.com/example/project"
)
print(result.summary)

AI Agent Knowledge Base

Sidebar

Table of Contents

OpenHands

Overview

CodeAct Paradigm

Architecture

Event-Sourced State Management

Sandboxed Execution

Multi-Agent Workflows

Composable Interfaces

Benchmarks

Comparison with Alternatives

Key Features

Recent Developments

Getting Started

References

See Also

AI Agent Knowledge Base

User Tools

Site Tools

Sidebar

Table of Contents

OpenHands

Overview

CodeAct Paradigm

Architecture

Event-Sourced State Management

Sandboxed Execution

Multi-Agent Workflows

Composable Interfaces

Benchmarks

Comparison with Alternatives

Key Features

Recent Developments

Getting Started

References

See Also

Page Tools