====== Harness (Orchestration Loop) ======
A **harness** in the context of AI orchestration refers to a stateless orchestration layer that manages the execution of agentic workflows by coordinating Large Language Model (LLM) invocations, routing tool calls to appropriate infrastructure components, and maintaining an event log of all operations. The harness architecture enables fault-tolerant agent execution through persistent session logging, allowing systems to recover from crashes without data loss by replaying the session [[event_stream|event stream]] (([[https://arxiv.org/abs/2210.03629|Yao et al. - ReAct: Synergizing Reasoning and Acting in Language Models (2022]]))

===== Architectural Components =====
The harness operates as a coordinating middleware between three primary layers: the LLM reasoning engine, the execution infrastructure, and the persistent session state. Rather than embedding orchestration logic within the LLM itself, the harness maintains a separation of concerns by handling all non-reasoning operations through an external control plane.

The core function of the harness involves receiving completion requests from an [[agent_loop|agent loop]], forwarding these requests to Claude or another LLM endpoint, parsing tool invocation responses from the model, validating tool calls against available schema, routing invocations to appropriate infrastructure handlers (such as API gateways, database connectors, or external services), and recording all state transitions as events in a session log. This event-centric design ensures that every significant state change is persisted before execution proceeds (([[https://arxiv.org/abs/2005.11401|Lewis et al. - Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (2020]]))

===== Statelessness and Fault Tolerance =====
The **stateless** nature of the harness is fundamental to its resilience properties. By storing all necessary context in the session log rather than in volatile memory, the orchestration layer can be instantiated, scaled, or recovered without requiring complex state synchronization or distributed consensus mechanisms. When a harness instance fails during execution, a new instance can be started and directed to read from the session log, rebuilding the execution context from the persisted [[event_stream|event stream]].

This design pattern follows principles from event sourcing architectures, where state is derived from a complete historical record of state-modifying events. The session log serves as both an audit trail and a complete reconstruction mechanism, enabling deterministic replay of operations up to the point of failure (([[https://dl.acm.org/doi/10.1145/3394486.3412911|Googl Research - Language Models as Zero-Shot Planners for Task Management (2023]]))

===== Tool Invocation Routing =====
When [[claude|Claude]] generates tool use blocks in its response, the harness is responsible for interpreting these invocations and routing them to the appropriate infrastructure layer. This involves parsing the tool name, extracting and validating parameters, checking authorization and quota constraints, selecting the correct backend handler based on tool type, executing the tool with timeout and error handling, and capturing the result for return to the LLM in the next turn.

The routing layer abstracts away infrastructure heterogeneity, allowing the LLM to express intentions through a unified tool interface while the harness handles the complexity of multi-system coordination. Different tools may be implemented as HTTP APIs, local functions, message queue workers, or asynchronous batch processes, yet the harness presents a consistent invocation interface (([[https://arxiv.org/abs/2301.00109|Schick et al. - Toolformer: Language Models Can Teach Themselves to Use Tools (2023]]))

===== Session State and Recovery =====
The session log captures all events necessary to reconstruct the execution state at any point in time. These events include LLM requests (with system context and conversation history), LLM responses (with token counts and reasoning steps), tool invocations (with parameters and expected side effects), tool results (with success or failure status), and control events (such as loop termination or error handling decisions).

Upon recovery from a crash, the harness reads the session log sequentially, reconstructing the conversation history and execution state. Once the log has been fully replayed, execution resumes with the next orchestration cycle. This approach ensures that work is never duplicated and that the system maintains consistency across failures. The session log may be stored in databases, object storage systems, or distributed log systems depending on durability and throughput requirements (([[https://arxiv.org/abs/2305.15334|Sumers et al. - In-Context Learning Creates Task Vectors (2023]]))

===== Applications and Current Status =====
The harness architecture has emerged as a practical pattern for implementing [[claude|Claude]]-managed agents, where orchestration concerns are cleanly separated from the reasoning capabilities of the language model. This pattern is particularly valuable in production settings where fault tolerance, auditability, and operational visibility are critical requirements.

Current implementations of the harness concept support complex multi-turn agentic workflows, including those involving sequential tool calls, conditional branching based on tool results, iterative refinement loops, and error recovery with automatic retry strategies. The stateless design enables horizontal scaling, where multiple harness instances can process different agent sessions concurrently without coordination overhead.

===== See Also =====

  * [[stateless_harness_orchestration|Stateless Harness Orchestration]]
  * [[agent_orchestration|Agent Orchestration]]
  * [[agent_harness_design|Agent Harness Design]]
  * [[stateful_vs_stateless_harness|Stateful Harness vs Stateless Harness]]
  * [[minimal_scaffolding_maximal_operational_harness|Minimal Scaffolding, Maximal Operational Harness]]

===== References =====