LangSmith

LangSmith is a framework-agnostic observability, evaluation, and deployment platform by LangChain for developing, debugging, and deploying AI agents and LLM applications. It provides end-to-end tracing, testing, prompt management, and production monitoring — whether you use LangChain, LlamaIndex, or a custom stack.

Overview

LangSmith addresses the core challenge of LLM application development: non-deterministic outputs are hard to debug and optimize. It captures detailed execution traces of every LLM call, chain, agent step, and tool invocation, giving developers full visibility into what their applications are actually doing in production.

The platform is HIPAA, SOC 2 Type 2, and GDPR compliant, making it suitable for regulated enterprise environments.

Key capabilities:

Tracing — Records full execution graphs including inputs, outputs, latency, token usage, costs, and nested tool/retriever calls with minimal runtime overhead via async transmission
Evaluation — Dataset management, LLM-as-judge scoring, A/B testing across models and prompts, and annotation queues for human feedback
Testing — Organize traces into test cases, run regression tests, and compare experiment results with dedicated experiment views
Deployment — Scalable servers for stateful, long-running agents with streaming, error recovery, and load balancing
Fleet Visual Builder — Visual interface (via LangGraph Studio integration) for designing, testing, and refining agent workflows before coding

Architecture

graph TD A[Your Application: LangChain / LangGraph / LlamaIndex / Custom] -->|traces async| B[LangSmith Platform] subgraph B[LangSmith Platform] C[Tracing Engine] D[Evaluation Engine] E[Deployment: Agents] F[Datasets] G[Playground] H[Dashboards] end

Getting Started

LangSmith requires zero code changes for LangChain/LangGraph apps — just set environment variables:

import os
 
# Enable tracing (works with LangChain/LangGraph automatically)
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "ls-your-api-key"
os.environ["LANGCHAIN_PROJECT"] = "my-project"
 
# For programmatic access to runs, datasets, and metrics
from langsmith import Client
 
client = Client()
 
# List successful production runs with token/cost data
runs = client.list_runs(
    project_name="production-agents",
    execution_order=1,
    error=False
)
for run in runs:
    print(f"Run: {run.name}, Tokens: {run.total_tokens}")
    print(f"Latency: {run.latency}s, Cost: ${run.total_cost}")
 
# Create an evaluation dataset from production traces
dataset = client.create_dataset("eval-golden-set")
for run in client.list_runs(project_name="production-agents", limit=50):
    client.create_example(
        inputs=run.inputs,
        outputs=run.outputs,
        dataset_id=dataset.id,
    )

For non-LangChain frameworks, use the SDK's @traceable decorator or manual span creation for full instrumentation.

Framework-Agnostic Integration

While LangSmith integrates automatically with the LangChain ecosystem, it supports any LLM application:

LangChain/LangGraph — Zero-config via environment variables
LlamaIndex — SDK integration with callback handlers
Custom code — @traceable decorator or RunTree API for manual instrumentation
REST API — Direct HTTP calls for any language or framework

Evaluation Workflow

LangSmith's evaluation system supports systematic quality measurement:

Create datasets from production traces or manual examples
Define scorers (LLM-as-judge, heuristic, or human)
Run evaluations across model/prompt variants
Compare results in experiment views with aggregated metrics
Set up annotation queues for human review

Key Strengths

Negligible runtime overhead via async trace transmission
Deep integration with LangGraph for stateful agent debugging
Time-travel debugging for reproducing agent failures
Production dashboards tracking latency, error rates, costs, and quality
Enterprise-grade compliance (HIPAA, SOC 2, GDPR)

AI Agent Knowledge Base

Sidebar

Table of Contents

LangSmith

Overview

Architecture

Getting Started

Framework-Agnostic Integration

Evaluation Workflow

Key Strengths

References

See Also

AI Agent Knowledge Base

User Tools

Site Tools

Sidebar

Table of Contents

LangSmith

Overview

Architecture

Getting Started

Framework-Agnostic Integration

Evaluation Workflow

Key Strengths

References

See Also

Page Tools