Core Concepts
Reasoning Techniques
Memory Systems
Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools & Products
Safety & Governance
Evaluation
Research
Development
Meta
Core Concepts
Reasoning Techniques
Memory Systems
Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools & Products
Safety & Governance
Evaluation
Research
Development
Meta
Synthetic Data Generation Agents are agentic AI pipelines that autonomously create high-quality training datasets by decomposing complex data generation tasks into manageable subtasks executed by specialized LLM-based agents. The AgentSynth framework, published as a conference paper at ICLR 2026, demonstrates this approach for generating diverse computer-use task trajectories at scale.
Training capable AI agents requires large volumes of high-quality, diverse task data with corresponding trajectories. Human annotation is expensive (often hundreds of dollars per trajectory) and difficult to scale. Agentic synthetic data generation addresses this by leveraging information asymmetry — the principle that executing a task step-by-step is significantly easier than reasoning about the complete solution at once.
By decomposing generation into forward-execution subtasks, agentic pipelines produce datasets that are simple to create but challenging to solve, providing both training data and discriminative benchmarks.
AgentSynth is a scalable, cost-efficient pipeline for automatically synthesizing task and trajectory datasets for generalist computer-use agents. Developed at UC Berkeley by Jingxu Xie, Dylan Xu, Xuandong Zhao, and Dawn Song.
AgentSynth deploys six distinct LLM-based agents in a coordinated pipeline:
A key innovation is precise control over task complexity by varying the number of composed subtasks. Each individual subtask is straightforward, but chaining them creates increasingly challenging long-horizon tasks:
| Difficulty Level | Subtasks | Agent Success Rate |
|---|---|---|
| Level 1 | 1 | 18% |
| Level 2 | 2 | 12% |
| Level 3 | 3 | 8% |
| Level 6 | 6 | 4% |
This steep performance degradation demonstrates the benchmark's discriminative power and highlights substantial room for agent improvement.
Simplified agentic synthetic data generation pipeline:
from dataclasses import dataclass @dataclass class SubTask: description: str tools_required: list[str] trajectory: list[dict] verified: bool = False class AgentSynthPipeline: def __init__(self, llm_client, environment): self.llm = llm_client self.env = environment def propose_subtask(self, context: dict) -> SubTask: prompt = f"Propose a simple computer task using: {context['available_tools']}" response = self.llm.generate(prompt) return SubTask( description=response.text, tools_required=response.tools, trajectory=[] ) def execute_subtask(self, subtask: SubTask) -> SubTask: trajectory = [] state = self.env.reset() for step in range(self.env.max_steps): action = self.llm.generate( f"Task: {subtask.description}\nState: {state}\nNext action:" ) state, done = self.env.step(action) trajectory.append({"state": state, "action": action}) if done: break subtask.trajectory = trajectory return subtask def verify_subtask(self, subtask: SubTask) -> bool: verification = self.llm.generate( f"Did this trajectory complete the task?\n" f"Task: {subtask.description}\n" f"Trajectory: {subtask.trajectory}" ) subtask.verified = verification.text.lower().startswith("yes") return subtask.verified def compose_tasks(self, subtasks: list[SubTask], difficulty: int) -> dict: selected = subtasks[:difficulty] composed_description = " Then, ".join(s.description for s in selected) composed_trajectory = [] for s in selected: composed_trajectory.extend(s.trajectory) return { "description": composed_description, "difficulty": difficulty, "trajectory": composed_trajectory, "num_subtasks": len(selected) }
AgentSynth achieves an average cost of $0.60 per trajectory, orders of magnitude cheaper than human annotations. Over 6,000 diverse and realistic tasks were generated using the pipeline, integrated with the OSWorld environment for authentic computer tool interactions.
Other frameworks complement AgentSynth in the synthetic data generation space: