====== Deep Search Agents ======
LLM-based Deep Search Agents represent a paradigm shift from static retrieval-augmented generation toward autonomous, multi-step information seeking with dynamic planning. As surveyed by Xi et al. (2025), these agents comprehend user intentions, execute multi-turn retrieval across diverse sources, and adaptively refine their search strategies -- extending capabilities far beyond traditional web search or single-pass RAG systems. OpenAI's Deep Research exemplifies this paradigm in practice.
graph TD
Q[Query] --> PLAN[Plan Search Strategy]
PLAN --> S1[Search Round 1]
S1 --> EVAL{Evaluate Results}
EVAL -->|Insufficient| REF[Refine Query]
REF --> S2[Search Round N]
S2 --> EVAL
EVAL -->|Sufficient| SYN[Synthesize Report]
SYN --> ANS[Final Report]
===== Background =====
The evolution of search follows three stages:
- **Traditional Web Search** -- Users manually select and consolidate results from ranked document lists
- **LLM-Enhanced Search** -- LLMs rewrite queries or summarize results in a single pass (basic RAG)
- **Search Agents** -- Autonomous agents control the entire search process with adaptive reasoning and dynamic retrieval
The critical limitation of LLM-enhanced search is its static, single-turn nature. Complex queries requiring multi-hop reasoning, cross-source synthesis, or iterative refinement cannot be handled by retrieve-once-then-generate pipelines.
===== Architecture =====
A deep search agent is formally defined as an LLM agent capable of:
* **Intent comprehension** -- Understanding the full scope of user information needs
* **Dynamic planning** -- Generating and revising multi-step search plans based on intermediate results
* **Multi-source retrieval** -- Searching across web, databases, APIs, private knowledge bases, and internal memory
* **Adaptive reasoning** -- Evaluating retrieval quality and adjusting strategy in real time
The agent operates as a sequential decision process. At each step $t$, the agent observes state $s_t$ (accumulated evidence and search history) and selects action $a_t$ from the action space:
$$a_t = \pi(s_t) \in \{\text{search}(q), \text{refine}(q), \text{synthesize}, \text{verify}, \text{terminate}\}$$
The policy $\pi$ is implemented by the LLM backbone, conditioned on the full trajectory $\tau_t = (s_0, a_0, \ldots, s_t)$.
===== Dynamic Planning =====
Unlike static pipelines, deep search agents employ planning with revision. Given a complex query $Q$, the agent generates an initial plan:
$$P_0 = \text{Decompose}(Q) = \{(q_1, \text{src}_1), \ldots, (q_k, \text{src}_k)\}$$
After executing sub-query $q_i$ and receiving documents $D_i$, the agent evaluates sufficiency:
$$\text{sufficient}(D_{1:i}, Q) = \begin{cases} \text{true} & \text{if } \text{coverage}(D_{1:i}, Q) \geq \theta \\ \text{false} & \text{otherwise} \end{cases}$$
On insufficiency, the agent revises the remaining plan: $P_{i+1} = \text{Revise}(P_i, D_{1:i}, Q)$, potentially adding new sub-queries, switching sources, or reformulating failed queries.
===== Multi-Hop Traversal =====
Deep search agents build knowledge chains through multi-hop traversal. Each hop uses results from previous hops as context:
$$q_{i+1} = \text{Generate}(Q, D_{1:i}, \text{gap}(D_{1:i}, Q))$$
where $\text{gap}(D_{1:i}, Q)$ identifies information still needed to fully answer $Q$. This enables complex reasoning patterns like entity identification followed by relation lookup followed by temporal filtering.
===== Code Example =====
from dataclasses import dataclass, field
@dataclass
class SearchState:
query: str
evidence: list = field(default_factory=list)
plan: list = field(default_factory=list)
history: list = field(default_factory=list)
class DeepSearchAgent:
def __init__(self, llm, tools):
self.llm = llm
self.tools = tools # {"web": web_search, "db": db_query, ...}
def search(self, query, max_hops=10):
state = SearchState(query=query)
state.plan = self.decompose(query)
for hop in range(max_hops):
if not state.plan:
break
sub_query, source = state.plan.pop(0)
docs = self.tools[source](sub_query)
state.evidence.extend(docs)
state.history.append((sub_query, source, docs))
if self.is_sufficient(state):
break
state.plan = self.revise_plan(state)
return self.synthesize(state)
def decompose(self, query):
return self.llm.generate(
f"Decompose into sub-queries with sources: {query}"
)
def is_sufficient(self, state):
score = self.llm.evaluate(state.query, state.evidence)
return score >= 0.8
def revise_plan(self, state):
return self.llm.generate(f"Revise search plan given gaps: {state}")
def synthesize(self, state):
return self.llm.generate(
f"Synthesize answer from evidence: {state.evidence}"
)
===== Taxonomy of Approaches =====
The survey categorizes deep search agent architectures along several dimensions:
| **Dimension** | **Variants** |
| Planning | Static decomposition, dynamic revision, hierarchical |
| Retrieval | Single-source, multi-source, tool-augmented |
| Reasoning | Single-agent, dual-agent (Reasoner-Purifier), multi-agent |
| Optimization | SFT, reinforcement learning, hybrid SFT+RL |
| Evaluation | Single-hop QA, multi-hop QA, open-ended research |
===== Applications =====
* **Academic Research** -- Automated literature review and evidence synthesis
* **Market Intelligence** -- Multi-source competitive analysis with verification
* **Investigative Journalism** -- Cross-referencing claims across public records
* **Technical Support** -- Deep diagnostic search across documentation and logs
===== References =====
* [[https://arxiv.org/abs/2508.05668|Xi et al. (2025) -- A Survey of LLM-based Deep Search Agents: Paradigm, Optimization, Evaluation, and Challenges]]
* [[https://github.com/YunjiaXi/Awesome-Search-Agent-Papers|Awesome Search Agent Papers Repository]]
* [[https://www.anthropic.com/engineering/multi-agent-research-system|Anthropic Multi-Agent Research System Architecture]]
===== See Also =====
* [[agentic_rag]] -- Agentic retrieval-augmented generation
* [[mcts_llm_reasoning]] -- Tree search for deliberative reasoning
* [[task_decomposition_strategies]] -- Formal decomposition of complex tasks