Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety & Security
Evaluation
Meta
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety & Security
Evaluation
Meta
Agentic Retrieval-Augmented Generation (Agentic RAG) is an advanced paradigm that integrates autonomous AI agents into the retrieval-augmented generation pipeline. Unlike basic RAG, which follows a static retrieve-then-generate pattern, Agentic RAG employs agents capable of autonomous decision-making, iterative refinement, and dynamic workflow orchestration to handle complex, multi-step information needs.
Traditional RAG systems retrieve documents from a knowledge base and concatenate them into the LLM context for generation. This single-pass approach fails on tasks requiring multi-hop reasoning, adaptive query strategies, or cross-source synthesis. Agentic RAG addresses these limitations by embedding agent capabilities directly into the retrieval loop.
The key insight is that retrieval should be treated as an agentic action rather than a static preprocessing step. The agent decides when to retrieve, what to retrieve, and how to reformulate queries based on intermediate reasoning.
Singh et al. (2025) propose a comprehensive taxonomy of Agentic RAG architectures organized around four core agentic design patterns:
The taxonomy classifies systems into: single-agent RAG, multi-agent RAG, hierarchical RAG, corrective RAG, adaptive RAG, and graph-based RAG.
In Agentic RAG, the agent autonomously plans retrieval strategies based on query complexity. Given a user query $q$, the agent generates a retrieval plan $P = \{(q_1, s_1), (q_2, s_2), \ldots, (q_n, s_n)\}$ where each $q_i$ is a sub-query and $s_i$ is the selected retrieval source.
The planning process uses a reward signal $R(P, q)$ that estimates plan quality:
$$R(P, q) = \sum_{i=1}^{n} \alpha_i \cdot \text{rel}(q_i, q) \cdot \text{cov}(D_i, q)$$
where $\text{rel}(q_i, q)$ measures sub-query relevance to the original query, $\text{cov}(D_i, q)$ measures coverage of retrieved documents $D_i$, and $\alpha_i$ are learned weighting coefficients.
Unlike static query expansion, Agentic RAG reformulates queries iteratively using feedback from previous retrieval rounds. The agent maintains a belief state $b_t$ at each step $t$:
$$b_{t+1} = \text{Update}(b_t, D_t, \text{Eval}(D_t, q))$$
If the evaluation function $\text{Eval}(D_t, q)$ indicates insufficient coverage or relevance, the agent generates a reformulated query $q_{t+1}$ conditioned on both the original query and the gap analysis.
from langchain.agents import AgentExecutor, create_react_agent from langchain.tools import Tool from langchain_core.prompts import PromptTemplate def build_agentic_rag(llm, retriever, tools): """Build an Agentic RAG pipeline with autonomous retrieval planning.""" retrieval_tool = Tool( name="knowledge_retrieval", func=retriever.invoke, description="Retrieve relevant documents from the knowledge base" ) all_tools = [retrieval_tool] + tools prompt = PromptTemplate.from_template( "You are a retrieval agent. Analyze the query complexity, " "plan retrieval steps, and iteratively refine until sufficient " "evidence is gathered.\n\nQuery: {input}\n{agent_scratchpad}" ) agent = create_react_agent(llm, all_tools, prompt) return AgentExecutor(agent=agent, tools=all_tools, max_iterations=10)
Agentic RAG has demonstrated strong results across domains: