====== R2R: Production Agentic RAG System ======
R2R (Retrieval to Response) by SciPhi is an open-source, production-ready [[agentic_rag|agentic RAG]] system with over 8,000 GitHub stars, backed by Y Combinator.(([[https://www.ycombinator.com|Y Combinator — startup accelerator backing SciPhi/R2R]]))((https://github.com/SciPhi-AI/R2R)) It provides a complete infrastructure for transforming unstructured data into actionable intelligence through automated [[knowledge_graphs|knowledge graphs]], [[hybrid_search|hybrid search]], and multi-step reasoning, all exposed via a RESTful API.
R2R outperforms competing frameworks like [[llamaindex|LlamaIndex]] (2.3x faster) and [[haystack|Haystack]] (3.1x faster) in ingestion throughput at over 160,000 tokens per second.((https://r2r-docs.sciphi.ai)) Its architecture is designed for enterprises processing large document collections (10-100s of GBs) who need deeper reasoning capabilities beyond simple vector search.
===== Architecture =====
R2R employs a [[modular|modular]], service-oriented architecture with four core pipelines:((https://r2r-docs.sciphi.ai))
* **Ingestion Pipeline**, Transforms multimodal data (PDF, TXT, JSON, PNG, MP3) into embeddable documents via native parsing, chunking, and entity extraction using LLMs
* **Embedding Pipeline**, Handles text transformation, [[chunking_strategies|chunking strategies]], and vector embedding generation
* **RAG Pipeline**, Extends embedding with LLM completions, supporting agentic workflows with tool use
* **Knowledge Graph Pipeline**, Builds document-level and collection-level graphs using entity/relationship extraction with community detection
All pipelines are orchestrated through a centralized RESTful API for language-agnostic integration. The system includes built-in user management, collection organization, and conversation tracking with branching support.
===== Code Example =====
from r2r import R2RClient
# Connect to R2R server (launch with: r2r serve --docker)
client = R2RClient(base_url="http://localhost:7272")
# Ingest documents - supports PDF, TXT, JSON, PNG, MP3
client.documents.create(
file_path="./quarterly_report.pdf",
metadata={"department": "finance", "quarter": "Q3"}
)
# Ingest from raw text
client.documents.create(
raw_text="R2R supports multimodal ingestion and [[knowledge_graphs|knowledge graphs]].",
metadata={"source": "manual"}
)
# Build knowledge graph from ingested documents
client.graphs.build(collection_id="default")
# Perform [[hybrid_search|hybrid search]] (vector + keyword)
search_results = client.search(
query="What were Q3 revenue trends?",
search_settings={"use_hybrid_search": True}
)
# [[agentic_rag|Agentic RAG]] with knowledge graph reasoning
rag_response = client.retrieval.rag(
query="Compare revenue trends across all quarters and identify patterns",
rag_generation_config={"model": "gpt-4o"},
search_settings={"use_graph_search": True}
)
print(rag_response.results.generated_answer)
# Deep Research API for multi-step reasoning
research = client.retrieval.deep_research(
query="What strategic recommendations emerge from our financial data?",
)
print(research.results.answer) # Citation-backed answer with reasoning logs
===== System Flow =====
%%{init: {"theme": "base", "themeVariables": {"primaryColor": "#27AE60"}}}%%
graph TD
A[Documents: PDF, TXT, PNG, MP3] --> B[Ingestion Pipeline]
B --> C[Parsing + Chunking]
C --> D[Entity Extraction]
D --> E[[[embeddings|Vector Embeddings]]]
D --> F[Knowledge Graph]
E --> G[(Vector Store)]
F --> H[(Graph Database)]
I[User Query] --> J[REST API]
J --> K{Search Strategy}
K -->|Vector| G
K -->|Keyword| L[Full-Text Index]
K -->|Graph| H
G --> M[Hybrid Results Fusion]
L --> M
H --> M
M --> N[LLM Generation]
N --> O[Citation-Backed Answer]
J --> P[Deep Research API]
P --> Q[Multi-Step Reasoning]
Q --> R[Tool-Based Search Loops]
R --> O
===== Key Features =====
* **[[hybrid_search|Hybrid Search]]**, Combines semantic vector search, keyword search, and HyDE (Hypothetical Document [[embeddings|Embeddings]])
* **[[knowledge_graphs|Knowledge Graphs]]**, Automated entity/relationship extraction with community detection at document and collection levels
* **Deep Research API**, Multi-step agentic reasoning with citation-backed answers and reasoning transparency
* **Multimodal Ingestion**, Native support for PDF, TXT, JSON, PNG, MP3 without external tools
* **RESTful API**, Complete API coverage for all operations with Python and JavaScript SDKs
* **R2R Dashboard**, Open-source React/Next.js UI for document management, playground, and analytics
* **Enterprise-Grade**, User management, collection-based access control, observability, and audit logs
===== See Also =====
* [[ragas|RAGAS: RAG Evaluation Framework]]
* [[agentic_rag|Agentic RAG]]
* [[xagent|XAgent: Autonomous LLM Agent for Complex Tasks]]
* [[rag_framework_comparison|RAG Framework Comparison]]
* [[rag_system_production_deployment|RAG System Production Deployment]]
===== References =====
* [[https://github.com/SciPhi-AI/R2R|R2R GitHub Repository]]
* [[https://r2r-docs.sciphi.ai|Official Documentation]]
* [[https://[[github|github]].com/SciPhi-AI/R2R-Application|R2R Dashboard Repository]](([[https://[[github|github]])).com/SciPhi-AI/R2R-Application|R2R Application — open-source React/Next.js management dashboard]]))
* [[https://www.ycombinator.com|Y Combinator (Backer)]]