====== R2R: Production Agentic RAG System ======
R2R (Retrieval to Response) by SciPhi is an open-source, production-ready agentic RAG system with over 8,000 GitHub stars, backed by Y Combinator. It provides a complete infrastructure for transforming unstructured data into actionable intelligence through automated knowledge graphs, hybrid search, and multi-step reasoning -- all exposed via a RESTful API.
R2R outperforms competing frameworks like LlamaIndex (2.3x faster) and Haystack (3.1x faster) in ingestion throughput at over 160,000 tokens per second. Its architecture is designed for enterprises processing large document collections (10-100s of GBs) who need deeper reasoning capabilities beyond simple vector search.
===== Architecture =====
R2R employs a modular, service-oriented architecture with four core pipelines:
* **Ingestion Pipeline** -- Transforms multimodal data (PDF, TXT, JSON, PNG, MP3) into embeddable documents via native parsing, chunking, and entity extraction using LLMs
* **Embedding Pipeline** -- Handles text transformation, chunking strategies, and vector embedding generation
* **RAG Pipeline** -- Extends embedding with LLM completions, supporting agentic workflows with tool use
* **Knowledge Graph Pipeline** -- Builds document-level and collection-level graphs using entity/relationship extraction with community detection
All pipelines are orchestrated through a centralized RESTful API for language-agnostic integration. The system includes built-in user management, collection organization, and conversation tracking with branching support.
===== Code Example =====
from r2r import R2RClient
# Connect to R2R server (launch with: r2r serve --docker)
client = R2RClient(base_url="http://localhost:7272")
# Ingest documents - supports PDF, TXT, JSON, PNG, MP3
client.documents.create(
file_path="./quarterly_report.pdf",
metadata={"department": "finance", "quarter": "Q3"}
)
# Ingest from raw text
client.documents.create(
raw_text="R2R supports multimodal ingestion and knowledge graphs.",
metadata={"source": "manual"}
)
# Build knowledge graph from ingested documents
client.graphs.build(collection_id="default")
# Perform hybrid search (vector + keyword)
search_results = client.search(
query="What were Q3 revenue trends?",
search_settings={"use_hybrid_search": True}
)
# Agentic RAG with knowledge graph reasoning
rag_response = client.retrieval.rag(
query="Compare revenue trends across all quarters and identify patterns",
rag_generation_config={"model": "gpt-4o"},
search_settings={"use_graph_search": True}
)
print(rag_response.results.generated_answer)
# Deep Research API for multi-step reasoning
research = client.retrieval.deep_research(
query="What strategic recommendations emerge from our financial data?",
)
print(research.results.answer) # Citation-backed answer with reasoning logs
===== System Flow =====
%%{init: {"theme": "base", "themeVariables": {"primaryColor": "#27AE60"}}}%%
graph TD
A[Documents: PDF, TXT, PNG, MP3] --> B[Ingestion Pipeline]
B --> C[Parsing + Chunking]
C --> D[Entity Extraction]
D --> E[Vector Embeddings]
D --> F[Knowledge Graph]
E --> G[(Vector Store)]
F --> H[(Graph Database)]
I[User Query] --> J[REST API]
J --> K{Search Strategy}
K -->|Vector| G
K -->|Keyword| L[Full-Text Index]
K -->|Graph| H
G --> M[Hybrid Results Fusion]
L --> M
H --> M
M --> N[LLM Generation]
N --> O[Citation-Backed Answer]
J --> P[Deep Research API]
P --> Q[Multi-Step Reasoning]
Q --> R[Tool-Based Search Loops]
R --> O
===== Key Features =====
* **Hybrid Search** -- Combines semantic vector search, keyword search, and HyDE (Hypothetical Document Embeddings)
* **Knowledge Graphs** -- Automated entity/relationship extraction with community detection at document and collection levels
* **Deep Research API** -- Multi-step agentic reasoning with citation-backed answers and reasoning transparency
* **Multimodal Ingestion** -- Native support for PDF, TXT, JSON, PNG, MP3 without external tools
* **RESTful API** -- Complete API coverage for all operations with Python and JavaScript SDKs
* **R2R Dashboard** -- Open-source React/Next.js UI for document management, playground, and analytics
* **Enterprise-Grade** -- User management, collection-based access control, observability, and audit logs
===== References =====
* [[https://github.com/SciPhi-AI/R2R|R2R GitHub Repository]]
* [[https://r2r-docs.sciphi.ai|Official Documentation]]
* [[https://github.com/SciPhi-AI/R2R-Application|R2R Dashboard Repository]]
* [[https://www.ycombinator.com|Y Combinator (Backer)]]
===== See Also =====
* [[ragas|RAGAS]] -- Evaluate your R2R pipeline with standardized metrics
* [[vanna|Vanna]] -- Text-to-SQL using similar RAG principles
* [[mastra|Mastra]] -- TypeScript agent framework with RAG support