====== R2R: Production Agentic RAG System ======
R2R (Retrieval to Response) by SciPhi is an open-source, production-ready [[agentic_rag|agentic RAG]] system with over 8,000 GitHub stars, backed by Y Combinator.(([[https://www.ycombinator.com|Y Combinator — startup accelerator backing SciPhi/R2R]]))((https://github.com/SciPhi-AI/R2R)) It provides a complete infrastructure for transforming unstructured data into actionable intelligence through automated [[knowledge_graphs|knowledge graphs]], [[hybrid_search|hybrid search]], and multi-step reasoning, all exposed via a RESTful API.

R2R outperforms competing frameworks like [[llamaindex|LlamaIndex]] (2.3x faster) and [[haystack|Haystack]] (3.1x faster) in ingestion throughput at over 160,000 tokens per second.((https://r2r-docs.sciphi.ai)) Its architecture is designed for enterprises processing large document collections (10-100s of GBs) who need deeper reasoning capabilities beyond simple vector search.

===== Architecture =====
R2R employs a [[modular|modular]], service-oriented architecture with four core pipelines:((https://r2r-docs.sciphi.ai))

  * **Ingestion Pipeline**, Transforms multimodal data (PDF, TXT, JSON, PNG, MP3) into embeddable documents via native parsing, chunking, and entity extraction using LLMs
  * **Embedding Pipeline**, Handles text transformation, [[chunking_strategies|chunking strategies]], and vector embedding generation
  * **RAG Pipeline**, Extends embedding with LLM completions, supporting agentic workflows with tool use
  * **Knowledge Graph Pipeline**, Builds document-level and collection-level graphs using entity/relationship extraction with community detection

All pipelines are orchestrated through a centralized RESTful API for language-agnostic integration. The system includes built-in user management, collection organization, and conversation tracking with branching support.

===== Code Example =====
<code python>
from r2r import R2RClient

# Connect to R2R server (launch with: r2r serve --docker)
client = R2RClient(base_url="http://localhost:7272")

# Ingest documents - supports PDF, TXT, JSON, PNG, MP3
client.documents.create(
    file_path="./quarterly_report.pdf",
    metadata={"department": "finance", "quarter": "Q3"}
)

# Ingest from raw text
client.documents.create(
    raw_text="R2R supports multimodal ingestion and [[knowledge_graphs|knowledge graphs]].",
    metadata={"source": "manual"}
)

# Build knowledge graph from ingested documents
client.graphs.build(collection_id="default")

# Perform [[hybrid_search|hybrid search]] (vector + keyword)
search_results = client.search(
    query="What were Q3 revenue trends?",
    search_settings={"use_hybrid_search": True}
)

# [[agentic_rag|Agentic RAG]] with knowledge graph reasoning
rag_response = client.retrieval.rag(
    query="Compare revenue trends across all quarters and identify patterns",
    rag_generation_config={"model": "gpt-4o"},
    search_settings={"use_graph_search": True}
)
print(rag_response.results.generated_answer)

# Deep Research API for multi-step reasoning
research = client.retrieval.deep_research(
    query="What strategic recommendations emerge from our financial data?",
)
print(research.results.answer)  # Citation-backed answer with reasoning logs
</code>

===== System Flow =====
<code>
%%{init: {"theme": "base", "themeVariables": {"primaryColor": "#27AE60"}}}%%
graph TD
    A[Documents: PDF, TXT, PNG, MP3] --> B[Ingestion Pipeline]
    B --> C[Parsing + Chunking]
    C --> D[Entity Extraction]
    D --> E[[[embeddings|Vector Embeddings]]]
    D --> F[Knowledge Graph]
    E --> G[(Vector Store)]
    F --> H[(Graph Database)]
    I[User Query] --> J[REST API]
    J --> K{Search Strategy}
    K -->|Vector| G
    K -->|Keyword| L[Full-Text Index]
    K -->|Graph| H
    G --> M[Hybrid Results Fusion]
    L --> M
    H --> M
    M --> N[LLM Generation]
    N --> O[Citation-Backed Answer]
    J --> P[Deep Research API]
    P --> Q[Multi-Step Reasoning]
    Q --> R[Tool-Based Search Loops]
    R --> O
</code>

===== Key Features =====
  * **[[hybrid_search|Hybrid Search]]**, Combines semantic vector search, keyword search, and HyDE (Hypothetical Document [[embeddings|Embeddings]])
  * **[[knowledge_graphs|Knowledge Graphs]]**, Automated entity/relationship extraction with community detection at document and collection levels
  * **Deep Research API**, Multi-step agentic reasoning with citation-backed answers and reasoning transparency
  * **Multimodal Ingestion**, Native support for PDF, TXT, JSON, PNG, MP3 without external tools
  * **RESTful API**, Complete API coverage for all operations with Python and JavaScript SDKs
  * **R2R Dashboard**, Open-source React/Next.js UI for document management, playground, and analytics
  * **Enterprise-Grade**, User management, collection-based access control, observability, and audit logs

===== See Also =====
  * [[ragas|RAGAS: RAG Evaluation Framework]]
  * [[agentic_rag|Agentic RAG]]
  * [[xagent|XAgent: Autonomous LLM Agent for Complex Tasks]]
  * [[rag_framework_comparison|RAG Framework Comparison]]
  * [[rag_system_production_deployment|RAG System Production Deployment]]

===== References =====
  * [[https://github.com/SciPhi-AI/R2R|R2R GitHub Repository]]
  * [[https://r2r-docs.sciphi.ai|Official Documentation]]
  * [[https://[[github|github]].com/SciPhi-AI/R2R-Application|R2R Dashboard Repository]](([[https://[[github|github]])).com/SciPhi-AI/R2R-Application|R2R Application — open-source React/Next.js management dashboard]]))
  * [[https://www.ycombinator.com|Y Combinator (Backer)]]