====== R2R: Production Agentic RAG System ====== R2R (Retrieval to Response) by SciPhi is an open-source, production-ready [[agentic_rag|agentic RAG]] system with over 8,000 GitHub stars, backed by Y Combinator.(([[https://www.ycombinator.com|Y Combinator — startup accelerator backing SciPhi/R2R]]))((https://github.com/SciPhi-AI/R2R)) It provides a complete infrastructure for transforming unstructured data into actionable intelligence through automated [[knowledge_graphs|knowledge graphs]], [[hybrid_search|hybrid search]], and multi-step reasoning, all exposed via a RESTful API. R2R outperforms competing frameworks like [[llamaindex|LlamaIndex]] (2.3x faster) and [[haystack|Haystack]] (3.1x faster) in ingestion throughput at over 160,000 tokens per second.((https://r2r-docs.sciphi.ai)) Its architecture is designed for enterprises processing large document collections (10-100s of GBs) who need deeper reasoning capabilities beyond simple vector search. ===== Architecture ===== R2R employs a [[modular|modular]], service-oriented architecture with four core pipelines:((https://r2r-docs.sciphi.ai)) * **Ingestion Pipeline**, Transforms multimodal data (PDF, TXT, JSON, PNG, MP3) into embeddable documents via native parsing, chunking, and entity extraction using LLMs * **Embedding Pipeline**, Handles text transformation, [[chunking_strategies|chunking strategies]], and vector embedding generation * **RAG Pipeline**, Extends embedding with LLM completions, supporting agentic workflows with tool use * **Knowledge Graph Pipeline**, Builds document-level and collection-level graphs using entity/relationship extraction with community detection All pipelines are orchestrated through a centralized RESTful API for language-agnostic integration. The system includes built-in user management, collection organization, and conversation tracking with branching support. ===== Code Example ===== from r2r import R2RClient # Connect to R2R server (launch with: r2r serve --docker) client = R2RClient(base_url="http://localhost:7272") # Ingest documents - supports PDF, TXT, JSON, PNG, MP3 client.documents.create( file_path="./quarterly_report.pdf", metadata={"department": "finance", "quarter": "Q3"} ) # Ingest from raw text client.documents.create( raw_text="R2R supports multimodal ingestion and [[knowledge_graphs|knowledge graphs]].", metadata={"source": "manual"} ) # Build knowledge graph from ingested documents client.graphs.build(collection_id="default") # Perform [[hybrid_search|hybrid search]] (vector + keyword) search_results = client.search( query="What were Q3 revenue trends?", search_settings={"use_hybrid_search": True} ) # [[agentic_rag|Agentic RAG]] with knowledge graph reasoning rag_response = client.retrieval.rag( query="Compare revenue trends across all quarters and identify patterns", rag_generation_config={"model": "gpt-4o"}, search_settings={"use_graph_search": True} ) print(rag_response.results.generated_answer) # Deep Research API for multi-step reasoning research = client.retrieval.deep_research( query="What strategic recommendations emerge from our financial data?", ) print(research.results.answer) # Citation-backed answer with reasoning logs ===== System Flow ===== %%{init: {"theme": "base", "themeVariables": {"primaryColor": "#27AE60"}}}%% graph TD A[Documents: PDF, TXT, PNG, MP3] --> B[Ingestion Pipeline] B --> C[Parsing + Chunking] C --> D[Entity Extraction] D --> E[[[embeddings|Vector Embeddings]]] D --> F[Knowledge Graph] E --> G[(Vector Store)] F --> H[(Graph Database)] I[User Query] --> J[REST API] J --> K{Search Strategy} K -->|Vector| G K -->|Keyword| L[Full-Text Index] K -->|Graph| H G --> M[Hybrid Results Fusion] L --> M H --> M M --> N[LLM Generation] N --> O[Citation-Backed Answer] J --> P[Deep Research API] P --> Q[Multi-Step Reasoning] Q --> R[Tool-Based Search Loops] R --> O ===== Key Features ===== * **[[hybrid_search|Hybrid Search]]**, Combines semantic vector search, keyword search, and HyDE (Hypothetical Document [[embeddings|Embeddings]]) * **[[knowledge_graphs|Knowledge Graphs]]**, Automated entity/relationship extraction with community detection at document and collection levels * **Deep Research API**, Multi-step agentic reasoning with citation-backed answers and reasoning transparency * **Multimodal Ingestion**, Native support for PDF, TXT, JSON, PNG, MP3 without external tools * **RESTful API**, Complete API coverage for all operations with Python and JavaScript SDKs * **R2R Dashboard**, Open-source React/Next.js UI for document management, playground, and analytics * **Enterprise-Grade**, User management, collection-based access control, observability, and audit logs ===== See Also ===== * [[ragas|RAGAS: RAG Evaluation Framework]] * [[agentic_rag|Agentic RAG]] * [[xagent|XAgent: Autonomous LLM Agent for Complex Tasks]] * [[rag_framework_comparison|RAG Framework Comparison]] * [[rag_system_production_deployment|RAG System Production Deployment]] ===== References ===== * [[https://github.com/SciPhi-AI/R2R|R2R GitHub Repository]] * [[https://r2r-docs.sciphi.ai|Official Documentation]] * [[https://[[github|github]].com/SciPhi-AI/R2R-Application|R2R Dashboard Repository]](([[https://[[github|github]])).com/SciPhi-AI/R2R-Application|R2R Application — open-source React/Next.js management dashboard]])) * [[https://www.ycombinator.com|Y Combinator (Backer)]]