Table of Contents

RAGFlow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine developed by Infiniflow that specializes in deep document understanding through advanced parsing capabilities including OCR, table structure recognition, and document layout analysis.1) With over 76,000 GitHub stars, it excels at handling complex documents that other RAG systems struggle with.

Repository github.com/infiniflow/ragflowgithub.com/infiniflow/ragflow]]
License Apache 2.0
Language Python
Stars 76K+
Category RAG Engine

Key Features

Architecture

RAGFlow decouples data extraction from chunking (since v0.17.0), allowing independent selection of visual models for each processing task. The pipeline flows through ingestion, parsing, embedding, retrieval, and generation stages.4)

graph TB subgraph Input["Document Input"] PDF[PDF Documents] DOCX[Word / Excel] IMG[Images] TXT[Text / Email] end subgraph Parsing["Deep Document Parsing"] DeepDoc[DeepDoc Engine] OCR[OCR Module] TSR[Table Structure Recognition] DLR[Layout Recognition] end subgraph Processing["Processing Pipeline"] Chunk[Chunking Engine] TOC[TOC Extraction] Embed[Embedding Generator] end subgraph Storage["Storage Layer"] VDB[(Vector Database)] <a href='/meta' class='wikilink1' title='meta' data-wiki-id='meta'>Meta</a>[(Metadata Store)] end subgraph Query["Query Pipeline"] Retrieve[Hybrid Retrieval] Rerank[[[reranking|Reranking]]] Generate[LLM Generation] end Input --> Parsing DeepDoc --> OCR DeepDoc --> TSR DeepDoc --> DLR Parsing --> Processing Processing --> Storage Storage --> Query TOC --> Retrieve

Document Parsing Details

RAGFlow's parsing capabilities are the core differentiator:5)

OCR and Vision Models

RAGFlow integrates multiple OCR and vision-based approaches for robust document understanding. Beyond its built-in OCR capabilities, the system can leverage complementary open-source models. MinerU-Diffusion is a 2.5B parameter open-source OCR model released by researchers from Shanghai AI Lab and Peking University that supports layout detection, plain text recognition, LaTeX formula output, and table recognition with high throughput for document processing pipelines.6)))

RAGFlow's visual model flexibility allows users to configure which OCR and parsing models suit their specific document types and performance requirements, enabling integration with specialized open-source models where appropriate.

Code Example

import requests
 
RAGFLOW_API = "http://localhost:9380/api/v1"
API_KEY = "ragflow-your-api-key"
HEADERS = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}
 
# Create a knowledge base (dataset)
dataset = requests.post(f"{RAGFLOW_API}/datasets",
    headers=HEADERS,
    json={"name": "technical_docs", "chunk_method": "naive"}
).json()
 
dataset_id = dataset["data"]["id"]
 
# Upload a document
with open("complex_report.pdf", "rb") as f:
    upload = requests.post(
        f"{RAGFLOW_API}/datasets/{dataset_id}/documents",
        headers={"Authorization": f"Bearer {API_KEY}"},
        files={"file": f}
    ).json()
 
# Query the knowledge base with RAG
answer = requests.post(f"{RAGFLOW_API}/chats",
    headers=HEADERS,
    json={"question": "What were the Q3 revenue figures?",
          "dataset_ids": dataset_id}
).json()
print(answer["data"]["answer"])

See Also

References

1)
https://[[github|github]].com/infiniflow/ragflow
3)
https://[[github|github]].com/infiniflow/ragflow/blob/main/deepdoc/README.md
6)
https://alphasignalai.substack.com/p/mineru-diffusion-ocr-has-been-reading|AlphaSignal AI - MinerU-Diffusion: OCR Has Been Reading (Year