LightRAG is a knowledge graph-based retrieval-augmented generation framework that integrates graph structures with vector representations to enable efficient, dual-level information retrieval from document collections. With over 30,000 GitHub stars and an associated EMNLP 2025 paper, it represents a significant evolution beyond classical RAG approaches.
| Repository | github.com/HKUDS/LightRAG |
| License | MIT |
| Language | Python |
| Stars | 30K+ |
| Category | Knowledge Graph RAG |
LightRAG's architecture consists of three primary components that work together to provide graph-enhanced retrieval:
1. Graph-Based Text Indexing: Documents are segmented into chunks, and LLMs extract entities (names, dates, locations, events) and their relationships. This constructs a comprehensive knowledge graph with key-value data structures for optimized retrieval.
2. Dual-Level Retrieval Paradigm: The system generates query keys at both detailed and abstract levels to accommodate diverse query types.
3. Retrieval-Augmented Answer Generation: A general-purpose LLM generates answers by processing concatenated values from relevant entities, relations, names, descriptions, and text excerpts.
The retrieval paradigm operates at two distinct levels:
By generating query keys at both levels, LightRAG ensures comprehensive, contextually relevant responses regardless of query complexity.
LightRAG supports five distinct query modes:
from lightrag import LightRAG, QueryParam from lightrag.llm import openai_complete, openai_embedding import os os.environ["OPENAI_API_KEY"] = "your-api-key" # Initialize LightRAG with working directory rag = LightRAG( working_dir="./lightrag_data", llm_model_func=openai_complete, llm_model_name="gpt-4o", embedding_func=openai_embedding, embedding_model_name="text-embedding-3-small" ) # Index documents (builds knowledge graph automatically) with open("research_paper.txt", "r") as f: rag.insert(f.read()) # Query with different modes result_naive = rag.query("What are the main findings?", param=QueryParam(mode="naive")) result_local = rag.query("What did the authors conclude about X?", param=QueryParam(mode="local")) result_hybrid = rag.query("How does this relate to the broader field?", param=QueryParam(mode="hybrid")) print(result_hybrid)