Microsoft GraphRAG

GraphRAG is a retrieval-augmented generation approach developed by Microsoft Research that integrates knowledge graphs with hierarchical community structures to enable AI systems to reason over complex, interconnected data. Introduced in early 2024, GraphRAG substantially outperforms naive vector-based RAG on multi-hop queries, achieving 3.4x accuracy gains (80% vs. 50% correct answers) and 72-83% comprehensiveness scores.¹⁾

Website: microsoft.github.io/graphrag/|microsoft.github.io/graphrag]]
GitHub: microsoft/graphrag|github.com/microsoft/graphrag]]
Paper: arXiv:2404.16130 (2024), Microsoft Research
Version: 1.0 (December 2024)

How It Works

GraphRAG extends standard RAG by building a knowledge graph from source text and layering it with community hierarchies:

Indexing (Graph Creation)

Text Chunking: Divide input into TextUnits
Entity/Relationship Extraction: Use LLMs with domain-specific prompts to extract entities (nodes) and relationships (edges with weights)
Community Detection: Cluster entities hierarchically using the Leiden algorithm
Community Summarization: Generate LLM summaries for each community covering entities, themes, events, and evidence
Embedding Generation: Create embeddings for entities, summaries, and TextUnits; store in vector databases (LanceDB, Azure AI Search)

Version 1.0 reduced storage requirements by 43% through an optimized data model.

Query Modes

Mode	Description	Best For
Local Search	Fans out from specific entities to neighbors, retrieves subgraph context	Entity-specific questions (“What about CompanyX?”)
Global Search	Aggregates community summaries for corpus-wide insights	Holistic questions across entire dataset
DRIFT Search	Hybrid: generates follow-up queries from communities, then iterates via local search	Dynamic multi-hop reasoning
Basic Search	Standard vector RAG fallback	Simple semantic queries

Python Example

import asyncio
from pathlib import Path
from graphrag.config import create_graphrag_config
from graphrag.index import create_pipeline_config, run_pipeline
from graphrag.query.indexer_adapters import (
    read_indexer_communities,
    read_indexer_entities,
    read_indexer_reports,
)
from graphrag.query.llm.oai.chat_openai import ChatOpenAI
from graphrag.query.structured_search.global_search.search import GlobalSearch
from graphrag.query.structured_search.global_search.community_context import GlobalCommunityContext
 
# Step 1: Index documents (run once to build the knowledge graph)
# graphrag index --root ./my_project
 
# Step 2: Query the indexed graph programmatically
output_dir = Path("./my_project/output")
 
llm = ChatOpenAI(model="gpt-4o", api_key="your-key")
 
# Load indexed artifacts
community_reports = read_indexer_reports(
    output_dir / "community_reports.parquet"
)
entities = read_indexer_entities(
    output_dir / "entities.parquet"
)
 
# Build the global search context from community summaries
context_builder = GlobalCommunityContext(
    community_reports=community_reports,
    entities=entities,
    token_encoder=llm.token_encoder,
)
 
# Run a global search query across the entire corpus
search_engine = GlobalSearch(
    llm=llm,
    context_builder=context_builder,
    max_data_tokens=12000,
)
 
async def main():
    result = await search_engine.asearch(
        "What are the main themes and relationships in this dataset?"
    )
    print(result.response)
    print(f"LLM calls: {result.llm_calls}, Tokens: {result.prompt_tokens}")
 
asyncio.run(main())

Improvements Over Naive RAG

Traditional vector-based RAG retrieves individual text chunks by semantic similarity, struggling with:

Questions requiring information scattered across multiple documents
Multi-hop reasoning chains
Global pattern recognition across large corpora

GraphRAG addresses these through:

Graph reasoning: Connects disparate facts across documents
Hierarchical summaries: Provide scalable context without token limits
Dynamic community selection: Automatically picks optimal hierarchy levels (added November 2024)

LazyGraphRAG

LazyGraphRAG (June 2025) optimizes the approach by skipping upfront LLM summarization during indexing. It builds lightweight graphs using NLP techniques and computes summaries dynamically at query time, reducing indexing costs to 0.1% of full GraphRAG while maintaining query quality.

Implementation

Pipeline: Python-based, configurable via YAML/JSON
LLM Support: GPT-4 Turbo recommended for graph extraction; configurable for other models
Cost Considerations: Indexing is 100-1000x more expensive than vector RAG (mitigated by LazyGraphRAG)
Auto-tuning: Domain-specific prompt auto-tuning since September 2024

Related Pages

References

¹⁾

Edge, D. et al. "From Local to Global: A Graph RAG Approach to Query-Focused Summarization." arXiv:2404.16130, 2024

AI Agent Knowledge Base

Sidebar

Table of Contents

Microsoft GraphRAG

How It Works

Indexing (Graph Creation)

Query Modes

Python Example

Improvements Over Naive RAG

LazyGraphRAG

Implementation

Related Pages

See Also

References

AI Agent Knowledge Base

User Tools

Site Tools

Sidebar

Table of Contents

Microsoft GraphRAG

How It Works

Indexing (Graph Creation)

Query Modes

Python Example

Improvements Over Naive RAG

LazyGraphRAG

Implementation

Related Pages

See Also

References

Page Tools