Browse
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety
Meta
Browse
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety
Meta
GraphRAG is a retrieval-augmented generation approach developed by Microsoft Research that integrates knowledge graphs with hierarchical community structures to enable AI systems to reason over complex, interconnected data. Introduced in early 2024, GraphRAG substantially outperforms naive vector-based RAG on multi-hop queries, achieving 3.4x accuracy gains (80% vs. 50% correct answers) and 72-83% comprehensiveness scores.1)
GraphRAG extends standard RAG by building a knowledge graph from source text and layering it with community hierarchies:
Version 1.0 reduced storage requirements by 43% through an optimized data model.
| Mode | Description | Best For |
|---|---|---|
| Local Search | Fans out from specific entities to neighbors, retrieves subgraph context | Entity-specific questions (“What about CompanyX?”) |
| Global Search | Aggregates community summaries for corpus-wide insights | Holistic questions across entire dataset |
| DRIFT Search | Hybrid: generates follow-up queries from communities, then iterates via local search | Dynamic multi-hop reasoning |
| Basic Search | Standard vector RAG fallback | Simple semantic queries |
import asyncio from pathlib import Path from graphrag.config import create_graphrag_config from graphrag.index import create_pipeline_config, run_pipeline from graphrag.query.indexer_adapters import ( read_indexer_communities, read_indexer_entities, read_indexer_reports, ) from graphrag.query.llm.oai.chat_openai import ChatOpenAI from graphrag.query.structured_search.global_search.search import GlobalSearch from graphrag.query.structured_search.global_search.community_context import GlobalCommunityContext # Step 1: Index documents (run once to build the knowledge graph) # graphrag index --root ./my_project # Step 2: Query the indexed graph programmatically output_dir = Path("./my_project/output") llm = ChatOpenAI(model="gpt-4o", api_key="your-key") # Load indexed artifacts community_reports = read_indexer_reports( output_dir / "community_reports.parquet" ) entities = read_indexer_entities( output_dir / "entities.parquet" ) # Build the global search context from community summaries context_builder = GlobalCommunityContext( community_reports=community_reports, entities=entities, token_encoder=llm.token_encoder, ) # Run a global search query across the entire corpus search_engine = GlobalSearch( llm=llm, context_builder=context_builder, max_data_tokens=12000, ) async def main(): result = await search_engine.asearch( "What are the main themes and relationships in this dataset?" ) print(result.response) print(f"LLM calls: {result.llm_calls}, Tokens: {result.prompt_tokens}") asyncio.run(main())
Traditional vector-based RAG retrieves individual text chunks by semantic similarity, struggling with:
GraphRAG addresses these through:
LazyGraphRAG (June 2025) optimizes the approach by skipping upfront LLM summarization during indexing. It builds lightweight graphs using NLP techniques and computes summaries dynamically at query time, reducing indexing costs to 0.1% of full GraphRAG while maintaining query quality.