This shows you the differences between two versions of the page.
| retrieval_augmented_generation [2026/03/24 16:37] – Create RAG page with researched content agent | retrieval_augmented_generation [2026/03/24 17:42] (current) – Add LaTeX math formatting agent | ||
|---|---|---|---|
| Line 7: | Line 7: | ||
| RAG operates in three stages: | RAG operates in three stages: | ||
| - | - **Retrieval** — A query is embedded into a vector and used to search a knowledge base (vector database, keyword index, or hybrid) for relevant document chunks | + | - **Retrieval** — A query is embedded into a vector |
| - **Augmentation** — Retrieved chunks are injected into the LLM prompt alongside the user query to provide grounding context | - **Augmentation** — Retrieved chunks are injected into the LLM prompt alongside the user query to provide grounding context | ||
| - | - **Generation** — The LLM synthesizes a response using both its training knowledge and the retrieved context | + | - **Generation** — The LLM synthesizes a response |
| ===== RAG Variants ===== | ===== RAG Variants ===== | ||
| Line 15: | Line 15: | ||
| ==== Naive RAG ==== | ==== Naive RAG ==== | ||
| - | The simplest implementation: | + | The simplest implementation: |
| ==== Advanced RAG ==== | ==== Advanced RAG ==== | ||
| Line 22: | Line 22: | ||
| * **Pre-retrieval** — Query rewriting (HyDE, ITER-RETGEN), | * **Pre-retrieval** — Query rewriting (HyDE, ITER-RETGEN), | ||
| - | * **Retrieval** — Hybrid search combining semantic vectors with BM25 keyword matching, plus fine-tuned embedding models | + | * **Retrieval** — Hybrid search combining semantic vectors with BM25 keyword matching |
| * **Post-retrieval** — Reranking retrieved results (Cohere Rerank, cross-encoders), | * **Post-retrieval** — Reranking retrieved results (Cohere Rerank, cross-encoders), | ||
| Line 58: | Line 58: | ||
| splitter = RecursiveCharacterTextSplitter( | splitter = RecursiveCharacterTextSplitter( | ||
| chunk_size=512, | chunk_size=512, | ||
| - | separators=[" | + | separators=[" |
| - | + | ||
| - | ", " | + | |
| - | ", ". ", " "] | + | |
| ) | ) | ||
| chunks = splitter.split_documents(documents) | chunks = splitter.split_documents(documents) | ||
| Line 80: | Line 77: | ||
| llm = ChatOpenAI(model=" | llm = ChatOpenAI(model=" | ||
| docs = retriever.invoke(" | docs = retriever.invoke(" | ||
| - | context = " | + | context = "\n" |
| - | " | + | response = llm.invoke(f" |
| - | response = llm.invoke(f" | + | |
| - | + | ||
| - | Question: How does GraphRAG improve retrieval?" | + | |
| </ | </ | ||
| Line 91: | Line 85: | ||
| [[https:// | [[https:// | ||
| - | * **Faithfulness** — Are generated claims supported by retrieved context? | + | * **Faithfulness** — Are generated claims supported by retrieved context? |
| * **Answer relevance** — Does the response address the actual question? | * **Answer relevance** — Does the response address the actual question? | ||
| - | * **Context precision** — How much of the retrieved context is relevant? | + | * **Context precision** — How much of the retrieved context is relevant? |
| - | * **Context recall** — Were all necessary documents retrieved? | + | * **Context recall** — Were all necessary documents retrieved? |
| ===== References ===== | ===== References ===== | ||
| Line 109: | Line 103: | ||
| * [[agent_memory_frameworks]] — Memory systems that build on RAG patterns | * [[agent_memory_frameworks]] — Memory systems that build on RAG patterns | ||
| * [[vector_databases]] — Storage infrastructure for RAG | * [[vector_databases]] — Storage infrastructure for RAG | ||
| - | |||