This shows you the differences between two versions of the page.
| Next revision | Previous revision | ||
| when_to_use_rag_vs_fine_tuning [2026/03/25 15:41] – Move from architecture: namespace to flat agent | when_to_use_rag_vs_fine_tuning [2026/03/30 22:39] (current) – Restructure: footnotes as references agent | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== When to Use RAG vs Fine-Tuning vs Prompt Engineering ====== | ====== When to Use RAG vs Fine-Tuning vs Prompt Engineering ====== | ||
| - | Choosing between RAG, fine-tuning, | + | Choosing between RAG, fine-tuning, |
| ===== Overview of Approaches ===== | ===== Overview of Approaches ===== | ||
| Line 64: | Line 64: | ||
| * **Best for**: Dynamic knowledge, large document sets, citation requirements, | * **Best for**: Dynamic knowledge, large document sets, citation requirements, | ||
| * **Choose when**: Knowledge base > 10K tokens, data updates frequently, you need grounded answers | * **Choose when**: Knowledge base > 10K tokens, data updates frequently, you need grounded answers | ||
| - | * **Cost**: $500-5K setup (vector DB + embeddings pipeline), ~$0.005-0.05/ | + | * **Cost**: $500-5K setup (vector DB + embeddings pipeline), ~$0.005-0.05/ |
| * **Example**: | * **Example**: | ||
| Line 71: | Line 71: | ||
| * **Best for**: Domain-specific reasoning, consistent structured output, brand voice, specialized terminology | * **Best for**: Domain-specific reasoning, consistent structured output, brand voice, specialized terminology | ||
| * **Choose when**: Prompt engineering fails consistency, | * **Choose when**: Prompt engineering fails consistency, | ||
| - | * **Cost**: $1K-100K+ depending on model size; GPT-4o mini fine-tuning ~$3/1M training tokens | + | * **Cost**: $1K-100K+ depending on model size; GPT-4o mini fine-tuning ~$3/1M training tokens(([[https:// |
| * **Example**: | * **Example**: | ||
| ===== Hybrid Approaches ===== | ===== Hybrid Approaches ===== | ||
| - | Most production systems in 2025-2026 combine approaches: | + | Most production systems in 2025-2026 combine approaches:(([[https:// |
| === Prompt Engineering + RAG (Most Common) === | === Prompt Engineering + RAG (Most Common) === | ||
| Line 102: | Line 102: | ||
| === Fine-Tuning + RAG (Enterprise) === | === Fine-Tuning + RAG (Enterprise) === | ||
| - | Fine-tune for domain reasoning and output consistency. RAG for current data. Best for high-stakes domains like healthcare, legal, finance. | + | Fine-tune for domain reasoning and output consistency. RAG for current data. Best for high-stakes domains like healthcare, legal, finance.(([[https:// |
| === All Three (Maximum Quality) === | === All Three (Maximum Quality) === | ||
| Line 124: | Line 124: | ||
| - **Hybrid is the default**: 70%+ of production AI systems in 2026 use at least two approaches. | - **Hybrid is the default**: 70%+ of production AI systems in 2026 use at least two approaches. | ||
| - **Measure before deciding**: A/B test approaches on your specific use case. | - **Measure before deciding**: A/B test approaches on your specific use case. | ||
| - | |||
| - | ===== References ===== | ||
| - | |||
| - | * [[https:// | ||
| - | * [[https:// | ||
| - | * [[https:// | ||
| - | * [[https:// | ||
| - | * [[https:// | ||
| ===== See Also ===== | ===== See Also ===== | ||
| Line 138: | Line 130: | ||
| * [[how_to_structure_system_prompts|How to Structure System Prompts]] — Maximize prompt engineering effectiveness | * [[how_to_structure_system_prompts|How to Structure System Prompts]] — Maximize prompt engineering effectiveness | ||
| * [[single_vs_multi_agent|Single vs Multi-Agent Architectures]] — Choosing agent patterns | * [[single_vs_multi_agent|Single vs Multi-Agent Architectures]] — Choosing agent patterns | ||
| + | |||
| + | ===== References ===== | ||