====== Vanna: AI Text-to-SQL via RAG ======
Vanna is an open-source Python framework that converts natural language questions into accurate SQL queries using Retrieval-Augmented Generation (RAG). With over 23,000 GitHub stars and MIT licensing, Vanna enables anyone to query databases in plain English without writing SQL. The framework learns your specific schema through a training process, then uses that knowledge combined with LLM inference to generate precise, executable SQL statements.
Vanna works with virtually any SQL database and any LLM provider, including local models via Ollama. Version 2.0 introduced enterprise features like user-aware security, streaming responses, row-level access control, and a modern web interface built on FastAPI.
===== Architecture =====
Vanna operates in two phases: **training** and **querying**. During training, you feed the RAG model your database schema (DDL statements), documentation, and example question-SQL pairs. This information is stored in a vector database for later retrieval. During querying, the user's natural language question is embedded and matched against the trained knowledge, which is then injected into the LLM context to produce accurate SQL.
The framework is built around a modular architecture with swappable components for the LLM service, vector store, SQL runner, and agent memory. Key API tools include:
* **RunSqlTool** -- Executes generated SQL against your database
* **VisualizeDataTool** -- Transforms results into charts and tables
* **SaveQuestionToolArgsTool** -- Enables learning from correct past queries
* **SaveTextMemoryTool** -- Stores textual context for future retrieval
===== Code Example =====
from vanna import Agent
from vanna.core.registry import ToolRegistry
from vanna.tools import RunSqlTool, VisualizeDataTool
from vanna.integrations.ollama import OllamaLlmService
from vanna.integrations.sqlite import SqliteRunner
from vanna.integrations.local.agent_memory import DemoAgentMemory
# Configure local LLM via Ollama
llm = OllamaLlmService(model="llama3.2", host="http://localhost:11434")
# Set up database runner and agent memory
db_tool = RunSqlTool(sql_runner=SqliteRunner(database_path="./analytics.db"))
agent_memory = DemoAgentMemory(max_items=1000)
# Register tools with access control
tools = ToolRegistry()
tools.register_local_tool(db_tool, access_groups=["admin", "analyst"])
tools.register_local_tool(VisualizeDataTool(), access_groups=["admin", "analyst"])
# Create the agent
agent = Agent(llm_service=llm, tool_registry=tools, agent_memory=agent_memory)
# Train on your schema
agent.train(ddl="CREATE TABLE sales (id INT, product TEXT, amount DECIMAL, date DATE)")
agent.train(question="total sales by product", sql="SELECT product, SUM(amount) FROM sales GROUP BY product")
# Ask questions in natural language
result = agent.ask("What were the top 5 products by revenue last quarter?")
print(result)
===== System Flow =====
%%{init: {"theme": "base", "themeVariables": {"primaryColor": "#4A90D9"}}}%%
graph TD
A[User Question] --> B[Embedding Service]
B --> C[Vector Store Lookup]
C --> D[Retrieve Schema + Examples]
D --> E[LLM Context Assembly]
E --> F[SQL Generation]
F --> G{Valid SQL?}
G -->|Yes| H[Execute on Database]
G -->|No| I[Error Feedback Loop]
I --> E
H --> J[Results + Visualization]
J --> K[Agent Memory Save]
K --> L[Return to User]
===== Key Features =====
* **Database Agnostic** -- PostgreSQL, MySQL, SQLite, BigQuery, Snowflake, and more
* **LLM Flexibility** -- OpenAI, Anthropic, Ollama (local), or any provider
* **Self-Learning** -- Improves accuracy through user feedback loops
* **Version 2.0** -- User-aware security, streaming responses, FastAPI integration
* **Deployment Options** -- Jupyter, Streamlit, Flask, Teams bot, or embedded widget
* **Privacy-First** -- Run fully local with Ollama and ChromaDB
===== References =====
* [[https://github.com/vanna-ai/vanna|Vanna GitHub Repository]]
* [[https://vanna.ai/docs/|Official Documentation]]
* [[https://vanna.ai|Vanna AI Website]]
* Shahul Es et al., "Evaluating Retrieval-Augmented Generation" (2023)
===== See Also =====
* [[ragas|RAGAS]] -- Evaluate the quality of your RAG-powered SQL pipeline
* [[r2r|R2R]] -- Production agentic RAG with knowledge graphs
* [[mastra|Mastra]] -- TypeScript framework for AI agent workflows