====== Vanna: AI Text-to-SQL via RAG ====== Vanna is an open-source Python framework that converts natural language questions into accurate SQL queries using Retrieval-Augmented Generation (RAG). With over 23,000 GitHub stars and MIT licensing, Vanna enables anyone to query databases in plain English without writing SQL. The framework learns your specific schema through a training process, then uses that knowledge combined with LLM inference to generate precise, executable SQL statements. Vanna works with virtually any SQL database and any LLM provider, including local models via Ollama. Version 2.0 introduced enterprise features like user-aware security, streaming responses, row-level access control, and a modern web interface built on FastAPI. ===== Architecture ===== Vanna operates in two phases: **training** and **querying**. During training, you feed the RAG model your database schema (DDL statements), documentation, and example question-SQL pairs. This information is stored in a vector database for later retrieval. During querying, the user's natural language question is embedded and matched against the trained knowledge, which is then injected into the LLM context to produce accurate SQL. The framework is built around a modular architecture with swappable components for the LLM service, vector store, SQL runner, and agent memory. Key API tools include: * **RunSqlTool** -- Executes generated SQL against your database * **VisualizeDataTool** -- Transforms results into charts and tables * **SaveQuestionToolArgsTool** -- Enables learning from correct past queries * **SaveTextMemoryTool** -- Stores textual context for future retrieval ===== Code Example ===== from vanna import Agent from vanna.core.registry import ToolRegistry from vanna.tools import RunSqlTool, VisualizeDataTool from vanna.integrations.ollama import OllamaLlmService from vanna.integrations.sqlite import SqliteRunner from vanna.integrations.local.agent_memory import DemoAgentMemory # Configure local LLM via Ollama llm = OllamaLlmService(model="llama3.2", host="http://localhost:11434") # Set up database runner and agent memory db_tool = RunSqlTool(sql_runner=SqliteRunner(database_path="./analytics.db")) agent_memory = DemoAgentMemory(max_items=1000) # Register tools with access control tools = ToolRegistry() tools.register_local_tool(db_tool, access_groups=["admin", "analyst"]) tools.register_local_tool(VisualizeDataTool(), access_groups=["admin", "analyst"]) # Create the agent agent = Agent(llm_service=llm, tool_registry=tools, agent_memory=agent_memory) # Train on your schema agent.train(ddl="CREATE TABLE sales (id INT, product TEXT, amount DECIMAL, date DATE)") agent.train(question="total sales by product", sql="SELECT product, SUM(amount) FROM sales GROUP BY product") # Ask questions in natural language result = agent.ask("What were the top 5 products by revenue last quarter?") print(result) ===== System Flow ===== %%{init: {"theme": "base", "themeVariables": {"primaryColor": "#4A90D9"}}}%% graph TD A[User Question] --> B[Embedding Service] B --> C[Vector Store Lookup] C --> D[Retrieve Schema + Examples] D --> E[LLM Context Assembly] E --> F[SQL Generation] F --> G{Valid SQL?} G -->|Yes| H[Execute on Database] G -->|No| I[Error Feedback Loop] I --> E H --> J[Results + Visualization] J --> K[Agent Memory Save] K --> L[Return to User] ===== Key Features ===== * **Database Agnostic** -- PostgreSQL, MySQL, SQLite, BigQuery, Snowflake, and more * **LLM Flexibility** -- OpenAI, Anthropic, Ollama (local), or any provider * **Self-Learning** -- Improves accuracy through user feedback loops * **Version 2.0** -- User-aware security, streaming responses, FastAPI integration * **Deployment Options** -- Jupyter, Streamlit, Flask, Teams bot, or embedded widget * **Privacy-First** -- Run fully local with Ollama and ChromaDB ===== References ===== * [[https://github.com/vanna-ai/vanna|Vanna GitHub Repository]] * [[https://vanna.ai/docs/|Official Documentation]] * [[https://vanna.ai|Vanna AI Website]] * Shahul Es et al., "Evaluating Retrieval-Augmented Generation" (2023) ===== See Also ===== * [[ragas|RAGAS]] -- Evaluate the quality of your RAG-powered SQL pipeline * [[r2r|R2R]] -- Production agentic RAG with knowledge graphs * [[mastra|Mastra]] -- TypeScript framework for AI agent workflows