====== Weaviate ======

**Weaviate** is an open-source vector database written in **Go** that stores both data objects and their vector embeddings, enabling semantic search, hybrid search, and structured filtering at scale. With over **16,000 stars** on GitHub,(([[https://github.com/weaviate/weaviate|GitHub Repository]])) it provides a cloud-native, real-time vector search engine using Hierarchical Navigable Small World (HNSW) graphs — achieving >95%% recall with millisecond latency.(([[https://weaviate.io/blog/vector-search-explained|Vector Search Explained]]))

Weaviate combines the power of vector similarity search with traditional structured data management, offering GraphQL and REST APIs, built-in AI model integration for automatic embedding generation, and horizontal scaling to billions of objects.(([[https://weaviate.io|Official Website]]))

===== How It Works =====

Weaviate stores data objects alongside their vector embeddings in an HNSW index — a hierarchical, multi-layered graph structure optimized for approximate nearest neighbor (ANN) search.(([[https://weaviate.io/blog/what-is-a-vector-database|What Is a Vector Database]])) When a query arrives, Weaviate can perform **semantic search** (vector similarity), **keyword search** (BM25), or **hybrid search** (combining both) with optional structured filters on object properties.

The database supports automatic vectorization through **modules** — pluggable vectorizers that generate embeddings during data ingestion using models like BERT, SBERT, OpenAI, or Cohere. This eliminates the need for a separate embedding pipeline.(([[https://www.datacamp.com/tutorial/weaviate-tutorial|DataCamp Weaviate Tutorial]]))

===== Key Features =====

  * **Vector search** — HNSW-based ANN search with >95%% recall at millisecond latency
  * **Hybrid search** — Combine semantic vector search with BM25 keyword search
  * **Structured filtering** — Blend similarity search with property-based constraints
  * **Auto-vectorization** — Built-in modules for OpenAI, Cohere, BERT, SBERT embeddings
  * **GraphQL API** — Complex queries with nested references and aggregations
  * **Real-time CRUD** — Full create, read, update, delete with live index updates
  * **Horizontal scaling** — Distributed architecture for billions of objects
  * **Multi-tenancy** — Isolated data per tenant with shared infrastructure
  * **Module ecosystem** — Vectorizers, readers, generators, and rankers

===== Installation and Usage =====

<code python>
# Install Weaviate client
# pip install weaviate-client

# Start Weaviate with Docker
# docker run -d -p 8080:8080 -p 50051:50051 \
#   cr.weaviate.io/semitechnologies/weaviate:latest

import weaviate
import weaviate.classes as wvc

# Connect to Weaviate
client = weaviate.connect_to_local()

# Create a collection with auto-vectorization
collection = client.collections.create(
    name="Article",
    vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_openai(),
    properties=[
        wvc.config.Property(name="title", data_type=wvc.config.DataType.TEXT),
        wvc.config.Property(name="content", data_type=wvc.config.DataType.TEXT),
        wvc.config.Property(name="category", data_type=wvc.config.DataType.TEXT),
    ]
)

# Insert data (embeddings generated automatically)
articles = client.collections.get("Article")
articles.data.insert({"title": "Intro to ML", "content": "Machine learning...", "category": "AI"})
articles.data.insert({"title": "Go Concurrency", "content": "Goroutines...", "category": "Programming"})

# Semantic search (nearest neighbors by meaning)
results = articles.query.near_text(
    query="artificial intelligence basics",
    limit=5
)

# Hybrid search (vector + keyword)
results = articles.query.hybrid(
    query="machine learning tutorial",
    alpha=0.75,  # 0=keyword only, 1=vector only
    limit=5
)

# Filtered vector search
results = articles.query.near_text(
    query="deep learning",
    filters=wvc.query.Filter.by_property("category").equal("AI"),
    limit=5
)

client.close()
</code>

===== Architecture =====

<code>
%%{init: {'theme': 'dark'}}%%
graph TB
    App([Application]) -->|GraphQL / REST / gRPC| API[Weaviate API Layer]
    API -->|Query| QE[Query Engine]
    QE -->|Vector Search| HNSW[HNSW Index]
    QE -->|Keyword Search| BM25[Inverted Index BM25]
    QE -->|Hybrid| Fusion[Score Fusion]
    QE -->|Filters| Filter[Property Filters]
    HNSW -->|ANN Results| Fusion
    BM25 -->|BM25 Results| Fusion
    Filter -->|Filtered Set| Fusion
    Fusion -->|Ranked Results| API
    API -->|Ingest| Ingest[Data Ingestion]
    Ingest -->|Auto-Vectorize| Modules{Vectorizer Modules}
    Modules -->|OpenAI| OAI[text2vec-openai]
    Modules -->|Cohere| Cohere[text2vec-cohere]
    Modules -->|Local| SBERT[text2vec-transformers]
    Ingest -->|Store| Storage[Object Store + HNSW]
    subgraph Written in Go
        API
        QE
        HNSW
        BM25
        Storage
    end
</code>

===== Search Modes =====

^ Mode ^ Method ^ Description ^
| Semantic | ''near_text'' / ''near_vector'' | Find objects by meaning using vector similarity |
| Keyword | ''bm25'' | Traditional full-text search with BM25 ranking |
| Hybrid | ''hybrid'' | Combine vector and keyword with configurable alpha |
| Filtered | Any + ''filters'' | Add property constraints to any search mode |

===== Deployment Options =====

  * **Docker** — Single-node for development and small workloads
  * **Kubernetes** — Distributed deployment with horizontal scaling
  * **Weaviate Cloud Services (WCS)** — Managed cloud with auto-scaling
  * **Embedded** — In-process for testing and prototyping

===== See Also =====

  * [[arize_phoenix|Arize Phoenix — AI Observability]]
  * [[outlines|Outlines — Structured Output via Constrained Decoding]]
  * [[chainlit|Chainlit — Conversational AI Framework]]

===== References =====