====== Milvus ======
**Milvus** is a cloud-native, open-source vector database built primarily in Go and C++ and optimized for high-performance similarity search on massive-scale embedding datasets. With over **43,000 GitHub stars** and CNCF graduated status, it is one of the most mature and battle-tested vector databases available.(([[https://github.com/milvus-io/milvus|Milvus GitHub Repository]]))

| **Repository** | [[https://[[github|github]].com/milvus-io/milvus|github.com/milvus-io/milvus]] |
| **License** | Apache 2.0 |
| **Language** | Go, C++ |
| **Stars** | 43K+ |
| **Category** | Vector Database |

===== Key Features =====
  * **Billion-Scale Search**, Handles billion-scale vector datasets with millisecond latency
  * **Disaggregated Architecture**, Independent scaling of storage and compute with microservices design
  * **Multiple Index Types**, HNSW, IVF, DiskANN with Int8 compression for memory efficiency
  * **[[hybrid_search|Hybrid Search]]**, Combines vector similarity with scalar filtering, BM25 full-text search, and JSON path indexing
  * **Hot/Cold Tiering**, Automatic data tiering between memory/SSD and object storage based on access patterns
  * **Hardware Acceleration**, AVX512, [[neon|Neon]] SIMD, quantization, and GPU support
  * **Multi-Tenancy**, Supports up to 100K collections per cluster
  * **CNCF Graduated**, Production-proven with enterprise-grade reliability

===== Architecture =====
Milvus employs a multi-layered, microservices-based architecture separating access, coordination, worker, and storage layers:(([[https://milvus.io/docs/architecture_overview.md|Milvus Architecture Documentation]]))

  * **Access Layer**, Handles client requests, validation, and routing to internal nodes
  * **Coordinator Service**, The brain of the system; manages load balancing, data coordination, and query planning via etcd
  * **Streaming Node**, Go-based component built on Woodpecker WAL for real-time ingestion, eliminating external message queues
  * **Data Node and Query Node**, C++ worker nodes handling compaction, indexing on sealed segments, and query execution
  * **Object Storage**, All data persists in S3/MinIO with a diskless architecture for simplified geo-distributed deployments

<mermaid>
graph TB
    subgraph Access["Access Layer"]
        SDK[Client SDKs]
        Proxy[Proxy / Load Balancer]
    end
    subgraph Coord["Coordinator Service"]
        Root[Root Coord]
        Data[Data Coord]
        Query[Query Coord]
        Index[Index Coord]
    end
    subgraph Workers["Worker Nodes"]
        SN[Streaming Node - Go]
        DN[Data Node - C++]
        QN[Query Node - C++]
    end
    subgraph Storage["Storage Layer"]
        WAL[Woodpecker WAL]
        ObjStore[(Object Storage - S3/MinIO)]
        Etcd[(etcd - Metadata)]
    end
    Access --> Coord
    Coord --> Workers
    SN --> WAL
    DN --> ObjStore
    QN --> ObjStore
    Coord --> Etcd
</mermaid>

===== Indexing Algorithms =====
Milvus supports multiple vector indexing strategies:

  * **HNSW**, [[hnsw_graphs|Hierarchical Navigable Small World graphs]] for high-accuracy approximate nearest neighbor search; supports Int8 compression to reduce memory while preserving accuracy
  * **IVF**, Inverted File index for partitioned search with configurable accuracy/speed tradeoffs
  * **DiskANN**, Disk-optimized approximate nearest neighbor for datasets larger than memory
  * **BM25**, Full-text search index with 400% faster performance than Elasticsearch, supporting configurable drop_ratio_search

===== Storage and Tiering =====
Milvus 2.6 introduced tiered hot/cold separation:(([[https://milvus.io/blog/introduce-milvus-2-6-built-for-scale-designed-to-reduce-costs.md|Milvus 2.6 Release Blog]]))

  * **Hot data**, Frequently accessed data stays in memory/SSD for low-latency queries
  * **Cold data**, Infrequently accessed data migrates to object storage (S3/MinIO/NetApp)
  * **Dynamic migration**, Automatic tiering based on access patterns, reducing storage costs by 50%
  * **Storage v2**, Parquet/Vortex formats for Spark compatibility and reduced IOPS/memory usage

===== Code Example =====
<code python>
from pymilvus import MilvusClient, DataType

# Connect to Milvus
client = MilvusClient(uri="http://localhost:19530")

# Create collection with schema
client.create_collection(
    collection_name="articles",
    dimension=768,
    metric_type="COSINE",
    auto_id=True
)

# Insert vectors with metadata
data = [
    {"vector": [0.1] * 768, "title": "RAG Guide", "category": "ai"},
    {"vector": [0.2] * 768, "title": "Vector DBs", "category": "database"},
]
client.insert(collection_name="articles", data=data)

# [[hybrid_search|Hybrid search]]: vector similarity + scalar filter
results = client.search(
    collection_name="articles",
    data=[[0.15] * 768],
    filter='category == "ai"',
    limit=10,
    output_fields=["title", "category"]
)
for hits in results:
    for hit in hits:
        print(f"{hit['entity']['title']}: {hit['distance']:.4f}")
</code>

===== See Also =====
  * [[weaviate|Weaviate]]
  * [[qdrant|Qdrant]]
  * [[supabase_vector|Supabase Vector]]
  * [[lateon|LateOn]]
  * [[vector_db_comparison|Vector Database Comparison]]

===== References =====