Milvus

Milvus is a cloud-native, open-source vector database built primarily in Go and C++ and optimized for high-performance similarity search on massive-scale embedding datasets. With over 43,000 GitHub stars and CNCF graduated status, it is one of the most mature and battle-tested vector databases available.

Repository	github.com/milvus-io/milvus
License	Apache 2.0
Language	Go, C++
Stars	43K+
Category	Vector Database

Key Features

Billion-Scale Search – Handles billion-scale vector datasets with millisecond latency
Disaggregated Architecture – Independent scaling of storage and compute with microservices design
Multiple Index Types – HNSW, IVF, DiskANN with Int8 compression for memory efficiency
Hybrid Search – Combines vector similarity with scalar filtering, BM25 full-text search, and JSON path indexing
Hot/Cold Tiering – Automatic data tiering between memory/SSD and object storage based on access patterns
Hardware Acceleration – AVX512, Neon SIMD, quantization, and GPU support
Multi-Tenancy – Supports up to 100K collections per cluster
CNCF Graduated – Production-proven with enterprise-grade reliability

Architecture

Milvus employs a multi-layered, microservices-based architecture separating access, coordination, worker, and storage layers:

Access Layer – Handles client requests, validation, and routing to internal nodes
Coordinator Service – The brain of the system; manages load balancing, data coordination, and query planning via etcd
Streaming Node – Go-based component built on Woodpecker WAL for real-time ingestion, eliminating external message queues
Data Node and Query Node – C++ worker nodes handling compaction, indexing on sealed segments, and query execution
Object Storage – All data persists in S3/MinIO with a diskless architecture for simplified geo-distributed deployments

graph TB subgraph Access["Access Layer"] SDK[Client SDKs] Proxy[Proxy / Load Balancer] end subgraph Coord["Coordinator Service"] Root[Root Coord] Data[Data Coord] Query[Query Coord] Index[Index Coord] end subgraph Workers["Worker Nodes"] SN[Streaming Node - Go] DN[Data Node - C++] QN[Query Node - C++] end subgraph Storage["Storage Layer"] WAL[Woodpecker WAL] ObjStore[(Object Storage - S3/MinIO)] Etcd[(etcd - Metadata)] end Access --> Coord Coord --> Workers SN --> WAL DN --> ObjStore QN --> ObjStore Coord --> Etcd

Indexing Algorithms

Milvus supports multiple vector indexing strategies:

HNSW – Hierarchical Navigable Small World graphs for high-accuracy approximate nearest neighbor search; supports Int8 compression to reduce memory while preserving accuracy
IVF – Inverted File index for partitioned search with configurable accuracy/speed tradeoffs
DiskANN – Disk-optimized approximate nearest neighbor for datasets larger than memory
BM25 – Full-text search index with 400% faster performance than Elasticsearch, supporting configurable drop_ratio_search

Storage and Tiering

Milvus 2.6 introduced tiered hot/cold separation:

Hot data – Frequently accessed data stays in memory/SSD for low-latency queries
Cold data – Infrequently accessed data migrates to object storage (S3/MinIO/NetApp)
Dynamic migration – Automatic tiering based on access patterns, reducing storage costs by 50%
Storage v2 – Parquet/Vortex formats for Spark compatibility and reduced IOPS/memory usage

Code Example

from pymilvus import MilvusClient, DataType
 
# Connect to Milvus
client = MilvusClient(uri="http://localhost:19530")
 
# Create collection with schema
client.create_collection(
    collection_name="articles",
    dimension=768,
    metric_type="COSINE",
    auto_id=True
)
 
# Insert vectors with metadata
data = [
    {"vector": [0.1] * 768, "title": "RAG Guide", "category": "ai"},
    {"vector": [0.2] * 768, "title": "Vector DBs", "category": "database"},
]
client.insert(collection_name="articles", data=data)
 
# Hybrid search: vector similarity + scalar filter
results = client.search(
    collection_name="articles",
    data=[[0.15] * 768],
    filter='category == "ai"',
    limit=10,
    output_fields=["title", "category"]
)
for hits in results:
    for hit in hits:
        print(f"{hit['entity']['title']}: {hit['distance']:.4f}")

AI Agent Knowledge Base

Sidebar

Table of Contents

Milvus

Key Features

Architecture

Indexing Algorithms

Storage and Tiering

Code Example

References

See Also

AI Agent Knowledge Base

User Tools

Site Tools

Sidebar

Table of Contents

Milvus

Key Features

Architecture

Indexing Algorithms

Storage and Tiering

Code Example

References

See Also

Page Tools