Milvus
Milvus is a cloud-native, open-source vector database built primarily in Go and C++ and optimized for high-performance similarity search on massive-scale embedding datasets. With over 43,000 GitHub stars and CNCF graduated status, it is one of the most mature and battle-tested vector databases available.1)
| Repository | github.com/milvus-io/milvus | github.com/milvus-io/milvus]] |
| License | Apache 2.0 | |
| Language | Go, C++ | |
| Stars | 43K+ | |
| Category | Vector Database | |
Key Features
Billion-Scale Search, Handles billion-scale vector datasets with millisecond latency
Disaggregated Architecture, Independent scaling of storage and compute with microservices design
Multiple Index Types, HNSW, IVF, DiskANN with Int8 compression for memory efficiency
Hybrid Search, Combines vector similarity with scalar filtering, BM25 full-text search, and JSON path indexing
Hot/Cold Tiering, Automatic data tiering between memory/SSD and object storage based on access patterns
Hardware Acceleration, AVX512,
Neon SIMD, quantization, and GPU support
Multi-Tenancy, Supports up to 100K collections per cluster
CNCF Graduated, Production-proven with enterprise-grade reliability
Architecture
Milvus employs a multi-layered, microservices-based architecture separating access, coordination, worker, and storage layers:2)
Access Layer, Handles client requests, validation, and routing to internal nodes
Coordinator Service, The brain of the system; manages load balancing, data coordination, and query planning via etcd
Streaming Node, Go-based component built on Woodpecker WAL for real-time ingestion, eliminating external message queues
Data Node and Query Node, C++ worker nodes handling compaction, indexing on sealed segments, and query execution
Object Storage, All data persists in S3/MinIO with a diskless architecture for simplified geo-distributed deployments
graph TB
subgraph Access["Access Layer"]
SDK[Client SDKs]
Proxy[Proxy / Load Balancer]
end
subgraph Coord["Coordinator Service"]
Root[Root Coord]
Data[Data Coord]
Query[Query Coord]
Index[Index Coord]
end
subgraph Workers["Worker Nodes"]
SN[Streaming Node - Go]
DN[Data Node - C++]
QN[Query Node - C++]
end
subgraph Storage["Storage Layer"]
WAL[Woodpecker WAL]
ObjStore[(Object Storage - S3/MinIO)]
Etcd[(etcd - Metadata)]
end
Access --> Coord
Coord --> Workers
SN --> WAL
DN --> ObjStore
QN --> ObjStore
Coord --> Etcd
Indexing Algorithms
Milvus supports multiple vector indexing strategies:
-
IVF, Inverted File index for partitioned search with configurable accuracy/speed tradeoffs
DiskANN, Disk-optimized approximate nearest neighbor for datasets larger than memory
BM25, Full-text search index with 400% faster performance than Elasticsearch, supporting configurable drop_ratio_search
Storage and Tiering
Milvus 2.6 introduced tiered hot/cold separation:3)
Hot data, Frequently accessed data stays in memory/SSD for low-latency queries
Cold data, Infrequently accessed data migrates to object storage (S3/MinIO/NetApp)
Dynamic migration, Automatic tiering based on access patterns, reducing storage costs by 50%
Storage v2, Parquet/Vortex formats for Spark compatibility and reduced IOPS/memory usage
Code Example
from pymilvus import MilvusClient, DataType
# Connect to Milvus
client = MilvusClient(uri="http://localhost:19530")
# Create collection with schema
client.create_collection(
collection_name="articles",
dimension=768,
metric_type="COSINE",
auto_id=True
)
# Insert vectors with metadata
data = [
{"vector": [0.1] * 768, "title": "RAG Guide", "category": "ai"},
{"vector": [0.2] * 768, "title": "Vector DBs", "category": "database"},
]
client.insert(collection_name="articles", data=data)
# [[hybrid_search|Hybrid search]]: vector similarity + scalar filter
results = client.search(
collection_name="articles",
data=[[0.15] * 768],
filter='category == "ai"',
limit=10,
output_fields=["title", "category"]
)
for hits in results:
for hit in hits:
print(f"{hit['entity']['title']}: {hit['distance']:.4f}")
See Also
References