====== Milvus ======
**Milvus** is a cloud-native, open-source vector database built primarily in Go and C++ and optimized for high-performance similarity search on massive-scale embedding datasets. With over **43,000 GitHub stars** and CNCF graduated status, it is one of the most mature and battle-tested vector databases available.
| **Repository** | [[https://github.com/milvus-io/milvus|github.com/milvus-io/milvus]] |
| **License** | Apache 2.0 |
| **Language** | Go, C++ |
| **Stars** | 43K+ |
| **Category** | Vector Database |
===== Key Features =====
* **Billion-Scale Search** -- Handles billion-scale vector datasets with millisecond latency
* **Disaggregated Architecture** -- Independent scaling of storage and compute with microservices design
* **Multiple Index Types** -- HNSW, IVF, DiskANN with Int8 compression for memory efficiency
* **Hybrid Search** -- Combines vector similarity with scalar filtering, BM25 full-text search, and JSON path indexing
* **Hot/Cold Tiering** -- Automatic data tiering between memory/SSD and object storage based on access patterns
* **Hardware Acceleration** -- AVX512, Neon SIMD, quantization, and GPU support
* **Multi-Tenancy** -- Supports up to 100K collections per cluster
* **CNCF Graduated** -- Production-proven with enterprise-grade reliability
===== Architecture =====
Milvus employs a multi-layered, microservices-based architecture separating access, coordination, worker, and storage layers:
* **Access Layer** -- Handles client requests, validation, and routing to internal nodes
* **Coordinator Service** -- The brain of the system; manages load balancing, data coordination, and query planning via etcd
* **Streaming Node** -- Go-based component built on Woodpecker WAL for real-time ingestion, eliminating external message queues
* **Data Node and Query Node** -- C++ worker nodes handling compaction, indexing on sealed segments, and query execution
* **Object Storage** -- All data persists in S3/MinIO with a diskless architecture for simplified geo-distributed deployments
graph TB
subgraph Access["Access Layer"]
SDK[Client SDKs]
Proxy[Proxy / Load Balancer]
end
subgraph Coord["Coordinator Service"]
Root[Root Coord]
Data[Data Coord]
Query[Query Coord]
Index[Index Coord]
end
subgraph Workers["Worker Nodes"]
SN[Streaming Node - Go]
DN[Data Node - C++]
QN[Query Node - C++]
end
subgraph Storage["Storage Layer"]
WAL[Woodpecker WAL]
ObjStore[(Object Storage - S3/MinIO)]
Etcd[(etcd - Metadata)]
end
Access --> Coord
Coord --> Workers
SN --> WAL
DN --> ObjStore
QN --> ObjStore
Coord --> Etcd
===== Indexing Algorithms =====
Milvus supports multiple vector indexing strategies:
* **HNSW** -- Hierarchical Navigable Small World graphs for high-accuracy approximate nearest neighbor search; supports Int8 compression to reduce memory while preserving accuracy
* **IVF** -- Inverted File index for partitioned search with configurable accuracy/speed tradeoffs
* **DiskANN** -- Disk-optimized approximate nearest neighbor for datasets larger than memory
* **BM25** -- Full-text search index with 400% faster performance than Elasticsearch, supporting configurable drop_ratio_search
===== Storage and Tiering =====
Milvus 2.6 introduced tiered hot/cold separation:
* **Hot data** -- Frequently accessed data stays in memory/SSD for low-latency queries
* **Cold data** -- Infrequently accessed data migrates to object storage (S3/MinIO/NetApp)
* **Dynamic migration** -- Automatic tiering based on access patterns, reducing storage costs by 50%
* **Storage v2** -- Parquet/Vortex formats for Spark compatibility and reduced IOPS/memory usage
===== Code Example =====
from pymilvus import MilvusClient, DataType
# Connect to Milvus
client = MilvusClient(uri="http://localhost:19530")
# Create collection with schema
client.create_collection(
collection_name="articles",
dimension=768,
metric_type="COSINE",
auto_id=True
)
# Insert vectors with metadata
data = [
{"vector": [0.1] * 768, "title": "RAG Guide", "category": "ai"},
{"vector": [0.2] * 768, "title": "Vector DBs", "category": "database"},
]
client.insert(collection_name="articles", data=data)
# Hybrid search: vector similarity + scalar filter
results = client.search(
collection_name="articles",
data=[[0.15] * 768],
filter='category == "ai"',
limit=10,
output_fields=["title", "category"]
)
for hits in results:
for hit in hits:
print(f"{hit['entity']['title']}: {hit['distance']:.4f}")
===== References =====
* [[https://github.com/milvus-io/milvus|Milvus GitHub Repository]]
* [[https://milvus.io|Milvus Official Website]]
* [[https://milvus.io/docs/architecture_overview.md|Milvus Architecture Documentation]]
* [[https://milvus.io/blog/introduce-milvus-2-6-built-for-scale-designed-to-reduce-costs.md|Milvus 2.6 Release Blog]]
===== See Also =====
* [[qdrant|Qdrant]] -- Rust-based vector database
* [[chromadb|ChromaDB]] -- AI-native embedding database
* [[mem0|Mem0]] -- Memory layer using vector databases
* [[ragflow|RAGFlow]] -- RAG engine for document understanding
* [[lightrag|LightRAG]] -- Knowledge graph RAG framework