====== Milvus ====== **Milvus** is a cloud-native, open-source vector database built primarily in Go and C++ and optimized for high-performance similarity search on massive-scale embedding datasets. With over **43,000 GitHub stars** and CNCF graduated status, it is one of the most mature and battle-tested vector databases available. | **Repository** | [[https://github.com/milvus-io/milvus|github.com/milvus-io/milvus]] | | **License** | Apache 2.0 | | **Language** | Go, C++ | | **Stars** | 43K+ | | **Category** | Vector Database | ===== Key Features ===== * **Billion-Scale Search** -- Handles billion-scale vector datasets with millisecond latency * **Disaggregated Architecture** -- Independent scaling of storage and compute with microservices design * **Multiple Index Types** -- HNSW, IVF, DiskANN with Int8 compression for memory efficiency * **Hybrid Search** -- Combines vector similarity with scalar filtering, BM25 full-text search, and JSON path indexing * **Hot/Cold Tiering** -- Automatic data tiering between memory/SSD and object storage based on access patterns * **Hardware Acceleration** -- AVX512, Neon SIMD, quantization, and GPU support * **Multi-Tenancy** -- Supports up to 100K collections per cluster * **CNCF Graduated** -- Production-proven with enterprise-grade reliability ===== Architecture ===== Milvus employs a multi-layered, microservices-based architecture separating access, coordination, worker, and storage layers: * **Access Layer** -- Handles client requests, validation, and routing to internal nodes * **Coordinator Service** -- The brain of the system; manages load balancing, data coordination, and query planning via etcd * **Streaming Node** -- Go-based component built on Woodpecker WAL for real-time ingestion, eliminating external message queues * **Data Node and Query Node** -- C++ worker nodes handling compaction, indexing on sealed segments, and query execution * **Object Storage** -- All data persists in S3/MinIO with a diskless architecture for simplified geo-distributed deployments graph TB subgraph Access["Access Layer"] SDK[Client SDKs] Proxy[Proxy / Load Balancer] end subgraph Coord["Coordinator Service"] Root[Root Coord] Data[Data Coord] Query[Query Coord] Index[Index Coord] end subgraph Workers["Worker Nodes"] SN[Streaming Node - Go] DN[Data Node - C++] QN[Query Node - C++] end subgraph Storage["Storage Layer"] WAL[Woodpecker WAL] ObjStore[(Object Storage - S3/MinIO)] Etcd[(etcd - Metadata)] end Access --> Coord Coord --> Workers SN --> WAL DN --> ObjStore QN --> ObjStore Coord --> Etcd ===== Indexing Algorithms ===== Milvus supports multiple vector indexing strategies: * **HNSW** -- Hierarchical Navigable Small World graphs for high-accuracy approximate nearest neighbor search; supports Int8 compression to reduce memory while preserving accuracy * **IVF** -- Inverted File index for partitioned search with configurable accuracy/speed tradeoffs * **DiskANN** -- Disk-optimized approximate nearest neighbor for datasets larger than memory * **BM25** -- Full-text search index with 400% faster performance than Elasticsearch, supporting configurable drop_ratio_search ===== Storage and Tiering ===== Milvus 2.6 introduced tiered hot/cold separation: * **Hot data** -- Frequently accessed data stays in memory/SSD for low-latency queries * **Cold data** -- Infrequently accessed data migrates to object storage (S3/MinIO/NetApp) * **Dynamic migration** -- Automatic tiering based on access patterns, reducing storage costs by 50% * **Storage v2** -- Parquet/Vortex formats for Spark compatibility and reduced IOPS/memory usage ===== Code Example ===== from pymilvus import MilvusClient, DataType # Connect to Milvus client = MilvusClient(uri="http://localhost:19530") # Create collection with schema client.create_collection( collection_name="articles", dimension=768, metric_type="COSINE", auto_id=True ) # Insert vectors with metadata data = [ {"vector": [0.1] * 768, "title": "RAG Guide", "category": "ai"}, {"vector": [0.2] * 768, "title": "Vector DBs", "category": "database"}, ] client.insert(collection_name="articles", data=data) # Hybrid search: vector similarity + scalar filter results = client.search( collection_name="articles", data=[[0.15] * 768], filter='category == "ai"', limit=10, output_fields=["title", "category"] ) for hits in results: for hit in hits: print(f"{hit['entity']['title']}: {hit['distance']:.4f}") ===== References ===== * [[https://github.com/milvus-io/milvus|Milvus GitHub Repository]] * [[https://milvus.io|Milvus Official Website]] * [[https://milvus.io/docs/architecture_overview.md|Milvus Architecture Documentation]] * [[https://milvus.io/blog/introduce-milvus-2-6-built-for-scale-designed-to-reduce-costs.md|Milvus 2.6 Release Blog]] ===== See Also ===== * [[qdrant|Qdrant]] -- Rust-based vector database * [[chromadb|ChromaDB]] -- AI-native embedding database * [[mem0|Mem0]] -- Memory layer using vector databases * [[ragflow|RAGFlow]] -- RAG engine for document understanding * [[lightrag|LightRAG]] -- Knowledge graph RAG framework