Table of Contents

Milvus

Milvus is a cloud-native, open-source vector database built primarily in Go and C++ and optimized for high-performance similarity search on massive-scale embedding datasets. With over 43,000 GitHub stars and CNCF graduated status, it is one of the most mature and battle-tested vector databases available.

Repository github.com/milvus-io/milvus
License Apache 2.0
Language Go, C++
Stars 43K+
Category Vector Database

Key Features

Architecture

Milvus employs a multi-layered, microservices-based architecture separating access, coordination, worker, and storage layers:

graph TB subgraph Access["Access Layer"] SDK[Client SDKs] Proxy[Proxy / Load Balancer] end subgraph Coord["Coordinator Service"] Root[Root Coord] Data[Data Coord] Query[Query Coord] Index[Index Coord] end subgraph Workers["Worker Nodes"] SN[Streaming Node - Go] DN[Data Node - C++] QN[Query Node - C++] end subgraph Storage["Storage Layer"] WAL[Woodpecker WAL] ObjStore[(Object Storage - S3/MinIO)] Etcd[(etcd - Metadata)] end Access --> Coord Coord --> Workers SN --> WAL DN --> ObjStore QN --> ObjStore Coord --> Etcd

Indexing Algorithms

Milvus supports multiple vector indexing strategies:

Storage and Tiering

Milvus 2.6 introduced tiered hot/cold separation:

Code Example

from pymilvus import MilvusClient, DataType
 
# Connect to Milvus
client = MilvusClient(uri="http://localhost:19530")
 
# Create collection with schema
client.create_collection(
    collection_name="articles",
    dimension=768,
    metric_type="COSINE",
    auto_id=True
)
 
# Insert vectors with metadata
data = [
    {"vector": [0.1] * 768, "title": "RAG Guide", "category": "ai"},
    {"vector": [0.2] * 768, "title": "Vector DBs", "category": "database"},
]
client.insert(collection_name="articles", data=data)
 
# Hybrid search: vector similarity + scalar filter
results = client.search(
    collection_name="articles",
    data=[[0.15] * 768],
    filter='category == "ai"',
    limit=10,
    output_fields=["title", "category"]
)
for hits in results:
    for hit in hits:
        print(f"{hit['entity']['title']}: {hit['distance']:.4f}")

References

See Also