Table of Contents

Embedding Models Comparison

Embedding models convert text (and sometimes images) into dense numerical vectors that capture semantic meaning, enabling similarity search, retrieval-augmented generation (RAG), and clustering. The choice of embedding model significantly impacts retrieval quality, cost, and latency. 1)

OpenAI Embeddings

text-embedding-3-small:

text-embedding-3-large:

Both models are English-focused with limited multilingual optimization. 2)

Cohere Embeddings

embed-v3:

embed-v4:

BGE Models (BAAI)

BGE-M3 is the leading open-source embedding model:

E5 Models (Microsoft)

E5 models range from small (384 dimensions, 118M parameters) to large (4,096 dimensions):

Jina Embeddings

Jina v2: 768-1,024 dimensions, ~$0.05/M tokens via API, self-hostable

Jina v3/v4:

Voyage AI Embeddings

voyage-3-large and Voyage Multimodal 3.5:

MTEB Benchmark Rankings

The Massive Text Embedding Benchmark (MTEB) evaluates embedding models across retrieval, classification, clustering, and other tasks. As of 2025-2026: 8)

Top proprietary: OpenAI text-embedding-3-large, Voyage-3-large, Gemini Embedding 2

Top open-source: BGE-M3, Jina v4, llama-embed-nemotron-8b

Multilingual leaders: Cohere v4, BGE-M3 (cross-lingual R@1 > 0.98)

Efficiency leaders: E5-small (14x faster, 100% Top-5 accuracy)

No single model dominates all categories; the best choice depends on language requirements, budget, latency constraints, and whether self-hosting is needed.

Matryoshka Representations

Matryoshka Representation Learning (MRL) trains embeddings so that truncating dimensions preserves most of the original performance. This allows trading storage and compute for quality: 9)

Choosing an Embedding Model

Use Case Recommended Models
General RAG (English) OpenAI text-embedding-3-large or -small
Multilingual RAG Cohere embed-v4 or BGE-M3
Cost-sensitive / Self-hosting BGE-M3 or E5-small
Storage-constrained Voyage (MRL at 512 dims) or Jina v4
Multimodal (image/PDF) Jina v4 or Voyage Multimodal 3.5
Enterprise two-stage Cohere v4 + Cohere Reranker

Always pair embeddings with a reranker for 5-10% additional retrieval quality gains. 10)

See Also

References

4)
https://milvus.io/blog/choose-embedding-model-rag-2026.md|Milvus: Choosing Embedding Models for RAG 2026
5)
https://aimultiple.com/open-source-embedding-models|AI Multiple: Open Source Embedding Models
8)
https://milvus.io/blog/choose-embedding-model-rag-2026.md|Milvus: Choosing Embedding Models 2026