====== Pinecone ======

**Pinecone** is a fully managed, cloud-native vector database designed for high-performance similarity search and retrieval-augmented generation (RAG) applications. It offers both serverless and pod-based deployment options with automated indexing, metadata filtering, and deep AI ecosystem integration.((source [[https://www.pinecone.io|Pinecone]]))

===== Architecture =====

Pinecone separates storage from compute, using blob storage as the source of truth for all indexes.((source [[https://shshell.com/blog/vector-db-module-6-lesson-2-pinecone-architecture|Pinecone Architecture - ShShell]]))

==== Serverless ====

Serverless indexes automatically scale without capacity planning:

  * Fully metered pay-per-use pricing
  * Dynamic compute unit allocation per query
  * Vectors stored in blob storage (S3/GCS)
  * Baseline latency of 50-100ms under normal conditions
  * Recommended for 99% of new AI applications

==== Pod-Based ====

Pod-based indexes provide consistent, predictable performance:

  * **s1 (Storage-Optimized)** -- SSD-supplemented for large datasets
  * **p1 (Performance-Optimized)** -- full RAM index for low latency
  * **p2 (Highest Throughput)** -- graph-based for maximum QPS
  * Manual scaling via pod count and replicas
  * ~30ms latency at 99th percentile on p2 pods

===== Indexing and Search =====

Pinecone uses proprietary automated indexing algorithms combining inverted file index (IVF) with product quantization (PQ):((source [[https://www.velodb.io/glossary/pinecone-vector-database|Pinecone Overview - VeloDB]]))

  * Supports cosine similarity, Euclidean distance, and dot product metrics
  * Automatic index optimization without manual parameter tuning
  * Sharding across multiple nodes with query parallelization
  * Early termination strategies for efficient top-k retrieval

===== Hybrid Search =====

Pinecone supports hybrid search blending dense vector search with sparse keyword matching:((source [[https://docs.pinecone.io/guides/search/hybrid-search|Pinecone Hybrid Search Documentation]]))

  * Two approaches: single hybrid index (recommended) or separate dense and sparse indexes
  * Combines semantic similarity with lexical relevance
  * Built-in sparse embedding models via Pinecone Inference
  * Improves retrieval quality for domain-specific terminology

===== Metadata Filtering =====

Metadata filtering combines vector similarity with attribute-based constraints:((source [[https://www.velodb.io/glossary/pinecone-vector-database|Pinecone Overview - VeloDB]]))

  * Filter by any stored metadata field (dates, categories, tags, user IDs)
  * Constraints applied at query time for high performance
  * Essential for multi-tenant and access-controlled applications
  * Supports equality, range, and set membership operators

===== Namespaces and Collections =====

  * **Namespaces** -- logical separation of data within a single index for multi-tenant architectures((source [[https://cyclr.com/resources/ai/understanding-vector-databases-a-deep-dive-with-pinecone|Understanding Vector DBs - Cyclr]]))
  * **Collections** -- static snapshots of indexes for backup, versioning, or migration
  * Both features simplify managing data across multiple applications or users

===== Integrations =====

Pinecone integrates with major AI development frameworks:((source [[https://www.velodb.io/glossary/pinecone-vector-database|Pinecone Overview - VeloDB]]))

  * **LangChain** -- vector store integration for RAG pipelines
  * **LlamaIndex** -- data framework connector
  * **Haystack** -- search pipeline integration
  * **Pinecone Inference** -- hosted embedding generation (Cohere Embed, sparse models)
  * Direct REST and Python/Node.js SDK access

===== See Also =====

  * [[pgvector|pgvector]]
  * [[supabase_vector|Supabase Vector]]
  * [[hugging_face|Hugging Face]]

===== References =====