====== Cohere ====== **Cohere** is a Toronto-based enterprise AI platform founded in 2019 by **Aidan Gomez**, a co-author of the seminal "Attention Is All You Need" paper that introduced the Transformer architecture. Cohere specializes in secure, customizable language models and tools for business applications including retrieval-augmented generation (RAG), semantic search, and AI agents. By early 2026, the company achieved $240 million ARR with 50%+ quarter-over-quarter growth and approximately 70% gross margins.((source [[https://cohere.com|Cohere official site]])) ===== Core Models ===== ==== Command Family ==== The Command family of generative models is optimized for enterprise workloads: * **Command R**: Optimized for RAG and multilingual tasks (10+ languages), with efficient GPU utilization and built-in tool integration * **Command R+**: 104 billion parameters with 128K context window, excelling at RAG and tool use across 10 major languages. Released April 2024((source [[https://intuitionlabs.ai/articles/cohere-enterprise-ai-llm-profile|Cohere Enterprise AI Profile]])) * **Command A**: Includes specialized variants such as Command A Translate (111 billion parameters, 23 languages, state-of-the-art machine translation) ==== Embed ==== The Embed model generates dense vector embeddings for semantic search and retrieval tasks. Deployable via APIs or through Cohere's Model Vault for secure on-premises inference. ==== Rerank ==== **Rerank 4** (released December 2025) offers 32K context for enterprise search and RAG pipelines. It uses a cross-encoder architecture that performs cross-attention between queries and documents for high-precision relevance scoring.((source [[https://futurumgroup.com/insights/coheres-multilingual-sovereign-ai-moat-ahead-of-a-2026-ipo/|Cohere Multilingual Sovereign AI]])) ===== RAG Capabilities ===== Command R and R+ are specifically optimized for Retrieval-Augmented Generation, integrating with external data sources, tools, and APIs. The models support grounding responses in enterprise data with citation anchoring, reducing hallucination in knowledge-intensive applications. ===== Enterprise Focus ===== Cohere prioritizes regulated industries with multiple deployment options: * **Model Vault** (September 2025): VPC-isolated or on-premises deployment of Command, Embed, and Rerank models, keeping data within customer networks * **North**: AI agent platform (launched January 2025) combining LLMs, search, and agents for workflows in HR, finance, support, and IT((source [[https://cohere.com|Cohere]])) * Dedicated enterprise clusters with customizable security configurations * Support for 23 languages with sovereign AI deployments ===== Additional Models ===== * **Tiny Aya** (2026): Open-weight multilingual small language model (3.35 billion parameters) supporting 70+ languages, designed for edge devices and laptops * **Cohere Transcribe**: Audio model (2 billion parameters) for speech-to-text in 12+ languages using Conformer + Transformer architecture((source [[https://siliconangle.com/2026/03/26/google-cohere-launch-new-audio-models/|Google and Cohere Launch Audio Models]])) ===== Funding and Growth ===== Cohere is backed by investors including NVIDIA, AMD, and Salesforce. The company closed a $100 million funding round in September 2025 and has been positioning for a potential 2026 IPO. Revenue grew from under $100 million to $240 million ARR, with strong unit economics driven by efficient model architectures.((source [[https://cohere.com/blog/september-2025-funding-round|Cohere September 2025 Funding]])) ===== See Also ===== * [[anthropic|Anthropic]] * [[mistral_ai|Mistral AI]] * [[deepseek|DeepSeek]] ===== References =====