====== Cohere ======

**Cohere** is a Toronto-based enterprise AI platform founded in 2019 by **Aidan Gomez**, a co-author of the seminal "Attention Is All You Need" paper that introduced the Transformer architecture. Cohere specializes in secure, customizable language models and tools for business applications including retrieval-augmented generation (RAG), semantic search, and AI agents. By early 2026, the company achieved $240 million ARR with 50%+ quarter-over-quarter growth and approximately 70% gross margins.((source [[https://cohere.com|Cohere official site]]))

===== Core Models =====

==== Command Family ====

The Command family of generative models is optimized for enterprise workloads:

  * **Command R**: Optimized for RAG and multilingual tasks (10+ languages), with efficient GPU utilization and built-in tool integration
  * **Command R+**: 104 billion parameters with 128K context window, excelling at RAG and tool use across 10 major languages. Released April 2024((source [[https://intuitionlabs.ai/articles/cohere-enterprise-ai-llm-profile|Cohere Enterprise AI Profile]]))
  * **Command A**: Includes specialized variants such as Command A Translate (111 billion parameters, 23 languages, state-of-the-art machine translation)

==== Embed ====

The Embed model generates dense vector embeddings for semantic search and retrieval tasks. Deployable via APIs or through Cohere's Model Vault for secure on-premises inference.

==== Rerank ====

**Rerank 4** (released December 2025) offers 32K context for enterprise search and RAG pipelines. It uses a cross-encoder architecture that performs cross-attention between queries and documents for high-precision relevance scoring.((source [[https://futurumgroup.com/insights/coheres-multilingual-sovereign-ai-moat-ahead-of-a-2026-ipo/|Cohere Multilingual Sovereign AI]]))

===== RAG Capabilities =====

Command R and R+ are specifically optimized for Retrieval-Augmented Generation, integrating with external data sources, tools, and APIs. The models support grounding responses in enterprise data with citation anchoring, reducing hallucination in knowledge-intensive applications.

===== Enterprise Focus =====

Cohere prioritizes regulated industries with multiple deployment options:

  * **Model Vault** (September 2025): VPC-isolated or on-premises deployment of Command, Embed, and Rerank models, keeping data within customer networks
  * **North**: AI agent platform (launched January 2025) combining LLMs, search, and agents for workflows in HR, finance, support, and IT((source [[https://cohere.com|Cohere]]))
  * Dedicated enterprise clusters with customizable security configurations
  * Support for 23 languages with sovereign AI deployments

===== Additional Models =====

  * **Tiny Aya** (2026): Open-weight multilingual small language model (3.35 billion parameters) supporting 70+ languages, designed for edge devices and laptops
  * **Cohere Transcribe**: Audio model (2 billion parameters) for speech-to-text in 12+ languages using Conformer + Transformer architecture((source [[https://siliconangle.com/2026/03/26/google-cohere-launch-new-audio-models/|Google and Cohere Launch Audio Models]]))

===== Funding and Growth =====

Cohere is backed by investors including NVIDIA, AMD, and Salesforce. The company closed a $100 million funding round in September 2025 and has been positioning for a potential 2026 IPO. Revenue grew from under $100 million to $240 million ARR, with strong unit economics driven by efficient model architectures.((source [[https://cohere.com/blog/september-2025-funding-round|Cohere September 2025 Funding]]))

===== See Also =====

  * [[anthropic|Anthropic]]
  * [[mistral_ai|Mistral AI]]
  * [[deepseek|DeepSeek]]

===== References =====