Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety & Security
Evaluation
Meta
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety & Security
Evaluation
Meta
Moonshot AI's Kimi K2 is a trillion-parameter open-source large language model built on a Mixture-of-Experts (MoE) Transformer architecture. First released in mid-2025, Kimi K2 represents one of China's most ambitious contributions to frontier AI, activating only 32 billion of its 1.04 trillion total parameters per token for efficient inference. 1) An upgraded version, Kimi K2.5, followed in January 2026 with native multimodal capabilities and an expanded 256K context window. 2)
Kimi K2 employs a dense-sparse hybrid design with the following specifications:
The MLA mechanism compresses key-value projections into a lower-dimensional space before computing attention scores, reducing memory bandwidth by 40-50%. 3) The model was trained using the Muon optimizer, purpose-built for trillion-parameter MoE models.
Kimi K2.5 was trained on 15 trillion mixed visual and textual tokens in a unified pipeline, allowing vision and language capabilities to develop together rather than as separate modules. 4) The vision component uses MoonViT, a 400-million-parameter vision encoder that processes images through the same transformer architecture as text.
Kimi K2.5 achieved 50.2% on Humanity's Last Exam at significantly lower cost than comparable closed models. 6) Both versions achieve state-of-the-art open-model performance across code, reasoning, and multi-step tasks.
Both models are fully open-source, distributed via HuggingFace in base and instruction-tuned variants. K2 runs at approximately 15 tokens/s on two Apple M3 Ultras. Using Unsloth Dynamic 1.8-bit quantization, K2.5's disk requirements drop from 600GB to 240GB, enabling operation on a single 24GB GPU with system RAM offloading. 7) A notable trade-off is that K2 exhibits 2-2.5x higher token usage compared to other models.
Kimi K2 is part of a broader wave of Chinese MoE model development. Moonshot AI, founded in 2023 by former Tsinghua University researchers, has positioned itself alongside DeepSeek and other Chinese labs pushing the boundaries of efficient large-scale AI. The MoE architecture allows these models to compete with Western frontier models while requiring significantly less compute per inference request.