DeepSeek is a Chinese AI research company founded in 2023 by Liang Wenfeng, founder of the quantitative hedge fund High-Flyer Capital, headquartered in Hangzhou, China. DeepSeek disrupted the global AI industry by demonstrating that frontier-class models could be trained for a fraction of the cost of Western competitors, using efficient Mixture of Experts (MoE) architectures. The company's models are fully open-source and available for self-hosting.1)
The flagship general-purpose model, released December 2024 with V3.2 following in December 2025. DeepSeek-V3 handles writing, analysis, coding, and data tasks, matching the quality of GPT-5 and Claude Sonnet 4.6 at significantly lower cost. The model supports context windows handling approximately 2,000 pages of text.2)
A dedicated reasoning model released January 2025, excelling at step-by-step logical reasoning and outperforming GPT-4 o1-mini. R1 displays its full reasoning trace to users. Its release triggered a major market shock when investors realized it was trained for approximately $5.6 million using optimized NVIDIA H800 chips, compared to over $100 million for comparable Western models.3)
DeepSeek pioneered aggressive use of Mixture of Experts (MoE) starting with DeepSeek-MoE in January 2024. The approach activates only 2-3 specialist “expert” sub-networks per query. For example, R1 activates approximately 37 billion of its 671 billion total parameters for any given input, requiring 2-4x fewer computational resources than equivalent dense models.4)
The $5.6 million training cost for DeepSeek-R1 fundamentally repriced expectations for AI development:
| Metric | DeepSeek | Western Competitors (e.g., GPT-4/5) |
|---|---|---|
| Training Cost | ~$5.6M | $100M+ | |
| API Input (per 1M tokens) | $0.14-0.55 | $1.25-15 | |
| API Output (per 1M tokens) | $0.28-2.19 | $10-75 |
The cost efficiency was achieved through MoE architecture, optimized training infrastructure, and efficient use of NVIDIA H800 chips (the export-compliant variant available in China).
DeepSeek-R1's January 2025 launch caused NVIDIA's market capitalization to drop by approximately $600 billion as investors reassessed assumptions about the compute requirements for frontier AI. The release:
All DeepSeek models are fully open-source, downloadable for self-hosting on consumer hardware (including NVIDIA RTX 4090/5090 GPUs). There are no subscription fees, and the API is priced 10-50x cheaper than Western alternatives.