DeepSeek

DeepSeek is a Chinese AI research company founded in 2023 by Liang Wenfeng, founder of the quantitative hedge fund High-Flyer Capital, headquartered in Hangzhou, China. DeepSeek disrupted the global AI industry by demonstrating that frontier-class models could be trained for a fraction of the cost of Western competitors, using efficient Mixture of Experts (MoE) architectures. The company's models are fully open-source and available for self-hosting.¹⁾

Key Models

DeepSeek-V3

The flagship general-purpose model, released December 2024 with V3.2 following in December 2025. DeepSeek-V3 handles writing, analysis, coding, and data tasks, matching the quality of GPT-5 and Claude Sonnet 4.6 at significantly lower cost. The model supports context windows handling approximately 2,000 pages of text.²⁾

DeepSeek-R1

A dedicated reasoning model released January 2025, excelling at step-by-step logical reasoning and outperforming GPT-4 o1-mini. R1 displays its full reasoning trace to users. Its release triggered a major market shock when investors realized it was trained for approximately $5.6 million using optimized NVIDIA H800 chips, compared to over $100 million for comparable Western models.³⁾

Other Models

DeepSeek-Janus Pro: Multimodal model for image analysis and generation
DeepSeek-V4 (February 2026): Next-generation model with integrated “deep thinking” reasoning capabilities

MoE Architecture

DeepSeek pioneered aggressive use of Mixture of Experts (MoE) starting with DeepSeek-MoE in January 2024. The approach activates only 2-3 specialist “expert” sub-networks per query. For example, R1 activates approximately 37 billion of its 671 billion total parameters for any given input, requiring 2-4x fewer computational resources than equivalent dense models.⁴⁾

Training Cost Breakthrough

The $5.6 million training cost for DeepSeek-R1 fundamentally repriced expectations for AI development:

Metric	DeepSeek	Western Competitors (e.g., GPT-4/5)
Training Cost	~$5.6M \| $100M+
API Input (per 1M tokens)	$0.14-0.55 \| $1.25-15
API Output (per 1M tokens)	$0.28-2.19 \| $10-75

The cost efficiency was achieved through MoE architecture, optimized training infrastructure, and efficient use of NVIDIA H800 chips (the export-compliant variant available in China).

Industry Impact

DeepSeek-R1's January 2025 launch caused NVIDIA's market capitalization to drop by approximately $600 billion as investors reassessed assumptions about the compute requirements for frontier AI. The release:

Shifted industry focus from raw scale to “cognitive density” and architectural efficiency
Became the top U.S. App Store download (January 2025) via its chatbot application
Accelerated adoption of efficient architectures across the industry
Boosted AI deployment in developing nations through affordable self-hosting⁵⁾

Open-Source Approach

All DeepSeek models are fully open-source, downloadable for self-hosting on consumer hardware (including NVIDIA RTX 4090/5090 GPUs). There are no subscription fees, and the API is priced 10-50x cheaper than Western alternatives.