Table of Contents

DeepSeek

DeepSeek is a Chinese AI research company founded in 2023 by Liang Wenfeng, founder of the quantitative hedge fund High-Flyer Capital, headquartered in Hangzhou, China. DeepSeek disrupted the global AI industry by demonstrating that frontier-class models could be trained for a fraction of the cost of Western competitors, using efficient Mixture of Experts (MoE) architectures. The company's models are fully open-source and available for self-hosting.1)

Key Models

DeepSeek-V3

The flagship general-purpose model, released December 2024 with V3.2 following in December 2025. DeepSeek-V3 handles writing, analysis, coding, and data tasks, matching the quality of GPT-5 and Claude Sonnet 4.6 at significantly lower cost. The model supports context windows handling approximately 2,000 pages of text.2)

DeepSeek-R1

A dedicated reasoning model released January 2025, excelling at step-by-step logical reasoning and outperforming GPT-4 o1-mini. R1 displays its full reasoning trace to users. Its release triggered a major market shock when investors realized it was trained for approximately $5.6 million using optimized NVIDIA H800 chips, compared to over $100 million for comparable Western models.3)

Other Models

MoE Architecture

DeepSeek pioneered aggressive use of Mixture of Experts (MoE) starting with DeepSeek-MoE in January 2024. The approach activates only 2-3 specialist “expert” sub-networks per query. For example, R1 activates approximately 37 billion of its 671 billion total parameters for any given input, requiring 2-4x fewer computational resources than equivalent dense models.4)

Training Cost Breakthrough

The $5.6 million training cost for DeepSeek-R1 fundamentally repriced expectations for AI development:

Metric DeepSeek Western Competitors (e.g., GPT-4/5)
Training Cost ~$5.6M | $100M+
API Input (per 1M tokens) $0.14-0.55 | $1.25-15
API Output (per 1M tokens) $0.28-2.19 | $10-75

The cost efficiency was achieved through MoE architecture, optimized training infrastructure, and efficient use of NVIDIA H800 chips (the export-compliant variant available in China).

Industry Impact

DeepSeek-R1's January 2025 launch caused NVIDIA's market capitalization to drop by approximately $600 billion as investors reassessed assumptions about the compute requirements for frontier AI. The release:

Open-Source Approach

All DeepSeek models are fully open-source, downloadable for self-hosting on consumer hardware (including NVIDIA RTX 4090/5090 GPUs). There are no subscription fees, and the API is priced 10-50x cheaper than Western alternatives.

See Also

References