Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety & Security
Evaluation
Meta
Core Concepts
Reasoning
Memory & Retrieval
Agent Types
Design Patterns
Training & Alignment
Frameworks
Tools
Safety & Security
Evaluation
Meta
Current pricing and specifications for all major LLMs. Use this to pick the right model for your use case and budget. Prices are per 1M tokens via official API.
| Model | Provider | Context Window | Input $/1M ^ Output $/1M | Strengths | API Endpoint | |
|---|---|---|---|---|---|---|
| GPT-4o | OpenAI | 128K | $2.50 | $10.00 | Fast multimodal frontier model, vision + audio | api.openai.com | |
| GPT-4.1 | OpenAI | 1M | $2.00 | $8.00 | Long-context coding, instruction following | api.openai.com | |
| o3 | OpenAI | 200K | $10.00 | $40.00 | Deep reasoning with thinking tokens, math/science | api.openai.com | |
| o3-mini | OpenAI | 200K | $1.10 | $4.40 | Budget reasoning model | api.openai.com | |
| Claude Opus 4 | Anthropic | 200K | $15.00 | $75.00 | Most capable reasoning, complex analysis, agentic coding | api.anthropic.com | |
| Claude Sonnet 4 | Anthropic | 200K | $3.00 | $15.00 | Best price-performance, balanced speed + quality | api.anthropic.com | |
| Claude Haiku 3.5 | Anthropic | 200K | $0.80 | $4.00 | Fast + cheap, classification, extraction | api.anthropic.com | |
| Gemini 2.5 Pro | 1M | $1.25 / $2.50 | $10.00 / $15.00 | Massive context, tiered pricing (<200K / >200K) | generativelanguage.googleapis.com | |
| Gemini 2.5 Flash | 1M | $0.15 / $0.30 | $0.60 / $3.50 | Ultra-fast, cheapest reasoning model | generativelanguage.googleapis.com | |
| Llama 4 Maverick | Meta (via hosts) | 128K | $0.88 | $0.88 | Open-weight, strong multilingual, self-hostable | together.ai / fireworks.ai | |
| Llama 4 Scout | Meta (via hosts) | 128K | $0.11 | $0.22 | Budget open model, lightweight tasks | together.ai / fireworks.ai | |
| Mistral Large | Mistral | 128K | $2.00 | $6.00 | GDPR-compliant, strong European data handling | api.mistral.ai | |
| DeepSeek V3 | DeepSeek | 128K | $0.27 | $1.10 | Extreme value, cache hits at $0.07/M input | api.deepseek.com | |
| Qwen 3 | Alibaba | 128K-1M | $0.16 | $0.70 | Budget multilingual, scalable context variants | dashscope.aliyuncs.com |
| Use Case | Recommended Model | Why |
|---|---|---|
| Complex reasoning & analysis | Claude Opus 4 | Highest capability, best for multi-step reasoning |
| Daily coding assistant | Claude Sonnet 4 or GPT-4.1 | Strong code quality at reasonable cost |
| Long document processing | Gemini 2.5 Pro or GPT-4.1 | 1M context windows |
| High-volume classification | Gemini 2.5 Flash | Cheapest per token with reasoning |
| Budget-conscious production | DeepSeek V3 | $0.27/M input with caching at $0.07 |
| Self-hosted / open-weight | Llama 4 Maverick | Strong open model, no API costs at scale |
| Math / science reasoning | o3 | Purpose-built for deep reasoning tasks |
| European data compliance | Mistral Large | GDPR-compliant, EU-hosted option |
| Multilingual applications | Qwen 3 or Llama 4 | Strong multilingual benchmarks |
| Fastest response time | Gemini 2.5 Flash | Sub-second latency, streaming |
| Model | Context Window | Notes |
|---|---|---|
| Gemini 2.5 Pro | 1,000,000 | Largest production context |
| Gemini 2.5 Flash | 1,000,000 | Same window, lower cost |
| GPT-4.1 | 1,000,000 | Newest OpenAI long-context |
| Claude Opus 4 | 200,000 | Extended thinking available |
| Claude Sonnet 4 | 200,000 | Same window as Opus |
| o3 | 200,000 | Thinking tokens use context |
| GPT-4o | 128,000 | Standard frontier context |
| Llama 4 Maverick | 128,000 | Open-weight |
| DeepSeek V3 | 128,000 | Budget option |
| Mistral Large | 128,000 | EU-compliant |
Last updated: March 2026. Prices change frequently – verify with provider.