Comprehensive reference for every major LLM API provider. Copy-paste ready endpoints, pricing, and code snippets.
Last updated: March 2026
| Provider | Model | Endpoint URL | Auth Method | Context | Input $/1M ^ Output $/1M | |
|---|---|---|---|---|---|---|
| OpenAI | GPT-4.1 | https://api.openai.com/v1/chat/completions | Bearer token | 1M | $2.00 | $8.00 | |
| OpenAI | o3 | https://api.openai.com/v1/chat/completions | Bearer token | 200K | $2.00 | $8.00 | |
| OpenAI | o4-mini | https://api.openai.com/v1/chat/completions | Bearer token | 200K | $1.10 | $4.40 | |
| OpenAI | GPT-4o | https://api.openai.com/v1/chat/completions | Bearer token | 128K | $2.50 | $10.00 | |
| Anthropic | Claude Opus 4.6 | https://api.anthropic.com/v1/messages | x-api-key header | 200K | $5.00 | $25.00 | |
| Anthropic | Claude Sonnet 4.5 | https://api.anthropic.com/v1/messages | x-api-key header | 200K | $3.00 | $15.00 | |
| Anthropic | Claude Haiku 3.5 | https://api.anthropic.com/v1/messages | x-api-key header | 200K | $0.80 | $4.00 | |
| Gemini 2.5 Pro | https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro:generateContent | API key or Bearer | 1M | $1.25 | $10.00 | ||
| Gemini 2.5 Flash | https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent | API key or Bearer | 1M | $0.15 | $0.60 | ||
| Mistral | Mistral Large | https://api.mistral.ai/v1/chat/completions | Bearer token | 128K | $2.00 | $6.00 | |
| Mistral | Mistral Small | https://api.mistral.ai/v1/chat/completions | Bearer token | 128K | $0.40 | $2.00 | |
| DeepSeek | DeepSeek-V3 | https://api.deepseek.com/v1/chat/completions | Bearer token | 164K | $0.14 | $0.28 | |
| DeepSeek | DeepSeek-R1 | https://api.deepseek.com/v1/chat/completions | Bearer token | 164K | $0.55 | $2.19 | |
| Groq | (Hosted models) | https://api.groq.com/openai/v1/chat/completions | Bearer token | Varies | ~$0.10 | ~$0.25 | |
| Together | (Hosted models) | https://api.together.xyz/v1/chat/completions | Bearer token | Varies | ~$0.20 | ~$0.88 | |
| Fireworks | (Hosted models) | https://api.fireworks.ai/inference/v1/chat/completions | Bearer token | Varies | ~$0.10 | ~$1.00 | |
| Ollama | (Local models) | http://localhost:11434/api/chat | None (local) | Varies | Free | Free |
import openai client = openai.OpenAI(api_key="sk-...") response = client.chat.completions.create( model="gpt-4.1", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello!"} ] ) print(response.choices[0].message.content)
curl https://api.openai.com/v1/chat/completions \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model":"gpt-4.1","messages":[{"role":"user","content":"Hello"}]}'
import anthropic client = anthropic.Anthropic(api_key="sk-ant-...") message = client.messages.create( model="claude-sonnet-4-5-20250514", max_tokens=1024, messages=[ {"role": "user", "content": "Hello!"} ] ) print(message.content[0].text)
curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "Content-Type: application/json" \ -d '{"model":"claude-sonnet-4-5-20250514","max_tokens":1024,"messages":[{"role":"user","content":"Hello"}]}'
from google import genai client = genai.Client(api_key="AIza...") response = client.models.generate_content( model="gemini-2.5-pro", contents="Hello!" ) print(response.text)
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro:generateContent?key=$GEMINI_API_KEY" \ -H "Content-Type: application/json" \ -d '{"contents":[{"parts":[{"text":"Hello"}]}]}'
from mistralai import Mistral client = Mistral(api_key="...") response = client.chat.complete( model="mistral-large-latest", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content)
import openai client = openai.OpenAI( api_key="sk-...", base_url="https://api.deepseek.com/v1" ) response = client.chat.completions.create( model="deepseek-chat", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content)
import openai client = openai.OpenAI( api_key="gsk_...", base_url="https://api.groq.com/openai/v1" ) response = client.chat.completions.create( model="llama-3.3-70b-versatile", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content)
import openai client = openai.OpenAI( api_key="...", base_url="https://api.together.xyz/v1" ) response = client.chat.completions.create( model="meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content)
import openai client = openai.OpenAI( api_key="fw_...", base_url="https://api.fireworks.ai/inference/v1" ) response = client.chat.completions.create( model="accounts/fireworks/models/llama-v3p1-70b-instruct", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content)
import openai client = openai.OpenAI( api_key="ollama", base_url="http://localhost:11434/v1" ) response = client.chat.completions.create( model="llama3.2", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content)
curl http://localhost:11434/api/chat \ -d '{"model":"llama3.2","messages":[{"role":"user","content":"Hello"}]}'
| Feature | OpenAI | Anthropic | Others | |
|---|---|---|---|---|
| Auth Header | Authorization: Bearer | x-api-key | API key in URL or Bearer | Authorization: Bearer |
| SDK Pattern | client.chat.completions.create() | client.messages.create() | client.models.generate_content() | OpenAI-compatible |
| Streaming | stream=True | stream=True (uses SSE) | stream=True | stream=True |
| Tool Calling | tools=[] param | tools=[] param | tools=[] param | Varies |
| Response Path | choices[0].message.content | content[0].text | response.text | OpenAI-compatible |
base_url and api_keymax_tokens for Anthropic (required) and consider it for cost control elsewhere