Jun 21, 2026 ยท 12 min read ยท Pricing Guide

AI API Pricing 2026: Every Model Ranked by Cost

42 models. 10 providers. From $0.075 to $180 per million output tokens. Here's the definitive ranking of every AI API you can buy right now.

42
Models ranked
10
Providers
2,400ร—
Price range
$0.30
Cheapest output/M

AI API pricing has become a maze. OpenAI alone has 9 models. Google has 7. Every provider has multiple tiers, and the "obvious" choice is rarely the cheapest. We track every model daily at APIpulse โ€” here's what the data says.

The Complete Ranking: Cheapest to Most Expensive

All prices are per million tokens. Active (non-deprecated) models only. Data verified Jun 21, 2026.

# Model Provider Tier Input/M Output/M Context
1 Mistral Small 4 Mistral Budget $0.10 $0.30 128K
2 Gemini 2.5 Flash-Lite Google Budget $0.10 $0.40 1M
3 DeepSeek V4 Flash DeepSeek Budget $0.14 $0.28 1M
4 GPT-oss 20B OpenAI Budget $0.08 $0.35 128K
5 Llama 4 Scout Meta (Together.ai) Budget $0.18 $0.59 1M
6 GPT-oss 120B OpenAI Budget $0.15 $0.60 128K
7 GPT-4o mini OpenAI Budget $0.15 $0.60 128K
8 DeepSeek V3.2 DeepSeek Budget $0.23 $0.34 128K
9 Llama 4 Maverick Meta (Together.ai) Budget $0.27 $0.85 1M
10 DeepSeek V4 Pro DeepSeek Budget $0.435 $0.87 1M
11 GPT-5 mini OpenAI Budget $0.25 $2.00 272K
12 Gemini 3.1 Flash-Lite Google Budget $0.25 $1.50 1M
13 Gemini 3 Flash Google Budget $0.50 $3.00 1M
14 Mistral Large 3 Mistral Budget $0.50 $1.50 262K
15 Command R Cohere Budget $0.50 $1.50 128K
16 Grok Build 0.1 xAI Budget $0.30 $0.50 256K
17 Kimi K2.6 Moonshot Budget $0.95 $4.00 256K
18 Claude Haiku 4.5 Anthropic Mid $1.00 $5.00 200K
19 Gemini 2.5 Pro Google Mid $1.25 $10.00 1M
20 GPT-5 OpenAI Premium $1.25 $10.00 272K
21 Grok 4.3 xAI Mid $1.25 $2.50 1M
22 Mistral Medium 3.5 Mistral Mid $1.50 $7.50 128K
23 Gemini 3.5 Flash Google Mid $1.50 $9.00 1M
24 GPT-5.3 Codex OpenAI Mid $1.75 $14.00 400K
25 Gemini 3.1 Pro Google Mid $2.00 $12.00 1M
26 Jamba 1.7 Large AI21 Mid $2.00 $8.00 256K
27 GPT-4o OpenAI Mid $2.50 $10.00 128K
28 Command A Cohere Mid $2.50 $10.00 128K
29 Command R+ Cohere Mid $2.50 $10.00 128K
30 Claude Sonnet 4.6 Anthropic Mid $3.00 $15.00 1M
31 GPT-5.5 OpenAI Premium $5.00 $30.00 1.05M
32 Claude Opus 4.7 Anthropic Premium $5.00 $25.00 1M
33 Claude Opus 4.8 Anthropic Premium $5.00 $25.00 1M
34 GPT-5.5 Pro OpenAI Premium $30.00 $180.00 1.05M

34 active models. 8 deprecated models (Claude 4 Opus, Sonnet 4, DeepSeek V3, Gemini 2.0 Flash/Lite, Jamba 1.5, Llama 3.1 variants) excluded. See full live dashboard โ†’

Key Takeaways

1. Output pricing varies 600ร— across models

The gap between the cheapest output (Mistral Small 4 at $0.30/M) and the most expensive (GPT-5.5 Pro at $180/M) is 600ร—. Input pricing varies less โ€” only 375ร— from $0.08 to $30.00. If your workload is output-heavy (chatbots, content generation, code completion), model choice matters enormously.

2. DeepSeek is the value king

DeepSeek V4 Pro ($0.435/$0.87) delivers 1M context with competitive quality at 11.5ร— cheaper output than GPT-5. Even DeepSeek V4 Flash ($0.14/$0.28) handles many tasks well. For startups and high-volume applications, DeepSeek is the default budget choice.

3. Google's Flash models are underrated

Gemini 3 Flash ($0.50/$3.00) with 1M context is an excellent mid-range option. Google also has the cheapest option overall โ€” Gemini 2.5 Flash-Lite at $0.10/$0.40 with 1M context. For long-document processing, Google's pricing is unbeatable.

4. Premium doesn't mean 10ร— better

Claude Opus 4.8 ($5/$25) and GPT-5.5 ($5/$30) are the premium reasoning models. But for most production workloads, Claude Sonnet 4.6 ($3/$15) or Gemini 2.5 Pro ($1.25/$10) deliver 90% of the quality at 40-60% of the cost. Reserve premium models for tasks that genuinely need them.

5. Context window is a hidden cost factor

A 1M context window (DeepSeek, Gemini, Claude) means you can process entire codebases or documents in one API call. Models with 128K context (GPT-4o, Mistral Medium) may require chunking โ€” which multiplies your costs by the number of chunks.

Best Model by Use Case

Use Case Best Model Output/M Why
High-volume chatbot DeepSeek V4 Flash $0.28 Cheapest 1M context model. Great for customer support, FAQ bots
Code generation Claude Sonnet 4.6 $15.00 Best code quality/price ratio. 1M context for full codebase analysis
Long document analysis Gemini 2.5 Flash-Lite $0.40 1M context at $0.10 input. Process entire books or legal docs cheaply
Complex reasoning Claude Opus 4.8 $25.00 Top-tier reasoning. Worth the premium for research, analysis, planning
Content generation at scale Mistral Large 3 $1.50 Good quality at budget price. Great for marketing copy, product descriptions
Startups / prototyping GPT-5 mini $2.00 Good enough quality, fast, OpenAI ecosystem compatibility
Enterprise RAG pipelines Gemini 3 Flash $3.00 1M context + budget pricing. Process large document stores efficiently

Provider Comparison

How do the big providers stack up on pricing?

Provider Models Cheapest/M (out) Most Expensive/M (out) Max Context
OpenAI 9 $0.35 $180.00 1.05M
Anthropic 5 $5.00 $25.00 1M
Google 7 $0.30 $12.00 1M
DeepSeek 4 $0.28 $0.87 1M
Mistral 3 $0.30 $7.50 262K
xAI 2 $0.50 $2.50 1M
Cohere 3 $1.50 $10.00 128K
Meta (Together.ai) 4 $0.10 $0.88 1M
Moonshot 1 $4.00 $4.00 256K
AI21 1 $8.00 $8.00 256K

๐Ÿ’ก Want to calculate your exact costs? Use our free API cost calculator โ€” enter your token usage and see monthly costs across all 42 models. Or check the live pricing dashboard for real-time data.

How to Save 50-90% on Your AI API Bill

  1. Audit your model usage. Most teams use GPT-5 or Claude Sonnet for tasks where a budget model would work fine. Run a cost audit to see where money goes.
  2. Route by complexity. Use cheap models (DeepSeek V4 Flash, Mistral Small) for simple tasks. Reserve premium models (Opus, GPT-5.5) for complex reasoning only.
  3. Batch non-urgent work. Process documents, generate reports, and run analysis during off-peak hours with budget models.
  4. Monitor output token usage. Output costs 5-20ร— more than input. Short, focused prompts save money. Set max_tokens limits.
  5. Compare before committing. Use our 232 comparison pages to find the cheapest model that meets your quality needs.

Find Your Cheapest Model

Enter your usage. See exact monthly costs across all 42 models. Free, no signup.

Try the Cost Calculator โ†’

FAQ

What is the cheapest AI API in 2026?

Mistral Small 4 ($0.10/$0.30 per million tokens) and Gemini 2.5 Flash-Lite ($0.10/$0.40) are the cheapest active models. For the absolute cheapest, Gemini 2.0 Flash Lite ($0.075/$0.30) still exists but is deprecated โ€” it will be shut down soon.

How much does GPT-5 cost per million tokens?

GPT-5 costs $1.25/M input and $10.00/M output. This puts it in the premium tier for output pricing. DeepSeek V4 Pro ($0.87/M output) offers similar capability at 11ร— lower cost for many tasks.

Which AI API offers the best value for money?

For budget workloads, DeepSeek V4 Pro ($0.435/$0.87) is unbeatable โ€” 1M context, competitive quality, 11ร— cheaper than GPT-5 on output. For quality-sensitive work, Claude Sonnet 4.6 ($3/$15) offers the best quality-to-price ratio among mid-tier models.

How much can I save by switching from GPT-5?

Switching to DeepSeek V4 Pro saves 91% on output costs. To Gemini 3 Flash saves 70%. To Claude Sonnet 4.6 saves 50% on output. Even switching to GPT-5 mini saves 80% on output for simpler tasks.

Are expensive models worth the premium?

For complex reasoning, research, and critical decision-making โ€” yes, premium models (Opus 4.8, GPT-5.5) are measurably better. For 80% of production tasks (summarization, extraction, simple Q&A, content generation), mid-tier and budget models are sufficient.

Last updated: Jun 21, 2026. Prices verified against provider documentation. See live data โ†’