Cheapest AI API in June 2026: All 34 Models Ranked by Cost

Updated June 2026 · 8 min read · 34 models compared

Looking for the cheapest AI API? We track pricing across all major providers and update monthly. Here's the definitive ranking of every AI model by cost, with real-world scenarios so you can pick the right model for your budget.

TL;DR: The cheapest AI API in June 2026 is Gemini 2.0 Flash Lite at $0.075/$0.30 per 1M tokens. For quality-sensitive tasks, DeepSeek V4 Pro ($0.44/$0.87) offers the best price-to-performance ratio.

All 34 AI Models Ranked by Input Cost

Try It Live — Instant Cost Calculator

See exactly what this model costs for your workload. No signup needed.

Prices are per 1M tokens. All data verified May 29, 2026. Full pricing table →

# Model Provider Input Output Context
1 Gemini 2.0 Flash Lite Google $0.075 $0.30 1M
2 GPT-oss 20B OpenAI $0.08 $0.35 128K
3 Gemini 2.0 Flash Google $0.10 $0.40 1M
4 Llama 3.1 8B Meta (Together.ai) $0.10 $0.10 128K
5 Llama 4 Scout Meta (Together.ai) $0.11 $0.34 10M
6 DeepSeek V4 Flash DeepSeek $0.14 $0.28 1M
7 GPT-4o mini OpenAI $0.15 $0.60 128K
8 GPT-oss 120B OpenAI $0.15 $0.60 128K
9 Mistral Small 4 Mistral $0.15 $0.60 128K
10 Llama 4 Maverick Meta (Together.ai) $0.20 $0.60 10M
11 GPT-5 mini OpenAI $0.25 $2.00 272K
12 DeepSeek V3 DeepSeek $0.27 $1.10 128K
13 Mistral Large 3 Mistral $0.50 $1.50 128K
14 Command R Cohere $0.50 $1.50 128K
15 DeepSeek V4 Pro DeepSeek $0.44 $0.87 1M
16 Kimi K2.6 Moonshot $0.90 $3.75 256K
17 Claude Haiku 4.5 Anthropic $1.00 $5.00 200K
18 Gemini 2.5 Pro Google $1.25 $10.00 1M
19 GPT-5 OpenAI $1.25 $10.00 272K
20 GPT-5.3 Codex OpenAI $1.75 $14.00 400K
21 Gemini 3.1 Pro Google $2.00 $12.00 1M
22 Jamba 1.5 Large AI21 $2.00 $8.00 256K
23 GPT-4o OpenAI $2.50 $10.00 128K
24 Command R+ Cohere $2.50 $10.00 128K
25 Claude Sonnet 4 Anthropic $3.00 $15.00 200K
26 Claude Sonnet 4.6 Anthropic $3.00 $15.00 1M
27 Grok Build 0.1 xAI $0.30 $0.50 256K
28 GPT-5.5 OpenAI $5.00 $30.00 1M
29 Claude Opus 4.7 Anthropic $5.00 $25.00 1M
30 Claude Opus 4.8 Anthropic $5.00 $25.00 1M
31 Llama 3.1 70B Meta (Together.ai) $0.88 $0.88 128K
32 Claude 4 Opus Anthropic $15.00 $75.00 200K
33 Grok 4.3 xAI $1.25 $2.50 1M
34 GPT-5.5 Pro OpenAI $30.00 $180.00 1M

Real Cost Scenarios

Let's see what these prices actually mean for real workloads. We'll model three common use cases.

Scenario 1: Simple Chatbot (10K requests/day)

A customer support chatbot with ~500 input tokens and ~300 output tokens per request.

Monthly Cost Comparison

Gemini 2.0 Flash Lite$3.78/mo
DeepSeek V4 Flash$6.72/mo
GPT-4o mini$9.90/mo
DeepSeek V4 Pro$14.49/mo
Claude Haiku 4.5$60.00/mo
GPT-5$127.50/mo
Claude Sonnet 4.6$180.00/mo
Cheapest vs Most Expensive48x difference

Scenario 2: Code Assistant (5K requests/day)

A coding assistant with ~2,000 input tokens and ~4,000 output tokens per request (longer outputs).

Monthly Cost Comparison

Gemini 2.0 Flash Lite$22.50/mo
DeepSeek V4 Flash$21.00/mo
GPT-4o mini$54.00/mo
DeepSeek V4 Pro$68.10/mo
Claude Haiku 4.5$360.00/mo
GPT-5$675.00/mo
Claude Sonnet 4.6$945.00/mo
Cheapest vs Most Expensive45x difference

Scenario 3: RAG Pipeline (1K requests/day)

A RAG system with ~4,000 input tokens (context + query) and ~1,000 output tokens.

Monthly Cost Comparison

Gemini 2.0 Flash Lite$5.40/mo
DeepSeek V4 Flash$5.46/mo
GPT-4o mini$12.60/mo
DeepSeek V4 Pro$17.46/mo
Claude Haiku 4.5$135.00/mo
GPT-5$180.00/mo
Cheapest vs Most Expensive33x difference

Best Budget Models by Use Case

Best for Simple Tasks (Classification, Extraction, Formatting)

Gemini 2.0 Flash Lite ($0.075/$0.30) or Llama 3.1 8B ($0.10/$0.10). These models handle straightforward tasks with minimal quality loss at 90%+ savings vs premium models.

Best for Code Generation

DeepSeek V4 Pro ($0.44/$0.87). Best price-to-quality ratio for coding. 1M context window handles large codebases. For simpler completions, Mistral Small 4 ($0.15/$0.60) is a strong budget pick.

Best for Long Context

Llama 4 Scout ($0.11/$0.34) with 10M context or Gemini 2.0 Flash ($0.10/$0.40) with 1M context. Both are extremely affordable for long-document processing.

Best for RAG Pipelines

DeepSeek V4 Flash ($0.14/$0.28). 1M context window, very low output cost. For RAG with shorter contexts, GPT-4o mini ($0.15/$0.60) is also competitive.

Best Quality-per-Dollar (Mid-Tier)

DeepSeek V4 Pro ($0.44/$0.87) or Mistral Large 3 ($0.50/$1.50). Both offer strong reasoning at a fraction of GPT-5/Claude Sonnet pricing.

How to Choose the Right Model

The cheapest model isn't always the best choice. Here's a decision framework:

  1. Start with the task complexity. Simple tasks (classification, extraction, formatting) → budget models. Complex reasoning, code generation, creative writing → mid-tier or premium.
  2. Consider output length. If your outputs are long (code, analysis), output pricing matters more. DeepSeek and Gemini have the lowest output costs.
  3. Check context window needs. If you're processing long documents, you need 1M+ context. Gemini Flash and Llama 4 Scout are the cheapest with large contexts.
  4. Test before committing. Use our Prompt Cost Calculator to model your exact workload across all 34 models.

Calculate Your Exact Costs

Paste your actual prompt and see costs across all 34 models. Find the cheapest option for your specific workload.

Open Calculator — Free

Key Takeaways

Methodology

All prices sourced directly from provider pricing pages, verified May 29, 2026. Prices are per 1M tokens. We track 34 models across 10 providers: OpenAI, Anthropic, Google, DeepSeek, Mistral, Cohere, Meta (Together.ai), Moonshot, xAI, and AI21. Data is updated monthly. See pricing changelog →

Share on X Share on LinkedIn Share on Reddit

Compare models: APIpulse Model Comparison — side-by-side pricing for 34 models across 10 providers. Free tool.

Try it free: APIpulse Cost Calculator — estimate your monthly spend across 34 models and 10 providers in 30 seconds.