Cheapest AI Models in 2026

The lowest-cost production LLMs ranked by per-million-token input + output pricing.

Cost per million tokens varies by nearly 4,800× across 2026 LLMs — from Gemini 2.5 Flash-Lite at $0.10 input / $0.40 output, to GPT-5.5 Pro at $30 input / $180 output. For high-volume workloads (classification, tagging, routing, bulk extraction), choosing the right budget model can cut bills 100× without measurable quality loss.

The calculator below lets you put your real usage against the cheapest models from each provider. Most teams find that 70–90% of their requests can run on a budget tier with no downstream impact.

Models included

GPT-5.4 nano (OpenAI) — $0.20 input / $1.25 output per 1M tokens · 270K context window
GPT-4.1 nano (OpenAI) — $0.10 input / $0.40 output per 1M tokens · 1M context window
Gemini 3.1 Flash-Lite (Google) — $0.25 input / $1.50 output per 1M tokens · 1M context window
Gemini 2.5 Flash-Lite (Google) — $0.10 input / $0.40 output per 1M tokens · 1M context window
DeepSeek V4 Flash (DeepSeek) — $0.14 input / $0.28 output per 1M tokens · 1M context window
Mistral Small 3.2 (Mistral) — $0.10 input / $0.30 output per 1M tokens · 128K context window
Grok 4.1 Fast (xAI) — $0.20 input / $0.50 output per 1M tokens · 2M context window

Frequently asked questions

What is the cheapest AI model in 2026?

Gemini 2.5 Flash-Lite at $0.10 per million input tokens and $0.40 per million output tokens — among the lowest-priced production LLMs in 2026.

What is the cheapest OpenAI model?

GPT-5.4 nano at $0.20 input / $1.25 output per million tokens, or GPT-4.1 nano at $0.10 / $0.40. Best for classification, routing, tagging, and simple extraction.

What is the cheapest Anthropic model?

Claude Haiku 4.5 at $1 input / $5 output per million tokens. Anthropic does not currently publish a sub-dollar tier.

What is the cheapest Google model?

Gemini 3.1 Flash-Lite at $0.25 input / $1.50 output per million tokens, with a free tier still available.