Cheapest AI Models in 2026

The lowest-cost production LLMs ranked by per-million-token input + output pricing.

Cost per million tokens varies by nearly 4,800× across 2026 LLMs — from Cohere Command R7B at $0.0375 input / $0.15 output, to GPT-5.5 Pro at $30 input / $180 output. For high-volume workloads (classification, tagging, routing, bulk extraction), choosing the right budget model can cut bills 100× without measurable quality loss.

The calculator below lets you put your real usage against the cheapest models from each provider. Most teams find that 70–90% of their requests can run on a budget tier with no downstream impact.

Models included

  • GPT-5 mini (OpenAI) — $0.25 input / $2.00 output per 1M tokens · 400K context window
  • GPT-5 nano (OpenAI) — $0.05 input / $0.40 output per 1M tokens · 128K context window
  • Claude Haiku 4.5 (Anthropic) — $1.00 input / $5.00 output per 1M tokens · 200K context window
  • Gemini 3 Flash (Google) — $0.50 input / $3.00 output per 1M tokens · 1M context window
  • Gemini 3.1 Flash-Lite (Google) — $0.25 input / $1.50 output per 1M tokens · 1M context window
  • Command R7B (Cohere) — $0.0375 input / $0.15 output per 1M tokens · 128K context window

Frequently asked questions

What is the cheapest AI model in 2026?

Cohere Command R7B at $0.0375 per million input tokens and $0.15 per million output tokens — the lowest-priced production-grade LLM in 2026.

What is the cheapest OpenAI model?

GPT-5 nano at $0.05 input / $0.40 output per million tokens. Best used for classification, routing, tagging, and simple extraction.

What is the cheapest Anthropic model?

Claude Haiku 4.5 at $1 input / $5 output per million tokens. Anthropic does not currently publish a sub-dollar tier.

What is the cheapest Google model?

Gemini 3.1 Flash-Lite at $0.25 input / $1.50 output per million tokens, with a free tier still available.

Related calculators

  • GPT-5 Cost Calculator
  • Gemini 3.5 Flash Cost Calculator
  • OpenAI vs Anthropic pricing