The lowest-cost production LLMs ranked by per-million-token input + output pricing.
Cost per million tokens varies by nearly 4,800× across 2026 LLMs — from Cohere Command R7B at $0.0375 input / $0.15 output, to GPT-5.5 Pro at $30 input / $180 output. For high-volume workloads (classification, tagging, routing, bulk extraction), choosing the right budget model can cut bills 100× without measurable quality loss.
The calculator below lets you put your real usage against the cheapest models from each provider. Most teams find that 70–90% of their requests can run on a budget tier with no downstream impact.
Cohere Command R7B at $0.0375 per million input tokens and $0.15 per million output tokens — the lowest-priced production-grade LLM in 2026.
GPT-5 nano at $0.05 input / $0.40 output per million tokens. Best used for classification, routing, tagging, and simple extraction.
Claude Haiku 4.5 at $1 input / $5 output per million tokens. Anthropic does not currently publish a sub-dollar tier.
Gemini 3.1 Flash-Lite at $0.25 input / $1.50 output per million tokens, with a free tier still available.