LLM Cost Calculator
Runs in browserCompare LLM API pricing for GPT-4, Claude, Gemini, and more.
Estimate monthly API spend across GPT-4o, Claude, Gemini, and more. Enter requests and average token usage to compare input, output, and total cost with March 2026 pricing.
LLM Cost Calculator tool
Sort by
| Model | Input cost | Output cost | Total/month | Context |
|---|---|---|---|---|
| Llama 3.3 (local) | — | — | Free — self-hosted | Self-hosted |
| Gemini Flash | $0.037 | $0.060 | $0.098 | 1M |
| GPT-4o mini | $0.075 | $0.120 | $0.195 | 128K |
| GPT-3.5 turbo | $0.250 | $0.300 | $0.550 | 16K |
| Claude Haiku 4.5 | $0.400 | $0.800 | $1.20 | 200K |
| GPT-4o | $1.25 | $2.00 | $3.25 | 128K |
| Gemini 3.1 Pro | $1.00 | $2.40 | $3.40 | 1M |
| Claude Sonnet 4.6 | $1.50 | $3.00 | $4.50 | 200K |
| Claude Opus 4.6 | $7.50 | $15.00 | $22.50 | 200K |
Prices last updated: March 2026. Always verify current pricing at each provider's website.
🔒 Runs in your browser · No uploads · Your data never leaves your device
How to use
Enter usage
Set monthly requests plus average input and output tokens per request.
Compare models
Review the table for input cost, output cost, total per month, and context window.
Sort
Sort by total cost, context size, or relative speed to match your priorities.
Common use cases
- Budgeting AI features before launch — Estimate monthly API spend at your expected request volume before committing to a production rollout.
- Comparing models for cost vs capability — Find the cheapest model that meets your token requirements by sorting the comparison table by total cost.
Examples
Default scenario
1k requests, 500 in / 200 out tokens.
OutputSee highlighted cheapest row (often self-hosted or Flash-class models).
Frequently asked questions
- Are these prices exact?
- Rates reflect March 2026 reference pricing; providers change prices — verify on their sites before budgeting.
- Why is local Llama $0?
- Self-hosted inference has no per-token API fee in this calculator; you still run your own hardware.
Key concepts
- Input tokens
- Tokens in the prompt sent to the model — priced separately from output tokens in most LLM APIs.
- Output tokens
- Tokens in the model's response — typically priced 2–4× higher than input tokens.
Related tools
You might find these useful too.