LLM Cost Calculator

Runs in browser

Compare LLM API pricing for GPT-4, Claude, Gemini, and more.

Estimate monthly API spend across GPT-4o, Claude, Gemini, and more. Enter requests and average token usage to compare input, output, and total cost with March 2026 pricing.

LLM Cost Calculator tool

Sort by
ModelInput costOutput costTotal/monthContext
Llama 3.3 (local)Free — self-hostedSelf-hosted
Gemini Flash$0.037$0.060$0.0981M
GPT-4o mini$0.075$0.120$0.195128K
GPT-3.5 turbo$0.250$0.300$0.55016K
Claude Haiku 4.5$0.400$0.800$1.20200K
GPT-4o$1.25$2.00$3.25128K
Gemini 3.1 Pro$1.00$2.40$3.401M
Claude Sonnet 4.6$1.50$3.00$4.50200K
Claude Opus 4.6$7.50$15.00$22.50200K

Prices last updated: March 2026. Always verify current pricing at each provider's website.

🔒 Runs in your browser · No uploads · Your data never leaves your device

How to use

  1. Enter usage

    Set monthly requests plus average input and output tokens per request.

  2. Compare models

    Review the table for input cost, output cost, total per month, and context window.

  3. Sort

    Sort by total cost, context size, or relative speed to match your priorities.

Common use cases

  • Budgeting AI features before launchEstimate monthly API spend at your expected request volume before committing to a production rollout.
  • Comparing models for cost vs capabilityFind the cheapest model that meets your token requirements by sorting the comparison table by total cost.

Examples

  • Default scenario

    1k requests, 500 in / 200 out tokens.

    Output
    See highlighted cheapest row (often self-hosted or Flash-class models).

Frequently asked questions

Are these prices exact?
Rates reflect March 2026 reference pricing; providers change prices — verify on their sites before budgeting.
Why is local Llama $0?
Self-hosted inference has no per-token API fee in this calculator; you still run your own hardware.

Key concepts

Input tokens
Tokens in the prompt sent to the model — priced separately from output tokens in most LLM APIs.
Output tokens
Tokens in the model's response — typically priced 2–4× higher than input tokens.

You might find these useful too.

More ai tools