Anthropic Claude API Pricing: Complete Cost Guide 2026
Exact pricing for every Claude model — Haiku, Sonnet, and Opus — with monthly cost estimates, a comparison to GPT-4 and Gemini, and guidance on which tier to choose.
Claude model tiers and their prices
Anthropic structures Claude into three tiers, each with a different price and capability level. Claude Haiku 4.5 is the entry tier at $1.00 per million input tokens and $5.00 per million output tokens. It is fast and cost-effective for classification, extraction, short answers, and high-volume routing tasks.
Claude Sonnet 4.6 is the mid-tier at $3.00 per million input tokens and $15.00 per million output tokens. Sonnet supports a one-million token context window and offers a strong balance of reasoning quality and response speed. It is a common choice for coding assistants, RAG pipelines, and document workflows. Claude Opus 4.8 is the flagship at $5.00 input and $25.00 output per million tokens, suited for complex multi-step reasoning and agentic tasks.
Monthly cost estimates for common usage levels
To estimate your monthly Claude bill, multiply your expected monthly token counts by the model rate. A support chatbot with 1,000 active users and moderate message lengths might process around 10 million input tokens and 5 million output tokens per month. At Haiku 4.5 rates, that works out to roughly $10 per month. At Sonnet 4.6 rates, the same workload costs about $105 per month.
Agentic workflows and long-document tasks produce much larger token counts. A coding agent that reads and writes full files across a 10-turn session might consume 500,000 tokens per user session. At Sonnet 4.6 rates, 1,000 such sessions per month costs around $1,500 to $2,000 depending on input-to-output ratio. Use the LLM Cost Calculator to model your specific token volumes before committing to a model tier.
How Claude compares to GPT-5 and Gemini on price
At the mid-tier, Claude Sonnet 4.6 ($3.00/$15.00 per million tokens) closely matches GPT-5.4 ($2.50/$15.00) — the output price is identical, with GPT-5.4 slightly cheaper on input. For output-heavy workloads such as code generation, report drafting, or long-form content, the two are cost-equivalent. If budget is a priority, GPT-5.4 Mini ($0.75/$4.50) or GPT-5.4 Nano ($0.20/$1.25) undercut both by a wide margin for simpler tasks.
Google Gemini 3.5 Flash ($1.50/$9.00) is cheaper than Claude Sonnet 4.6 on both input and output, making it a strong alternative for teams with heavy token volumes. Gemini 3.1 Flash-Lite ($0.25/$1.50) is among the most affordable capable models from a major Western provider. The right choice depends on quality per task — a model that succeeds in one pass can cost less than a cheaper model that requires multiple retries.
Choosing the right Claude tier for your product
Start with Haiku 4.5 for any workload that processes short, structured inputs or routes requests before a heavier model handles them. Classification, summarization of short documents, intent detection, and keyword extraction are all good fits. The cost advantage over Sonnet compounds quickly at scale.
Use Sonnet 4.6 as the default for most product features. Its one-million token context window handles entire codebases, long conversations, and large document sets. Opus 4.8 is best reserved for tasks where reasoning quality directly changes the user outcome and where cost is secondary to accuracy. For most teams, a tiered strategy — Haiku for simple tasks, Sonnet for standard tasks, Opus only for critical tasks — achieves the best cost-to-quality ratio.
Estimate your own workload
Use the calculator to compare your expected API bill with a purchased or rented GPU setup.
Open calculator