LLM Cost Calculator

Agentic AI Workflow LLM Cost Estimator

AI agents call LLMs multiple times per task — plan, execute, reflect, retry. Estimate monthly costs for multi-step agentic pipelines with tool use and long context.

Recommended Setup

Model
Claude Sonnet 4.6
Anthropic
Monthly tokens
520M
400M in / 120M out
Estimated monthly cost
$3,000

Cost Comparison: All Cloud Models

Based on 400M input + 120M output tokens/month

ModelProviderMonthly cost
Gemma 3 4BGoogle$25.60
Llama 3.1 8B (Groq)Groq$29.60
Gemma 3 12BGoogle$31.60
gpt-oss-120bOpenAI$37.60
Gemma 3n 4BGoogle$38.40
Doubao Seed 2.0 MiniDoubao (ByteDance)$46.80
Gemma 3 27BGoogle$51.20
Gemma 4 26B A4B Google$63.60
DeepSeek V4 FlashDeepSeek$64.00
GPT-5 NanoOpenAI$68.00
Qwen3.5-FlashQwen (Alibaba)$68.00
Qwen3 14BQwen (Alibaba)$68.80
GLM-4.7 FlashZhipu AI (GLM)$72.00
Doubao Pro 32KDoubao (ByteDance)$77.60
Hunyuan TurboSHunyuan (Tencent)$77.60
Llama 4 Scout (Groq)Groq$84.80
GPT-4.1 NanoOpenAI$88.00
Gemini 2.5 Flash-LiteGoogle$88.00
Qwen3 30B A3BQwen (Alibaba)$90.00
Qwen3 VL 32B InstructQwen (Alibaba)$90.40
Gemma 4 31BGoogle$91.20
Hunyuan T1Hunyuan (Tencent)$123
GPT-4o-miniOpenAI$132
GPT-OSS 120B (Groq)Groq$132
DeepSeek V3.2DeepSeek$133
Qwen3 Coder NextQwen (Alibaba)$140
DeepSeek V3DeepSeek$176
DeepSeek V3.1DeepSeek$179
Qwen3 VL 235B A22B InstructQwen (Alibaba)$186
Qwen3 32B (Groq)Groq$187
Qwen2.5 VL 72B InstructQwen (Alibaba)$190
Qwen2.5 72B InstructQwen (Alibaba)$192
Qwen3.5-PlusQwen (Alibaba)$198
Qwen3.6 FlashQwen (Alibaba)$212
GPT-5.4 NanoOpenAI$230
DeepSeek V3 (Mar 2025)DeepSeek$240
Gemini 3.1 Flash-LiteGoogle$280
DeepSeek V4 ProDeepSeek$280
Qwen3 Coder 480B A35BQwen (Alibaba)$304
Llama 3.3 70B (Groq)Groq$331
GPT-5 MiniOpenAI$340
GPT-4.1 MiniOpenAI$352
Qwen3.7 PlusQwen (Alibaba)$352
Qwen3.6 PlusQwen (Alibaba)$366
Qwen2.5 Coder 32B InstructQwen (Alibaba)$384
Kimi K2.5Kimi (Moonshot AI)$388
Qwen3-MaxQwen (Alibaba)$398
Gemini 2.5 FlashGoogle$420
Llama 3.3 70B (Together)Together AI$458
R1 0528DeepSeek$458
Doubao Seed 2.0 ProDoubao (ByteDance)$472
Kimi K2Kimi (Moonshot AI)$504
Kimi K2.5 (Together)Together AI$536
Kimi K2 ThinkingKimi (Moonshot AI)$540
DeepSeek R1DeepSeek$580
Qwen3 Coder PlusQwen (Alibaba)$650
Qwen3.5 397B (Together)Together AI$672
Qwen3 Max ThinkingQwen (Alibaba)$780
GLM-5Zhipu AI (GLM)$784
Claude 3.5 HaikuAnthropic$800
GPT-5.4 MiniOpenAI$840
Kimi K2.6Kimi (Moonshot AI)$880
Qwen3.7 MaxQwen (Alibaba)$950
GLM-5-TurboZhipu AI (GLM)$960
o4-miniOpenAI$968
o3 MiniOpenAI$968
Claude Haiku 4.5Anthropic$1,000
GLM-5.1Zhipu AI (GLM)$1,088
DeepSeek V4 Pro (Together)Together AI$1,368
Moonshot V1 (128K)Kimi (Moonshot AI)$1,400
Gemini 3.5 FlashGoogle$1,680
GPT-5OpenAI$1,700
GPT-5 CodexOpenAI$1,700
Gemini 2.5 ProGoogle$1,700
GPT-4.1OpenAI$1,760
o3OpenAI$1,760
o4 Mini Deep ResearchOpenAI$1,760
GPT-4oOpenAI$2,200
Gemini 3.1 Pro PreviewGoogle$2,240
GPT-5.4OpenAI$2,800
Claude Sonnet 4.6★ recommendedAnthropic$3,000
Claude Sonnet 4.5Anthropic$3,000
Claude Sonnet 4Anthropic$3,000
Claude Opus 4.7Anthropic$5,000
Claude Opus 4.6Anthropic$5,000
Claude Opus 4.8Anthropic$5,000
Claude Opus 4.5Anthropic$5,000
GPT-5.5OpenAI$5,600
o3 Deep ResearchOpenAI$8,800
Claude Opus 4.8 (Fast)Anthropic$10,000
o1OpenAI$13,200
Claude Opus 4.1Anthropic$15,000
Claude Opus 4Anthropic$15,000
o3 ProOpenAI$17,600
GPT-5 ProOpenAI$20,400
Claude Opus 4.7 (Fast)Anthropic$30,000
Claude Opus 4.6 (Fast)Anthropic$30,000
GPT-5.5 ProOpenAI$33,600
GPT-5.4 ProOpenAI$33,600
o1-proOpenAI$132,000

Frequently Asked Questions

Why are agentic workflows so much more expensive than single-shot LLM calls?

Each agent step re-sends the full conversation history plus tool outputs. A 5-step agent task with a 10k-token context uses 50k input tokens total — 5× more than a direct request. Multi-agent pipelines multiply this further. Budget 3–10× the tokens you'd expect from a single-turn interaction.

Which model is best for running AI agents?

Claude Sonnet 4.6 ($3/1M input) is the leading choice for agentic tasks — it reliably follows tool-use schemas and handles complex multi-step reasoning. Claude Opus 4.8 ($5/1M) is better for high-stakes agents where accuracy matters most. For cost-optimized agents with simpler tasks, GPT-5.4 Mini ($0.75/1M) is a strong mid-tier option.

How can I reduce costs for agentic AI workflows?

Key strategies: (1) limit context accumulation — prune tool outputs after each step; (2) use a small model for planning/routing and a large model only for execution; (3) cap retry loops with max_iterations; (4) cache repeated tool results with a key-value store. These cuts can reduce token use by 40–70%.