What is the cheapest Groq model?

The most affordable Groq model is Llama 3.1 8B (Groq) at $0.05/1M input tokens.

Groq API Pricing 2026

Q: How much does Groq LLM API cost?

Groq offers 5 models. The cheapest option starts at $0.05/1M input tokens. Ultra-fast inference on dedicated LPU hardware — up to 1,000 tokens/sec. Best for latency-sensitive production apps. Predictable, linear pricing.

Ultra-fast inference on dedicated LPU hardware — up to 1,000 tokens/sec. Best for latency-sensitive production apps. Predictable, linear pricing.

Pricing verified 2026-06-06. Sourced from console.groq.com.

Get Groq API access →

Groq Model Pricing

Prices in USD per 1M tokens

Model	Input / 1M	Output / 1M	Context
GPT-OSS 120B (Groq) OpenAI open-source model on Groq; 500 tokens/sec	$0.15	$0.6	128,000
Llama 4 Scout (Groq) Fastest LLM inference on Groq LPU; 594 tokens/sec	$0.11	$0.34	128,000
Qwen3 32B (Groq) Strong mid-size model with extended context; 662 tokens/sec	$0.29	$0.59	131,072
Llama 3.3 70B (Groq) Reliable 70B on Groq LPU; 394 tokens/sec	$0.59	$0.79	128,000
Llama 3.1 8B (Groq) Ultra-low cost; 840 tokens/sec on Groq LPU	$0.05	$0.08	128,000

Estimated Monthly Cost (70% input / 30% output split)

Model	1M tokens/mo	10M tokens/mo	100M tokens/mo	1B tokens/mo
GPT-OSS 120B (Groq)	$0.285	$2.85	$28.50	$285
Llama 4 Scout (Groq)	$0.179	$1.79	$17.90	$179
Qwen3 32B (Groq)	$0.380	$3.80	$38.00	$380
Llama 3.3 70B (Groq)	$0.650	$6.50	$65.00	$650
Llama 3.1 8B (Groq)	$0.059	$0.590	$5.90	$59.00

Frequently Asked Questions

How much does Groq LLM API cost?

Groq offers 5 models ranging from $0.050/1M to $0.59/1M input tokens. Ultra-fast inference on dedicated LPU hardware — up to 1,000 tokens/sec. Best for latency-sensitive production apps. Predictable, linear pricing.

Is Groq cheaper than self-hosting?

For low-volume workloads (under 100M tokens/month), cloud APIs like Groq are almost always cheaper than purchasing and maintaining GPU hardware. Use our calculator to find the exact break-even point for your usage.

Compare Groq vs self-hosting →