LLM Cost Calculator
Back to guides
2026-06-06 · 6 min read

Google Gemini API Pricing: Complete Cost Guide 2026

Exact pricing for every current Gemini model — Gemini 3.5 Flash, Gemini 3.1 Pro, and Flash-Lite — with monthly cost estimates and a comparison to Claude and GPT-5.

Gemini model tiers and pricing

Google's Gemini family in 2026 spans three active generations. Gemini 3.5 Flash is the current flagship at $1.50 per million input tokens and $9.00 per million output tokens with a one-million token context window. Gemini 3.1 Pro Preview ($2.00/$12.00) offers advanced multimodal and agentic capabilities. Gemini 3.1 Flash-Lite ($0.25/$1.50) is the most cost-efficient current-generation Gemini model.

The Gemini 2.5 series remains in production: Gemini 2.5 Pro ($1.25/$10.00) and Gemini 2.5 Flash ($0.30/$2.50) are still widely used and well-supported. Gemini 2.5 Flash-Lite ($0.10/$0.40) is one of the cheapest capable models from any major Western provider. Open-source Gemma models (Gemma 4 26B at $0.06/$0.33, Gemma 4 31B at $0.12/$0.36) are available via Google AI Studio at very low cost.

Monthly cost estimates at common usage levels

At 10M tokens per month with a 70/30 input-output split, Gemini 2.5 Flash-Lite costs roughly $1.40/month — among the cheapest options for any production workload. Gemini 3.1 Flash-Lite comes in at about $2.20/month for the same volume. Gemini 3.5 Flash runs about $13.20/month and Gemini 3.1 Pro about $17.60/month.

For heavy usage at 500M tokens per month, Gemini 2.5 Flash-Lite stays under $70/month while Gemini 3.5 Flash reaches about $660/month. Long-context workloads — documents over 100K tokens — are a particular strength of Gemini, which offers a one-million token context window across multiple tiers without a surcharge.

How Gemini compares to Claude and GPT-5

Gemini 3.5 Flash ($1.50/$9.00) is cheaper than both Claude Sonnet 4.6 ($3.00/$15.00) and GPT-5.4 ($2.50/$15.00) on both input and output. For output-heavy workloads at scale, Gemini 3.5 Flash can reduce costs by 40 percent versus GPT-5.4 and 40 percent versus Claude Sonnet. The tradeoff is ecosystem and tooling maturity — Claude and OpenAI have broader third-party integrations.

Gemini 2.5 Flash-Lite ($0.10/$0.40) undercuts Claude Haiku 4.5 ($1.00/$5.00) by 90 percent on input and 92 percent on output. For structured tasks where Gemini's quality is sufficient, this is one of the strongest cost-reduction opportunities available without switching to smaller or less-supported models.

Best use cases for Gemini

Gemini's one-million token context window makes it particularly strong for full-document processing, large codebase review, long conversation history, and retrieval-augmented generation with large knowledge bases. The Flash variants handle these use cases at prices that make long-context workloads economically viable for production scale.

For multimodal tasks — image analysis, video understanding, and combined text-image processing — Gemini is a top choice among major providers. Google AI Studio provides a straightforward entry point for testing, and the Vertex AI integration supports enterprise deployments with compliance and SLA requirements.

Estimate your own workload

Use the calculator to compare your expected API bill with a purchased or rented GPU setup.

Open calculator