TokenMix Research Lab · 2026-04-06

Google Gemini API Pricing 2026: 3.1 Pro, Flash, Batch Costs

Last Updated: 2026-04-30
Author: TokenMix Research Lab
Data checked: 2026-04-30

Google Gemini API pricing now has three cost levers that matter: model tier, prompt length, and processing mode. Gemini 3.1 Pro is $2/$12 per 1M tokens under 200K prompt tokens. Gemini 3.1 Flash-Lite is $0.25/$1.50. Batch and Flex can cut several paid routes by 50%.

According to Google's official Gemini API pricing page, Gemini 3.1 Pro Preview costs $2 input and $12 output per 1M tokens for prompts up to 200K tokens, then $4 input and $18 output above 200K. Gemini 3.1 Flash-Lite Preview costs $0.25 input and $1.50 output for text, image, and video input. Gemini 3 Flash Preview costs $0.50 input and $3 output. Google also states that paid tier includes context caching, Batch API with 50% cost reduction, higher rate limits, and content not used to improve products.

Quick Answer
Confirmed Gemini Pricing Facts
Gemini Price Table
Standard vs Batch vs Flex vs Priority
Long Context Pricing
Context Caching Costs
Grounding and Search Costs
Cost per Task
Gemini vs OpenAI vs Claude vs DeepSeek
When TokenMix.ai Fits
Final Recommendation
FAQ
Related Articles
Sources

Quick Answer

Question	Answer
Cheapest current Gemini text route	Gemini 3.1 Flash-Lite Preview at $0.25 input and $1.50 output per 1M tokens.
Best Gemini reasoning route	Gemini 3.1 Pro Preview at $2/$12 under 200K prompt tokens.
Main hidden pricing rule	Gemini Pro prices increase above 200K prompt tokens.
Best cost lever	Batch or Flex for async workloads; context caching for repeated prefixes.
Search grounding cost	Gemini 3 includes 5,000 prompts per month shared free, then $14 per 1,000 search queries.
Best production rule	Use Flash-Lite for bulk work, Flash for fast multimodal work, Pro for hard reasoning.

Confirmed Gemini Pricing Facts

Claim	Status	Practical meaning	Source
Gemini 3.1 Pro is $2/$12 under 200K prompt tokens	Confirmed	Premium Gemini reasoning is cheaper than GPT-5.4 on input, but output is close.	Google pricing
Gemini 3.1 Pro becomes $4/$18 above 200K	Confirmed	Long prompts change the bill materially.	Google pricing
Gemini 3.1 Flash-Lite is $0.25/$1.50	Confirmed	This is the cheap high-volume Gemini route.	Google pricing
Gemini 3 Flash is $0.50/$3	Confirmed	Better default when Flash-Lite is too light.	Google pricing
Batch API can reduce cost by 50%	Confirmed	Use it for offline labeling, summarization, and evaluation.	Google pricing
Paid tier content is not used to improve products	Confirmed on pricing page	Important for production privacy posture.	Google pricing
Cheapest model is always best	False	Quality, latency, grounding, cache, and prompt length change the decision.	TokenMix analysis

Gemini Price Table

All prices are per 1M tokens in USD.

Gemini model	Standard input	Context cache read	Standard output	Main prompt rule	Best use
Gemini 3.1 Flash-Lite Preview	$0.25	$0.025	$1.50	Text/image/video input; audio input costs more	Bulk simple tasks, translation, extraction
Gemini 3 Flash Preview	$0.50	$0.05	$3.00	Text/image/video input; audio input costs more	Fast multimodal work
Gemini 3.1 Pro Preview	$2.00	$0.20	$12.00	Price shown for prompts <= 200K	Complex reasoning and coding
Gemini 2.5 Pro	$1.25	$0.125	$10.00	Price shown for prompts <= 200K	Stable Pro route
Gemini 3 Pro Image Preview	$2.00 text/image	Not listed in the same text table	$12 text, $120 image tokens	Image output has separate token math	Image generation and multimodal output

The important correction: old Gemini pricing pages often focused on Gemini 2.0 or 2.5 Flash Lite. Current search intent is now about Gemini 3.1 Pro, Gemini 3.1 Flash-Lite, Gemini 3 Flash, Batch, Flex, and context cache storage.

Standard vs Batch vs Flex vs Priority

Google exposes multiple processing modes. The right mode depends on latency and reliability needs.

Mode	Cost profile	Best for	Avoid when
Standard	Normal price	Live apps and normal production traffic	You can wait and cost matters more.
Batch	Often 50% of standard input/output	Offline jobs, evaluations, bulk summarization	A user is waiting for the response.
Flex	Similar discount profile for supported Gemini 3.1 routes	Lower-priority production jobs	You need predictable latency.
Priority	Higher price	Latency-sensitive or high-priority work	Cost is the main constraint.

Gemini 3.1 Pro Preview

Mode	Input <=200K	Input >200K	Output <=200K	Output >200K
Standard	$2.00	$4.00	$12.00	$18.00
Batch	$1.00	$2.00	$6.00	$9.00
Flex	$1.00	$2.00	$6.00	$9.00
Priority	$3.60	$7.20	$21.60	$32.40

Gemini 3.1 Flash-Lite Preview

Mode	Text/image/video input	Audio input	Output
Standard	$0.25	$0.50	$1.50
Batch	$0.125	$0.25	$0.75
Flex	$0.125	$0.25	$0.75
Priority	$0.45	$0.90	$2.70

Gemini 3 Flash Preview

Mode	Text/image/video input	Audio input	Output
Standard	$0.50	$1.00	$3.00
Batch	$0.25	$0.50	$1.50
Flex	$0.25	$0.50	$1.50

Batch is the cleanest lever when the work is not interactive. If you are running nightly evaluation or bulk enrichment, paying Standard pricing is usually waste.

Long Context Pricing

Gemini Pro pricing changes above 200K prompt tokens. Do not use the under-200K price for every workload.

Model	Prompt <=200K input	Prompt >200K input	Prompt <=200K output	Prompt >200K output
Gemini 3.1 Pro Preview	$2.00	$4.00	$12.00	$18.00
Gemini 2.5 Pro	$1.25	$2.50	$10.00	$15.00

Cost example:

Workload	Gemini 3.1 Pro under 200K	Gemini 3.1 Pro over 200K
100K input, 5K output	$0.260	Not applicable
300K input, 5K output	Not applicable	$1.290
900K input, 20K output	Not applicable	$3.960

The rule is simple: Gemini Pro can be very competitive for normal prompts. For very long prompts, check the >200K tier before you promise savings.

Context Caching Costs

Google context caching has two costs: cache read token pricing and hourly storage pricing.

Model	Cache read <=200K	Cache read >200K	Storage
Gemini 3.1 Pro Preview	$0.20	$0.40	$4.50 per 1M tokens per hour
Gemini 2.5 Pro	$0.125	$0.25	$4.50 per 1M tokens per hour
Gemini 3.1 Flash-Lite Preview	$0.025 text/image/video	$0.05 audio	$1.00 per 1M tokens per hour
Gemini 3 Flash Preview	$0.05 text/image/video	$0.10 audio	$1.00 per 1M tokens per hour

Cache math for a repeated 100K-token prefix on Gemini 3.1 Pro:

Calls in one hour	No cache input cost	Cache read token cost	Storage cost estimate	Practical note
1	$0.200	Not useful	Extra storage	Do not cache for one call.
5	$1.000	$0.100 read-side equivalent	$0.450 for 100K tokens/hour	Starts to help if reuse is real.
20	$4.000	$0.400 read-side equivalent	$0.450 for 100K tokens/hour	Stronger fit.

This is different from Anthropic's simple 0.1x cache read model. With Gemini, storage duration matters.

Grounding and Search Costs

Gemini 3 pricing includes search-related add-ons. These can matter more than tokens in search-heavy agents.

Feature	Free allowance	Paid price	Risk
Grounding with Google Search for Gemini 3	5,000 prompts per month shared across Gemini 3	$14 per 1,000 search queries	One prompt can trigger more than one search query.
Grounding with Google Maps	5,000 prompts per month shared across Gemini 3	$14 per 1,000 search queries for Gemini 3.1 Pro	Search-heavy local workflows can exceed token cost.
Gemini 2.5 Pro Google Search grounding	1,500 RPD free	$35 per 1,000 grounded prompts	Different from Gemini 3 pricing.

If a Gemini app uses grounding on every request, track search queries separately from model tokens.

Cost per Task

These examples use Standard pricing and no cache unless noted.

Task	Token shape	Flash-Lite	Gemini 3 Flash	Gemini 3.1 Pro
Simple reply	500 input / 200 output	$0.000425	$0.000850	$0.003400
Support answer	2K input / 800 output	$0.001700	$0.003400	$0.013600
RAG answer	8K input / 500 output	$0.002750	$0.005500	$0.022000
Code explanation	20K input / 3K output	$0.009500	$0.019000	$0.076000
Long Pro review	300K input / 5K output	Not ideal	Not ideal	$1.290000

Monthly cost at 10,000 simple replies per day:

Model	Daily cost	Monthly cost
Gemini 3.1 Flash-Lite	$4.25	$127.50
Gemini 3 Flash	$8.50	$255.00
Gemini 3.1 Pro	$34.00	$1,020.00

This is why routing matters. Use Flash-Lite for simple work, then escalate.

Gemini vs OpenAI vs Claude vs DeepSeek

Model	Input	Cache read	Output	Best use
Gemini 3.1 Flash-Lite	$0.25	$0.025	$1.50	Cheap Gemini route
Gemini 3 Flash	$0.50	$0.05	$3.00	Fast multimodal route
Gemini 3.1 Pro	$2.00	$0.20	$12.00	Premium Gemini route
GPT-5.4 mini	$0.75	$0.075	$4.50	Budget OpenAI route
GPT-5.4	$2.50	$0.25	$15.00	OpenAI default
Claude Sonnet 4.6	$3.00	$0.30	$15.00	Balanced Claude route
DeepSeek V4 Flash	$0.14 miss	$0.0028 hit	$0.28	Lowest-cost text route

Gemini is strongest when you need Google's multimodal ecosystem, grounding, long-context Pro, or low-cost Flash-Lite. It is not automatically the cheapest overall route once DeepSeek cache hits enter the calculation.

When TokenMix.ai Fits

TokenMix.ai helps when Gemini is one route in a larger model policy. A real production app often needs Gemini for multimodal work, Claude for writing or coding quality, DeepSeek for low-cost reasoning, and OpenAI for ecosystem compatibility.

Need	Direct Gemini API	TokenMix.ai unified API
Only Gemini models	Good fit	Optional
Gemini plus OpenAI/Claude/DeepSeek	Multiple integrations	One OpenAI-compatible access layer
Cost-aware routing	Build yourself	Centralize routing policy
Payment flexibility	Google billing path	Useful when direct billing is hard
Price comparison	Manual	Compare across providers

For the broader comparison, use the AI API pricing hub. For OpenAI-compatible setup, read Gemini OpenAI-Compatible API.

Final Recommendation

Start with Gemini 3.1 Flash-Lite for high-volume simple tasks. Use Gemini 3 Flash when Flash-Lite is too weak. Use Gemini 3.1 Pro only when the request needs stronger reasoning, and check the >200K prompt tier before running long-context jobs.

FAQ

How much does Gemini 3.1 Pro cost?

Gemini 3.1 Pro Preview costs $2 input and $12 output per 1M tokens for prompts up to 200K tokens. Above 200K prompt tokens, it costs $4 input and $18 output.

What is the cheapest Gemini API model?

Gemini 3.1 Flash-Lite Preview is the cheapest current Gemini text route in this guide. It costs $0.25 input and $1.50 output per 1M tokens for text, image, and video input.

Does Gemini Batch API save 50%?

Yes for supported paid Gemini routes. Gemini 3.1 Pro Batch is $1/$6 under 200K prompt tokens, compared with $2/$12 in Standard mode.

Is Gemini context caching free?

No. Gemini context caching has cache read token pricing and hourly storage pricing. For Gemini 3.1 Pro, cache reads are $0.20 per 1M tokens under 200K and storage is $4.50 per 1M tokens per hour.

Does Gemini pricing increase above 200K tokens?

Yes for Gemini Pro models. Gemini 3.1 Pro increases from $2/$12 to $4/$18 above 200K prompt tokens.

How much does Gemini grounding with Google Search cost?

For Gemini 3, Google's pricing page lists 5,000 prompts per month free, shared across Gemini 3, then $14 per 1,000 search queries. A single prompt can trigger more than one search query.

Is Gemini cheaper than GPT-5.4?

Gemini 3.1 Pro is cheaper than GPT-5.4 on standard input and output under 200K prompts: $2/$12 versus $2.50/$15. Gemini 3.1 Flash-Lite is much cheaper for simple tasks.

Should I use Gemini directly or through a gateway?

Use Gemini directly if your app only needs Google models and native Gemini features. Use a gateway such as TokenMix.ai when Gemini is one route among OpenAI, Claude, DeepSeek, and other model families.