TokenMix Research Lab · 2026-05-19

Gemini 3.5 Flash Released at I/O 2026: $1.50/$9 API Pricing

Last Updated: 2026-05-20 Data checked: 2026-05-20 (post Google I/O 2026 keynote) Author: TokenMix Research Lab

Google launched Gemini 3.5 Flash at I/O 2026 (May 19-20), not Gemini 3.5 Pro. The API model ID is gemini-3.5-flash, status stable, priced at $1.50 input / $9 output per million tokens with native Google Search grounding built in. Gemini 3.5 Pro did not ship — Gemini 3.1 Pro remains the Pro tier in preview.

Post-release update — May 20, 2026: We originally predicted Gemini 3.5 Pro with 70% odds for I/O 2026 (our original prediction). Google shipped Gemini 3.5 Flash instead — same generation jump, different tier. The prediction half-landed: I/O 2026 did launch a Gemini 3.5 model, but Flash (the mid-tier) shipped first while Pro stayed in 3.1 preview. Per Google's official Gemini API pricing page, Gemini 3.5 Flash debuted with four pricing tiers (Standard, Batch, Flex, Priority) plus Google Search and Maps grounding integrated. Below is what actually shipped, the full pricing breakdown, an honest prediction audit, and the migration plan teams should run now that the model is live.

Skip ahead: What Actually Shipped · Pricing · Prediction Audit · Migration.

Quick Verdict: Gemini 3.5 Pro Launch Snapshot
What Google Has Actually Confirmed
Why Google I/O 2026 Is the Main Release Window
Gemini 3.1 Pro Pricing: The Baseline 3.5 Must Beat
Gemini 3.1 Pro Benchmarks: The Current Bar
Cost Per Task: Gemini 3.1 Pro Today vs Possible 3.5 Pricing
How to Prepare API Access Before Gemini 3.5 Pro Appears
Migration Checklist: Gemini 3.1 Pro to Gemini 3.5 Pro
FAQ

What Actually Shipped at Google I/O 2026

Google's I/O 2026 keynote (May 19-20) included a Gemini 3.5 launch — but the model that shipped was Gemini 3.5 Flash, not the Pro tier we predicted. Per Google's official model docs at ai.google.dev/gemini-api/docs/models, the live lineup as of May 20, 2026 is:

Model	Status	Tagline
Gemini 3.5 Flash	Stable ✅	"Most intelligent model for sustained frontier performance on agentic and coding tasks"
Gemini 3.1 Pro	Preview	"Advanced intelligence, complex problem-solving skills, and powerful agentic and vibe coding capabilities"
Gemini 3 Flash	Preview	"Frontier-class performance rivaling larger models at a fraction of the cost"
Gemini 3.1 Flash-Lite	Stable	"Most cost-efficient, optimized for high-volume agentic tasks"
Gemini Omni	Released (new flagship multimodal)	"Create anything from anything"
Nano Banana	Released	Image generation specialist

Key takeaway: Gemini 3.5 Flash is the new agentic + coding workhorse — and it shipped stable on day one, not preview. The Pro tier upgrade Google teased remains in the 3.1 Preview state. If you were waiting for "Gemini 3.5 Pro" to migrate critical workloads, the practical answer right now is: migrate to 3.5 Flash for agentic tasks, keep using 3.1 Pro Preview for the deepest reasoning tasks.

Gemini 3.5 Flash API Pricing (Actual, Day One)

Per Google's Gemini API pricing page as of May 20, 2026:

Tier	Input	Output (incl. thinking)	Context cache write	Notes
Standard	$1.50 / 1M	$9.00 / 1M	$0.15 / 1M (write) + $1/M·h (storage)	Default, real-time
Batch	$0.75 / 1M	$4.50 / 1M	$0.075 / 1M	50% off, async (24h SLA)
Flex	$0.75 / 1M	$4.50 / 1M	$0.08 / 1M	50% off, lower priority
Priority	$2.70 / 1M	$16.20 / 1M	$0.27 / 1M	80% premium, top SLA

Free tier: All Standard tier output free of charge during preview window. Grounding with Google Search & Maps: 5,000 free prompts/month (shared across Gemini 3 family), then $14 per 1,000 search queries.

For an 8-second agentic task (~100K input + 30K output tokens):

Tier	Cost per task
Standard	~$0.42
Batch / Flex	~$0.21
Priority	~$0.76

vs Gemini 3.1 Pro (Preview) baseline $2/$12 per MTok — Gemini 3.5 Flash is 25% cheaper on input, 25% cheaper on output, while being marked Stable. For most agentic and coding workloads, this is a clear migration path.

Prediction Audit: Where We Were Right, Where We Were Wrong

We published the Gemini 3.5 Pro prediction on May 19, 2026 with these claims. Marking the score honestly:

Our claim	Actual outcome	Score
Gemini 3.5 ships at I/O 2026 (70% odds for I/O window)	✅ Yes, shipped on day one of I/O 2026	Right
Specifically Gemini 3.5 Pro	❌ Google shipped Gemini 3.5 Flash instead; Pro stayed in 3.1 Preview	Wrong
API model ID likely `gemini-3.5-pro-preview`	❌ Actual ID is `gemini-3.5-flash` (Stable, no preview suffix)	Wrong
Pricing might be premium tier vs 3.1 Pro	❌ Actually cheaper than 3.1 Pro ($1.50 vs $2 input)	Wrong
Migration prep work would apply on launch day	✅ Same migration plan still works — swap model ID, re-run evals	Right
Veo 4 likely also at I/O	❌ Veo stayed at 3.1; no Veo 4 announcement	Wrong

Net: Cadence prediction held (I/O 2026 was the right window). SKU prediction missed (Flash not Pro; cheaper not pricier). Honesty about this is the point of marking Confirmed / Likely / Speculation tiers — it's how trust gets built across release cycles.

Migration: From Gemini 3.1 Pro Preview to Gemini 3.5 Flash

If you have agentic or coding workloads currently on gemini-3.1-pro-preview, the migration window is now. Suggested checklist:

Day 1 (today): Swap config-driven model ID from gemini-3.1-pro-preview → gemini-3.5-flash. If you followed our config pattern, this is one environment variable.
Day 1-3: Re-run your eval suite (50-prompt baseline) on Gemini 3.5 Flash. Measure cost per accepted output, not per generation.
Day 3-7: Compare Flash vs current 3.1 Pro Preview on your specific task shapes. Flash claims "sustained frontier performance on agentic and coding tasks" — verify on your workload, especially long-context (>200K) cases.
Week 2: If Flash wins on cost-per-usable-output, promote to production. Keep Gemini 3.1 Pro Preview in fallback chain for deepest-reasoning escalations until Gemini 3.5 Pro ships.
Ongoing: Watch for gemini-3.5-pro announcement — Google's pattern suggests Pro will follow Flash within 1-3 months. Have a second migration plan ready.

For teams routing across providers, this is also a good moment to verify your AI gateway picks up the new model ID. The TokenMix.ai model intelligence tracker already surfaces gemini-3.5-flash and tracks day-to-day pricing.

The sections below are preserved from our original prediction piece, with the new post-release data taking precedence wherever they conflict. Use them as historical context on how we approached this launch window.

Quick Verdict: Gemini 3.5 Pro Launch Snapshot

Claim	Status	Source / Confidence
Gemini 3.5 Pro is released	Not confirmed	No official Google, DeepMind, Gemini API, or Vertex AI page found
Current official Pro baseline	Confirmed: Gemini 3.1 Pro Preview	Google DeepMind Gemini Pro page
Current API model ID	Confirmed: `gemini-3.1-pro-preview`	Gemini API pricing
Google I/O 2026 dates	Confirmed: May 19-20, 2026	Google I/O save-the-date
Gemini 3.5 Pro at I/O	Speculation: 35% odds	TokenMix.ai estimate based on timing, search heat, and lack of official docs
Any Gemini 3.x model update at I/O	Likely: 60% odds	Google says I/O covers Gemini updates; exact model name unknown
Gemini 3.5 Pro API pricing	Unknown	Use Gemini 3.1 Pro as the planning baseline
Safe production model ID today	`gemini-3.1-pro-preview`	Do not hardcode a guessed `gemini-3.5-pro` string

Bottom line: Gemini 3.5 Pro is a watchlist model, not a deployable model. If it launches at I/O, the first useful question will not be the name. It will be whether it beats Gemini 3.1 Pro's cost per accepted task.

What Google Has Actually Confirmed

Google's confirmed story has three layers.

First, Gemini 3 is the active flagship family. In the official Gemini 3 launch post, Google called Gemini 3 its most intelligent model and said it was available across the Gemini app, AI Studio, Vertex AI, Google Antigravity, and API surfaces. That post also said more Gemini 3 series models would follow.

Second, the current Pro page is Gemini 3.1 Pro. DeepMind lists Gemini 3.1 Pro as Preview, with text, image, video, audio, and PDF input, text output, 1M input tokens, 64K output tokens, and availability across Gemini App, Google Cloud / Vertex AI, Google AI Studio, Gemini API, Google AI Mode, and Google Antigravity.

Third, the Gemini API pricing page lists gemini-3.1-pro-preview and gemini-3.1-pro-preview-customtools. It does not list gemini-3.5-pro, gemini-3.5-pro-preview, or any Gemini 3.5 API model ID as of this data check.

Confirmed Item	Current State	Practical Meaning
Latest public Pro page	Gemini 3.1 Pro	Use 3.1 as the comparison baseline
Model status	Preview	Expect naming, limits, or rate tiers to change
Input modalities	Text, image, video, audio, PDF	Strong multimodal default for agent and document workflows
Output modality	Text	No confirmed Gemini 3.5 multimodal output story yet
Input context	1M tokens	Long-context API planning can continue on 3.1
Output context	64K tokens	Enough for large reports and code patches
API model ID	`gemini-3.1-pro-preview`	This is the live ID to test today
Gemini 3.5 model ID	None confirmed	Any guessed ID is unsafe

The important distinction: "new Gemini model expected" is not the same claim as "Gemini 3.5 Pro released." One is a launch-window read. The other requires an official model card, API docs, pricing row, or product announcement.

Why Google I/O 2026 Is the Main Release Window

Google I/O 2026 is the obvious window because the event starts today, May 19, and Google's save-the-date page explicitly names Gemini in the event scope. That does not prove Gemini 3.5 Pro. It does justify monitoring the keyword.

The current signal stack looks like this:

Signal	Strength	What It Supports	What It Does Not Support
Google I/O dates confirmed	High	A near-term Google AI announcement window	Specific Gemini 3.5 Pro naming
Official I/O page mentions Gemini	High	Gemini will be part of the event story	A Pro model release
Gemini 3 post promised more models	Medium	Additional Gemini 3 series models are plausible	Exact version or timing
DeepMind page now centers 3.1 Pro	High	3.1 Pro is the current baseline	3.5 has shipped
Prediction-market and community chatter	Medium	Search demand is rising	Official release facts
Official docs show no 3.5 string	High	3.5 is not live in the API docs yet	Future absence

Our current probability estimate:

Event	Probability	Reasoning
Gemini 3.5 Pro announced at Google I/O 2026	35%	Plausible timing, but no official docs or stable leak trail
Any Gemini 3.x model update at I/O	60%	I/O is a Gemini-heavy event and the 3 series still has room for variants
Gemini 3.5 Pro by July 31, 2026	70%	Naming could arrive after I/O if Google stages API rollout
No Gemini Pro model update before August	30%	Google may focus on Android, agents, Flash, video, or Workspace instead

This is a forecast, not a confirmation. If Google publishes a model card or pricing row during I/O, this article should be updated in the same slug within 30 minutes.

Gemini 3.1 Pro Pricing: The Baseline 3.5 Must Beat

For developers, the useful baseline is not rumor. It is the live Gemini 3.1 Pro Preview pricing row.

Per Google's Gemini API pricing page, gemini-3.1-pro-preview uses a two-tier prompt price:

Mode	Input, <=200K Prompt	Input, >200K Prompt	Output, <=200K Prompt	Output, >200K Prompt	Notes
Standard	$2.00 / MTok	$4.00 / MTok	$12.00 / MTok	$18.00 / MTok	Default production path
Batch	$1.00 / MTok	$2.00 / MTok	$6.00 / MTok	$9.00 / MTok	50% cheaper for non-realtime jobs
Flex	$1.00 / MTok	$2.00 / MTok	$6.00 / MTok	$9.00 / MTok	Latency-tolerant production traffic
Priority	$3.60 / MTok	$7.20 / MTok	$21.60 / MTok	$32.40 / MTok	Throughput-sensitive workloads

Three notes matter more than the headline price:

Pricing Detail	Why It Matters
The 200K threshold doubles input price	Long-context RAG can move from cheap to merely reasonable very quickly
Batch and Flex halve token price	Offline eval, extraction, and document jobs should rarely use Standard
Context caching is priced separately	Repeated long prompts need storage math, not just token math
Grounding can add per-query cost	Search-heavy agents need request-level accounting

If Gemini 3.5 Pro launches above this price, it has to reduce retries or unlock new tasks. "Newer" is not a cost argument.

Gemini 3.1 Pro Benchmarks: The Current Bar

DeepMind's Gemini 3.1 Pro page publishes benchmark deltas against Gemini 3 Pro and other frontier models. These numbers are the bar a Gemini 3.5 Pro launch would need to clear.

Benchmark	Gemini 3.1 Pro	Gemini 3 Pro	What Changed
Humanity's Last Exam, no tools	44.4%	37.5%	Strong reasoning lift
Humanity's Last Exam, search + code	51.4%	45.8%	Better tool-assisted reasoning
ARC-AGI-2	77.1%	31.1%	Major abstract reasoning jump
GPQA Diamond	94.3%	91.9%	Small but meaningful science gain
Terminal-Bench 2.0	68.5%	56.9%	Better terminal-agent behavior
SWE-Bench Verified	80.6%	76.2%	Incremental coding-agent gain
MCP Atlas	69.2%	54.1%	Stronger multi-step MCP workflows
BrowseComp	85.9%	59.2%	Large agentic search improvement
MMMU-Pro	80.5%	81.0%	Slightly behind Gemini 3 Pro
MRCR v2, 1M pointwise	26.3%	26.3%	No visible long-context retrieval gain

The read: Gemini 3.1 Pro is already a serious agent model. A Gemini 3.5 Pro release that only adds small benchmark gains will not be enough. The needed improvement is reliability under long-horizon tool use, not just higher chart numbers.

Where 3.5 would need to move:

Area	Gemini 3.1 Pro Baseline	Useful Gemini 3.5 Pro Target	Why It Matters
Terminal-Bench 2.0	68.5%	75%+	Coding agents need fewer broken shell loops
SWE-Bench Verified	80.6%	83%+	Above this line, fewer human review cycles
MCP Atlas	69.2%	75%+	MCP tool routing is becoming production-critical
BrowseComp	85.9%	90%+	Agentic search is a real workflow, not a demo
MRCR 1M	26.3%	40%+	Long-context retrieval remains the weak point
Price retention	$2/$12 to $4/$18	Same or <=25% premium	Higher price must be paid back by fewer retries

This is why TokenMix.ai treats Gemini 3.5 Pro as a cost-per-workflow question. If it ships, do not compare only leaderboard rank. Compare accepted task rate, tool-call retries, latency, and the prompt tier you actually hit.

Cost Per Task: Gemini 3.1 Pro Today vs Possible 3.5 Pricing

Pricing guesses are dangerous, so this section uses scenario math. The confirmed numbers are Gemini 3.1 Pro prices from Google's pricing page. The Gemini 3.5 columns are hypothetical premiums to show what teams should test on launch day.

Workload	Tokens In / Out	Gemini 3.1 Pro Standard	If 3.5 Is +25%	If 3.5 Is +50%
Coding agent pass	30K / 8K	$0.16	$0.20	$0.23
Research summary	100K / 5K	$0.26	$0.33	$0.39
180K document analysis	180K / 6K	$0.43	$0.54	$0.65
800K codebase review	800K / 12K	$3.42	$4.27	$5.12
1M-token legal RAG	1M / 20K	$4.36	$5.45	$6.54

Now the real production question:

Scenario	Standard Cost	Accepted Output Rate	Cost Per Accepted Result
Gemini 3.1 Pro, 180K doc task	$0.43	70%	$0.61
Gemini 3.5 Pro, +25% premium	$0.54	80%	$0.68
Gemini 3.5 Pro, +25% premium	$0.54	90%	$0.60
Gemini 3.5 Pro, +50% premium	$0.65	90%	$0.72

Translation: a 25% price premium is acceptable only if Gemini 3.5 Pro meaningfully reduces failed generations, tool retries, or human edits. A 50% premium needs a bigger quality jump.

Batch and Flex change the economics:

Workload	Gemini 3.1 Standard	Gemini 3.1 Batch/Flex	Savings
100K / 5K research summary	$0.26	$0.13	50%
180K / 6K document analysis	$0.43	$0.22	50%
800K / 12K codebase review	$3.42	$1.71	50%
10,000 x 30K / 8K coding passes	$1,560	$780	50%

If your Gemini workload is asynchronous, start with Batch or Flex before waiting for a new model. That saves real money today.

How to Prepare API Access Before Gemini 3.5 Pro Appears

There is no public gemini-3.5-pro API ID today. The wrong implementation is to hardcode a guessed model name and wait.

The right implementation is a config-driven model switch:

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.tokenmix.ai/v1",
    api_key=os.environ["TOKENMIX_API_KEY"],
)

MODEL_ID = os.getenv("MODEL_ID", "gemini-3.1-pro-preview")

response = client.chat.completions.create(
    model=MODEL_ID,
    messages=[
        {"role": "system", "content": "You are a precise API migration reviewer."},
        {"role": "user", "content": "Review this tool-calling workflow for failure risk."},
    ],
)

Through TokenMix.ai, the value is not pretending 3.5 is already live. The value is one OpenAI-compatible integration that can route between Gemini, GPT, Claude, DeepSeek, and other models when your eval says the switch is worth it.

Access Route	Use It For	Risk
Direct Gemini API	Native Google features, fastest access to new Gemini IDs	Provider-specific SDK and pricing logic
Vertex AI	Enterprise IAM, audit logs, regional controls	Heavier setup
Gemini app / AI Studio	Manual testing and prompt exploration	Not production automation
TokenMix.ai unified API	Multi-provider routing and quick fallback	Wait for upstream model availability

Internal links worth keeping near this article:

Use the Google Gemini API pricing guide for live cost planning.
Compare current model families in GPT vs Claude vs Gemini.
For OpenAI-side context, see the GPT-5.5 release and pricing analysis.
If you want provider portability, read How to Switch AI Providers Without Refactoring.

Migration Checklist: Gemini 3.1 Pro to Gemini 3.5 Pro

Do this before launch, not after the model name appears.

Action	When	Effort
Save 50-100 production prompts with expected outputs	Now	2-4 hours
Label failure types: wrong answer, bad tool call, bad format, timeout	Now	Half day
Track token tier: <=200K vs >200K prompt	Now	1 hour
Put model ID behind config	Now	30 minutes
Add accepted-output rate to your eval	Now	2 hours
Compare 3.5 vs 3.1 on the same prompts	Launch day	2-6 hours
Measure cost per accepted result, not cost per token	Launch day	1 hour
Keep 3.1 Pro as fallback for 7-14 days	After launch	Ongoing

The decision rule:

If Gemini 3.5 Pro...	Migration Decision
Keeps 3.1 pricing and improves accepted-output rate	Migrate high-value workflows quickly
Costs 25% more and reduces retries by 25%+	Test production traffic gradually
Costs 50% more with small benchmark gains	Keep 3.1 Pro for most workloads
Improves coding but not long context	Route agent tasks only
Improves long context but not latency	Use it for offline RAG, not interactive apps

Do not migrate because the version number is bigger. Migrate when Gemini 3.5 Pro lowers your total workflow cost or unlocks a task Gemini 3.1 Pro cannot handle.

FAQ

Is Gemini 3.5 Pro released?

No. As of 2026-05-19 10:00 UTC+08, Google has not published an official Gemini 3.5 Pro model card, API pricing row, or public model ID. The current official Pro baseline is Gemini 3.1 Pro Preview.

When is the Gemini 3.5 Pro release date?

There is no confirmed release date. Google I/O 2026 on May 19-20 is the main watch window, but our estimate is only 35% for a specific Gemini 3.5 Pro announcement at I/O. A broader Gemini 3.x update is more likely than that exact name.

What is the latest official Gemini Pro model?

The latest official Pro page points to Gemini 3.1 Pro. DeepMind lists it as Preview, with 1M input tokens, 64K output tokens, multimodal input, and availability through Gemini App, Vertex AI, AI Studio, Gemini API, Google AI Mode, and Google Antigravity.

What is the Gemini 3.5 Pro API model ID?

There is no confirmed API model ID. Do not hardcode gemini-3.5-pro or gemini-3.5-pro-preview until Google lists the model in official Gemini API or Vertex AI documentation.

How much will Gemini 3.5 Pro cost?

Unknown. The current planning baseline is Gemini 3.1 Pro Preview: $2 input / $12 output per million tokens for prompts up to 200K, and $4 / $18 for prompts above 200K. Batch and Flex cut those token prices by 50%.

Will Gemini 3.5 Pro be better than GPT-5.5?

That cannot be answered until Google publishes the model and independent evals run. The important categories to watch are Terminal-Bench, SWE-Bench Verified, BrowseComp, MCP Atlas, long-context retrieval, latency, and accepted-output rate.

Should I wait for Gemini 3.5 Pro before building?

No. Build on Gemini 3.1 Pro or your current production model now, but keep the model ID configurable. Waiting for a rumored name is slower than building a clean evaluation and routing layer.

Can I access Gemini models through TokenMix.ai?

Yes, TokenMix.ai is designed for unified AI API access and provider switching. The safe pattern is to call the current supported Gemini model through an OpenAI-compatible endpoint, then switch the configured model ID only after Gemini 3.5 Pro is officially available and passes your evals.

Sources

Google I/O 2026 save-the-date: https://blog.google/innovation-and-ai/technology/developers-tools/io-2026-save-the-date/
Google Gemini 3 launch post: https://blog.google/products-and-platforms/products/gemini/gemini-3/
Google DeepMind Gemini 3.1 Pro page: https://deepmind.google/models/gemini/pro/
Gemini API pricing: https://ai.google.dev/gemini-api/docs/pricing
Gemini 3.5 prediction-market tracker, treated as non-official signal only: https://polymarket.com/event/gemini-3pt5-released-by-june-30

By TokenMix Research Lab · Published 2026-05-19 · Last Updated 2026-05-20 · Data Checked 2026-05-20 · Updated post Google I/O 2026 keynote with actual Gemini 3.5 Flash launch data.