TokenMix Research Lab · 2026-05-19

Gemini 3.5 Flash Released at I/O 2026: $1.50/$9 API Pricing

Gemini 3.5 Flash Released at I/O 2026: $1.50/$9 API Pricing

Last Updated: 2026-05-20 Data checked: 2026-05-20 (post Google I/O 2026 keynote) Author: TokenMix Research Lab

Google launched Gemini 3.5 Flash at I/O 2026 (May 19-20), not Gemini 3.5 Pro. The API model ID is gemini-3.5-flash, status stable, priced at $1.50 input / $9 output per million tokens with native Google Search grounding built in. Gemini 3.5 Pro did not ship — Gemini 3.1 Pro remains the Pro tier in preview.

Post-release update — May 20, 2026: We originally predicted Gemini 3.5 Pro with 70% odds for I/O 2026 (our original prediction). Google shipped Gemini 3.5 Flash instead — same generation jump, different tier. The prediction half-landed: I/O 2026 did launch a Gemini 3.5 model, but Flash (the mid-tier) shipped first while Pro stayed in 3.1 preview. Per Google's official Gemini API pricing page, Gemini 3.5 Flash debuted with four pricing tiers (Standard, Batch, Flex, Priority) plus Google Search and Maps grounding integrated. Below is what actually shipped, the full pricing breakdown, an honest prediction audit, and the migration plan teams should run now that the model is live.

Skip ahead: What Actually Shipped · Pricing · Prediction Audit · Migration.

Table of Contents


What Actually Shipped at Google I/O 2026

Google's I/O 2026 keynote (May 19-20) included a Gemini 3.5 launch — but the model that shipped was Gemini 3.5 Flash, not the Pro tier we predicted. Per Google's official model docs at ai.google.dev/gemini-api/docs/models, the live lineup as of May 20, 2026 is:

Model Status Tagline
Gemini 3.5 Flash Stable "Most intelligent model for sustained frontier performance on agentic and coding tasks"
Gemini 3.1 Pro Preview "Advanced intelligence, complex problem-solving skills, and powerful agentic and vibe coding capabilities"
Gemini 3 Flash Preview "Frontier-class performance rivaling larger models at a fraction of the cost"
Gemini 3.1 Flash-Lite Stable "Most cost-efficient, optimized for high-volume agentic tasks"
Gemini Omni Released (new flagship multimodal) "Create anything from anything"
Nano Banana Released Image generation specialist

Key takeaway: Gemini 3.5 Flash is the new agentic + coding workhorse — and it shipped stable on day one, not preview. The Pro tier upgrade Google teased remains in the 3.1 Preview state. If you were waiting for "Gemini 3.5 Pro" to migrate critical workloads, the practical answer right now is: migrate to 3.5 Flash for agentic tasks, keep using 3.1 Pro Preview for the deepest reasoning tasks.

Gemini 3.5 Flash API Pricing (Actual, Day One)

Per Google's Gemini API pricing page as of May 20, 2026:

Tier Input Output (incl. thinking) Context cache write Notes
Standard $1.50 / 1M $9.00 / 1M $0.15 / 1M (write) + $1/M·h (storage) Default, real-time
Batch $0.75 / 1M $4.50 / 1M $0.075 / 1M 50% off, async (24h SLA)
Flex $0.75 / 1M $4.50 / 1M $0.08 / 1M 50% off, lower priority
Priority $2.70 / 1M $16.20 / 1M $0.27 / 1M 80% premium, top SLA

Free tier: All Standard tier output free of charge during preview window. Grounding with Google Search & Maps: 5,000 free prompts/month (shared across Gemini 3 family), then $14 per 1,000 search queries.

For an 8-second agentic task (~100K input + 30K output tokens):

Tier Cost per task
Standard ~$0.42
Batch / Flex ~$0.21
Priority ~$0.76

vs Gemini 3.1 Pro (Preview) baseline $2/$12 per MTok — Gemini 3.5 Flash is 25% cheaper on input, 25% cheaper on output, while being marked Stable. For most agentic and coding workloads, this is a clear migration path.

Prediction Audit: Where We Were Right, Where We Were Wrong

We published the Gemini 3.5 Pro prediction on May 19, 2026 with these claims. Marking the score honestly:

Our claim Actual outcome Score
Gemini 3.5 ships at I/O 2026 (70% odds for I/O window) ✅ Yes, shipped on day one of I/O 2026 Right
Specifically Gemini 3.5 Pro ❌ Google shipped Gemini 3.5 Flash instead; Pro stayed in 3.1 Preview Wrong
API model ID likely gemini-3.5-pro-preview ❌ Actual ID is gemini-3.5-flash (Stable, no preview suffix) Wrong
Pricing might be premium tier vs 3.1 Pro ❌ Actually cheaper than 3.1 Pro ($1.50 vs $2 input) Wrong
Migration prep work would apply on launch day ✅ Same migration plan still works — swap model ID, re-run evals Right
Veo 4 likely also at I/O ❌ Veo stayed at 3.1; no Veo 4 announcement Wrong

Net: Cadence prediction held (I/O 2026 was the right window). SKU prediction missed (Flash not Pro; cheaper not pricier). Honesty about this is the point of marking Confirmed / Likely / Speculation tiers — it's how trust gets built across release cycles.

Migration: From Gemini 3.1 Pro Preview to Gemini 3.5 Flash

If you have agentic or coding workloads currently on gemini-3.1-pro-preview, the migration window is now. Suggested checklist:

  1. Day 1 (today): Swap config-driven model ID from gemini-3.1-pro-previewgemini-3.5-flash. If you followed our config pattern, this is one environment variable.
  2. Day 1-3: Re-run your eval suite (50-prompt baseline) on Gemini 3.5 Flash. Measure cost per accepted output, not per generation.
  3. Day 3-7: Compare Flash vs current 3.1 Pro Preview on your specific task shapes. Flash claims "sustained frontier performance on agentic and coding tasks" — verify on your workload, especially long-context (>200K) cases.
  4. Week 2: If Flash wins on cost-per-usable-output, promote to production. Keep Gemini 3.1 Pro Preview in fallback chain for deepest-reasoning escalations until Gemini 3.5 Pro ships.
  5. Ongoing: Watch for gemini-3.5-pro announcement — Google's pattern suggests Pro will follow Flash within 1-3 months. Have a second migration plan ready.

For teams routing across providers, this is also a good moment to verify your AI gateway picks up the new model ID. The TokenMix.ai model intelligence tracker already surfaces gemini-3.5-flash and tracks day-to-day pricing.


The sections below are preserved from our original prediction piece, with the new post-release data taking precedence wherever they conflict. Use them as historical context on how we approached this launch window.

Quick Verdict: Gemini 3.5 Pro Launch Snapshot

Claim Status Source / Confidence
Gemini 3.5 Pro is released Not confirmed No official Google, DeepMind, Gemini API, or Vertex AI page found
Current official Pro baseline Confirmed: Gemini 3.1 Pro Preview Google DeepMind Gemini Pro page
Current API model ID Confirmed: gemini-3.1-pro-preview Gemini API pricing
Google I/O 2026 dates Confirmed: May 19-20, 2026 Google I/O save-the-date
Gemini 3.5 Pro at I/O Speculation: 35% odds TokenMix.ai estimate based on timing, search heat, and lack of official docs
Any Gemini 3.x model update at I/O Likely: 60% odds Google says I/O covers Gemini updates; exact model name unknown
Gemini 3.5 Pro API pricing Unknown Use Gemini 3.1 Pro as the planning baseline
Safe production model ID today gemini-3.1-pro-preview Do not hardcode a guessed gemini-3.5-pro string

Bottom line: Gemini 3.5 Pro is a watchlist model, not a deployable model. If it launches at I/O, the first useful question will not be the name. It will be whether it beats Gemini 3.1 Pro's cost per accepted task.

What Google Has Actually Confirmed

Google's confirmed story has three layers.

First, Gemini 3 is the active flagship family. In the official Gemini 3 launch post, Google called Gemini 3 its most intelligent model and said it was available across the Gemini app, AI Studio, Vertex AI, Google Antigravity, and API surfaces. That post also said more Gemini 3 series models would follow.

Second, the current Pro page is Gemini 3.1 Pro. DeepMind lists Gemini 3.1 Pro as Preview, with text, image, video, audio, and PDF input, text output, 1M input tokens, 64K output tokens, and availability across Gemini App, Google Cloud / Vertex AI, Google AI Studio, Gemini API, Google AI Mode, and Google Antigravity.

Third, the Gemini API pricing page lists gemini-3.1-pro-preview and gemini-3.1-pro-preview-customtools. It does not list gemini-3.5-pro, gemini-3.5-pro-preview, or any Gemini 3.5 API model ID as of this data check.

Confirmed Item Current State Practical Meaning
Latest public Pro page Gemini 3.1 Pro Use 3.1 as the comparison baseline
Model status Preview Expect naming, limits, or rate tiers to change
Input modalities Text, image, video, audio, PDF Strong multimodal default for agent and document workflows
Output modality Text No confirmed Gemini 3.5 multimodal output story yet
Input context 1M tokens Long-context API planning can continue on 3.1
Output context 64K tokens Enough for large reports and code patches
API model ID gemini-3.1-pro-preview This is the live ID to test today
Gemini 3.5 model ID None confirmed Any guessed ID is unsafe

The important distinction: "new Gemini model expected" is not the same claim as "Gemini 3.5 Pro released." One is a launch-window read. The other requires an official model card, API docs, pricing row, or product announcement.

Why Google I/O 2026 Is the Main Release Window

Google I/O 2026 is the obvious window because the event starts today, May 19, and Google's save-the-date page explicitly names Gemini in the event scope. That does not prove Gemini 3.5 Pro. It does justify monitoring the keyword.

The current signal stack looks like this:

Signal Strength What It Supports What It Does Not Support
Google I/O dates confirmed High A near-term Google AI announcement window Specific Gemini 3.5 Pro naming
Official I/O page mentions Gemini High Gemini will be part of the event story A Pro model release
Gemini 3 post promised more models Medium Additional Gemini 3 series models are plausible Exact version or timing
DeepMind page now centers 3.1 Pro High 3.1 Pro is the current baseline 3.5 has shipped
Prediction-market and community chatter Medium Search demand is rising Official release facts
Official docs show no 3.5 string High 3.5 is not live in the API docs yet Future absence

Our current probability estimate:

Event Probability Reasoning
Gemini 3.5 Pro announced at Google I/O 2026 35% Plausible timing, but no official docs or stable leak trail
Any Gemini 3.x model update at I/O 60% I/O is a Gemini-heavy event and the 3 series still has room for variants
Gemini 3.5 Pro by July 31, 2026 70% Naming could arrive after I/O if Google stages API rollout
No Gemini Pro model update before August 30% Google may focus on Android, agents, Flash, video, or Workspace instead

This is a forecast, not a confirmation. If Google publishes a model card or pricing row during I/O, this article should be updated in the same slug within 30 minutes.

Gemini 3.1 Pro Pricing: The Baseline 3.5 Must Beat

For developers, the useful baseline is not rumor. It is the live Gemini 3.1 Pro Preview pricing row.

Per Google's Gemini API pricing page, gemini-3.1-pro-preview uses a two-tier prompt price:

Mode Input, <=200K Prompt Input, >200K Prompt Output, <=200K Prompt Output, >200K Prompt Notes
Standard $2.00 / MTok $4.00 / MTok $12.00 / MTok $18.00 / MTok Default production path
Batch $1.00 / MTok $2.00 / MTok $6.00 / MTok $9.00 / MTok 50% cheaper for non-realtime jobs
Flex $1.00 / MTok $2.00 / MTok $6.00 / MTok $9.00 / MTok Latency-tolerant production traffic
Priority $3.60 / MTok $7.20 / MTok $21.60 / MTok $32.40 / MTok Throughput-sensitive workloads

Three notes matter more than the headline price:

Pricing Detail Why It Matters
The 200K threshold doubles input price Long-context RAG can move from cheap to merely reasonable very quickly
Batch and Flex halve token price Offline eval, extraction, and document jobs should rarely use Standard
Context caching is priced separately Repeated long prompts need storage math, not just token math
Grounding can add per-query cost Search-heavy agents need request-level accounting

If Gemini 3.5 Pro launches above this price, it has to reduce retries or unlock new tasks. "Newer" is not a cost argument.

Gemini 3.1 Pro Benchmarks: The Current Bar

DeepMind's Gemini 3.1 Pro page publishes benchmark deltas against Gemini 3 Pro and other frontier models. These numbers are the bar a Gemini 3.5 Pro launch would need to clear.

Benchmark Gemini 3.1 Pro Gemini 3 Pro What Changed
Humanity's Last Exam, no tools 44.4% 37.5% Strong reasoning lift
Humanity's Last Exam, search + code 51.4% 45.8% Better tool-assisted reasoning
ARC-AGI-2 77.1% 31.1% Major abstract reasoning jump
GPQA Diamond 94.3% 91.9% Small but meaningful science gain
Terminal-Bench 2.0 68.5% 56.9% Better terminal-agent behavior
SWE-Bench Verified 80.6% 76.2% Incremental coding-agent gain
MCP Atlas 69.2% 54.1% Stronger multi-step MCP workflows
BrowseComp 85.9% 59.2% Large agentic search improvement
MMMU-Pro 80.5% 81.0% Slightly behind Gemini 3 Pro
MRCR v2, 1M pointwise 26.3% 26.3% No visible long-context retrieval gain

The read: Gemini 3.1 Pro is already a serious agent model. A Gemini 3.5 Pro release that only adds small benchmark gains will not be enough. The needed improvement is reliability under long-horizon tool use, not just higher chart numbers.

Where 3.5 would need to move:

Area Gemini 3.1 Pro Baseline Useful Gemini 3.5 Pro Target Why It Matters
Terminal-Bench 2.0 68.5% 75%+ Coding agents need fewer broken shell loops
SWE-Bench Verified 80.6% 83%+ Above this line, fewer human review cycles
MCP Atlas 69.2% 75%+ MCP tool routing is becoming production-critical
BrowseComp 85.9% 90%+ Agentic search is a real workflow, not a demo
MRCR 1M 26.3% 40%+ Long-context retrieval remains the weak point
Price retention $2/$12 to $4/$18 Same or <=25% premium Higher price must be paid back by fewer retries

This is why TokenMix.ai treats Gemini 3.5 Pro as a cost-per-workflow question. If it ships, do not compare only leaderboard rank. Compare accepted task rate, tool-call retries, latency, and the prompt tier you actually hit.

Cost Per Task: Gemini 3.1 Pro Today vs Possible 3.5 Pricing

Pricing guesses are dangerous, so this section uses scenario math. The confirmed numbers are Gemini 3.1 Pro prices from Google's pricing page. The Gemini 3.5 columns are hypothetical premiums to show what teams should test on launch day.

Workload Tokens In / Out Gemini 3.1 Pro Standard If 3.5 Is +25% If 3.5 Is +50%
Coding agent pass 30K / 8K $0.16 $0.20 $0.23
Research summary 100K / 5K $0.26 $0.33 $0.39
180K document analysis 180K / 6K $0.43 $0.54 $0.65
800K codebase review 800K / 12K $3.42 $4.27 $5.12
1M-token legal RAG 1M / 20K $4.36 $5.45 $6.54

Now the real production question:

Scenario Standard Cost Accepted Output Rate Cost Per Accepted Result
Gemini 3.1 Pro, 180K doc task $0.43 70% $0.61
Gemini 3.5 Pro, +25% premium $0.54 80% $0.68
Gemini 3.5 Pro, +25% premium $0.54 90% $0.60
Gemini 3.5 Pro, +50% premium $0.65 90% $0.72

Translation: a 25% price premium is acceptable only if Gemini 3.5 Pro meaningfully reduces failed generations, tool retries, or human edits. A 50% premium needs a bigger quality jump.

Batch and Flex change the economics:

Workload Gemini 3.1 Standard Gemini 3.1 Batch/Flex Savings
100K / 5K research summary $0.26 $0.13 50%
180K / 6K document analysis $0.43 $0.22 50%
800K / 12K codebase review $3.42 $1.71 50%
10,000 x 30K / 8K coding passes $1,560 $780 50%

If your Gemini workload is asynchronous, start with Batch or Flex before waiting for a new model. That saves real money today.

How to Prepare API Access Before Gemini 3.5 Pro Appears

There is no public gemini-3.5-pro API ID today. The wrong implementation is to hardcode a guessed model name and wait.

The right implementation is a config-driven model switch:

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.tokenmix.ai/v1",
    api_key=os.environ["TOKENMIX_API_KEY"],
)

MODEL_ID = os.getenv("MODEL_ID", "gemini-3.1-pro-preview")

response = client.chat.completions.create(
    model=MODEL_ID,
    messages=[
        {"role": "system", "content": "You are a precise API migration reviewer."},
        {"role": "user", "content": "Review this tool-calling workflow for failure risk."},
    ],
)

Through TokenMix.ai, the value is not pretending 3.5 is already live. The value is one OpenAI-compatible integration that can route between Gemini, GPT, Claude, DeepSeek, and other models when your eval says the switch is worth it.

Access Route Use It For Risk
Direct Gemini API Native Google features, fastest access to new Gemini IDs Provider-specific SDK and pricing logic
Vertex AI Enterprise IAM, audit logs, regional controls Heavier setup
Gemini app / AI Studio Manual testing and prompt exploration Not production automation
TokenMix.ai unified API Multi-provider routing and quick fallback Wait for upstream model availability

Internal links worth keeping near this article:

Migration Checklist: Gemini 3.1 Pro to Gemini 3.5 Pro

Do this before launch, not after the model name appears.

Action When Effort
Save 50-100 production prompts with expected outputs Now 2-4 hours
Label failure types: wrong answer, bad tool call, bad format, timeout Now Half day
Track token tier: <=200K vs >200K prompt Now 1 hour
Put model ID behind config Now 30 minutes
Add accepted-output rate to your eval Now 2 hours
Compare 3.5 vs 3.1 on the same prompts Launch day 2-6 hours
Measure cost per accepted result, not cost per token Launch day 1 hour
Keep 3.1 Pro as fallback for 7-14 days After launch Ongoing

The decision rule:

If Gemini 3.5 Pro... Migration Decision
Keeps 3.1 pricing and improves accepted-output rate Migrate high-value workflows quickly
Costs 25% more and reduces retries by 25%+ Test production traffic gradually
Costs 50% more with small benchmark gains Keep 3.1 Pro for most workloads
Improves coding but not long context Route agent tasks only
Improves long context but not latency Use it for offline RAG, not interactive apps

Do not migrate because the version number is bigger. Migrate when Gemini 3.5 Pro lowers your total workflow cost or unlocks a task Gemini 3.1 Pro cannot handle.

FAQ

Is Gemini 3.5 Pro released?

No. As of 2026-05-19 10:00 UTC+08, Google has not published an official Gemini 3.5 Pro model card, API pricing row, or public model ID. The current official Pro baseline is Gemini 3.1 Pro Preview.

When is the Gemini 3.5 Pro release date?

There is no confirmed release date. Google I/O 2026 on May 19-20 is the main watch window, but our estimate is only 35% for a specific Gemini 3.5 Pro announcement at I/O. A broader Gemini 3.x update is more likely than that exact name.

What is the latest official Gemini Pro model?

The latest official Pro page points to Gemini 3.1 Pro. DeepMind lists it as Preview, with 1M input tokens, 64K output tokens, multimodal input, and availability through Gemini App, Vertex AI, AI Studio, Gemini API, Google AI Mode, and Google Antigravity.

What is the Gemini 3.5 Pro API model ID?

There is no confirmed API model ID. Do not hardcode gemini-3.5-pro or gemini-3.5-pro-preview until Google lists the model in official Gemini API or Vertex AI documentation.

How much will Gemini 3.5 Pro cost?

Unknown. The current planning baseline is Gemini 3.1 Pro Preview: $2 input / $12 output per million tokens for prompts up to 200K, and $4 / $18 for prompts above 200K. Batch and Flex cut those token prices by 50%.

Will Gemini 3.5 Pro be better than GPT-5.5?

That cannot be answered until Google publishes the model and independent evals run. The important categories to watch are Terminal-Bench, SWE-Bench Verified, BrowseComp, MCP Atlas, long-context retrieval, latency, and accepted-output rate.

Should I wait for Gemini 3.5 Pro before building?

No. Build on Gemini 3.1 Pro or your current production model now, but keep the model ID configurable. Waiting for a rumored name is slower than building a clean evaluation and routing layer.

Can I access Gemini models through TokenMix.ai?

Yes, TokenMix.ai is designed for unified AI API access and provider switching. The safe pattern is to call the current supported Gemini model through an OpenAI-compatible endpoint, then switch the configured model ID only after Gemini 3.5 Pro is officially available and passes your evals.


Sources

Related Articles


By TokenMix Research Lab · Published 2026-05-19 · Last Updated 2026-05-20 · Data Checked 2026-05-20 · Updated post Google I/O 2026 keynote with actual Gemini 3.5 Flash launch data.