TokenMix Research Lab · 2026-05-19

Gemini 3.5 Pro Release Date: 35% I/O 2026 Odds, API Plan
Last Updated: 2026-05-19 Data checked: 2026-05-19 10:00 UTC+08 Author: TokenMix Research Lab
Gemini 3.5 Pro has not been officially released. As of this update, Google and DeepMind still point developers to Gemini 3.1 Pro Preview, not a public gemini-3.5-pro model.
The timing is still worth watching. Google I/O 2026 runs May 19-20, and Google's own event page says I/O will cover "latest AI breakthroughs" across Gemini, Android, and more. Google also said the Gemini 3 era would get additional models, while the current Gemini API pricing page lists gemini-3.1-pro-preview at $2 input / $12 output per million tokens for prompts up to 200K, and $4 / $18 past 200K. The right move is simple: treat Gemini 3.5 Pro as a high-interest launch candidate, not a live API target.
Below is the confirmed state, the search and market signal, the current Gemini 3.1 Pro pricing baseline, the benchmark bar a Gemini 3.5 Pro release would need to clear, and the API migration plan teams should prepare now.
Table of Contents
- Quick Verdict: Gemini 3.5 Pro Launch Snapshot
- What Google Has Actually Confirmed
- Why Google I/O 2026 Is the Main Release Window
- Gemini 3.1 Pro Pricing: The Baseline 3.5 Must Beat
- Gemini 3.1 Pro Benchmarks: The Current Bar
- Cost Per Task: Gemini 3.1 Pro Today vs Possible 3.5 Pricing
- How to Prepare API Access Before Gemini 3.5 Pro Appears
- Migration Checklist: Gemini 3.1 Pro to Gemini 3.5 Pro
- FAQ
Quick Verdict: Gemini 3.5 Pro Launch Snapshot
| Claim | Status | Source / Confidence |
|---|---|---|
| Gemini 3.5 Pro is released | Not confirmed | No official Google, DeepMind, Gemini API, or Vertex AI page found |
| Current official Pro baseline | Confirmed: Gemini 3.1 Pro Preview | Google DeepMind Gemini Pro page |
| Current API model ID | Confirmed: gemini-3.1-pro-preview |
Gemini API pricing |
| Google I/O 2026 dates | Confirmed: May 19-20, 2026 | Google I/O save-the-date |
| Gemini 3.5 Pro at I/O | Speculation: 35% odds | TokenMix.ai estimate based on timing, search heat, and lack of official docs |
| Any Gemini 3.x model update at I/O | Likely: 60% odds | Google says I/O covers Gemini updates; exact model name unknown |
| Gemini 3.5 Pro API pricing | Unknown | Use Gemini 3.1 Pro as the planning baseline |
| Safe production model ID today | gemini-3.1-pro-preview |
Do not hardcode a guessed gemini-3.5-pro string |
Bottom line: Gemini 3.5 Pro is a watchlist model, not a deployable model. If it launches at I/O, the first useful question will not be the name. It will be whether it beats Gemini 3.1 Pro's cost per accepted task.
What Google Has Actually Confirmed
Google's confirmed story has three layers.
First, Gemini 3 is the active flagship family. In the official Gemini 3 launch post, Google called Gemini 3 its most intelligent model and said it was available across the Gemini app, AI Studio, Vertex AI, Google Antigravity, and API surfaces. That post also said more Gemini 3 series models would follow.
Second, the current Pro page is Gemini 3.1 Pro. DeepMind lists Gemini 3.1 Pro as Preview, with text, image, video, audio, and PDF input, text output, 1M input tokens, 64K output tokens, and availability across Gemini App, Google Cloud / Vertex AI, Google AI Studio, Gemini API, Google AI Mode, and Google Antigravity.
Third, the Gemini API pricing page lists gemini-3.1-pro-preview and gemini-3.1-pro-preview-customtools. It does not list gemini-3.5-pro, gemini-3.5-pro-preview, or any Gemini 3.5 API model ID as of this data check.
| Confirmed Item | Current State | Practical Meaning |
|---|---|---|
| Latest public Pro page | Gemini 3.1 Pro | Use 3.1 as the comparison baseline |
| Model status | Preview | Expect naming, limits, or rate tiers to change |
| Input modalities | Text, image, video, audio, PDF | Strong multimodal default for agent and document workflows |
| Output modality | Text | No confirmed Gemini 3.5 multimodal output story yet |
| Input context | 1M tokens | Long-context API planning can continue on 3.1 |
| Output context | 64K tokens | Enough for large reports and code patches |
| API model ID | gemini-3.1-pro-preview |
This is the live ID to test today |
| Gemini 3.5 model ID | None confirmed | Any guessed ID is unsafe |
The important distinction: "new Gemini model expected" is not the same claim as "Gemini 3.5 Pro released." One is a launch-window read. The other requires an official model card, API docs, pricing row, or product announcement.
Why Google I/O 2026 Is the Main Release Window
Google I/O 2026 is the obvious window because the event starts today, May 19, and Google's save-the-date page explicitly names Gemini in the event scope. That does not prove Gemini 3.5 Pro. It does justify monitoring the keyword.
The current signal stack looks like this:
| Signal | Strength | What It Supports | What It Does Not Support |
|---|---|---|---|
| Google I/O dates confirmed | High | A near-term Google AI announcement window | Specific Gemini 3.5 Pro naming |
| Official I/O page mentions Gemini | High | Gemini will be part of the event story | A Pro model release |
| Gemini 3 post promised more models | Medium | Additional Gemini 3 series models are plausible | Exact version or timing |
| DeepMind page now centers 3.1 Pro | High | 3.1 Pro is the current baseline | 3.5 has shipped |
| Prediction-market and community chatter | Medium | Search demand is rising | Official release facts |
| Official docs show no 3.5 string | High | 3.5 is not live in the API docs yet | Future absence |
Our current probability estimate:
| Event | Probability | Reasoning |
|---|---|---|
| Gemini 3.5 Pro announced at Google I/O 2026 | 35% | Plausible timing, but no official docs or stable leak trail |
| Any Gemini 3.x model update at I/O | 60% | I/O is a Gemini-heavy event and the 3 series still has room for variants |
| Gemini 3.5 Pro by July 31, 2026 | 70% | Naming could arrive after I/O if Google stages API rollout |
| No Gemini Pro model update before August | 30% | Google may focus on Android, agents, Flash, video, or Workspace instead |
This is a forecast, not a confirmation. If Google publishes a model card or pricing row during I/O, this article should be updated in the same slug within 30 minutes.
Gemini 3.1 Pro Pricing: The Baseline 3.5 Must Beat
For developers, the useful baseline is not rumor. It is the live Gemini 3.1 Pro Preview pricing row.
Per Google's Gemini API pricing page, gemini-3.1-pro-preview uses a two-tier prompt price:
| Mode | Input, <=200K Prompt | Input, >200K Prompt | Output, <=200K Prompt | Output, >200K Prompt | Notes |
|---|---|---|---|---|---|
| Standard | $2.00 / MTok | $4.00 / MTok | $12.00 / MTok | $18.00 / MTok | Default production path |
| Batch | $1.00 / MTok | $2.00 / MTok | $6.00 / MTok | $9.00 / MTok | 50% cheaper for non-realtime jobs |
| Flex | $1.00 / MTok | $2.00 / MTok | $6.00 / MTok | $9.00 / MTok | Latency-tolerant production traffic |
| Priority | $3.60 / MTok | $7.20 / MTok | $21.60 / MTok | $32.40 / MTok | Throughput-sensitive workloads |
Three notes matter more than the headline price:
| Pricing Detail | Why It Matters |
|---|---|
| The 200K threshold doubles input price | Long-context RAG can move from cheap to merely reasonable very quickly |
| Batch and Flex halve token price | Offline eval, extraction, and document jobs should rarely use Standard |
| Context caching is priced separately | Repeated long prompts need storage math, not just token math |
| Grounding can add per-query cost | Search-heavy agents need request-level accounting |
If Gemini 3.5 Pro launches above this price, it has to reduce retries or unlock new tasks. "Newer" is not a cost argument.
Gemini 3.1 Pro Benchmarks: The Current Bar
DeepMind's Gemini 3.1 Pro page publishes benchmark deltas against Gemini 3 Pro and other frontier models. These numbers are the bar a Gemini 3.5 Pro launch would need to clear.
| Benchmark | Gemini 3.1 Pro | Gemini 3 Pro | What Changed |
|---|---|---|---|
| Humanity's Last Exam, no tools | 44.4% | 37.5% | Strong reasoning lift |
| Humanity's Last Exam, search + code | 51.4% | 45.8% | Better tool-assisted reasoning |
| ARC-AGI-2 | 77.1% | 31.1% | Major abstract reasoning jump |
| GPQA Diamond | 94.3% | 91.9% | Small but meaningful science gain |
| Terminal-Bench 2.0 | 68.5% | 56.9% | Better terminal-agent behavior |
| SWE-Bench Verified | 80.6% | 76.2% | Incremental coding-agent gain |
| MCP Atlas | 69.2% | 54.1% | Stronger multi-step MCP workflows |
| BrowseComp | 85.9% | 59.2% | Large agentic search improvement |
| MMMU-Pro | 80.5% | 81.0% | Slightly behind Gemini 3 Pro |
| MRCR v2, 1M pointwise | 26.3% | 26.3% | No visible long-context retrieval gain |
The read: Gemini 3.1 Pro is already a serious agent model. A Gemini 3.5 Pro release that only adds small benchmark gains will not be enough. The needed improvement is reliability under long-horizon tool use, not just higher chart numbers.
Where 3.5 would need to move:
| Area | Gemini 3.1 Pro Baseline | Useful Gemini 3.5 Pro Target | Why It Matters |
|---|---|---|---|
| Terminal-Bench 2.0 | 68.5% | 75%+ | Coding agents need fewer broken shell loops |
| SWE-Bench Verified | 80.6% | 83%+ | Above this line, fewer human review cycles |
| MCP Atlas | 69.2% | 75%+ | MCP tool routing is becoming production-critical |
| BrowseComp | 85.9% | 90%+ | Agentic search is a real workflow, not a demo |
| MRCR 1M | 26.3% | 40%+ | Long-context retrieval remains the weak point |
| Price retention | $2/$12 to $4/$18 | Same or <=25% premium | Higher price must be paid back by fewer retries |
This is why TokenMix.ai treats Gemini 3.5 Pro as a cost-per-workflow question. If it ships, do not compare only leaderboard rank. Compare accepted task rate, tool-call retries, latency, and the prompt tier you actually hit.
Cost Per Task: Gemini 3.1 Pro Today vs Possible 3.5 Pricing
Pricing guesses are dangerous, so this section uses scenario math. The confirmed numbers are Gemini 3.1 Pro prices from Google's pricing page. The Gemini 3.5 columns are hypothetical premiums to show what teams should test on launch day.
| Workload | Tokens In / Out | Gemini 3.1 Pro Standard | If 3.5 Is +25% | If 3.5 Is +50% |
|---|---|---|---|---|
| Coding agent pass | 30K / 8K | $0.16 | $0.20 | $0.23 |
| Research summary | 100K / 5K | $0.26 | $0.33 | $0.39 |
| 180K document analysis | 180K / 6K | $0.43 | $0.54 | $0.65 |
| 800K codebase review | 800K / 12K | $3.42 | $4.27 | $5.12 |
| 1M-token legal RAG | 1M / 20K | $4.36 | $5.45 | $6.54 |
Now the real production question:
| Scenario | Standard Cost | Accepted Output Rate | Cost Per Accepted Result |
|---|---|---|---|
| Gemini 3.1 Pro, 180K doc task | $0.43 | 70% | $0.61 |
| Gemini 3.5 Pro, +25% premium | $0.54 | 80% | $0.68 |
| Gemini 3.5 Pro, +25% premium | $0.54 | 90% | $0.60 |
| Gemini 3.5 Pro, +50% premium | $0.65 | 90% | $0.72 |
Translation: a 25% price premium is acceptable only if Gemini 3.5 Pro meaningfully reduces failed generations, tool retries, or human edits. A 50% premium needs a bigger quality jump.
Batch and Flex change the economics:
| Workload | Gemini 3.1 Standard | Gemini 3.1 Batch/Flex | Savings |
|---|---|---|---|
| 100K / 5K research summary | $0.26 | $0.13 | 50% |
| 180K / 6K document analysis | $0.43 | $0.22 | 50% |
| 800K / 12K codebase review | $3.42 | $1.71 | 50% |
| 10,000 x 30K / 8K coding passes | $1,560 | $780 | 50% |
If your Gemini workload is asynchronous, start with Batch or Flex before waiting for a new model. That saves real money today.
How to Prepare API Access Before Gemini 3.5 Pro Appears
There is no public gemini-3.5-pro API ID today. The wrong implementation is to hardcode a guessed model name and wait.
The right implementation is a config-driven model switch:
import os
from openai import OpenAI
client = OpenAI(
base_url="https://api.tokenmix.ai/v1",
api_key=os.environ["TOKENMIX_API_KEY"],
)
MODEL_ID = os.getenv("MODEL_ID", "gemini-3.1-pro-preview")
response = client.chat.completions.create(
model=MODEL_ID,
messages=[
{"role": "system", "content": "You are a precise API migration reviewer."},
{"role": "user", "content": "Review this tool-calling workflow for failure risk."},
],
)
Through TokenMix.ai, the value is not pretending 3.5 is already live. The value is one OpenAI-compatible integration that can route between Gemini, GPT, Claude, DeepSeek, and other models when your eval says the switch is worth it.
| Access Route | Use It For | Risk |
|---|---|---|
| Direct Gemini API | Native Google features, fastest access to new Gemini IDs | Provider-specific SDK and pricing logic |
| Vertex AI | Enterprise IAM, audit logs, regional controls | Heavier setup |
| Gemini app / AI Studio | Manual testing and prompt exploration | Not production automation |
| TokenMix.ai unified API | Multi-provider routing and quick fallback | Wait for upstream model availability |
Internal links worth keeping near this article:
- Use the Google Gemini API pricing guide for live cost planning.
- Compare current model families in GPT vs Claude vs Gemini.
- For OpenAI-side context, see the GPT-5.5 release and pricing analysis.
- If you want provider portability, read How to Switch AI Providers Without Refactoring.
Migration Checklist: Gemini 3.1 Pro to Gemini 3.5 Pro
Do this before launch, not after the model name appears.
| Action | When | Effort |
|---|---|---|
| Save 50-100 production prompts with expected outputs | Now | 2-4 hours |
| Label failure types: wrong answer, bad tool call, bad format, timeout | Now | Half day |
| Track token tier: <=200K vs >200K prompt | Now | 1 hour |
| Put model ID behind config | Now | 30 minutes |
| Add accepted-output rate to your eval | Now | 2 hours |
| Compare 3.5 vs 3.1 on the same prompts | Launch day | 2-6 hours |
| Measure cost per accepted result, not cost per token | Launch day | 1 hour |
| Keep 3.1 Pro as fallback for 7-14 days | After launch | Ongoing |
The decision rule:
| If Gemini 3.5 Pro... | Migration Decision |
|---|---|
| Keeps 3.1 pricing and improves accepted-output rate | Migrate high-value workflows quickly |
| Costs 25% more and reduces retries by 25%+ | Test production traffic gradually |
| Costs 50% more with small benchmark gains | Keep 3.1 Pro for most workloads |
| Improves coding but not long context | Route agent tasks only |
| Improves long context but not latency | Use it for offline RAG, not interactive apps |
Do not migrate because the version number is bigger. Migrate when Gemini 3.5 Pro lowers your total workflow cost or unlocks a task Gemini 3.1 Pro cannot handle.
FAQ
Is Gemini 3.5 Pro released?
No. As of 2026-05-19 10:00 UTC+08, Google has not published an official Gemini 3.5 Pro model card, API pricing row, or public model ID. The current official Pro baseline is Gemini 3.1 Pro Preview.
When is the Gemini 3.5 Pro release date?
There is no confirmed release date. Google I/O 2026 on May 19-20 is the main watch window, but our estimate is only 35% for a specific Gemini 3.5 Pro announcement at I/O. A broader Gemini 3.x update is more likely than that exact name.
What is the latest official Gemini Pro model?
The latest official Pro page points to Gemini 3.1 Pro. DeepMind lists it as Preview, with 1M input tokens, 64K output tokens, multimodal input, and availability through Gemini App, Vertex AI, AI Studio, Gemini API, Google AI Mode, and Google Antigravity.
What is the Gemini 3.5 Pro API model ID?
There is no confirmed API model ID. Do not hardcode gemini-3.5-pro or gemini-3.5-pro-preview until Google lists the model in official Gemini API or Vertex AI documentation.
How much will Gemini 3.5 Pro cost?
Unknown. The current planning baseline is Gemini 3.1 Pro Preview: $2 input / $12 output per million tokens for prompts up to 200K, and $4 / $18 for prompts above 200K. Batch and Flex cut those token prices by 50%.
Will Gemini 3.5 Pro be better than GPT-5.5?
That cannot be answered until Google publishes the model and independent evals run. The important categories to watch are Terminal-Bench, SWE-Bench Verified, BrowseComp, MCP Atlas, long-context retrieval, latency, and accepted-output rate.
Should I wait for Gemini 3.5 Pro before building?
No. Build on Gemini 3.1 Pro or your current production model now, but keep the model ID configurable. Waiting for a rumored name is slower than building a clean evaluation and routing layer.
Can I access Gemini models through TokenMix.ai?
Yes, TokenMix.ai is designed for unified AI API access and provider switching. The safe pattern is to call the current supported Gemini model through an OpenAI-compatible endpoint, then switch the configured model ID only after Gemini 3.5 Pro is officially available and passes your evals.
Sources
- Google I/O 2026 save-the-date: https://blog.google/innovation-and-ai/technology/developers-tools/io-2026-save-the-date/
- Google Gemini 3 launch post: https://blog.google/products-and-platforms/products/gemini/gemini-3/
- Google DeepMind Gemini 3.1 Pro page: https://deepmind.google/models/gemini/pro/
- Gemini API pricing: https://ai.google.dev/gemini-api/docs/pricing
- Gemini 3.5 prediction-market tracker, treated as non-official signal only: https://polymarket.com/event/gemini-3pt5-released-by-june-30
Related Articles
- Google Gemini API Pricing 2026: 3.1 Pro, Flash, Batch Costs
- GPT-5.5 (Spud) Released: API Pricing and Benchmarks
- GPT vs Claude vs Gemini: Pricing, Benchmarks, and Features
- OpenAI API Pricing 2026: GPT-5.5, Realtime, Image Costs
- How to Switch AI Providers Without Refactoring
By TokenMix Research Lab · Published 2026-05-19 · Last Updated 2026-05-19 · Data Checked 2026-05-19