TokenMix Research Lab · 2026-04-03

GPT-5 API Pricing 2026: 5.5, 5.4, Mini Costs, Batch Math

Last Updated: 2026-04-30
Author: TokenMix Research Lab
Data checked: 2026-04-30

GPT-5 API pricing now has three public flagship routes: GPT-5.5 at $5/$30, GPT-5.4 at $2.50/$15, and GPT-5.4 mini at $0.75/$4.50 per 1M tokens. Cache reads are 90% cheaper than base input, and Batch API cuts input and output by 50%.

According to OpenAI's official API pricing page, GPT-5.5 costs $5 input, $0.50 cached input, and $30 output per 1M tokens. GPT-5.4 costs $2.50 input, $0.25 cached input, and $15 output. GPT-5.4 mini costs $0.75 input, $0.075 cached input, and $4.50 output. OpenAI also states that standard rates shown apply to context lengths under 270K, Batch API saves 50% on inputs and outputs, and data residency adds 10%.

Quick Answer
Confirmed GPT-5 Pricing Facts
GPT-5 Price Table
Standard vs Cached vs Batch
GPT-5.5 vs GPT-5.4 vs GPT-5.4 Mini
Cost per Task
Monthly Cost Scenarios
When To Escalate
GPT-5 vs Claude, Gemini, DeepSeek
When TokenMix.ai Fits
Final Recommendation
FAQ
Related Articles
Sources

Quick Answer

Question	Answer
Cheapest GPT-5 API route	GPT-5.4 mini at $0.75 input and $4.50 output per 1M tokens.
Default GPT-5 route	GPT-5.4 at $2.50 input and $15 output.
Premium GPT-5 route	GPT-5.5 at $5 input and $30 output.
Cached input discount	90% off base input for all three listed GPT-5 routes.
Batch discount	50% off input and output for async Batch API jobs.
Key context caveat	The public pricing page says listed standard rates apply under 270K context length.

Confirmed GPT-5 Pricing Facts

Claim	Status	Practical meaning	Source
GPT-5.5 is $5/$30 per 1M tokens	Confirmed	It is the premium OpenAI route.	OpenAI pricing
GPT-5.4 is $2.50/$15 per 1M tokens	Confirmed	It is the balanced OpenAI default.	OpenAI pricing
GPT-5.4 mini is $0.75/$4.50 per 1M tokens	Confirmed	It is the budget GPT-5 route.	OpenAI pricing
Cached input is 10% of base input	Confirmed by listed cached-input prices	Repeated context can change the cost curve.	OpenAI pricing
Batch API saves 50% on inputs and outputs	Confirmed	Best for offline evaluation, summarization, and enrichment.	OpenAI pricing
Data residency adds 10%	Confirmed pricing toggle	Regulated deployments need this in cost forecasts.	OpenAI pricing
Old four-tier GPT price tables should be used	Not supported by current public page checked	Do not model current GPT-5 spend from stale page copies.	TokenMix analysis

GPT-5 Price Table

All prices are per 1M tokens in USD.

Model	Input	Cached input	Output	Best use
GPT-5.4 mini	$0.75	$0.075	$4.50	Cheap OpenAI route, agents, bulk simple tasks
GPT-5.4	$2.50	$0.25	$15.00	Default production route
GPT-5.5	$5.00	$0.50	$30.00	Hard coding, professional work, escalation

This is the clean current picture. Older pages that mention a four-tier GPT-5.4 lineup are not aligned with the public OpenAI pricing page checked on 2026-04-30.

Standard vs Cached vs Batch

OpenAI cost levers are simple but powerful.

Model	Standard input	Cached input	Batch input	Standard output	Batch output
GPT-5.4 mini	$0.75	$0.075	$0.375	$4.50	$2.25
GPT-5.4	$2.50	$0.25	$1.25	$15.00	$7.50
GPT-5.5	$5.00	$0.50	$2.50	$30.00	$15.00

Effective input with repeated context:

Model	Base input	70% cached input	90% cached input
GPT-5.4 mini	$0.75	$0.2775	$0.1425
GPT-5.4	$2.50	$0.9250	$0.4750
GPT-5.5	$5.00	$1.8500	$0.9500

Formula:

effective input = uncached_share * base_input + cached_share * cached_input

The catch: cached input only helps when the prefix actually repeats. It does not reduce output tokens.

GPT-5.5 vs GPT-5.4 vs GPT-5.4 Mini

Decision factor	GPT-5.4 mini	GPT-5.4	GPT-5.5
Price tier	Budget	Balanced	Premium
Input vs GPT-5.4	70% cheaper	Baseline	2x higher
Output vs GPT-5.4	70% cheaper	Baseline	2x higher
Best default role	First-pass work	Main production route	Escalation route
Avoid when	Quality failures cost more than token savings	You need the highest quality	Request is simple or high-volume

The practical routing policy:

Request type	First route	Escalate if
Classification	GPT-5.4 mini	The label is high-risk or ambiguous.
Extraction	GPT-5.4 mini	The document is messy or legally sensitive.
User-facing answer	GPT-5.4	The answer must be near-perfect.
Coding task	GPT-5.4	Multi-file reasoning fails.
Deep professional analysis	GPT-5.5	Start here only when failure cost is high.

Cost per Task

These examples use standard non-batch pricing and no cache unless stated.

Task	Token shape	GPT-5.4 mini	GPT-5.4	GPT-5.5
Simple chat reply	500 input / 200 output	$0.001275	$0.004250	$0.008500
Support answer	2K input / 800 output	$0.005100	$0.017000	$0.034000
RAG answer	8K input / 500 output	$0.008250	$0.027500	$0.055000
Code review	20K input / 3K output	$0.028500	$0.095000	$0.190000
Long cached agent turn	100K input, 70% cached / 2K output	$0.036750	$0.122500	$0.245000

Batch version of the same code review:

Model	Standard cost	Batch cost	Savings
GPT-5.4 mini	$0.028500	$0.014250	50%
GPT-5.4	$0.095000	$0.047500	50%
GPT-5.5	$0.190000	$0.095000	50%

Monthly Cost Scenarios

Assume 100M input tokens and 30M output tokens per month.

Model	No cache	70% input cache	Batch only	Batch plus 70% cache estimate
GPT-5.4 mini	$210.00	$162.75	$105.00	$81.38
GPT-5.4	$700.00	$542.50	$350.00	$271.25
GPT-5.5	$1,400.00	$1,085.00	$700.00	$542.50

At this volume, GPT-5.5 costs 6.7x GPT-5.4 mini before cache and batch. Use it where quality changes the outcome, not as a default bulk model.

When To Escalate

Situation	Stay on mini	Use GPT-5.4	Escalate to GPT-5.5
Bulk labels	Yes	Rarely	No
Support drafts	Often	Yes for final answer	Rarely
Code generation	For simple edits	Default	Hard multi-step tasks
Legal/finance summaries	No	Usually	High-stakes or ambiguous
Agent planning	Cheap first pass	Main route	Failed or high-value turns

The key metric is cost per successful workflow, not cost per token.

GPT-5 vs Claude, Gemini, DeepSeek

Model	Input	Cached input	Output	Main role
GPT-5.4 mini	$0.75	$0.075	$4.50	Budget OpenAI route
GPT-5.4	$2.50	$0.25	$15.00	OpenAI default
GPT-5.5	$5.00	$0.50	$30.00	Premium OpenAI route
Claude Sonnet 4.6	$3.00	$0.30	$15.00	Balanced Claude route
Claude Opus 4.7	$5.00	$0.50	$25.00	Premium Claude route
Gemini 3.1 Pro	$2.00	$0.20	$12.00	Premium Gemini route under 200K
DeepSeek V4 Flash	$0.14 miss	$0.0028 hit	$0.28	Lowest-cost text route

GPT-5.4 is competitive on quality and ecosystem fit. It is not the lowest-cost route. DeepSeek dominates raw token price, Gemini is strong under 200K, and Claude remains a strong quality benchmark.

When TokenMix.ai Fits

TokenMix.ai fits when GPT-5 is one route in a multi-model system. Instead of wiring separate clients for OpenAI, Claude, Gemini, DeepSeek, and other providers, you can use one OpenAI-compatible access layer and route by task.

Need	Direct OpenAI	TokenMix.ai unified API
Native OpenAI feature coverage	Best path	Use when compatible
GPT plus Claude/Gemini/DeepSeek	Multiple integrations	One routing layer
Cost-aware fallback	Build yourself	Centralized policy
Payment flexibility	OpenAI billing path	Useful when direct billing is hard
Price comparison	Manual	Compare model families

Use the AI API pricing hub for cross-provider price decisions.

Final Recommendation

Use GPT-5.4 mini for cheap first-pass work, GPT-5.4 as the default OpenAI production route, and GPT-5.5 only for hard tasks where a better answer is worth 2x the GPT-5.4 token price.

FAQ

How much does GPT-5.5 API cost?

GPT-5.5 costs $5 input, $0.50 cached input, and $30 output per 1M tokens on OpenAI's public pricing page checked on 2026-04-30.

How much does GPT-5.4 API cost?

GPT-5.4 costs $2.50 input, $0.25 cached input, and $15 output per 1M tokens. It is the balanced default in the current GPT-5 API lineup.

How much does GPT-5.4 mini cost?

GPT-5.4 mini costs $0.75 input, $0.075 cached input, and $4.50 output per 1M tokens. It is the cheapest GPT-5 route listed on the public pricing page.

Does GPT-5 API cached input save 90%?

Yes. The cached input prices shown by OpenAI are 10% of base input prices for GPT-5.5, GPT-5.4, and GPT-5.4 mini.

Does Batch API reduce GPT-5 costs?

Yes. OpenAI states that Batch API saves 50% on inputs and outputs. Use it for async jobs where a user is not waiting.

Is GPT-5.5 worth the extra cost?

Use GPT-5.5 when failure is expensive: hard coding, professional analysis, and multi-step problems. For routine tasks, GPT-5.4 or GPT-5.4 mini is usually more cost-efficient.

Is GPT-5 cheaper than Claude?

GPT-5.4 mini is cheaper than current Claude 4.5/4.6 routes. GPT-5.4 is cheaper than Claude Sonnet 4.6 on input but matches it on output. GPT-5.5 is more expensive than Claude Opus 4.7 on output.

Should I use GPT-5 directly or through a gateway?

Use OpenAI directly for native feature coverage and direct billing. Use a gateway such as TokenMix.ai when your app needs GPT plus Claude, Gemini, DeepSeek, routing, fallback, and centralized cost tracking.