TokenMix Research Lab · 2026-04-05

OpenAI API Pricing 2026: GPT-5.5, Realtime, Image Costs
Last Updated: 2026-04-30
Author: TokenMix Research Lab
Data checked: 2026-04-30
OpenAI API pricing in 2026 is anchored by GPT-5.5 at $5/$30, GPT-5.4 at $2.50/$15, and GPT-5.4 mini at $0.75/$4.50 per 1M tokens. Realtime, image generation, web search, containers, Batch API, and data residency can change the final bill.
OpenAI's official API pricing page lists GPT-5.5, GPT-5.4, GPT-5.4 mini, GPT-realtime-1.5, GPT-image-2, web search, containers, Batch API, priority processing, flex processing, and data residency pricing modifiers. The short version: cache reads are 90% cheaper for the GPT-5.5/5.4 text routes, Batch API saves 50% on inputs and outputs, and data residency adds 10%.
Table of Contents
- Quick Answer
- Confirmed OpenAI Pricing Facts
- GPT Text Model Pricing
- Realtime Pricing
- GPT-image-2 Pricing
- Web Search and Container Costs
- Batch, Flex, Priority, and Data Residency
- Cost per Task
- Monthly Cost Scenarios
- OpenAI vs Claude, Gemini, DeepSeek
- When TokenMix.ai Fits
- Final Recommendation
- FAQ
- Related Articles
- Sources
Quick Answer
| Question | Answer |
|---|---|
| Cheapest listed GPT text model | GPT-5.4 mini at $0.75 input and $4.50 output per 1M tokens. |
| Default OpenAI production model | GPT-5.4 at $2.50 input and $15 output. |
| Premium OpenAI model | GPT-5.5 at $5 input and $30 output. |
| Realtime text price | GPT-realtime-1.5 text is $4 input, $0.40 cached input, and $16 output per 1M tokens. |
| Image generation price | GPT-image-2 image tokens are $8 input, $2 cached input, and $30 output per 1M tokens. |
| Biggest discount | Batch API saves 50%; cached input is 90% cheaper for GPT-5 text routes. |
Confirmed OpenAI Pricing Facts
| Claim | Status | Practical meaning | Source |
|---|---|---|---|
| GPT-5.5 is $5/$30 per 1M tokens | Confirmed | Premium text route. | OpenAI pricing |
| GPT-5.4 is $2.50/$15 per 1M tokens | Confirmed | Default flagship route. | OpenAI pricing |
| GPT-5.4 mini is $0.75/$4.50 per 1M tokens | Confirmed | Budget GPT route. | OpenAI pricing |
| GPT-realtime-1.5 audio is $32 input and $64 output | Confirmed | Realtime voice is much more expensive than text. | OpenAI pricing |
| GPT-image-2 image tokens are $8 input and $30 output | Confirmed | Image generation needs separate cost math. | OpenAI pricing |
| Web search is $10 per 1K calls | Confirmed | Search-heavy apps need per-call tracking. | OpenAI pricing |
| Batch API saves 50% | Confirmed | Strongest lever for async jobs. | OpenAI pricing |
| Data residency adds 10% | Confirmed | Enterprise compliance can add a visible premium. | OpenAI pricing |
GPT Text Model Pricing
All prices are per 1M tokens in USD.
| Model | Input | Cached input | Output | Best use |
|---|---|---|---|---|
| GPT-5.4 mini | $0.75 | $0.075 | $4.50 | Budget production, agents, simple coding |
| GPT-5.4 | $2.50 | $0.25 | $15.00 | Default OpenAI production route |
| GPT-5.5 | $5.00 | $0.50 | $30.00 | Hard coding and professional work |
Batch-adjusted text pricing:
| Model | Batch input | Batch output | Savings |
|---|---|---|---|
| GPT-5.4 mini | $0.375 | $2.25 | 50% |
| GPT-5.4 | $1.25 | $7.50 | 50% |
| GPT-5.5 | $2.50 | $15.00 | 50% |
Data residency adjusted text pricing:
| Model | Input with +10% | Cached input with +10% | Output with +10% |
|---|---|---|---|
| GPT-5.4 mini | $0.825 | $0.0825 | $4.95 |
| GPT-5.4 | $2.75 | $0.275 | $16.50 |
| GPT-5.5 | $5.50 | $0.55 | $33.00 |
Realtime Pricing
GPT-realtime-1.5 has separate prices for audio, text, and image inputs.
| Modality | Input | Cached input | Output |
|---|---|---|---|
| Audio | $32.00 | $0.40 | $64.00 |
| Text | $4.00 | $0.40 | $16.00 |
| Image | $5.00 | $0.50 | Not listed as image output in this section |
Cost example:
| Realtime workload | Token shape | Estimated cost |
|---|---|---|
| Text turn | 1K text input / 500 text output | $0.0120 |
| Audio-heavy turn | 1K audio input / 500 audio output | $0.0640 |
| Cached text session | 70% of 10K text input cached / 2K output | $0.0352 |
Realtime pricing is not just "GPT text plus streaming." Audio input and output are a different cost class.
GPT-image-2 Pricing
OpenAI lists GPT-image-2 as the state-of-the-art image generation model.
| Token type | Input | Cached input | Output |
|---|---|---|---|
| Image tokens | $8.00 | $2.00 | $30.00 |
| Text tokens | $5.00 | $1.25 | Not listed separately in this section |
Cost notes:
| Scenario | Cost driver | Practical meaning |
|---|---|---|
| Prompt-heavy image generation | Text input | Long prompts are not free. |
| Reference-image workflows | Image input | Input images are priced separately from text. |
| Iterative image edits | Cached input can help | Repeated context may reduce input cost. |
| High-volume generation | Output tokens dominate | Track output token volume carefully. |
For a focused image model breakdown, use GPT-image-2 pricing.
Web Search and Container Costs
OpenAI tool pricing can dominate the bill when apps use tools heavily.
| Tool | Public price | Cost risk |
|---|---|---|
| Web search | $10 per 1K calls | Search-heavy agents can add large per-call spend. |
| Search content tokens | Free on pricing page | Still track call count. |
| Containers | $0.03 for 1GB, $1.92 for 64GB per 20-minute session | Code/tool sessions can become a separate infrastructure bill. |
Tool-heavy agent cost example:
| Workflow | Model tokens | Tool calls | Total driver |
|---|---|---|---|
| Search answer | 3K GPT-5.4 tokens | 1 web search | Search call can exceed token cost. |
| Code execution | 10K GPT-5.4 tokens | 1 container session | Container session adds fixed runtime cost. |
| Research agent | 20K GPT-5.5 tokens | 5 web searches | Both premium tokens and search calls matter. |
Do not forecast OpenAI spend from model tokens alone if you use tools.
Batch, Flex, Priority, and Data Residency
| Mode or modifier | Price effect | Best for |
|---|---|---|
| Standard | Listed price | Normal live requests |
| Batch API | 50% off inputs and outputs | Offline jobs, evaluation, enrichment |
| Flex processing | Lower cost for slower, lower-priority work | Non-production or non-urgent requests |
| Priority processing | Higher performance option | Latency-sensitive production |
| Data residency | +10% | Enterprise or regional processing requirements |
The practical order:
| Constraint | First lever |
|---|---|
| User waits for answer | Cache repeated context. |
| User does not wait | Batch. |
| Job is low priority | Flex. |
| Compliance requires region | Add data residency premium. |
| Quality failure is expensive | Escalate model. |
Cost per Task
These examples use Standard pricing, no cache, no tools.
| Task | Token shape | GPT-5.4 mini | GPT-5.4 | GPT-5.5 |
|---|---|---|---|---|
| Simple chat reply | 500 input / 200 output | $0.001275 | $0.004250 | $0.008500 |
| Support answer | 2K input / 800 output | $0.005100 | $0.017000 | $0.034000 |
| RAG answer | 8K input / 500 output | $0.008250 | $0.027500 | $0.055000 |
| Code review | 20K input / 3K output | $0.028500 | $0.095000 | $0.190000 |
| Long analysis | 100K input / 10K output | $0.120000 | $0.400000 | $0.800000 |
Same tasks with Batch:
| Task | GPT-5.4 mini batch | GPT-5.4 batch | GPT-5.5 batch |
|---|---|---|---|
| Simple chat reply | $0.000638 | $0.002125 | $0.004250 |
| Support answer | $0.002550 | $0.008500 | $0.017000 |
| RAG answer | $0.004125 | $0.013750 | $0.027500 |
| Code review | $0.014250 | $0.047500 | $0.095000 |
| Long analysis | $0.060000 | $0.200000 | $0.400000 |
Monthly Cost Scenarios
Assume 100M input tokens and 30M output tokens per month.
| Model | No cache | 70% input cache | Batch only | Batch plus 70% cache estimate |
|---|---|---|---|---|
| GPT-5.4 mini | $210.00 | $162.75 | $105.00 | $81.38 |
| GPT-5.4 | $700.00 | $542.50 | $350.00 | $271.25 |
| GPT-5.5 | $1,400.00 | $1,085.00 | $700.00 | $542.50 |
Add data residency:
| Model | No cache with +10% data residency | 70% cache with +10% |
|---|---|---|
| GPT-5.4 mini | $231.00 | $179.03 |
| GPT-5.4 | $770.00 | $596.75 |
| GPT-5.5 | $1,540.00 | $1,193.50 |
OpenAI vs Claude, Gemini, DeepSeek
| Model | Input | Cached input | Output | Main role |
|---|---|---|---|---|
| GPT-5.4 mini | $0.75 | $0.075 | $4.50 | Budget OpenAI route |
| GPT-5.4 | $2.50 | $0.25 | $15.00 | OpenAI default |
| GPT-5.5 | $5.00 | $0.50 | $30.00 | Premium OpenAI route |
| Claude Sonnet 4.6 | $3.00 | $0.30 | $15.00 | Balanced Claude route |
| Claude Opus 4.7 | $5.00 | $0.50 | $25.00 | Premium Claude route |
| Gemini 3.1 Pro | $2.00 | $0.20 | $12.00 | Premium Gemini under 200K |
| DeepSeek V4 Flash | $0.14 miss | $0.0028 hit | $0.28 | Lowest-cost text route |
OpenAI is strongest when ecosystem compatibility, GPT quality, realtime, image, and tool integrations matter. It is not the lowest-cost text route in 2026.
When TokenMix.ai Fits
TokenMix.ai fits when OpenAI is one route in a larger AI API policy.
| Need | Direct OpenAI | TokenMix.ai unified API |
|---|---|---|
| Native OpenAI tools | Best path | Use when compatible |
| OpenAI plus Claude/Gemini/DeepSeek | Multiple integrations | One OpenAI-compatible access layer |
| Cost-aware routing | Build yourself | Centralized model policy |
| Fallback | Build yourself | Route across providers |
| Payment flexibility | OpenAI billing path | Useful when direct billing is hard |
Use AI API pricing for cross-provider decisions and OpenAI-compatible API gateway for integration strategy.
Final Recommendation
Use GPT-5.4 mini for cheap OpenAI work, GPT-5.4 as the default production route, GPT-5.5 only for hard tasks, Batch for async jobs, and cache for repeated context. Add tool and data-residency costs before forecasting monthly spend.
FAQ
How much does the OpenAI GPT-5.5 API cost?
GPT-5.5 costs $5 input, $0.50 cached input, and $30 output per 1M tokens on OpenAI's public pricing page checked on 2026-04-30.
How much does GPT-5.4 cost?
GPT-5.4 costs $2.50 input, $0.25 cached input, and $15 output per 1M tokens. It is the current default OpenAI flagship route.
How much does GPT-5.4 mini cost?
GPT-5.4 mini costs $0.75 input, $0.075 cached input, and $4.50 output per 1M tokens. It is the cheapest GPT text route listed on the current public pricing page.
Does OpenAI Batch API save money?
Yes. OpenAI states that Batch API saves 50% on inputs and outputs. It is best for async work such as evaluation, summarization, labeling, and enrichment.
How much does OpenAI realtime audio cost?
GPT-realtime-1.5 audio costs $32 input, $0.40 cached input, and $64 output per 1M tokens. Realtime audio is a separate cost class from text.
How much does GPT-image-2 cost?
GPT-image-2 image tokens cost $8 input, $2 cached input, and $30 output per 1M tokens. Text input to GPT-image-2 costs $5 input and $1.25 cached input.
Does OpenAI web search charge per token?
OpenAI's pricing page lists web search at $10 per 1,000 calls and says search content tokens are free. Track search calls separately from model tokens.
Is OpenAI cheaper than Claude or Gemini?
It depends on model and workload. GPT-5.4 mini is cheaper than Claude Sonnet and Gemini Pro, but GPT-5.5 output is more expensive than Claude Opus and Gemini Pro. DeepSeek V4 Flash is far cheaper on raw text tokens.
Related Articles
- AI API Pricing 2026: 16 Models, Cache, Batch, Routing Hub
- GPT-5 API Pricing 2026: 5.5, 5.4, Mini Costs, Batch Math
- GPT-image-2 Pricing 2026: Image Tokens, Output Cost, Routing
- Claude API Pricing 2026: Opus, Sonnet, Haiku Costs Compared
- Google Gemini API Pricing 2026: 3.1 Pro, Flash, Batch Costs
- DeepSeek API Pricing 2026: V4 Costs, Cache Hits, R1 Changes
- OpenAI-Compatible API Gateway: 9 Providers, One SDK Guide
- OpenAI API No Credit Card 2026: 5 Legal Ways To Get Access
- OpenAI API With Alipay 2026: 4 Legal Payment Routes Guide