TokenMix Research Lab · 2026-04-24
GPT-4o API: Access, Pricing, Code Examples 2026
Last Updated: 2026-04-24
Author: TokenMix Research Lab
GPT-4o is OpenAI's legacy flagship multimodal model — still production-available in April 2026 despite being superseded by GPT-5 and GPT-5.4. Pricing holds at $2.50 input / $10 output per MTok, 128K context, with text + image + audio input/output support. This guide covers complete setup (Python, TypeScript, curl), pricing at 3 scales, code examples for image generation (now via gpt-image-2 separately), vision input, and the audio/realtime variants. Plus: when GPT-4o is still the right pick vs migrating to GPT-5.4 ($2.50/$15, better quality) or GPT-4.1 ($2/$8, 1M context). TokenMix.ai routes GPT-4o through OpenAI-compatible endpoint.
Table of Contents
- Confirmed vs Speculation
- GPT-4o API Setup
- Pricing Tiers
- Code Examples: Python + TypeScript + curl
- Vision Input
- Image Generation: Via gpt-image-2
- When to Migrate to GPT-5.4
- FAQ
Confirmed vs Speculation
| Claim | Status | Source |
|---|---|---|
| GPT-4o still production-available | Confirmed | OpenAI models |
| Pricing $2.50/$10 per MTok | Confirmed | Pricing page |
| 128K context window | Confirmed | |
| Image generation built into GPT-4o | No — separate gpt-image-2 model |
|
| Vision (image input) native | Yes | |
| Audio realtime variant available | Yes (gpt-4o-realtime-preview) | |
| Deprecation announced | No — still supported |
Snapshot note (2026-04-24): GPT-4o pricing ($2.50/$10) and model availability are current at snapshot. OpenAI typically keeps legacy flagships available ~18 months post-superseding, but specific dates aren't committed. For new projects starting today, start on GPT-5.4 (same input price, better quality) or evaluate GPT-5.5 which launched April 23, 2026 — GPT-4o is primarily relevant for migration-avoidance on existing production.
GPT-4o API Setup
1. Get OpenAI API key:
- Sign up at platform.openai.com
- Add billing + credit card
- API keys → Create new secret key
- Save immediately (shown once)
2. Set env var:
export OPENAI_API_KEY="sk-proj-..."
3. First call:
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
Pricing Tiers
| Model | Input $/MTok | Output $/MTok | Best for |
|---|---|---|---|
| gpt-4o | $2.50 | $10.00 | Legacy production |
| gpt-4o-mini | $0.15 | $0.60 | Budget general chat |
| gpt-4o-realtime-preview | $0.06/min audio in, $0.24/min audio out (+$5/$20 per MTok text) | Voice agents | |
| gpt-4o-audio-preview | $0.06/min audio | $0.24/min audio | Async audio tasks |
| gpt-4o-transcribe | $0.006/min audio | STT async |
GPT-4o-mini at $0.15/$0.60 is 17× cheaper than gpt-4o for comparable chat quality — most production migrations go mini → full only when needed.
Code Examples: Python + TypeScript + curl
Python:
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are concise."},
{"role": "user", "content": "Explain quantum entanglement."}
],
temperature=0.7,
max_tokens=500
)
TypeScript:
import OpenAI from "openai";
const client = new OpenAI();
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello!" }]
});
curl:
curl https://api.openai.com/v1/chat/completions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role":"user","content":"Hello"}]
}'
Vision Input
Send images via URL or base64:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{"type": "image_url", "image_url": {
"url": "https://example.com/image.jpg"
}}
]
}]
)
For base64:
{"image_url": {"url": f"data:image/jpeg;base64,{base64_string}"}}
Max image size: 20MB. Automatically resized to fit model's visual processing.
Image Generation: Via gpt-image-2
GPT-4o itself doesn't generate images — use separate gpt-image-2 model:
response = client.images.generate(
model="gpt-image-2",
prompt="A serene mountain landscape at sunset",
size="1024x1024",
quality="hd"
)
image_url = response.data[0].url
Pricing: ~$0.04/image standard, $0.08/image HD. See GPT Image 2 developer guide.
When to Migrate to GPT-5.4
| Your situation | Stay on 4o or migrate? |
|---|---|
| Existing production on gpt-4o, quality acceptable | Stay |
| Want better quality at similar cost | Migrate to gpt-5.4 ($2.50/$15) |
| Need 1M context | Migrate to gpt-4.1 ($2/$8) |
| Real-time voice agent | Keep gpt-4o-realtime-preview (still best for voice) |
| Coding agent | Migrate to gpt-5.1-codex or Claude Opus 4.7 |
| High-volume chat | Stay on gpt-4o-mini (dirt cheap $0.15) |
| Classification / batch | Migrate to gpt-5.4-nano ($0.05) |
See All ChatGPT Models Compared for complete family overview.
FAQ
Is GPT-4o deprecated?
Not yet. Still supported through at least Q2 2027 based on OpenAI's historical deprecation timelines. For new work, start with GPT-5.4 — quality is meaningfully better at the same input price.
What's the max token limit for GPT-4o?
Input: 128K tokens (~100K words). Output: 16K tokens per response. For larger contexts, use GPT-4.1 (1M) or Claude (200K native, 1M extended). See GPT-4.1 vs 4o comparison.
Can I stream GPT-4o responses?
Yes:
stream = client.chat.completions.create(
model="gpt-4o",
messages=[...],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
Does GPT-4o support function calling?
Yes, native tool use. Same OpenAI tool schema as GPT-5.x. Compatible with all major agent frameworks.
What's the specific gpt-4o-2024-08-06 version?
Versioned snapshot pinning. Useful for production reproducibility — guarantees exact model weights don't change underneath you. Slightly more restrictive than floating gpt-4o alias. Use for compliance-sensitive workloads.
Is GPT-4o the same as ChatGPT?
GPT-4o is the API model. ChatGPT is the consumer product (uses GPT-4o or GPT-5 family depending on user tier). API and consumer product are billed separately.
Can I fine-tune GPT-4o?
Yes, fine-tuning supported via OpenAI's FT API. Training cost: $3 per 1M training tokens. Inference at $3 per 1M input + $12 per 1M output on fine-tuned versions (20-30% premium over base).
Sources
- OpenAI GPT-4o Documentation
- OpenAI API Pricing
- GPT-4.1 vs 4o — TokenMix
- All ChatGPT Models — TokenMix
- GPT-4o Realtime Audio — TokenMix
- GPT-4o Transcribe — TokenMix
By TokenMix Research Lab · Updated 2026-04-24