TokenMix Research Lab · 2026-04-24

GPT-4o API: Access, Pricing, Code Examples 2026

GPT-4o is OpenAI's legacy flagship multimodal model — still production-available in April 2026 despite being superseded by GPT-5 and GPT-5.4. Pricing holds at $2.50 input / 0 output per MTok, 128K context, with text + image + audio input/output support. This guide covers complete setup (Python, TypeScript, curl), pricing at 3 scales, code examples for image generation (now via gpt-image-2 separately), vision input, and the audio/realtime variants. Plus: when GPT-4o is still the right pick vs migrating to GPT-5.4 ($2.50/ 5, better quality) or GPT-4.1 ($2/$8, 1M context). TokenMix.ai routes GPT-4o through OpenAI-compatible endpoint.

Confirmed vs Speculation
GPT-4o API Setup
Pricing Tiers
Code Examples: Python + TypeScript + curl
Vision Input
Image Generation: Via gpt-image-2
When to Migrate to GPT-5.4
FAQ

Confirmed vs Speculation

Claim	Status	Source
GPT-4o still production-available	Confirmed	OpenAI models
Pricing $2.50/ 0 per MTok	Confirmed	Pricing page
128K context window	Confirmed
Image generation built into GPT-4o	No — separate `gpt-image-2` model
Vision (image input) native	Yes
Audio realtime variant available	Yes (gpt-4o-realtime-preview)
Deprecation announced	No — still supported

Snapshot note (2026-04-24): GPT-4o pricing ($2.50/ 0) and model availability are current at snapshot. OpenAI typically keeps legacy flagships available ~18 months post-superseding, but specific dates aren't committed. For new projects starting today, start on GPT-5.4 (same input price, better quality) or evaluate GPT-5.5 which launched April 23, 2026 — GPT-4o is primarily relevant for migration-avoidance on existing production.

GPT-4o API Setup

1. Get OpenAI API key:

Sign up at platform.openai.com
Add billing + credit card
API keys → Create new secret key
Save immediately (shown once)

2. Set env var:

export OPENAI_API_KEY="sk-proj-..."

3. First call:

from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Pricing Tiers

Model	Input $/MTok	Output $/MTok	Best for
gpt-4o	$2.50	0.00	Legacy production
gpt-4o-mini	$0.15	$0.60	Budget general chat
gpt-4o-realtime-preview	$0.06/min audio in, $0.24/min audio out (+$5/$20 per MTok text)		Voice agents
gpt-4o-audio-preview	$0.06/min audio	$0.24/min audio	Async audio tasks
gpt-4o-transcribe	$0.006/min audio		STT async

GPT-4o-mini at $0.15/$0.60 is 17× cheaper than gpt-4o for comparable chat quality — most production migrations go mini → full only when needed.

Code Examples: Python + TypeScript + curl

Python:

from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are concise."},
        {"role": "user", "content": "Explain quantum entanglement."}
    ],
    temperature=0.7,
    max_tokens=500
)

TypeScript:

import OpenAI from "openai";
const client = new OpenAI();

const response = await client.chat.completions.create({
    model: "gpt-4o",
    messages: [{ role: "user", content: "Hello!" }]
});

curl:

curl https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role":"user","content":"Hello"}]
  }'

Vision Input

Send images via URL or base64:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {"type": "image_url", "image_url": {
                "url": "https://example.com/image.jpg"
            }}
        ]
    }]
)

For base64:

{"image_url": {"url": f"data:image/jpeg;base64,{base64_string}"}}

Max image size: 20MB. Automatically resized to fit model's visual processing.

Image Generation: Via gpt-image-2

GPT-4o itself doesn't generate images — use separate gpt-image-2 model:

response = client.images.generate(
    model="gpt-image-2",
    prompt="A serene mountain landscape at sunset",
    size="1024x1024",
    quality="hd"
)
image_url = response.data[0].url

Pricing: ~$0.04/image standard, $0.08/image HD. See GPT Image 2 developer guide.

When to Migrate to GPT-5.4

Your situation	Stay on 4o or migrate?
Existing production on gpt-4o, quality acceptable	Stay
Want better quality at similar cost	Migrate to gpt-5.4 ($2.50/ 5)
Need 1M context	Migrate to gpt-4.1 ($2/$8)
Real-time voice agent	Keep gpt-4o-realtime-preview (still best for voice)
Coding agent	Migrate to gpt-5.1-codex or Claude Opus 4.7
High-volume chat	Stay on gpt-4o-mini (dirt cheap $0.15)
Classification / batch	Migrate to gpt-5.4-nano ($0.05)

See All ChatGPT Models Compared for complete family overview.

FAQ

Is GPT-4o deprecated?

Not yet. Still supported through at least Q2 2027 based on OpenAI's historical deprecation timelines. For new work, start with GPT-5.4 — quality is meaningfully better at the same input price.

What's the max token limit for GPT-4o?

Input: 128K tokens (~100K words). Output: 16K tokens per response. For larger contexts, use GPT-4.1 (1M) or Claude (200K native, 1M extended). See GPT-4.1 vs 4o comparison.

Can I stream GPT-4o responses?

Yes:

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[...],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Does GPT-4o support function calling?

Yes, native tool use. Same OpenAI tool schema as GPT-5.x. Compatible with all major agent frameworks.

What's the specific `gpt-4o-2024-08-06` version?

Versioned snapshot pinning. Useful for production reproducibility — guarantees exact model weights don't change underneath you. Slightly more restrictive than floating gpt-4o alias. Use for compliance-sensitive workloads.

Is GPT-4o the same as ChatGPT?

GPT-4o is the API model. ChatGPT is the consumer product (uses GPT-4o or GPT-5 family depending on user tier). API and consumer product are billed separately.

Can I fine-tune GPT-4o?

Yes, fine-tuning supported via OpenAI's FT API. Training cost: $3 per 1M training tokens. Inference at $3 per 1M input + 2 per 1M output on fine-tuned versions (20-30% premium over base).

Sources

By TokenMix Research Lab · Updated 2026-04-24