Anthropic API Pricing in 2026: Claude Opus, Sonnet, Haiku — Every Model, Every Discount, Real Costs

TokenMix Research Lab · 2026-04-04

Anthropic API Pricing in 2026: Claude Opus, Sonnet, Haiku — Every Model, Every Discount, Real Costs

Anthropic's Claude API runs three pricing tiers in 2026: Haiku 4.5 at $1/$5, Sonnet 4.6 at $3/$15, and Opus 4.6 at $5/$25 per million tokens. But the list price is just the starting point. Prompt caching cuts input by 90%. Batch processing halves everything. Stack both and Sonnet input drops to $0.15/M — cheaper than [DeepSeek V4](https://tokenmix.ai/blog/deepseek-api-pricing). This is the complete Anthropic API pricing reference: every model, every discount, every hidden surcharge, with real production cost scenarios. All data from [Anthropic's official pricing page](https://platform.claude.com/docs/en/about-claude/pricing) and cross-referenced by [TokenMix.ai](https://tokenmix.ai), April 2026.

[Complete Anthropic API Pricing Table]
[Anthropic Prompt Caching Pricing: 90% Off Input]
[Anthropic Batch API Pricing: 50% Off Everything]
[Stacking Anthropic Discounts: Up to 95% Savings]
[Long-Context and Data Residency Surcharges]
[Anthropic Fast Mode Pricing: Opus at 6x]
[Third-Party Anthropic API Access: Bedrock, Vertex, TokenMix.ai]
[Anthropic API Pricing vs OpenAI vs DeepSeek vs Gemini]
[Real-World Anthropic API Cost Scenarios]
[How to Choose the Right Claude Model]
[Conclusion]
[FAQ]

---

Complete Anthropic API Pricing Table

All prices per 1M tokens, from [Anthropic's official pricing page](https://platform.claude.com/docs/en/about-claude/pricing), April 2026:

| Model | Input | Cache Hit (0.1x) | Output | Batch Input | Batch Output | Context | | --------------------- | ----- | ---------------- | ------ | ----------- | ------------ | ------- | | **Claude Opus 4.6** | $5.00 | $0.50 | $25.00 | $2.50 | $12.50 | 1M | | **Claude Sonnet 4.6** | $3.00 | $0.30 | $15.00 | $1.50 | $7.50 | 1M | | **Claude Haiku 4.5** | $1.00 | $0.10 | $5.00 | $0.50 | $2.50 | 200K | | Claude Opus 4.5 | $5.00 | $0.50 | $25.00 | $2.50 | $12.50 | 1M | | Claude Sonnet 4.5 | $3.00 | $0.30 | $15.00 | $1.50 | $7.50 | 200K | | Claude Haiku 3.5 | $0.80 | $0.08 | $4.00 | $0.40 | $2.00 | 200K | | Claude Haiku 3 | $0.25 | $0.03 | $1.25 | $0.125 | $0.625 | 200K |

**Legacy models:** Opus 4.1/4.0 remain at $15/$75 — 67% more expensive than Opus 4.6 with lower quality. Opus 3 (deprecated) is the same price. No reason to use any of them.

**Key pricing move:** Anthropic cut Opus pricing by 67% from $15/$75 (Opus 4.1) to $5/$25 (Opus 4.6). This makes Opus viable for production workloads that previously defaulted to Sonnet on cost grounds.

---

Anthropic Prompt Caching Pricing: 90% Off Input

Anthropic's cache system has two duration tiers, each with different write and read costs:

| Cache Operation | Multiplier vs Base Input | Sonnet 4.6 Price | Opus 4.6 Price | | -------------------- | ------------------------ | ---------------- | -------------- | | Standard input | 1.0x | $3.00/M | $5.00/M | | 5-minute cache write | 1.25x | $3.75/M | $6.25/M | | 1-hour cache write | 2.0x | $6.00/M | $10.00/M | | **Cache hit (read)** | **0.1x** | **$0.30/M** | **$0.50/M** |

**Break-even:** 5-minute cache pays back after 1 cache hit. 1-hour cache pays back after 2 hits. Any production workload with repeated system prompts will easily exceed this.

**What to cache:** System prompts (500-2K tokens), few-shot examples (2K-10K), document context for [RAG](https://tokenmix.ai/blog/rag-tutorial-2026) (10K-100K), conversation history (grows per turn). Implementation requires adding one `cache_control` field — Anthropic handles the rest.

Source: [Anthropic prompt caching docs](https://platform.claude.com/docs/en/build-with-claude/prompt-caching)

---

Anthropic Batch API Pricing: 50% Off Everything

The [Batch API](https://tokenmix.ai/blog/openai-batch-api-pricing) processes requests asynchronously with up to 24-hour turnaround at 50% off:

| Model | Standard Input/Output | Batch Input/Output | | ----------------- | --------------------- | ------------------ | | Claude Opus 4.6 | $5.00 / $25.00 | $2.50 / $12.50 | | Claude Sonnet 4.6 | $3.00 / $15.00 | $1.50 / $7.50 | | Claude Haiku 4.5 | $1.00 / $5.00 | $0.50 / $2.50 |

**Best Anthropic batch use cases:** Nightly content pipelines, bulk classification, embedding generation, test suite evaluation, data enrichment — any workload where users aren't waiting.

Source: [Anthropic batch processing docs](https://platform.claude.com/docs/en/build-with-claude/batch-processing)

---

Stacking Anthropic Discounts: Up to 95% Savings

Cache (90% off input) + Batch (50% off everything) stack multiplicatively:

| Optimization | Sonnet Input/M | Sonnet Output/M | Total Discount | | ----------------- | -------------- | --------------- | ------------------------- | | Standard | $3.00 | $15.00 | 0% | | Cache only | $0.30 | $15.00 | 90% on input | | Batch only | $1.50 | $7.50 | 50% on everything | | **Cache + Batch** | **$0.15** | **$7.50** | **95% input, 50% output** |

**Sonnet 4.6 at $0.15/M input is cheaper than DeepSeek V4's standard $0.30/M.** Anthropic's premium model at DeepSeek's budget price — if your workload qualifies for both optimizations.

---

Long-Context and Data Residency Surcharges

Two hidden surcharges most Anthropic API pricing guides miss:

**Sonnet 4.6 long-context surcharge:** Requests with >200K input tokens pay 2x on both input and output:

| Sonnet 4.6 | ≤200K Input | >200K Input | | ---------- | ----------- | ----------- | | Input | $3.00/M | $6.00/M | | Output | $15.00/M | $22.50/M |

**Important:** Opus 4.6 has NO long-context surcharge — flat $5/$25 across the full 1M window. For requests over 200K tokens, Opus is actually cheaper than Sonnet.

**US data residency surcharge:** Using `inference_geo` for US-only routing on Opus 4.6+ adds a 1.1x multiplier to all pricing (input, output, cache). Only applies to Claude API direct — [AWS Bedrock](https://tokenmix.ai/blog/aws-bedrock-pricing) and Google Vertex have their own regional pricing.

Source: [Anthropic long context docs](https://platform.claude.com/docs/en/build-with-claude/context-windows), [Anthropic data residency docs](https://platform.claude.com/docs/en/build-with-claude/data-residency)

---

Anthropic Fast Mode Pricing: Opus at 6x

Fast mode for Opus 4.6 provides significantly faster output at premium pricing:

| Mode | Input/M | Output/M | | -------- | ------- | -------- | | Standard | $5.00 | $25.00 | | **Fast** | $30.00 | $150.00 |

6x standard rates. Not available with Batch API. Stacks with cache and data residency multipliers.

**When fast mode makes sense:** Time-critical agent loops where each reasoning step blocks the next. If a 5-step agent workflow takes 30 seconds in standard mode and 5 seconds in fast mode, the time savings may justify the 6x cost for user-facing applications.

Source: [Anthropic fast mode docs](https://platform.claude.com/docs/en/build-with-claude/fast-mode)

---

Third-Party Anthropic API Access: Bedrock, Vertex, TokenMix.ai

| Provider | Sonnet 4.6 Input/M | Key Difference | | ------------------------ | ------------------ | -------------------------------------- | | Anthropic Direct | $3.00 | Full features, fastest model access | | AWS Bedrock (Global) | $3.00 | AWS integration, IAM, compliance | | AWS Bedrock (Regional) | $3.30 | +10% for data residency guarantee | | Google Vertex (Global) | $3.00 | GCP integration | | Google Vertex (Regional) | $3.30 | +10% premium | | **TokenMix.ai** | **$2.85** | 155+ models, auto-failover, below-list |

**Anthropic-specific features only on direct API:** Fast mode, Claude Managed Agents ($0.08/session-hour), latest model releases first.

Through [TokenMix.ai](https://tokenmix.ai), Claude models are available alongside GPT, DeepSeek, Gemini, and 155+ others through a single OpenAI-compatible API with 3-5% below official rates.

---

Anthropic API Pricing vs OpenAI vs DeepSeek vs Gemini

| Model | Input/M | Output/M | Cache Hit/M | Context | Batch Output/M | | ----------------- | ------- | -------- | ----------- | ------- | -------------- | | Claude Opus 4.6 | $5.00 | $25.00 | $0.50 | 1M | $12.50 | | Claude Sonnet 4.6 | $3.00 | $15.00 | $0.30 | 1M | $7.50 | | Claude Haiku 4.5 | $1.00 | $5.00 | $0.10 | 200K | $2.50 | | GPT-5.4 | $2.50 | $15.00 | $0.25 | 1.1M | $7.50 | | GPT-5.4 Mini | $0.75 | $4.50 | $0.075 | 400K | $2.25 | | DeepSeek V4 | $0.30 | $0.50 | $0.03 | 1M | N/A | | Grok 4.20 | $2.00 | $6.00 | $0.20 | 2M | $3.00 | | Gemini 3.1 Pro | $2.00 | $12.00 | $0.50 | 1M | N/A |

**Key Anthropic pricing positions from [TokenMix.ai](https://tokenmix.ai/pricing) data:**

1. **Sonnet vs [GPT-5.4](https://tokenmix.ai/blog/gpt-5-api-pricing):** Anthropic is 20% more expensive on input ($3 vs $2.50), identical on output ($15). But Anthropic's cache is cheaper ($0.30 vs $0.25 — wait, GPT wins here). At scale with both using cache + batch, costs are nearly identical.

2. **Opus 4.6 at $5/$25 is now competitive.** The old $15/$75 Opus made it a luxury product. At $5/$25, it's only 2x Sonnet's input and 1.67x output — for significantly better quality.

3. **Haiku 4.5 vs GPT-5.4 Mini:** Haiku ($1/$5) is 33% more expensive than GPT Mini ($0.75/$4.50) on input, 11% more on output. GPT Mini wins on raw price at this tier.

4. **Everyone is expensive vs DeepSeek.** Sonnet at $3/$15 vs DeepSeek V4 at $0.30/$0.50 — 10x on input, 30x on output. The quality gap is small (SWE-bench: Sonnet ~79% vs DeepSeek 81%).

---

Real-World Anthropic API Cost Scenarios

Scenario 1: Startup chatbot — 500 conversations/day

Monthly: ~12M input, ~6M output tokens, 70% cache hit rate

| Model | Monthly (Cached) | | ----------------- | ---------------- | | Claude Haiku 4.5 | $15.60 | | Claude Sonnet 4.6 | $46.80 | | GPT-5.4 Mini | $29.70 | | DeepSeek V4 | $4.10 |

Scenario 2: SaaS — 5,000 calls/day

Monthly: ~450M input, ~225M output, 75% cache hit rate

| Model | Standard | Cached + Batch | | ----------------- | -------- | -------------- | | Claude Sonnet 4.6 | $4,725 | $923 | | Claude Opus 4.6 | $7,875 | $1,294 | | GPT-5.4 | $4,500 | $881 |

Scenario 3: Enterprise — 50,000 calls/day

Monthly: ~15B input, ~4.5B output, 85% cache hit rate

| Model | Cached + Batch | | ----------------- | -------------- | | Claude Sonnet 4.6 | $13,875 | | GPT-5.4 | $23,438 |

**At enterprise scale, Anthropic's stacking discounts beat OpenAI** — Sonnet cached+batch ($13,875) vs GPT-5.4 cached+batch ($23,438). Anthropic saves $9,563/month.

---

How to Choose the Right Claude Model

| Your Situation | Recommended Model | Why | | ------------------------------- | ------------------------------------ | ----------------------------------------------- | | Budget production, simple tasks | Haiku 4.5 ($1/$5) | Cheapest Claude, 5x less than Sonnet | | General production workload | Sonnet 4.6 ($3/$15) | Best quality/price for most use cases | | Long documents >200K tokens | **Opus 4.6 ($5/$25)** | Flat pricing beats Sonnet's 2x surcharge | | Maximum quality | Opus 4.6 ($5/$25) | Best Claude, 67% cheaper than previous gen | | Speed-critical agent loops | Opus 4.6 Fast ($30/$150) | 6x price for significantly faster output | | Async batch processing | Sonnet + Batch ($1.50/$7.50) | 50% off for 24h latency tolerance | | Maximum Anthropic savings | Sonnet + Cache + Batch ($0.15/$7.50) | 95% off input — cheapest premium model possible | | Multi-provider with failover | Any Claude via TokenMix.ai | Unified API, auto-failover, 3-5% savings |

---

**Related:** [Compare all model pricing in our complete LLM API pricing comparison](https://tokenmix.ai/blog/llm-api-pricing-comparison)

Conclusion

Anthropic API pricing in 2026 is defined by three moves: Opus dropped 67% to $5/$25 (making it viable for production), [prompt caching](https://tokenmix.ai/blog/prompt-caching-guide) delivers 90% input savings, and stacking cache + batch gets Sonnet input to $0.15/M. For teams with cacheable, async-friendly workloads, Anthropic can be cheaper than GPT-5.4 at scale despite higher list prices.

The caveat: Sonnet's 200K long-context surcharge (2x pricing) catches teams processing large documents. For that use case, switch to Opus — flat pricing across the full 1M window.

Real-time Anthropic pricing compared against 155+ models at [tokenmix.ai/pricing](https://tokenmix.ai/pricing).

---

FAQ

How much does the Anthropic Claude API cost?

Three tiers: Haiku 4.5 at $1/$5, Sonnet 4.6 at $3/$15, Opus 4.6 at $5/$25 per million tokens (input/output). Prompt caching reduces input by 90%. Batch API halves all costs. Stacking both gives up to 95% savings on input.

Is Anthropic API cheaper than OpenAI?

At list price, no — Sonnet ($3/$15) is 20% more expensive on input than GPT-5.4 ($2.50/$15). But at enterprise scale with cache + batch, Anthropic's stacking discounts can make it cheaper overall. Depends on your workload's cache hit rate and batch eligibility.

What is Anthropic's cheapest model?

Claude Haiku 3 at $0.25/$1.25 per million tokens. Among current-generation models, Haiku 4.5 at $1/$5. Maximum savings: Haiku 4.5 + cache + batch = $0.05/M input, $2.50/M output.

Does Anthropic charge extra for long context?

Yes — Sonnet 4.6 doubles pricing for requests over 200K input tokens ($6/$22.50 instead of $3/$15). Opus 4.6 has no surcharge — flat $5/$25 across the full 1M [context window](https://tokenmix.ai/blog/llm-context-window-explained).

Can I use Claude through AWS Bedrock?

Yes. Same base pricing as direct API for global endpoints. Regional endpoints add 10% premium for data residency. Bedrock provides IAM integration and AWS compliance certifications.

What is Anthropic fast mode?

Fast mode for Opus 4.6 charges 6x standard rates ($30/$150 per million tokens) for significantly faster output. Not available with Batch API. Useful for time-critical agent workflows.

How does Anthropic compare to DeepSeek?

DeepSeek V4 is 10x cheaper on input ($0.30 vs $3) and 30x on output ($0.50 vs $15) at list price. With Anthropic's stacked discounts, the input gap narrows to 2x. Quality is comparable (SWE-bench: ~79% vs 81%). Choose Anthropic for reliability and ecosystem, DeepSeek for pure cost.

Does Anthropic offer a free tier?

Small free credits for new accounts. No ongoing free tier. Contact sales for enterprise trial credits. [TokenMix.ai](https://tokenmix.ai) offers pay-as-you-go Claude access with no minimums.

---

*Author: TokenMix Research Lab | Last Updated: April 2026 | Data Source: [Anthropic Official Pricing](https://platform.claude.com/docs/en/about-claude/pricing), [TokenMix.ai](https://tokenmix.ai), and [Artificial Analysis](https://artificialanalysis.ai)*

Anthropic API Pricing in 2026: Claude Opus, Sonnet, Haiku — Every Model, Every Discount, Real Costs

Anthropic API Pricing in 2026: Claude Opus, Sonnet, Haiku — Every Model, Every Discount, Real Costs

Table of Contents

Complete Anthropic API Pricing Table

Anthropic Prompt Caching Pricing: 90% Off Input

Anthropic Batch API Pricing: 50% Off Everything

Stacking Anthropic Discounts: Up to 95% Savings

Long-Context and Data Residency Surcharges

Anthropic Fast Mode Pricing: Opus at 6x

Third-Party Anthropic API Access: Bedrock, Vertex, TokenMix.ai

Anthropic API Pricing vs OpenAI vs DeepSeek vs Gemini

Real-World Anthropic API Cost Scenarios

Scenario 1: Startup chatbot — 500 conversations/day

Scenario 2: SaaS — 5,000 calls/day

Scenario 3: Enterprise — 50,000 calls/day

How to Choose the Right Claude Model

Conclusion

FAQ

How much does the Anthropic Claude API cost?

Is Anthropic API cheaper than OpenAI?

What is Anthropic's cheapest model?

Does Anthropic charge extra for long context?

Can I use Claude through AWS Bedrock?

What is Anthropic fast mode?

How does Anthropic compare to DeepSeek?

Does Anthropic offer a free tier?