Anthropic API Pricing in 2026: Claude Opus, Sonnet, Haiku — Every Model, Every Discount, Real Costs
TokenMix Research Lab · 2026-04-04

Anthropic API Pricing in 2026: Claude Opus, Sonnet, Haiku — Every Model, Every Discount, Real Costs
Anthropic's Claude API runs three pricing tiers in 2026: Haiku 4.5 at $1/$5, Sonnet 4.6 at $3/$15, and Opus 4.6 at $5/$25 per million tokens. But the list price is just the starting point. Prompt caching cuts input by 90%. Batch processing halves everything. Stack both and Sonnet input drops to $0.15/M — cheaper than [DeepSeek V4](https://tokenmix.ai/blog/deepseek-api-pricing). This is the complete Anthropic API pricing reference: every model, every discount, every hidden surcharge, with real production cost scenarios. All data from [Anthropic's official pricing page](https://platform.claude.com/docs/en/about-claude/pricing) and cross-referenced by [TokenMix.ai](https://tokenmix.ai), April 2026.
Table of Contents
- [Complete Anthropic API Pricing Table]
- [Anthropic Prompt Caching Pricing: 90% Off Input]
- [Anthropic Batch API Pricing: 50% Off Everything]
- [Stacking Anthropic Discounts: Up to 95% Savings]
- [Long-Context and Data Residency Surcharges]
- [Anthropic Fast Mode Pricing: Opus at 6x]
- [Third-Party Anthropic API Access: Bedrock, Vertex, TokenMix.ai]
- [Anthropic API Pricing vs OpenAI vs DeepSeek vs Gemini]
- [Real-World Anthropic API Cost Scenarios]
- [How to Choose the Right Claude Model]
- [Conclusion]
- [FAQ]
---
Complete Anthropic API Pricing Table
All prices per 1M tokens, from [Anthropic's official pricing page](https://platform.claude.com/docs/en/about-claude/pricing), April 2026:
| Model | Input | Cache Hit (0.1x) | Output | Batch Input | Batch Output | Context | | --------------------- | ----- | ---------------- | ------ | ----------- | ------------ | ------- | | **Claude Opus 4.6** | $5.00 | $0.50 | $25.00 | $2.50 | $12.50 | 1M | | **Claude Sonnet 4.6** | $3.00 | $0.30 | $15.00 | $1.50 | $7.50 | 1M | | **Claude Haiku 4.5** | $1.00 | $0.10 | $5.00 | $0.50 | $2.50 | 200K | | Claude Opus 4.5 | $5.00 | $0.50 | $25.00 | $2.50 | $12.50 | 1M | | Claude Sonnet 4.5 | $3.00 | $0.30 | $15.00 | $1.50 | $7.50 | 200K | | Claude Haiku 3.5 | $0.80 | $0.08 | $4.00 | $0.40 | $2.00 | 200K | | Claude Haiku 3 | $0.25 | $0.03 | $1.25 | $0.125 | $0.625 | 200K |
**Legacy models:** Opus 4.1/4.0 remain at $15/$75 — 67% more expensive than Opus 4.6 with lower quality. Opus 3 (deprecated) is the same price. No reason to use any of them.
**Key pricing move:** Anthropic cut Opus pricing by 67% from $15/$75 (Opus 4.1) to $5/$25 (Opus 4.6). This makes Opus viable for production workloads that previously defaulted to Sonnet on cost grounds.
---
Anthropic Prompt Caching Pricing: 90% Off Input
Anthropic's cache system has two duration tiers, each with different write and read costs:
| Cache Operation | Multiplier vs Base Input | Sonnet 4.6 Price | Opus 4.6 Price | | -------------------- | ------------------------ | ---------------- | -------------- | | Standard input | 1.0x | $3.00/M | $5.00/M | | 5-minute cache write | 1.25x | $3.75/M | $6.25/M | | 1-hour cache write | 2.0x | $6.00/M | $10.00/M | | **Cache hit (read)** | **0.1x** | **$0.30/M** | **$0.50/M** |
**Break-even:** 5-minute cache pays back after 1 cache hit. 1-hour cache pays back after 2 hits. Any production workload with repeated system prompts will easily exceed this.
**What to cache:** System prompts (500-2K tokens), few-shot examples (2K-10K), document context for [RAG](https://tokenmix.ai/blog/rag-tutorial-2026) (10K-100K), conversation history (grows per turn). Implementation requires adding one `cache_control` field — Anthropic handles the rest.
Source: [Anthropic prompt caching docs](https://platform.claude.com/docs/en/build-with-claude/prompt-caching)
---
Anthropic Batch API Pricing: 50% Off Everything
The [Batch API](https://tokenmix.ai/blog/openai-batch-api-pricing) processes requests asynchronously with up to 24-hour turnaround at 50% off:
| Model | Standard Input/Output | Batch Input/Output | | ----------------- | --------------------- | ------------------ | | Claude Opus 4.6 | $5.00 / $25.00 | $2.50 / $12.50 | | Claude Sonnet 4.6 | $3.00 / $15.00 | $1.50 / $7.50 | | Claude Haiku 4.5 | $1.00 / $5.00 | $0.50 / $2.50 |
**Best Anthropic batch use cases:** Nightly content pipelines, bulk classification, embedding generation, test suite evaluation, data enrichment — any workload where users aren't waiting.
Source: [Anthropic batch processing docs](https://platform.claude.com/docs/en/build-with-claude/batch-processing)
---
Stacking Anthropic Discounts: Up to 95% Savings
Cache (90% off input) + Batch (50% off everything) stack multiplicatively:
| Optimization | Sonnet Input/M | Sonnet Output/M | Total Discount | | ----------------- | -------------- | --------------- | ------------------------- | | Standard | $3.00 | $15.00 | 0% | | Cache only | $0.30 | $15.00 | 90% on input | | Batch only | $1.50 | $7.50 | 50% on everything | | **Cache + Batch** | **$0.15** | **$7.50** | **95% input, 50% output** |
**Sonnet 4.6 at $0.15/M input is cheaper than DeepSeek V4's standard $0.30/M.** Anthropic's premium model at DeepSeek's budget price — if your workload qualifies for both optimizations.
---
Long-Context and Data Residency Surcharges
Two hidden surcharges most Anthropic API pricing guides miss:
**Sonnet 4.6 long-context surcharge:** Requests with >200K input tokens pay 2x on both input and output:
| Sonnet 4.6 | ≤200K Input | >200K Input | | ---------- | ----------- | ----------- | | Input | $3.00/M | $6.00/M | | Output | $15.00/M | $22.50/M |
**Important:** Opus 4.6 has NO long-context surcharge — flat $5/$25 across the full 1M window. For requests over 200K tokens, Opus is actually cheaper than Sonnet.
**US data residency surcharge:** Using `inference_geo` for US-only routing on Opus 4.6+ adds a 1.1x multiplier to all pricing (input, output, cache). Only applies to Claude API direct — [AWS Bedrock](https://tokenmix.ai/blog/aws-bedrock-pricing) and Google Vertex have their own regional pricing.
Source: [Anthropic long context docs](https://platform.claude.com/docs/en/build-with-claude/context-windows), [Anthropic data residency docs](https://platform.claude.com/docs/en/build-with-claude/data-residency)
---
Anthropic Fast Mode Pricing: Opus at 6x
Fast mode for Opus 4.6 provides significantly faster output at premium pricing:
| Mode | Input/M | Output/M | | -------- | ------- | -------- | | Standard | $5.00 | $25.00 | | **Fast** | $30.00 | $150.00 |
6x standard rates. Not available with Batch API. Stacks with cache and data residency multipliers.
**When fast mode makes sense:** Time-critical agent loops where each reasoning step blocks the next. If a 5-step agent workflow takes 30 seconds in standard mode and 5 seconds in fast mode, the time savings may justify the 6x cost for user-facing applications.
Source: [Anthropic fast mode docs](https://platform.claude.com/docs/en/build-with-claude/fast-mode)
---
Third-Party Anthropic API Access: Bedrock, Vertex, TokenMix.ai
| Provider | Sonnet 4.6 Input/M | Key Difference | | ------------------------ | ------------------ | -------------------------------------- | | Anthropic Direct | $3.00 | Full features, fastest model access | | AWS Bedrock (Global) | $3.00 | AWS integration, IAM, compliance | | AWS Bedrock (Regional) | $3.30 | +10% for data residency guarantee | | Google Vertex (Global) | $3.00 | GCP integration | | Google Vertex (Regional) | $3.30 | +10% premium | | **TokenMix.ai** | **$2.85** | 155+ models, auto-failover, below-list |
**Anthropic-specific features only on direct API:** Fast mode, Claude Managed Agents ($0.08/session-hour), latest model releases first.
Through [TokenMix.ai](https://tokenmix.ai), Claude models are available alongside GPT, DeepSeek, Gemini, and 155+ others through a single OpenAI-compatible API with 3-5% below official rates.
---
Anthropic API Pricing vs OpenAI vs DeepSeek vs Gemini
| Model | Input/M | Output/M | Cache Hit/M | Context | Batch Output/M | | ----------------- | ------- | -------- | ----------- | ------- | -------------- | | Claude Opus 4.6 | $5.00 | $25.00 | $0.50 | 1M | $12.50 | | Claude Sonnet 4.6 | $3.00 | $15.00 | $0.30 | 1M | $7.50 | | Claude Haiku 4.5 | $1.00 | $5.00 | $0.10 | 200K | $2.50 | | GPT-5.4 | $2.50 | $15.00 | $0.25 | 1.1M | $7.50 | | GPT-5.4 Mini | $0.75 | $4.50 | $0.075 | 400K | $2.25 | | DeepSeek V4 | $0.30 | $0.50 | $0.03 | 1M | N/A | | Grok 4.20 | $2.00 | $6.00 | $0.20 | 2M | $3.00 | | Gemini 3.1 Pro | $2.00 | $12.00 | $0.50 | 1M | N/A |
**Key Anthropic pricing positions from [TokenMix.ai](https://tokenmix.ai/pricing) data:**
1. **Sonnet vs [GPT-5.4](https://tokenmix.ai/blog/gpt-5-api-pricing):** Anthropic is 20% more expensive on input ($3 vs $2.50), identical on output ($15). But Anthropic's cache is cheaper ($0.30 vs $0.25 — wait, GPT wins here). At scale with both using cache + batch, costs are nearly identical.
2. **Opus 4.6 at $5/$25 is now competitive.** The old $15/$75 Opus made it a luxury product. At $5/$25, it's only 2x Sonnet's input and 1.67x output — for significantly better quality.
3. **Haiku 4.5 vs GPT-5.4 Mini:** Haiku ($1/$5) is 33% more expensive than GPT Mini ($0.75/$4.50) on input, 11% more on output. GPT Mini wins on raw price at this tier.
4. **Everyone is expensive vs DeepSeek.** Sonnet at $3/$15 vs DeepSeek V4 at $0.30/$0.50 — 10x on input, 30x on output. The quality gap is small (SWE-bench: Sonnet ~79% vs DeepSeek 81%).
---
Real-World Anthropic API Cost Scenarios
Scenario 1: Startup chatbot — 500 conversations/day
- Monthly: ~12M input, ~6M output tokens, 70% cache hit rate
| Model | Monthly (Cached) | | ----------------- | ---------------- | | Claude Haiku 4.5 | $15.60 | | Claude Sonnet 4.6 | $46.80 | | GPT-5.4 Mini | $29.70 | | DeepSeek V4 | $4.10 |
Scenario 2: SaaS — 5,000 calls/day
- Monthly: ~450M input, ~225M output, 75% cache hit rate
| Model | Standard | Cached + Batch | | ----------------- | -------- | -------------- | | Claude Sonnet 4.6 | $4,725 | $923 | | Claude Opus 4.6 | $7,875 | $1,294 | | GPT-5.4 | $4,500 | $881 |
Scenario 3: Enterprise — 50,000 calls/day
- Monthly: ~15B input, ~4.5B output, 85% cache hit rate
| Model | Cached + Batch | | ----------------- | -------------- | | Claude Sonnet 4.6 | $13,875 | | GPT-5.4 | $23,438 |
**At enterprise scale, Anthropic's stacking discounts beat OpenAI** — Sonnet cached+batch ($13,875) vs GPT-5.4 cached+batch ($23,438). Anthropic saves $9,563/month.
---
How to Choose the Right Claude Model
| Your Situation | Recommended Model | Why | | ------------------------------- | ------------------------------------ | ----------------------------------------------- | | Budget production, simple tasks | Haiku 4.5 ($1/$5) | Cheapest Claude, 5x less than Sonnet | | General production workload | Sonnet 4.6 ($3/$15) | Best quality/price for most use cases | | Long documents >200K tokens | **Opus 4.6 ($5/$25)** | Flat pricing beats Sonnet's 2x surcharge | | Maximum quality | Opus 4.6 ($5/$25) | Best Claude, 67% cheaper than previous gen | | Speed-critical agent loops | Opus 4.6 Fast ($30/$150) | 6x price for significantly faster output | | Async batch processing | Sonnet + Batch ($1.50/$7.50) | 50% off for 24h latency tolerance | | Maximum Anthropic savings | Sonnet + Cache + Batch ($0.15/$7.50) | 95% off input — cheapest premium model possible | | Multi-provider with failover | Any Claude via TokenMix.ai | Unified API, auto-failover, 3-5% savings |
---
**Related:** [Compare all model pricing in our complete LLM API pricing comparison](https://tokenmix.ai/blog/llm-api-pricing-comparison)
Conclusion
Anthropic API pricing in 2026 is defined by three moves: Opus dropped 67% to $5/$25 (making it viable for production), [prompt caching](https://tokenmix.ai/blog/prompt-caching-guide) delivers 90% input savings, and stacking cache + batch gets Sonnet input to $0.15/M. For teams with cacheable, async-friendly workloads, Anthropic can be cheaper than GPT-5.4 at scale despite higher list prices.
The caveat: Sonnet's 200K long-context surcharge (2x pricing) catches teams processing large documents. For that use case, switch to Opus — flat pricing across the full 1M window.
Real-time Anthropic pricing compared against 155+ models at [tokenmix.ai/pricing](https://tokenmix.ai/pricing).
---
FAQ
How much does the Anthropic Claude API cost?
Three tiers: Haiku 4.5 at $1/$5, Sonnet 4.6 at $3/$15, Opus 4.6 at $5/$25 per million tokens (input/output). Prompt caching reduces input by 90%. Batch API halves all costs. Stacking both gives up to 95% savings on input.
Is Anthropic API cheaper than OpenAI?
At list price, no — Sonnet ($3/$15) is 20% more expensive on input than GPT-5.4 ($2.50/$15). But at enterprise scale with cache + batch, Anthropic's stacking discounts can make it cheaper overall. Depends on your workload's cache hit rate and batch eligibility.
What is Anthropic's cheapest model?
Claude Haiku 3 at $0.25/$1.25 per million tokens. Among current-generation models, Haiku 4.5 at $1/$5. Maximum savings: Haiku 4.5 + cache + batch = $0.05/M input, $2.50/M output.
Does Anthropic charge extra for long context?
Yes — Sonnet 4.6 doubles pricing for requests over 200K input tokens ($6/$22.50 instead of $3/$15). Opus 4.6 has no surcharge — flat $5/$25 across the full 1M [context window](https://tokenmix.ai/blog/llm-context-window-explained).
Can I use Claude through AWS Bedrock?
Yes. Same base pricing as direct API for global endpoints. Regional endpoints add 10% premium for data residency. Bedrock provides IAM integration and AWS compliance certifications.
What is Anthropic fast mode?
Fast mode for Opus 4.6 charges 6x standard rates ($30/$150 per million tokens) for significantly faster output. Not available with Batch API. Useful for time-critical agent workflows.
How does Anthropic compare to DeepSeek?
DeepSeek V4 is 10x cheaper on input ($0.30 vs $3) and 30x on output ($0.50 vs $15) at list price. With Anthropic's stacked discounts, the input gap narrows to 2x. Quality is comparable (SWE-bench: ~79% vs 81%). Choose Anthropic for reliability and ecosystem, DeepSeek for pure cost.
Does Anthropic offer a free tier?
Small free credits for new accounts. No ongoing free tier. Contact sales for enterprise trial credits. [TokenMix.ai](https://tokenmix.ai) offers pay-as-you-go Claude access with no minimums.
---
*Author: TokenMix Research Lab | Last Updated: April 2026 | Data Source: [Anthropic Official Pricing](https://platform.claude.com/docs/en/about-claude/pricing), [TokenMix.ai](https://tokenmix.ai), and [Artificial Analysis](https://artificialanalysis.ai)*