TokenMix Research Lab · 2026-04-04

OpenAI Embedding Pricing 2026: $0.02/M vs $0.13/M (6.5x Premium)

OpenAI Embedding Models Pricing in 2026: text-embedding-3-small vs 3-large vs ada-002 Cost Comparison

Last Updated: 2026-04-29
Author: TokenMix Research Lab

text-embedding-3-small at $0.02/M is the right default for 80% of RAG apps; 3-large at $0.13/M (6.5× more) only worth it for legal/medical accuracy; ada-002 is obsolete (3-small is 5× cheaper and better).

OpenAI's embedding models are the cheapest part of their API — text-embedding-3-small costs just $0.02 per million tokens, and batch processing halves it to $0.01/M. But choosing wrong between the three models can cost you 6.5x more than necessary. text-embedding-3-large at $0.13/M delivers better retrieval quality for RAG — whether the 6.5x premium is worth it depends on your use case. This guide breaks down every OpenAI embedding model's real cost, compares them against Voyage, Cohere, and free alternatives, and shows exactly when to use which. All pricing verified against OpenAI's official docs and tracked by TokenMix.ai as of April 2026.

Quick OpenAI Embedding Pricing Overview
text-embedding-3-small vs 3-large: When 6.5x More is Worth It
text-embedding-ada-002: Should You Migrate?
Batch API: Cut Embedding Costs by 50%
OpenAI Embedding Pricing vs Voyage vs Cohere vs Free Options
Real-World Embedding Cost Scenarios
How to Choose the Right OpenAI Embedding Model
Conclusion
FAQ

Quick OpenAI Embedding Pricing Overview

Three OpenAI embedding tiers: 3-small at $0.02/M (1536 dim), 3-large at $0.13/M (3072 dim), legacy ada-002 at $0.10/M — all input-only, capped at 8,191 input tokens per request.

All prices per 1M tokens, OpenAI direct API, April 2026:

Model	Standard	Batch (50% off)	Dimensions	Max Input
text-embedding-3-small	$0.02	$0.01	1536	8,191
text-embedding-3-large	$0.13	$0.065	3072	8,191
text-embedding-ada-002	$0.10	$0.05	1536	8,191

Key insight: Embeddings are input-only — you pay per token embedded, no output cost. This makes them fundamentally cheaper than chat models per token processed.

text-embedding-3-small at $0.02/M is 97% cheaper than GPT-5.4 Nano input ($0.20/M). If your workload is primarily search/retrieval rather than generation, embeddings save massive costs.

text-embedding-3-small vs 3-large: When 6.5x More is Worth It

3-large costs 6.5× more for a 4-point MTEB lift (62% → 66%) and 2× the dimensions/storage — only worth it for legal/medical/technical retrieval where wrong answers have high consequences. The small model costs $0.02/M. The large model costs $0.13/M. That's a 6.5x difference. What do you get for it?

Metric	3-small	3-large	Improvement
Price/M tokens	$0.02	$0.13	6.5x more expensive
Dimensions	1536	3072	2x
MTEB retrieval score	~62%	~66%	+4 points
Storage per vector	~6KB	~12KB	2x
Query latency	Faster	Slower	~1.5x slower

Use 3-small when:

Cost is primary concern
Document corpus is large (millions of docs — storage cost matters)
Use case is simple semantic search or classification
Retrieval quality "good enough" is acceptable

Use 3-large when:

Retrieval quality is critical (legal, medical, technical RAG)
Corpus is small-to-medium (storage cost difference is negligible)
You need maximum accuracy for reranking or multi-step retrieval
The 4-point MTEB improvement translates to meaningful answer quality

The 80/20 rule: For 80% of production RAG applications, 3-small is sufficient. The 4-point quality gap only matters at the margins — if your users notice wrong search results, upgrade.

text-embedding-ada-002: Should You Migrate?

Yes, immediately — text-embedding-3-small is 5× cheaper ($0.02 vs $0.10/M) and slightly higher quality (~62% vs ~61% MTEB). There is no scenario where staying on ada-002 makes sense. ada-002 was OpenAI's standard embedding model before the v3 series launched. It's still available but there's little reason to stay on it.

Comparison	ada-002	3-small	3-large
Price/M	$0.10	$0.02	$0.13
Quality (MTEB)	~61%	~62%	~66%
Dimensions	1536	1536	3072

text-embedding-3-small is 5x cheaper AND slightly better than ada-002. There is no scenario where ada-002 is the right choice for new projects. If you're still on ada-002, migrate to 3-small — you get better quality at 20% of the cost.

Migration is straightforward: same API endpoint, change the model parameter. But you'll need to re-embed your entire corpus since the vector spaces aren't compatible.

Batch API: Cut Embedding Costs by 50%

Batch API halves embedding costs (3-small drops to $0.01/M, 3-large to $0.065/M) — always use it for corpus building, index migrations, and re-embedding; standard pricing only for live user queries. Embedding workloads are the ideal batch API candidate — they're inherently async (you're building/updating an index, not serving real-time queries).

Model	Standard	Batch	Monthly cost for 1B tokens
text-embedding-3-small	$0.02	$0.01	$10 → $5
text-embedding-3-large	$0.13	$0.065	$65 → $32.50

Always use batch for initial corpus embedding. Building a new search index? Migrating to a new model? Re-embedding after dimension changes? Use batch. There's no reason to pay standard rates for non-real-time embedding work.

For real-time embedding (live user queries), standard pricing applies — but individual queries are so cheap ($0.02/M) that the cost is negligible.

OpenAI Embedding Pricing vs Voyage vs Cohere vs Free Options

Cross-provider: Google text-embedding-005 cheapest hosted at $0.00625/M (3× under OpenAI), Voyage-3-large highest quality at 67% MTEB ($0.18/M, 9× more), nomic-embed-text free if you self-host.

Provider	Model	Price/M tokens	Dimensions	Quality (MTEB)
OpenAI	text-embedding-3-small	$0.02	1536	~62%
OpenAI	text-embedding-3-large	$0.13	3072	~66%
Voyage AI	voyage-3-large	$0.18	1024	~67%
Cohere	embed-v4.0	$0.10	1024	~65%
Google	text-embedding-005	$0.00625	768	~64%
Free (local)	nomic-embed-text	$0 (self-host)	768	~61%

Key insights from TokenMix.ai cross-provider data:

Google text-embedding-005 is the cheapest hosted option at $0.00625/M — 3x cheaper than OpenAI 3-small. Quality is competitive at 64% MTEB.
Voyage-3-large leads on quality (67% MTEB) but costs 9x more than OpenAI 3-small. Only worth it for quality-critical applications.
Self-hosted nomic-embed-text is free if you have GPU capacity. Quality matches ada-002. Best for teams with existing ML infrastructure.
OpenAI 3-small offers the best balance of quality, price, ecosystem integration, and reliability. It's not the cheapest or the best, but it's "good enough and easy" for most teams.

Through TokenMix.ai, you can access OpenAI embedding models alongside alternatives from multiple providers through a single API.

Real-World Embedding Cost Scenarios

Two real workloads: 100K-doc startup RAG → $2.72 for entire Year 1 on 3-small (effectively free); 10M-doc enterprise search → $336 Year 1 on 3-small vs $199 on Google embed (saves $137/yr).

Scenario 1: Startup RAG app — 100K documents

Average document: 2,000 tokens
Total corpus: 200M tokens (one-time embedding)
Daily queries: 1,000 (avg 100 tokens each)

Model	Initial Embed	Monthly Query Cost	Total Year 1
3-small (batch)	$2.00	$0.06	$2.72
3-large (batch)	$13.00	$0.39	$17.68
Voyage-3-large	$36.00	$0.54	$42.48

text-embedding-3-small costs $2.72 for an entire year of a 100K-document RAG application. Embedding costs are essentially free at this scale.

Scenario 2: Enterprise search — 10M documents

Average document: 3,000 tokens
Total corpus: 30B tokens
Daily queries: 50,000

Model	Initial Embed	Monthly Query Cost	Total Year 1
3-small (batch)	$300	$3.00	$336
3-large (batch)	$1,950	$19.50	$2,184
Google embed	$187.50	$0.94	$199

At enterprise scale, Google's embedding model saves $137/year vs OpenAI 3-small. But if you're already using OpenAI for chat, the integration simplicity of staying with one provider often outweighs the $11/month savings.

Which OpenAI Embedding Model Should You Pick?

Default to 3-small for 80% of RAG apps; upgrade to 3-large only when wrong answers have high consequences (legal/medical). Always pair with Batch API for corpus building. ada-002 has no remaining use case.

Your Situation	Recommended Model	Why
Most RAG applications	text-embedding-3-small	$0.02/M, quality sufficient for 80% of use cases
Quality-critical retrieval	text-embedding-3-large	+4 MTEB points worth 6.5x cost for legal/medical
Currently on ada-002	Migrate to 3-small	5x cheaper, slightly better quality
Maximum quality at any cost	Voyage-3-large	Leads MTEB benchmarks
Maximum cost savings	Google text-embedding-005	$0.00625/M — cheapest hosted option
Bulk re-embedding, index building	Any model + Batch API	50% off, no reason not to
Privacy-first, own infrastructure	nomic-embed-text (self-host)	Free, decent quality, full control

What's the Bottom Line on OpenAI Embedding Pricing?

Embeddings are too cheap to over-think — text-embedding-3-small at $0.02/M means a 100K-doc RAG app costs $3/year. Default to 3-small + Batch API. Spend 6.5× more on 3-large only when retrieval accuracy directly impacts user outcomes. OpenAI's embedding models are so cheap that cost is rarely the deciding factor. text-embedding-3-small at $0.02/M means a 100K-document RAG app costs under $3/year for embeddings. The real decision is quality: spend 6.5x more for text-embedding-3-large if retrieval accuracy directly impacts user experience, stick with 3-small for everything else.

If you're still on ada-002, migrate today — 3-small is 5x cheaper and slightly better. And always use the Batch API for corpus embedding — there's no reason to pay standard rates for non-real-time work.

Real-time embedding model pricing across all providers at tokenmix.ai/models.

FAQ

How much does OpenAI text-embedding-3-small cost?

$0.02 per million tokens at standard rates. With the Batch API (50% off), it drops to $0.01/M. Embedding 1 million tokens costs about 2 cents — making it one of the cheapest API operations available.

Should I use text-embedding-3-small or 3-large?

Use 3-small for most applications — it's 6.5x cheaper ($0.02 vs $0.13/M) with only a 4-point quality gap on MTEB benchmarks. Use 3-large only when retrieval accuracy is critical: legal search, medical RAG, or applications where wrong results have high consequences.

Is text-embedding-ada-002 still worth using?

No. text-embedding-3-small is 5x cheaper ($0.02 vs $0.10/M) and slightly higher quality. Migrate to 3-small for immediate cost savings with no quality loss. You'll need to re-embed your corpus since vector spaces differ.

What are the cheapest embedding models available?

Google text-embedding-005 at $0.00625/M is the cheapest hosted option. OpenAI 3-small at $0.02/M is second. Self-hosted nomic-embed-text is free if you have GPU infrastructure.

How do I reduce OpenAI embedding costs?

Three strategies: (1) Use the Batch API for all non-real-time embedding — saves 50%. (2) Use 3-small instead of 3-large unless quality demands it. (3) Consider Google text-embedding-005 at $0.00625/M for pure cost optimization.

What is the max input size for OpenAI embeddings?

8,191 tokens per embedding request for all three models (3-small, 3-large, ada-002). Documents longer than 8,191 tokens must be chunked before embedding.

Author: TokenMix Research Lab | Last Updated: April 2026 | Data Source: OpenAI Official Pricing, TokenMix.ai, and MTEB Leaderboard