OpenAI Embedding Models Pricing in 2026: text-embedding-3-small vs 3-large vs ada-002 Cost Comparison
tokenmix · 2026-04-04

OpenAI Embedding Models Pricing in 2026: text-embedding-3-small vs 3-large vs ada-002 Cost Comparison
OpenAI's embedding models are the cheapest part of their API — text-embedding-3-small costs just $0.02 per million tokens, and batch processing halves it to $0.01/M. But choosing wrong between the three models can cost you 6.5x more than necessary. text-embedding-3-large at $0.13/M delivers better retrieval quality for [RAG](https://tokenmix.ai/blog/rag-tutorial-2026) — whether the 6.5x premium is worth it depends on your use case. This guide breaks down every OpenAI [embedding model](https://tokenmix.ai/blog/text-embedding-models-comparison)'s real cost, compares them against Voyage, Cohere, and free alternatives, and shows exactly when to use which. All pricing verified against [OpenAI's official docs](https://platform.openai.com/docs/pricing/) and tracked by [TokenMix.ai](https://tokenmix.ai) as of April 2026.
Table of Contents
- [Quick OpenAI Embedding Pricing Overview]
- [text-embedding-3-small vs 3-large: When 6.5x More is Worth It]
- [text-embedding-ada-002: Should You Migrate?]
- [Batch API: Cut Embedding Costs by 50%]
- [OpenAI Embedding Pricing vs Voyage vs Cohere vs Free Options]
- [Real-World Embedding Cost Scenarios]
- [How to Choose the Right OpenAI Embedding Model]
- [Conclusion]
- [FAQ]
---
Quick OpenAI Embedding Pricing Overview
All prices per 1M tokens, OpenAI direct API, April 2026:
| Model | Standard | Batch (50% off) | Dimensions | Max Input | | -------------------------- | -------- | --------------- | ---------- | --------- | | **text-embedding-3-small** | $0.02 | $0.01 | 1536 | 8,191 | | **text-embedding-3-large** | $0.13 | $0.065 | 3072 | 8,191 | | text-embedding-ada-002 | $0.10 | $0.05 | 1536 | 8,191 |
**Key insight:** Embeddings are input-only — you pay per token embedded, no output cost. This makes them fundamentally cheaper than chat models per token processed.
text-embedding-3-small at $0.02/M is **97% cheaper than [GPT-5.4](https://tokenmix.ai/blog/gpt-5-api-pricing) Nano input** ($0.20/M). If your workload is primarily search/retrieval rather than generation, embeddings save massive costs.
---
text-embedding-3-small vs 3-large: When 6.5x More is Worth It
The small model costs $0.02/M. The large model costs $0.13/M. That's a 6.5x difference. What do you get for it?
| Metric | 3-small | 3-large | Improvement | | -------------------- | ------- | ------- | ------------------- | | Price/M tokens | $0.02 | $0.13 | 6.5x more expensive | | Dimensions | 1536 | 3072 | 2x | | MTEB retrieval score | ~62% | ~66% | +4 points | | Storage per vector | ~6KB | ~12KB | 2x | | Query latency | Faster | Slower | ~1.5x slower |
**Use 3-small when:**
- Cost is primary concern
- Document corpus is large (millions of docs — storage cost matters)
- Use case is simple semantic search or classification
- Retrieval quality "good enough" is acceptable
**Use 3-large when:**
- Retrieval quality is critical (legal, medical, technical RAG)
- Corpus is small-to-medium (storage cost difference is negligible)
- You need maximum accuracy for reranking or multi-step retrieval
- The 4-point MTEB improvement translates to meaningful answer quality
**The 80/20 rule:** For 80% of production RAG applications, 3-small is sufficient. The 4-point quality gap only matters at the margins — if your users notice wrong search results, upgrade.
---
text-embedding-ada-002: Should You Migrate?
ada-002 was OpenAI's standard embedding model before the v3 series launched. It's still available but there's little reason to stay on it.
| Comparison | ada-002 | 3-small | 3-large | | -------------- | ------- | ------- | ------- | | Price/M | $0.10 | $0.02 | $0.13 | | Quality (MTEB) | ~61% | ~62% | ~66% | | Dimensions | 1536 | 1536 | 3072 |
**text-embedding-3-small is 5x cheaper AND slightly better than ada-002.** There is no scenario where ada-002 is the right choice for new projects. If you're still on ada-002, migrate to 3-small — you get better quality at 20% of the cost.
Migration is straightforward: same API endpoint, change the model parameter. But you'll need to re-embed your entire corpus since the vector spaces aren't compatible.
---
Batch API: Cut Embedding Costs by 50%
Embedding workloads are the ideal batch API candidate — they're inherently async (you're building/updating an index, not serving real-time queries).
| Model | Standard | Batch | Monthly cost for 1B tokens | | ---------------------- | -------- | ------ | -------------------------- | | text-embedding-3-small | $0.02 | $0.01 | $10 → **$5** | | text-embedding-3-large | $0.13 | $0.065 | $65 → **$32.50** |
**Always use batch for initial corpus embedding.** Building a new search index? Migrating to a new model? Re-embedding after dimension changes? Use batch. There's no reason to pay standard rates for non-real-time embedding work.
For real-time embedding (live user queries), standard pricing applies — but individual queries are so cheap ($0.02/M) that the cost is negligible.
---
OpenAI Embedding Pricing vs Voyage vs Cohere vs Free Options
| Provider | Model | Price/M tokens | Dimensions | Quality (MTEB) | | ---------------- | ---------------------- | -------------- | ---------- | -------------- | | **OpenAI** | text-embedding-3-small | $0.02 | 1536 | ~62% | | **OpenAI** | text-embedding-3-large | $0.13 | 3072 | ~66% | | Voyage AI | voyage-3-large | $0.18 | 1024 | ~67% | | Cohere | embed-v4.0 | $0.10 | 1024 | ~65% | | Google | text-embedding-005 | $0.00625 | 768 | ~64% | | **Free (local)** | nomic-embed-text | $0 (self-host) | 768 | ~61% |
**Key insights from [TokenMix.ai](https://tokenmix.ai) cross-provider data:**
1. **Google text-embedding-005 is the cheapest hosted option** at $0.00625/M — 3x cheaper than OpenAI 3-small. Quality is competitive at 64% MTEB.
2. **Voyage-3-large leads on quality** (67% MTEB) but costs 9x more than OpenAI 3-small. Only worth it for quality-critical applications.
3. **Self-hosted nomic-embed-text is free** if you have GPU capacity. Quality matches ada-002. Best for teams with existing ML infrastructure.
4. **OpenAI 3-small offers the best balance** of quality, price, ecosystem integration, and reliability. It's not the cheapest or the best, but it's "good enough and easy" for most teams.
Through [TokenMix.ai](https://tokenmix.ai), you can access OpenAI embedding models alongside alternatives from multiple providers through a single API.
---
Real-World Embedding Cost Scenarios
Scenario 1: Startup RAG app — 100K documents
- Average document: 2,000 tokens
- Total corpus: 200M tokens (one-time embedding)
- Daily queries: 1,000 (avg 100 tokens each)
| Model | Initial Embed | Monthly Query Cost | Total Year 1 | | --------------- | ------------- | ------------------ | ------------ | | 3-small (batch) | $2.00 | $0.06 | $2.72 | | 3-large (batch) | $13.00 | $0.39 | $17.68 | | Voyage-3-large | $36.00 | $0.54 | $42.48 |
**text-embedding-3-small costs $2.72 for an entire year** of a 100K-document RAG application. Embedding costs are essentially free at this scale.
Scenario 2: Enterprise search — 10M documents
- Average document: 3,000 tokens
- Total corpus: 30B tokens
- Daily queries: 50,000
| Model | Initial Embed | Monthly Query Cost | Total Year 1 | | --------------- | ------------- | ------------------ | ------------ | | 3-small (batch) | $300 | $3.00 | $336 | | 3-large (batch) | $1,950 | $19.50 | $2,184 | | Google embed | $187.50 | $0.94 | $199 |
At enterprise scale, Google's embedding model saves $137/year vs OpenAI 3-small. But if you're already using OpenAI for chat, the integration simplicity of staying with one provider often outweighs the $11/month savings.
---
How to Choose the Right OpenAI Embedding Model
| Your Situation | Recommended Model | Why | | --------------------------------- | ---------------------------- | ------------------------------------------------ | | Most RAG applications | text-embedding-3-small | $0.02/M, quality sufficient for 80% of use cases | | Quality-critical retrieval | text-embedding-3-large | +4 MTEB points worth 6.5x cost for legal/medical | | Currently on ada-002 | Migrate to 3-small | 5x cheaper, slightly better quality | | Maximum quality at any cost | Voyage-3-large | Leads MTEB benchmarks | | Maximum cost savings | Google text-embedding-005 | $0.00625/M — cheapest hosted option | | Bulk re-embedding, index building | Any model + Batch API | 50% off, no reason not to | | Privacy-first, own infrastructure | nomic-embed-text (self-host) | Free, decent quality, full control |
---
**Related:** [Compare all model pricing in our complete LLM API pricing comparison](https://tokenmix.ai/blog/llm-api-pricing-comparison)
Conclusion
OpenAI's embedding models are so cheap that cost is rarely the deciding factor. text-embedding-3-small at $0.02/M means a 100K-document RAG app costs under $3/year for embeddings. The real decision is quality: spend 6.5x more for text-embedding-3-large if retrieval accuracy directly impacts user experience, stick with 3-small for everything else.
If you're still on ada-002, migrate today — 3-small is 5x cheaper and slightly better. And always use the [Batch API](https://tokenmix.ai/blog/openai-batch-api-pricing) for corpus embedding — there's no reason to pay standard rates for non-real-time work.
Real-time embedding model pricing across all providers at [tokenmix.ai/models](https://tokenmix.ai/models).
---
FAQ
How much does OpenAI text-embedding-3-small cost?
$0.02 per million tokens at standard rates. With the Batch API (50% off), it drops to $0.01/M. Embedding 1 million tokens costs about 2 cents — making it one of the cheapest API operations available.
Should I use text-embedding-3-small or 3-large?
Use 3-small for most applications — it's 6.5x cheaper ($0.02 vs $0.13/M) with only a 4-point quality gap on MTEB benchmarks. Use 3-large only when retrieval accuracy is critical: legal search, medical RAG, or applications where wrong results have high consequences.
Is text-embedding-ada-002 still worth using?
No. text-embedding-3-small is 5x cheaper ($0.02 vs $0.10/M) and slightly higher quality. Migrate to 3-small for immediate cost savings with no quality loss. You'll need to re-embed your corpus since vector spaces differ.
What are the cheapest embedding models available?
Google text-embedding-005 at $0.00625/M is the cheapest hosted option. OpenAI 3-small at $0.02/M is second. Self-hosted nomic-embed-text is free if you have GPU infrastructure.
How do I reduce OpenAI embedding costs?
Three strategies: (1) Use the Batch API for all non-real-time embedding — saves 50%. (2) Use 3-small instead of 3-large unless quality demands it. (3) Consider Google text-embedding-005 at $0.00625/M for pure cost optimization.
What is the max input size for OpenAI embeddings?
8,191 tokens per embedding request for all three models (3-small, 3-large, ada-002). Documents longer than 8,191 tokens must be chunked before embedding.
---
*Author: TokenMix Research Lab | Last Updated: April 2026 | Data Source: [OpenAI Official Pricing](https://platform.openai.com/docs/pricing/), [TokenMix.ai](https://tokenmix.ai), and [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard)*