TokenMix Research Lab · 2026-06-05

Text Embedding Ada 002 Dimension 2026: 1536-D Legacy Guide
Last Updated: 2026-06-05 Author: TokenMix Research Lab Data verified: 2026-06-05 - OpenAI embeddings guide, embeddings API reference, text-embedding-ada-002 model page, pricing page, model catalog, rate limits, Batch API, and data controls
text-embedding-ada-002 returns 1536-dimensional vectors. In 2026 it is a legacy-safe model, not the cheapest OpenAI embedding choice.
OpenAI's embeddings API reference shows text-embedding-ada-002 returning "1536 floats total" in the embedding object (OpenAI embeddings API). The same API reference says the optional dimensions parameter is only supported in text-embedding-3 and later models, so ada-002 dimensions are not adjustable through that field (Embeddings API). OpenAI's current pricing page lists text-embedding-ada-002 at $0.10 per 1M tokens and $0.05 per 1M tokens through Batch, while text-embedding-3-small is $0.02 standard and $0.01 Batch (OpenAI pricing). The embeddings guide lists ada-002 at 12,500 pages per dollar, 61.0% MTEB, and 8192 max input tokens (OpenAI embeddings guide).
Table of Contents
- Quick Verdict
- Dimension Facts
- Pricing and Storage Math
- Rate Limits
- Migration Matrix
- Code Examples
- Cost and Storage Scenarios
- Risks and Caveats
- Final Recommendation
- FAQ
- Sources
- Related Articles
Quick Verdict
| Claim | Status | Source |
|---|---|---|
text-embedding-ada-002 returns 1536 floats |
Confirmed | Embeddings API |
The ada-002 vector dimension can be changed with the dimensions parameter |
False | OpenAI says dimensions is only supported in text-embedding-3 and later |
| ada-002 max input is 8192 tokens | Confirmed | Embeddings guide, model page |
| ada-002 costs $0.10 per 1M tokens | Confirmed | OpenAI pricing |
| ada-002 Batch cost is $0.05 per 1M tokens | Confirmed | OpenAI pricing |
text-embedding-3-small is cheaper than ada-002 |
Confirmed | OpenAI pricing |
OpenAI lists text-embedding-3-small at 1536 dimensions by default |
Confirmed | Embeddings guide |
| ada-002 is still listed as an available older embedding model | Confirmed | ada-002 model page |
| ada-002 is the best OpenAI embedding model in 2026 | False | OpenAI calls text-embedding-3-small and 3-large newer and more performant |
| Existing ada-002 indexes should be migrated blindly | False | Re-embedding changes vector space and requires recall validation |
Most new OpenAI embedding projects should start on text-embedding-3-small |
Likely | It is cheaper, newer, and has the same default dimension |
| More teams will keep ada-002 only for legacy index compatibility | Speculation | No OpenAI migration mandate found |
Dimension Facts
| Model | Default dimension | Adjustable dimensions? | Max input | Price / 1M tokens | Status |
|---|---|---|---|---|---|
text-embedding-ada-002 |
1536 | No documented dimensions support |
8192 | $0.10 | Confirmed |
text-embedding-3-small |
1536 | Yes | 8192 | $0.02 | Confirmed |
text-embedding-3-large |
3072 | Yes | 8192 | $0.13 | Confirmed |
The important detail is not just "1536." It is vector-space compatibility. A 1536-dimensional ada-002 vector and a 1536-dimensional text-embedding-3-small vector are not interchangeable. You cannot mix them in one index and expect distance scores to mean the same thing.
Pricing and Storage Math
| Model | Standard cost / 1M tokens | Batch cost / 1M tokens | $10 buys standard tokens | $10 buys Batch tokens | Status |
|---|---|---|---|---|---|
text-embedding-3-small |
$0.02 | $0.01 | 500M | 1B | Confirmed |
text-embedding-ada-002 |
$0.10 | $0.05 | 100M | 200M | Confirmed |
text-embedding-3-large |
$0.13 | $0.065 | 76.9M | 153.8M | Confirmed |
Cost calculation 1: embedding 100M tokens costs $10 on ada-002 standard, $5 on ada-002 Batch, $2 on text-embedding-3-small standard, and $1 on text-embedding-3-small Batch. For new projects, ada-002 needs a compatibility reason to justify the 5x standard price gap against 3-small.
Storage calculation: 1M ada-002 vectors at 1536 dimensions stored as float32 use 1,000,000 x 1536 x 4 = 6.144 GB before vector database metadata, indexes, replicas, and compression. If the vector DB stores two replicas plus index overhead, real provisioned storage can be several times higher. That storage math is independent of OpenAI token price.
Rate Limits
| Usage tier | RPM | RPD | TPM | Batch queue limit | Status |
|---|---|---|---|---|---|
| Free | 100 | 2,000 | 40,000 | Not listed | Confirmed |
| Tier 1 | 3,000 | Not listed | 1,000,000 | 3,000,000 | Confirmed |
| Tier 2 | 5,000 | Not listed | 1,000,000 | 20,000,000 | Confirmed |
| Tier 3 | 5,000 | Not listed | 5,000,000 | 100,000,000 | Confirmed |
| Tier 4 | 10,000 | Not listed | 5,000,000 | 500,000,000 | Confirmed |
| Tier 5 | 10,000 | Not listed | 10,000,000 | 4,000,000,000 | Confirmed |
OpenAI rate limits vary by account and tier, so use the dashboard as runtime truth. The model page gives a public baseline, but launch planning should still read your live account limits.
Migration Matrix
| Situation | Stay on ada-002? | Move to 3-small? |
Move to 3-large? |
Status |
|---|---|---|---|---|
| Existing production index built on ada-002 | Yes until migration test passes | Yes after side-by-side recall eval | Maybe if quality gain pays | Likely |
| New semantic search project | No strong reason | Best default | Use if recall needs it | Likely |
| Storage-sensitive mobile/edge index | Maybe if legacy | Use adjustable dimensions | Use only if quality matters more than storage | Confirmed for dimensions support |
| Cost-sensitive batch embedding | No unless compatibility | Strongest cost pick | More expensive but higher MTEB | Confirmed |
| Mixed old and new chunks in same index | Risky | Re-embed all chunks together | Re-embed all chunks together | Likely |
If the broader question is not ada-002 but OpenAI model cost, read OpenAI API Cost 2026. If you want a low-cost model route across providers, start with Cheapest AI API Providers 2026.
Code Examples
Python:
from openai import OpenAI
client = OpenAI()
response = client.embeddings.create(
model="text-embedding-ada-002",
input="TokenMix compares model cost, latency, and API access.",
encoding_format="float",
)
vector = response.data[0].embedding
print(len(vector)) # 1536
cURL:
curl https://api.openai.com/v1/embeddings \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "text-embedding-ada-002",
"input": "TokenMix compares model cost, latency, and API access.",
"encoding_format": "float"
}'
Do not send dimensions with ada-002. OpenAI documents that parameter for text-embedding-3 and later models.
Cost and Storage Scenarios
| Scenario | Token volume | ada-002 standard | ada-002 Batch | 3-small standard |
Best action |
|---|---|---|---|---|---|
| Small docs corpus | 10M | $1.00 | $0.50 | $0.20 | Use 3-small unless legacy |
| Medium SaaS help center | 100M | $10.00 | $5.00 | $2.00 | Re-embed with 3-small for new index |
| Large RAG corpus | 1B | $100.00 | $50.00 | $20.00 | Batch all embeddings |
| 1M-vector storage | 1536 float32 vectors | 6.144 GB raw | Same | Same for 3-small default |
Budget DB overhead |
| 10M-vector storage | 1536 float32 vectors | 61.44 GB raw | Same | Same for 3-small default |
Consider compression |
Cost calculation 2: a 10M-document corpus averaging 500 tokens per document is 5B tokens. That costs $500 on ada-002 standard, $250 on ada-002 Batch, $100 on 3-small standard, and $50 on 3-small Batch. The model choice matters more than the one-time script.
Risks and Caveats
| Risk | What breaks | Mitigation | Status |
|---|---|---|---|
| Mixing embedding models | Distance scores become inconsistent | Re-embed the whole index per model | Likely |
| Assuming dimension equals quality | 1536-D models can perform differently | Measure retrieval recall | Confirmed |
Sending dimensions to ada-002 |
Unsupported parameter risk | Use text-embedding-3 for dimension control |
Confirmed |
| Ignoring storage overhead | Vector DB bill exceeds raw math | Include replicas, metadata, index overhead | Likely |
| Embedding over 8192 tokens | Request fails or must be chunked | Chunk before embedding | Confirmed |
| Migrating without eval | Search quality regresses silently | Side-by-side recall and click tests | Likely |
Final Recommendation
Keep ada-002 only when you need legacy vector-space compatibility. For new OpenAI embedding work in 2026, start with text-embedding-3-small, use Batch for bulk jobs, and migrate old indexes only after recall tests prove the new vectors work.
FAQ
What is the dimension of text-embedding-ada-002?
text-embedding-ada-002 returns 1536-dimensional vectors. OpenAI's API reference shows the ada-002 response as 1536 floats.
Can I reduce ada-002 dimensions?
No documented OpenAI parameter reduces ada-002 dimensions. The dimensions parameter is only supported in text-embedding-3 and later models.
Is ada-002 still available in 2026?
Yes, OpenAI still lists text-embedding-ada-002 as an older embedding model. That does not make it the best default for new projects.
How much does ada-002 cost?
OpenAI lists ada-002 at $0.10 per 1M tokens and $0.05 per 1M tokens through Batch. text-embedding-3-small is cheaper at $0.02 standard and $0.01 Batch.
Can I mix ada-002 and text-embedding-3-small vectors?
Do not mix them in one index. They can have the same 1536 length, but they are different vector spaces.
What is ada-002 max input length?
OpenAI lists 8192 max input tokens for ada-002 in the embeddings guide and model page. Longer documents should be chunked before embedding.
Should I migrate existing ada-002 indexes?
Migrate only after side-by-side retrieval tests. Re-embedding can improve cost and performance, but it can also change ranking behavior.
What is the storage size of 1M ada-002 vectors?
At 1536 dimensions and float32 storage, raw vectors use about 6.144 GB for 1M rows. Real vector database storage can be higher because of metadata, indexes, replicas, and compression settings.
Sources
- OpenAI Embeddings API Reference - official ada-002 1536-float response, input limits, and
dimensionssupport scope - OpenAI Embeddings Guide - official model comparison, max input, use cases, MTEB, and pages-per-dollar data
- text-embedding-ada-002 Model Page - official model status, pricing, endpoint, and rate-limit table
- OpenAI Pricing - official embedding standard and Batch prices
- OpenAI Batch API - official async Batch API cost and limit framing
- OpenAI Rate Limits - official RPM, TPM, and usage-tier guidance
- OpenAI Models - official model catalog and current embedding family positioning
- OpenAI Data Controls - official data processing and regional processing context