TokenMix Research Lab · 2026-06-05

Text Embedding Ada 002 Dimension 2026: 1536-D Legacy Guide

Last Updated: 2026-06-05 Author: TokenMix Research Lab Data verified: 2026-06-05 - OpenAI embeddings guide, embeddings API reference, text-embedding-ada-002 model page, pricing page, model catalog, rate limits, Batch API, and data controls

text-embedding-ada-002 returns 1536-dimensional vectors. In 2026 it is a legacy-safe model, not the cheapest OpenAI embedding choice.

OpenAI's embeddings API reference shows text-embedding-ada-002 returning "1536 floats total" in the embedding object (OpenAI embeddings API). The same API reference says the optional dimensions parameter is only supported in text-embedding-3 and later models, so ada-002 dimensions are not adjustable through that field (Embeddings API). OpenAI's current pricing page lists text-embedding-ada-002 at $0.10 per 1M tokens and $0.05 per 1M tokens through Batch, while text-embedding-3-small is $0.02 standard and $0.01 Batch (OpenAI pricing). The embeddings guide lists ada-002 at 12,500 pages per dollar, 61.0% MTEB, and 8192 max input tokens (OpenAI embeddings guide).

Quick Verdict
Dimension Facts
Pricing and Storage Math
Rate Limits
Migration Matrix
Code Examples
Cost and Storage Scenarios
Risks and Caveats
Final Recommendation
FAQ
Sources
Related Articles

Quick Verdict

Claim	Status	Source
`text-embedding-ada-002` returns 1536 floats	Confirmed	Embeddings API
The ada-002 vector dimension can be changed with the `dimensions` parameter	False	OpenAI says `dimensions` is only supported in `text-embedding-3` and later
ada-002 max input is 8192 tokens	Confirmed	Embeddings guide, model page
ada-002 costs $0.10 per 1M tokens	Confirmed	OpenAI pricing
ada-002 Batch cost is $0.05 per 1M tokens	Confirmed	OpenAI pricing
`text-embedding-3-small` is cheaper than ada-002	Confirmed	OpenAI pricing
OpenAI lists `text-embedding-3-small` at 1536 dimensions by default	Confirmed	Embeddings guide
ada-002 is still listed as an available older embedding model	Confirmed	ada-002 model page
ada-002 is the best OpenAI embedding model in 2026	False	OpenAI calls `text-embedding-3-small` and `3-large` newer and more performant
Existing ada-002 indexes should be migrated blindly	False	Re-embedding changes vector space and requires recall validation
Most new OpenAI embedding projects should start on `text-embedding-3-small`	Likely	It is cheaper, newer, and has the same default dimension
More teams will keep ada-002 only for legacy index compatibility	Speculation	No OpenAI migration mandate found

Dimension Facts

Model	Default dimension	Adjustable dimensions?	Max input	Price / 1M tokens	Status
`text-embedding-ada-002`	1536	No documented `dimensions` support	8192	$0.10	Confirmed
`text-embedding-3-small`	1536	Yes	8192	$0.02	Confirmed
`text-embedding-3-large`	3072	Yes	8192	$0.13	Confirmed

The important detail is not just "1536." It is vector-space compatibility. A 1536-dimensional ada-002 vector and a 1536-dimensional text-embedding-3-small vector are not interchangeable. You cannot mix them in one index and expect distance scores to mean the same thing.

Pricing and Storage Math

Model	Standard cost / 1M tokens	Batch cost / 1M tokens	$10 buys standard tokens	$10 buys Batch tokens	Status
`text-embedding-3-small`	$0.02	$0.01	500M	1B	Confirmed
`text-embedding-ada-002`	$0.10	$0.05	100M	200M	Confirmed
`text-embedding-3-large`	$0.13	$0.065	76.9M	153.8M	Confirmed

Cost calculation 1: embedding 100M tokens costs $10 on ada-002 standard, $5 on ada-002 Batch, $2 on text-embedding-3-small standard, and $1 on text-embedding-3-small Batch. For new projects, ada-002 needs a compatibility reason to justify the 5x standard price gap against 3-small.

Storage calculation: 1M ada-002 vectors at 1536 dimensions stored as float32 use 1,000,000 x 1536 x 4 = 6.144 GB before vector database metadata, indexes, replicas, and compression. If the vector DB stores two replicas plus index overhead, real provisioned storage can be several times higher. That storage math is independent of OpenAI token price.

Rate Limits

Usage tier	RPM	RPD	TPM	Batch queue limit	Status
Free	100	2,000	40,000	Not listed	Confirmed
Tier 1	3,000	Not listed	1,000,000	3,000,000	Confirmed
Tier 2	5,000	Not listed	1,000,000	20,000,000	Confirmed
Tier 3	5,000	Not listed	5,000,000	100,000,000	Confirmed
Tier 4	10,000	Not listed	5,000,000	500,000,000	Confirmed
Tier 5	10,000	Not listed	10,000,000	4,000,000,000	Confirmed

OpenAI rate limits vary by account and tier, so use the dashboard as runtime truth. The model page gives a public baseline, but launch planning should still read your live account limits.

Migration Matrix

Situation	Stay on ada-002?	Move to `3-small`?	Move to `3-large`?	Status
Existing production index built on ada-002	Yes until migration test passes	Yes after side-by-side recall eval	Maybe if quality gain pays	Likely
New semantic search project	No strong reason	Best default	Use if recall needs it	Likely
Storage-sensitive mobile/edge index	Maybe if legacy	Use adjustable dimensions	Use only if quality matters more than storage	Confirmed for dimensions support
Cost-sensitive batch embedding	No unless compatibility	Strongest cost pick	More expensive but higher MTEB	Confirmed
Mixed old and new chunks in same index	Risky	Re-embed all chunks together	Re-embed all chunks together	Likely

If the broader question is not ada-002 but OpenAI model cost, read OpenAI API Cost 2026. If you want a low-cost model route across providers, start with Cheapest AI API Providers 2026.

Code Examples

Python:

from openai import OpenAI

client = OpenAI()

response = client.embeddings.create(
    model="text-embedding-ada-002",
    input="TokenMix compares model cost, latency, and API access.",
    encoding_format="float",
)

vector = response.data[0].embedding
print(len(vector))  # 1536

cURL:

curl https://api.openai.com/v1/embeddings \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-ada-002",
    "input": "TokenMix compares model cost, latency, and API access.",
    "encoding_format": "float"
  }'

Do not send dimensions with ada-002. OpenAI documents that parameter for text-embedding-3 and later models.

Cost and Storage Scenarios

Scenario	Token volume	ada-002 standard	ada-002 Batch	`3-small` standard	Best action
Small docs corpus	10M	$1.00	$0.50	$0.20	Use `3-small` unless legacy
Medium SaaS help center	100M	$10.00	$5.00	$2.00	Re-embed with `3-small` for new index
Large RAG corpus	1B	$100.00	$50.00	$20.00	Batch all embeddings
1M-vector storage	1536 float32 vectors	6.144 GB raw	Same	Same for `3-small` default	Budget DB overhead
10M-vector storage	1536 float32 vectors	61.44 GB raw	Same	Same for `3-small` default	Consider compression

Cost calculation 2: a 10M-document corpus averaging 500 tokens per document is 5B tokens. That costs $500 on ada-002 standard, $250 on ada-002 Batch, $100 on 3-small standard, and $50 on 3-small Batch. The model choice matters more than the one-time script.

Risks and Caveats

Risk	What breaks	Mitigation	Status
Mixing embedding models	Distance scores become inconsistent	Re-embed the whole index per model	Likely
Assuming dimension equals quality	1536-D models can perform differently	Measure retrieval recall	Confirmed
Sending `dimensions` to ada-002	Unsupported parameter risk	Use `text-embedding-3` for dimension control	Confirmed
Ignoring storage overhead	Vector DB bill exceeds raw math	Include replicas, metadata, index overhead	Likely
Embedding over 8192 tokens	Request fails or must be chunked	Chunk before embedding	Confirmed
Migrating without eval	Search quality regresses silently	Side-by-side recall and click tests	Likely

Final Recommendation

Keep ada-002 only when you need legacy vector-space compatibility. For new OpenAI embedding work in 2026, start with text-embedding-3-small, use Batch for bulk jobs, and migrate old indexes only after recall tests prove the new vectors work.

FAQ

What is the dimension of `text-embedding-ada-002`?

text-embedding-ada-002 returns 1536-dimensional vectors. OpenAI's API reference shows the ada-002 response as 1536 floats.

Can I reduce ada-002 dimensions?

No documented OpenAI parameter reduces ada-002 dimensions. The dimensions parameter is only supported in text-embedding-3 and later models.

Is ada-002 still available in 2026?

Yes, OpenAI still lists text-embedding-ada-002 as an older embedding model. That does not make it the best default for new projects.

How much does ada-002 cost?

OpenAI lists ada-002 at $0.10 per 1M tokens and $0.05 per 1M tokens through Batch. text-embedding-3-small is cheaper at $0.02 standard and $0.01 Batch.

Can I mix ada-002 and text-embedding-3-small vectors?

Do not mix them in one index. They can have the same 1536 length, but they are different vector spaces.

What is ada-002 max input length?

OpenAI lists 8192 max input tokens for ada-002 in the embeddings guide and model page. Longer documents should be chunked before embedding.

Should I migrate existing ada-002 indexes?

Migrate only after side-by-side retrieval tests. Re-embedding can improve cost and performance, but it can also change ranking behavior.

What is the storage size of 1M ada-002 vectors?

At 1536 dimensions and float32 storage, raw vectors use about 6.144 GB for 1M rows. Real vector database storage can be higher because of metadata, indexes, replicas, and compression settings.

Sources

OpenAI Embeddings API Reference - official ada-002 1536-float response, input limits, and dimensions support scope
OpenAI Embeddings Guide - official model comparison, max input, use cases, MTEB, and pages-per-dollar data
text-embedding-ada-002 Model Page - official model status, pricing, endpoint, and rate-limit table
OpenAI Pricing - official embedding standard and Batch prices
OpenAI Batch API - official async Batch API cost and limit framing
OpenAI Rate Limits - official RPM, TPM, and usage-tier guidance
OpenAI Models - official model catalog and current embedding family positioning
OpenAI Data Controls - official data processing and regional processing context