TokenMix Research Lab · 2026-06-05

Text Embedding Ada 002 Dimension 2026: 1536-D Legacy Guide

Text Embedding Ada 002 Dimension 2026: 1536-D Legacy Guide

Last Updated: 2026-06-05 Author: TokenMix Research Lab Data verified: 2026-06-05 - OpenAI embeddings guide, embeddings API reference, text-embedding-ada-002 model page, pricing page, model catalog, rate limits, Batch API, and data controls

text-embedding-ada-002 returns 1536-dimensional vectors. In 2026 it is a legacy-safe model, not the cheapest OpenAI embedding choice.

OpenAI's embeddings API reference shows text-embedding-ada-002 returning "1536 floats total" in the embedding object (OpenAI embeddings API). The same API reference says the optional dimensions parameter is only supported in text-embedding-3 and later models, so ada-002 dimensions are not adjustable through that field (Embeddings API). OpenAI's current pricing page lists text-embedding-ada-002 at $0.10 per 1M tokens and $0.05 per 1M tokens through Batch, while text-embedding-3-small is $0.02 standard and $0.01 Batch (OpenAI pricing). The embeddings guide lists ada-002 at 12,500 pages per dollar, 61.0% MTEB, and 8192 max input tokens (OpenAI embeddings guide).

Table of Contents

Quick Verdict

Claim Status Source
text-embedding-ada-002 returns 1536 floats Confirmed Embeddings API
The ada-002 vector dimension can be changed with the dimensions parameter False OpenAI says dimensions is only supported in text-embedding-3 and later
ada-002 max input is 8192 tokens Confirmed Embeddings guide, model page
ada-002 costs $0.10 per 1M tokens Confirmed OpenAI pricing
ada-002 Batch cost is $0.05 per 1M tokens Confirmed OpenAI pricing
text-embedding-3-small is cheaper than ada-002 Confirmed OpenAI pricing
OpenAI lists text-embedding-3-small at 1536 dimensions by default Confirmed Embeddings guide
ada-002 is still listed as an available older embedding model Confirmed ada-002 model page
ada-002 is the best OpenAI embedding model in 2026 False OpenAI calls text-embedding-3-small and 3-large newer and more performant
Existing ada-002 indexes should be migrated blindly False Re-embedding changes vector space and requires recall validation
Most new OpenAI embedding projects should start on text-embedding-3-small Likely It is cheaper, newer, and has the same default dimension
More teams will keep ada-002 only for legacy index compatibility Speculation No OpenAI migration mandate found

Dimension Facts

Model Default dimension Adjustable dimensions? Max input Price / 1M tokens Status
text-embedding-ada-002 1536 No documented dimensions support 8192 $0.10 Confirmed
text-embedding-3-small 1536 Yes 8192 $0.02 Confirmed
text-embedding-3-large 3072 Yes 8192 $0.13 Confirmed

The important detail is not just "1536." It is vector-space compatibility. A 1536-dimensional ada-002 vector and a 1536-dimensional text-embedding-3-small vector are not interchangeable. You cannot mix them in one index and expect distance scores to mean the same thing.

Pricing and Storage Math

Model Standard cost / 1M tokens Batch cost / 1M tokens $10 buys standard tokens $10 buys Batch tokens Status
text-embedding-3-small $0.02 $0.01 500M 1B Confirmed
text-embedding-ada-002 $0.10 $0.05 100M 200M Confirmed
text-embedding-3-large $0.13 $0.065 76.9M 153.8M Confirmed

Cost calculation 1: embedding 100M tokens costs $10 on ada-002 standard, $5 on ada-002 Batch, $2 on text-embedding-3-small standard, and $1 on text-embedding-3-small Batch. For new projects, ada-002 needs a compatibility reason to justify the 5x standard price gap against 3-small.

Storage calculation: 1M ada-002 vectors at 1536 dimensions stored as float32 use 1,000,000 x 1536 x 4 = 6.144 GB before vector database metadata, indexes, replicas, and compression. If the vector DB stores two replicas plus index overhead, real provisioned storage can be several times higher. That storage math is independent of OpenAI token price.

Rate Limits

Usage tier RPM RPD TPM Batch queue limit Status
Free 100 2,000 40,000 Not listed Confirmed
Tier 1 3,000 Not listed 1,000,000 3,000,000 Confirmed
Tier 2 5,000 Not listed 1,000,000 20,000,000 Confirmed
Tier 3 5,000 Not listed 5,000,000 100,000,000 Confirmed
Tier 4 10,000 Not listed 5,000,000 500,000,000 Confirmed
Tier 5 10,000 Not listed 10,000,000 4,000,000,000 Confirmed

OpenAI rate limits vary by account and tier, so use the dashboard as runtime truth. The model page gives a public baseline, but launch planning should still read your live account limits.

Migration Matrix

Situation Stay on ada-002? Move to 3-small? Move to 3-large? Status
Existing production index built on ada-002 Yes until migration test passes Yes after side-by-side recall eval Maybe if quality gain pays Likely
New semantic search project No strong reason Best default Use if recall needs it Likely
Storage-sensitive mobile/edge index Maybe if legacy Use adjustable dimensions Use only if quality matters more than storage Confirmed for dimensions support
Cost-sensitive batch embedding No unless compatibility Strongest cost pick More expensive but higher MTEB Confirmed
Mixed old and new chunks in same index Risky Re-embed all chunks together Re-embed all chunks together Likely

If the broader question is not ada-002 but OpenAI model cost, read OpenAI API Cost 2026. If you want a low-cost model route across providers, start with Cheapest AI API Providers 2026.

Code Examples

Python:

from openai import OpenAI

client = OpenAI()

response = client.embeddings.create(
    model="text-embedding-ada-002",
    input="TokenMix compares model cost, latency, and API access.",
    encoding_format="float",
)

vector = response.data[0].embedding
print(len(vector))  # 1536

cURL:

curl https://api.openai.com/v1/embeddings \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-ada-002",
    "input": "TokenMix compares model cost, latency, and API access.",
    "encoding_format": "float"
  }'

Do not send dimensions with ada-002. OpenAI documents that parameter for text-embedding-3 and later models.

Cost and Storage Scenarios

Scenario Token volume ada-002 standard ada-002 Batch 3-small standard Best action
Small docs corpus 10M $1.00 $0.50 $0.20 Use 3-small unless legacy
Medium SaaS help center 100M $10.00 $5.00 $2.00 Re-embed with 3-small for new index
Large RAG corpus 1B $100.00 $50.00 $20.00 Batch all embeddings
1M-vector storage 1536 float32 vectors 6.144 GB raw Same Same for 3-small default Budget DB overhead
10M-vector storage 1536 float32 vectors 61.44 GB raw Same Same for 3-small default Consider compression

Cost calculation 2: a 10M-document corpus averaging 500 tokens per document is 5B tokens. That costs $500 on ada-002 standard, $250 on ada-002 Batch, $100 on 3-small standard, and $50 on 3-small Batch. The model choice matters more than the one-time script.

Risks and Caveats

Risk What breaks Mitigation Status
Mixing embedding models Distance scores become inconsistent Re-embed the whole index per model Likely
Assuming dimension equals quality 1536-D models can perform differently Measure retrieval recall Confirmed
Sending dimensions to ada-002 Unsupported parameter risk Use text-embedding-3 for dimension control Confirmed
Ignoring storage overhead Vector DB bill exceeds raw math Include replicas, metadata, index overhead Likely
Embedding over 8192 tokens Request fails or must be chunked Chunk before embedding Confirmed
Migrating without eval Search quality regresses silently Side-by-side recall and click tests Likely

Final Recommendation

Keep ada-002 only when you need legacy vector-space compatibility. For new OpenAI embedding work in 2026, start with text-embedding-3-small, use Batch for bulk jobs, and migrate old indexes only after recall tests prove the new vectors work.

FAQ

What is the dimension of text-embedding-ada-002?

text-embedding-ada-002 returns 1536-dimensional vectors. OpenAI's API reference shows the ada-002 response as 1536 floats.

Can I reduce ada-002 dimensions?

No documented OpenAI parameter reduces ada-002 dimensions. The dimensions parameter is only supported in text-embedding-3 and later models.

Is ada-002 still available in 2026?

Yes, OpenAI still lists text-embedding-ada-002 as an older embedding model. That does not make it the best default for new projects.

How much does ada-002 cost?

OpenAI lists ada-002 at $0.10 per 1M tokens and $0.05 per 1M tokens through Batch. text-embedding-3-small is cheaper at $0.02 standard and $0.01 Batch.

Can I mix ada-002 and text-embedding-3-small vectors?

Do not mix them in one index. They can have the same 1536 length, but they are different vector spaces.

What is ada-002 max input length?

OpenAI lists 8192 max input tokens for ada-002 in the embeddings guide and model page. Longer documents should be chunked before embedding.

Should I migrate existing ada-002 indexes?

Migrate only after side-by-side retrieval tests. Re-embedding can improve cost and performance, but it can also change ranking behavior.

What is the storage size of 1M ada-002 vectors?

At 1536 dimensions and float32 storage, raw vectors use about 6.144 GB for 1M rows. Real vector database storage can be higher because of metadata, indexes, replicas, and compression settings.

Sources

Related Articles