TokenMix Research Lab · 2026-04-13

DeepSeek API Key 2026: Setup, V4 Pricing, Cache Checks Guide
Last Updated: 2026-04-30
Author: TokenMix Research Lab
Data checked: 2026-04-30
Getting a DeepSeek API key is quick. The part that changed is pricing and model naming: new code should target deepseek-v4-flash or deepseek-v4-pro, not old R1 or V3.2 assumptions.
According to DeepSeek's official Models & Pricing page, V4 Flash is $0.14 per 1M cache-miss input tokens and $0.28 per 1M output tokens. Cache-hit input is $0.0028 per 1M tokens. V4 Pro is discounted to $0.435 input and $0.87 output until 2026-05-31 15:59 UTC. DeepSeek also says deepseek-chat and deepseek-reasoner map to V4 Flash compatibility modes and will be deprecated.
Table of Contents
- Quick Answer
- Create A DeepSeek API Key
- Current Pricing
- First API Call
- Cache-Hit Checks
- Model Name Checks
- Billing And Spend Guards
- Direct DeepSeek vs TokenMix.ai
- FAQ
- Related Articles
- Sources
Quick Answer
| Step | What to do |
|---|---|
| 1 | Create or sign in to the DeepSeek Platform |
| 2 | Generate an API key |
| 3 | Store the key server-side as DEEPSEEK_API_KEY |
| 4 | Use https://api.deepseek.com as the OpenAI-compatible base URL |
| 5 | Start with deepseek-v4-flash |
| 6 | Track cache-hit and cache-miss tokens |
| 7 | Set daily spend and output-token limits |
If you need Alipay, WeChat Pay, multi-model fallback, or one OpenAI-compatible endpoint across providers, use TokenMix.ai instead of maintaining separate direct accounts.
Create A DeepSeek API Key
The standard direct path is simple.
| Step | Check |
|---|---|
| Create account | Use the official DeepSeek Platform |
| Add billing balance | Confirm balance and receipt |
| Generate key | Treat it as a production secret |
| Store key | Environment variable or secret manager |
| Test key | One small V4 Flash request |
| Rotate key | Rotate after leak or team changes |
Do not put the key in a browser, mobile app, GitHub repo, or client-side bundle. Route calls through your backend.
Current Pricing
All prices are per 1M tokens, checked on 2026-04-30.
| Model | Cache-hit input | Cache-miss input | Output | Note |
|---|---|---|---|---|
| deepseek-v4-flash | $0.0028 | $0.14 | $0.28 | Default economical model |
| deepseek-v4-pro | $0.003625 | $0.435 | $0.87 | 75% discount until 2026-05-31 15:59 UTC |
| V4 Pro full listed price | $0.0145 | $1.74 | $3.48 | Use for post-discount planning |
| deepseek-chat | Alias to V4 Flash non-thinking | Alias | Alias | Legacy compatibility |
| deepseek-reasoner | Alias to V4 Flash thinking | Alias | Alias | Legacy compatibility |
The high-signal takeaway: cache-hit V4 Flash input is 50x cheaper than cache-miss V4 Flash input.
First API Call
Python:
from openai import OpenAI
client = OpenAI(
api_key="DEEPSEEK_API_KEY",
base_url="https://api.deepseek.com"
)
response = client.chat.completions.create(
model="deepseek-v4-flash",
messages=[
{"role": "user", "content": "Write a one-paragraph summary of API key safety."}
]
)
print(response.choices[0].message.content)
cURL:
curl https://api.deepseek.com/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer $DEEPSEEK_API_KEY" -d '{
"model": "deepseek-v4-flash",
"messages": [
{"role": "user", "content": "Return a JSON health check."}
]
}'
Cache-Hit Checks
DeepSeek context caching is enabled by default, but it is best-effort. The docs expose cache usage fields.
| Usage field | Meaning |
|---|---|
prompt_cache_hit_tokens |
Tokens billed at cache-hit price |
prompt_cache_miss_tokens |
Tokens billed at cache-miss price |
| Cache hit | Reused prefix was found |
| Cache miss | Full input was processed normally |
Cost example for V4 Flash:
| Input tokens | Cache state | Cost |
|---|---|---|
| 1,000,000 | Miss | $0.14 |
| 1,000,000 | Hit | $0.0028 |
| 10,000,000 | Miss | $1.40 |
| 10,000,000 | Hit | $0.028 |
Model Name Checks
| If your code uses... | Change to... | Reason |
|---|---|---|
deepseek-chat |
deepseek-v4-flash |
Old alias maps to V4 Flash non-thinking |
deepseek-reasoner |
Explicit V4 thinking route | Old alias maps to V4 Flash thinking |
| V3.2 pricing constants | V4 Flash/Pro pricing constants | Official table is now V4-first |
| R1 pricing constants | Current V4 pricing constants | R1 name is legacy for API planning |
DeepSeek says the old aliases will be fully retired and inaccessible after 2026-07-24 15:59 UTC.
Billing And Spend Guards
| Guard | Suggested default |
|---|---|
| Daily spend alert | 50%, 80%, 100% of daily budget |
| Max output tokens | Set per endpoint |
| Model allowlist | Flash by default, Pro by approval |
| Retry cap | 1-2 retries with exponential backoff |
| Cache monitoring | Store hit/miss token counts |
| Balance check | Stop jobs before balance hits zero |
The cheapest key is not the safest key. The safe key is server-side, monitored, budgeted, and easy to rotate.
Direct DeepSeek vs TokenMix.ai
| Route | Best for | Caveat |
|---|---|---|
| Direct DeepSeek API | DeepSeek-only apps and official account control | Separate billing and no cross-provider fallback |
| TokenMix.ai | Multi-model production, local payments, fallback routing | Less direct provider control |
| OpenRouter | Model discovery across many providers | Check marketplace pricing and fees |
| Self-hosted open model | Full infra control | You own GPU, ops, scaling, and reliability |
TokenMix.ai fits when a DeepSeek API key is not enough: one endpoint, OpenAI-compatible calls, DeepSeek plus GPT/Claude/Gemini/Qwen, and payment options such as Alipay and WeChat Pay.
FAQ
How do I get a DeepSeek API key?
Create an account on the DeepSeek Platform, add balance if required, and generate an API key. Store the key in a server-side environment variable or secret manager.
What model should I use first?
Use deepseek-v4-flash first. It is the default economical model and supports 1M context, tool calls, JSON output, and thinking or non-thinking modes.
How much does DeepSeek V4 Flash cost?
V4 Flash costs $0.0028 per 1M cache-hit input tokens, $0.14 per 1M cache-miss input tokens, and $0.28 per 1M output tokens.
Is DeepSeek V4 Pro more expensive?
Yes. During the current 75% discount, V4 Pro costs $0.435 cache-miss input and $0.87 output per 1M tokens. Use it for hard tasks, not every request.
Is deepseek-reasoner deprecated?
DeepSeek says deepseek-reasoner maps to V4 Flash thinking mode for compatibility and will be deprecated. New code should use explicit V4 model names and mode controls.
Does DeepSeek support the OpenAI SDK?
Yes. Use the OpenAI SDK with base_url="https://api.deepseek.com" and a DeepSeek API key.
How do I know if cache is working?
Read the usage fields prompt_cache_hit_tokens and prompt_cache_miss_tokens. Cache hits are billed at the lower cache-hit rate.
Should I use TokenMix.ai for DeepSeek?
Use TokenMix.ai when you want DeepSeek plus other providers, fallback routing, unified billing, and local payment options. Use direct DeepSeek when you only need DeepSeek.