TokenMix Research Lab · 2026-04-13

DeepSeek API Key 2026: Setup, V4 Pricing, Cache Checks Guide

DeepSeek API Key 2026: Setup, V4 Pricing, Cache Checks Guide

Last Updated: 2026-04-30
Author: TokenMix Research Lab
Data checked: 2026-04-30

Getting a DeepSeek API key is quick. The part that changed is pricing and model naming: new code should target deepseek-v4-flash or deepseek-v4-pro, not old R1 or V3.2 assumptions.

According to DeepSeek's official Models & Pricing page, V4 Flash is $0.14 per 1M cache-miss input tokens and $0.28 per 1M output tokens. Cache-hit input is $0.0028 per 1M tokens. V4 Pro is discounted to $0.435 input and $0.87 output until 2026-05-31 15:59 UTC. DeepSeek also says deepseek-chat and deepseek-reasoner map to V4 Flash compatibility modes and will be deprecated.

Table of Contents

Quick Answer

Step What to do
1 Create or sign in to the DeepSeek Platform
2 Generate an API key
3 Store the key server-side as DEEPSEEK_API_KEY
4 Use https://api.deepseek.com as the OpenAI-compatible base URL
5 Start with deepseek-v4-flash
6 Track cache-hit and cache-miss tokens
7 Set daily spend and output-token limits

If you need Alipay, WeChat Pay, multi-model fallback, or one OpenAI-compatible endpoint across providers, use TokenMix.ai instead of maintaining separate direct accounts.

Create A DeepSeek API Key

The standard direct path is simple.

Step Check
Create account Use the official DeepSeek Platform
Add billing balance Confirm balance and receipt
Generate key Treat it as a production secret
Store key Environment variable or secret manager
Test key One small V4 Flash request
Rotate key Rotate after leak or team changes

Do not put the key in a browser, mobile app, GitHub repo, or client-side bundle. Route calls through your backend.

Current Pricing

All prices are per 1M tokens, checked on 2026-04-30.

Model Cache-hit input Cache-miss input Output Note
deepseek-v4-flash $0.0028 $0.14 $0.28 Default economical model
deepseek-v4-pro $0.003625 $0.435 $0.87 75% discount until 2026-05-31 15:59 UTC
V4 Pro full listed price $0.0145 $1.74 $3.48 Use for post-discount planning
deepseek-chat Alias to V4 Flash non-thinking Alias Alias Legacy compatibility
deepseek-reasoner Alias to V4 Flash thinking Alias Alias Legacy compatibility

The high-signal takeaway: cache-hit V4 Flash input is 50x cheaper than cache-miss V4 Flash input.

First API Call

Python:

from openai import OpenAI

client = OpenAI(
    api_key="DEEPSEEK_API_KEY",
    base_url="https://api.deepseek.com"
)

response = client.chat.completions.create(
    model="deepseek-v4-flash",
    messages=[
        {"role": "user", "content": "Write a one-paragraph summary of API key safety."}
    ]
)

print(response.choices[0].message.content)

cURL:

curl https://api.deepseek.com/chat/completions   -H "Content-Type: application/json"   -H "Authorization: Bearer $DEEPSEEK_API_KEY"   -d '{
    "model": "deepseek-v4-flash",
    "messages": [
      {"role": "user", "content": "Return a JSON health check."}
    ]
  }'

Cache-Hit Checks

DeepSeek context caching is enabled by default, but it is best-effort. The docs expose cache usage fields.

Usage field Meaning
prompt_cache_hit_tokens Tokens billed at cache-hit price
prompt_cache_miss_tokens Tokens billed at cache-miss price
Cache hit Reused prefix was found
Cache miss Full input was processed normally

Cost example for V4 Flash:

Input tokens Cache state Cost
1,000,000 Miss $0.14
1,000,000 Hit $0.0028
10,000,000 Miss $1.40
10,000,000 Hit $0.028

Model Name Checks

If your code uses... Change to... Reason
deepseek-chat deepseek-v4-flash Old alias maps to V4 Flash non-thinking
deepseek-reasoner Explicit V4 thinking route Old alias maps to V4 Flash thinking
V3.2 pricing constants V4 Flash/Pro pricing constants Official table is now V4-first
R1 pricing constants Current V4 pricing constants R1 name is legacy for API planning

DeepSeek says the old aliases will be fully retired and inaccessible after 2026-07-24 15:59 UTC.

Billing And Spend Guards

Guard Suggested default
Daily spend alert 50%, 80%, 100% of daily budget
Max output tokens Set per endpoint
Model allowlist Flash by default, Pro by approval
Retry cap 1-2 retries with exponential backoff
Cache monitoring Store hit/miss token counts
Balance check Stop jobs before balance hits zero

The cheapest key is not the safest key. The safe key is server-side, monitored, budgeted, and easy to rotate.

Direct DeepSeek vs TokenMix.ai

Route Best for Caveat
Direct DeepSeek API DeepSeek-only apps and official account control Separate billing and no cross-provider fallback
TokenMix.ai Multi-model production, local payments, fallback routing Less direct provider control
OpenRouter Model discovery across many providers Check marketplace pricing and fees
Self-hosted open model Full infra control You own GPU, ops, scaling, and reliability

TokenMix.ai fits when a DeepSeek API key is not enough: one endpoint, OpenAI-compatible calls, DeepSeek plus GPT/Claude/Gemini/Qwen, and payment options such as Alipay and WeChat Pay.

FAQ

How do I get a DeepSeek API key?

Create an account on the DeepSeek Platform, add balance if required, and generate an API key. Store the key in a server-side environment variable or secret manager.

What model should I use first?

Use deepseek-v4-flash first. It is the default economical model and supports 1M context, tool calls, JSON output, and thinking or non-thinking modes.

How much does DeepSeek V4 Flash cost?

V4 Flash costs $0.0028 per 1M cache-hit input tokens, $0.14 per 1M cache-miss input tokens, and $0.28 per 1M output tokens.

Is DeepSeek V4 Pro more expensive?

Yes. During the current 75% discount, V4 Pro costs $0.435 cache-miss input and $0.87 output per 1M tokens. Use it for hard tasks, not every request.

Is deepseek-reasoner deprecated?

DeepSeek says deepseek-reasoner maps to V4 Flash thinking mode for compatibility and will be deprecated. New code should use explicit V4 model names and mode controls.

Does DeepSeek support the OpenAI SDK?

Yes. Use the OpenAI SDK with base_url="https://api.deepseek.com" and a DeepSeek API key.

How do I know if cache is working?

Read the usage fields prompt_cache_hit_tokens and prompt_cache_miss_tokens. Cache hits are billed at the lower cache-hit rate.

Should I use TokenMix.ai for DeepSeek?

Use TokenMix.ai when you want DeepSeek plus other providers, fallback routing, unified billing, and local payment options. Use direct DeepSeek when you only need DeepSeek.

Related Articles

Sources