DeepSeek V4 Flash API via TokenMix
Use DeepSeek V4 Flash from DeepSeek as a chat model through the TokenMix AI API relay and multi-model gateway.
DeepSeek V4 Flash is the efficient DeepSeek V4 model for general chat, coding, analysis, and high-throughput workloads. It supports thinking and non-thinking modes, 1M context, up to 384K output, JSON output, tool calls, chat prefix completion, and FIM completion in non-thinking mode.
API access
- Base URL:
https://api.tokenmix.ai/v1 - Model ID:
deepseek-v4-flash - OpenAI SDK compatible. Change the base URL and use your TokenMix API key.
Pricing
Input $0.1372/M tokens, output $0.2744/M tokens
Capabilities
Function calling, JSON mode, Streaming, Reasoning
Model specs
- Context: 1000K tokens
- Max output: 384K tokens
Availability
1/1 available API endpoints are healthy right now.
Recent performance
TTFT 5769ms, latency 11463ms, throughput 130.6 tok/s.
Start using this model
Create an API key, top up from $1 when needed, and call this model through the TokenMix OpenAI-compatible endpoint.