DeepSeek V4 Flash API via TokenMix

Use DeepSeek V4 Flash from DeepSeek as a chat model through the TokenMix AI API relay and multi-model gateway.

DeepSeek V4 Flash is the efficient DeepSeek V4 model for general chat, coding, analysis, and high-throughput workloads. It supports thinking and non-thinking modes, 1M context, up to 384K output, JSON output, tool calls, chat prefix completion, and FIM completion in non-thinking mode.

API access

Base URL: https://api.tokenmix.ai/v1
Model ID: deepseek-v4-flash
OpenAI SDK compatible. Change the base URL and use your TokenMix API key.

Pricing

Input $0.132353/M tokens, output $0.264706/M tokens

Capabilities

Function calling, JSON mode, Streaming, Reasoning

Model specs

Context: 1000K tokens
Max output: 384K tokens

Availability

3/3 available API endpoints are healthy right now.

Recent performance

TTFT 1363ms, latency 5938ms, throughput 97.3 tok/s.

Start using this model

Create an API key, top up from $1 when needed, and call this model through the TokenMix OpenAI-compatible endpoint.

Create API key · View pricing · Quickstart