DeepSeek V3.1

by DeepSeek · chat

首个混合推理模型,支持思考与非思考双模式,671B MoE 架构,强化智能体与工程能力

Pricing

Input: $0.2511/M tokens · Output: .023/M tokens

Capabilities

Streaming

Context: 128K tokens

Max output: 8K tokens

Routes: 3/3 healthy

Performance

TTFT: 745ms · Latency: 5142ms · Throughput: 8.7 tok/s