DeepSeek V3.1

by DeepSeek · chat

DeepSeek's first hybrid reasoning model supporting both thinking and non-thinking modes. 671B MoE architecture with 128K context. Optimized for agent and software engineering tasks with faster reasoning than R1.

Pricing

Input: $0.2511/M tokens · Output: .023/M tokens

Capabilities

Streaming

Context: 128K tokens

Max output: 8K tokens

Routes: 3/3 healthy

Performance

TTFT: 745ms · Latency: 5142ms · Throughput: 8.7 tok/s