GPT-OSS 120B
by OpenAI · chat
OpenAI's open-weight mixture-of-experts model with 120B total parameters (5.1B active per token), released under Apache 2.0. Features configurable chain-of-thought reasoning and runs on a single 80GB GPU via MXFP4 quantization.
Pricing
Input: $0.1425/M tokens · Output: $0.57/M tokens
Capabilities
Function Calling, JSON Mode, Streaming, Reasoning
Context: 131K tokens
Max output: 131K tokens
Routes: 2/2 healthy
Performance
TTFT: 680ms · Latency: 2335ms · Throughput: 436.3 tok/s