Grok 4.1 Fast Non-Reasoning API via TokenMix
Use Grok 4.1 Fast Non-Reasoning from xAI as a chat model through the TokenMix AI API relay and multi-model gateway.
Low-latency, non-reasoning variant of Grok 4.1 Fast with 2M context window. Delivers fast responses without extended thinking while maintaining frontier-level tool-calling and agentic capabilities.
API access
- Base URL:
https://api.tokenmix.ai/v1 - Model ID:
grok-4.1-fast-non-reasoning - OpenAI SDK compatible. Change the base URL and use your TokenMix API key.
Pricing
Input $0.19/M tokens, output $0.475/M tokens
Capabilities
Vision, Function calling, JSON mode, Streaming
Model specs
- Context: 2000K tokens
- Max output: 30K tokens
Availability
1/1 available API endpoints are healthy right now.
Recent performance
TTFT 1419ms, latency 4366ms, throughput 47.7 tok/s.
Start using this model
Create an API key, top up from $1 when needed, and call this model through the TokenMix OpenAI-compatible endpoint.