TokenMix Research Lab · 2026-03-17

Getting Started with TokenMix API in 5 Minutes
Last Updated: 2026-04-29
Author: TokenMix Research Lab
Drop-in OpenAI-compatible endpoint at https://api.tokenmix.ai/v1 — change one base_url, keep your existing OpenAI SDK code, gain access to GPT-4o, Claude Sonnet 4, Gemini 2.0 Flash, DeepSeek R1, Llama 4, and 300+ more.
TokenMix gives you access to all major AI models — GPT-4o, Claude Sonnet 4, Gemini 2.0 Flash, DeepSeek R1, Llama 4, and more — through a single OpenAI-compatible API. If you have used the OpenAI SDK before, you already know how to use TokenMix. If you have not, this guide will get you making API calls in under 5 minutes.
Table of Contents
- Step 1: Get Your API Key
- Step 2: Install the SDK
- Step 3: Make Your First API Call
- Step 4: Streaming Responses
- Step 5: Switch Between Models
- Common Patterns
- Where to Go Next?
Step 1: Get Your API Key
Sign up, create a key in Dashboard > API Keys, save it once — keys are shown only at creation.
- Sign up at tokenmix.ai
- Go to Dashboard > API Keys
- Click "Create New Key"
- Copy and save your key somewhere secure. You will not be able to see it again.
Step 2: Install the SDK
Use the standard OpenAI SDK — TokenMix is fully OpenAI-compatible, no proprietary client required.
Python:
pip install openai
Node.js:
npm install openai
Step 3: Make Your First API Call
Point base_url at TokenMix and call chat.completions.create exactly as you would with OpenAI — same auth, same payload, same response shape.
Python
import openai
import sys
client = openai.OpenAI(
base_url="https://api.tokenmix.ai/v1",
api_key="your-tokenmix-api-key" # Replace with your actual key
)
try:
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain what an API gateway is in two sentences."}
],
max_tokens=200,
temperature=0.7
)
print(response.choices[0].message.content)
except openai.AuthenticationError:
print("Invalid API key. Check your key at tokenmix.ai/dashboard/keys")
sys.exit(1)
except openai.RateLimitError:
print("Rate limit reached. Wait a moment and try again.")
sys.exit(1)
except openai.APIError as e:
print(f"API error: {e.message}")
sys.exit(1)
Node.js
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.tokenmix.ai/v1",
apiKey: "your-tokenmix-api-key", // Replace with your actual key
});
async function main() {
try {
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain what an API gateway is in two sentences." },
],
max_tokens: 200,
temperature: 0.7,
});
console.log(response.choices[0].message.content);
} catch (error) {
if (error instanceof OpenAI.AuthenticationError) {
console.error("Invalid API key. Check your key at tokenmix.ai/dashboard/keys");
} else if (error instanceof OpenAI.RateLimitError) {
console.error("Rate limit reached. Wait a moment and try again.");
} else {
console.error("API error:", error.message);
}
process.exit(1);
}
}
main();
cURL
curl https://api.tokenmix.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-tokenmix-api-key" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain what an API gateway is in two sentences."}
],
"max_tokens": 200
}'
Step 4: Streaming Responses
Pass stream=True to get token-by-token output — required for any chat UI where users wait on the first character. For chat applications or any UI that shows text as it is generated, use streaming:
Python Streaming
import openai
client = openai.OpenAI(
base_url="https://api.tokenmix.ai/v1",
api_key="your-tokenmix-api-key"
)
try:
stream = client.chat.completions.create(
model="claude-sonnet-4",
messages=[
{"role": "user", "content": "Write a short guide on Python type hints."}
],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="", flush=True)
print() # Final newline
except openai.APIError as e:
print(f"\nStream error: {e.message}")
Node.js Streaming
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.tokenmix.ai/v1",
apiKey: "your-tokenmix-api-key",
});
async function main() {
const stream = await client.chat.completions.create({
model: "claude-sonnet-4",
messages: [
{ role: "user", content: "Write a short guide on Python type hints." },
],
stream: true,
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content;
if (content) {
process.stdout.write(content);
}
}
console.log();
}
main().catch(console.error);
Step 5: Switch Between Models
Switching models is a one-line change — same endpoint, same SDK, same API key, just edit the model parameter. The best part of using TokenMix: switching models is a one-line change. Every model uses the same endpoint, same SDK, same API key:
# Just change the model parameter
response = client.chat.completions.create(
model="claude-sonnet-4", # Or: gpt-4o, gemini-2.0-flash, deepseek-r1, llama-4
messages=[{"role": "user", "content": "Hello!"}]
)
No new SDK, no new API key, no new billing account. This makes it trivial to benchmark models against each other on your own data.
Common Patterns
The three patterns every production codebase needs: timeouts, retry-with-backoff, and env-var-loaded keys.
Setting a Timeout
client = openai.OpenAI(
base_url="https://api.tokenmix.ai/v1",
api_key="your-tokenmix-api-key",
timeout=30.0 # 30 second timeout
)
Retry with Exponential Backoff
import time
import openai
def call_with_retry(client, max_retries=3, **kwargs):
for attempt in range(max_retries):
try:
return client.chat.completions.create(**kwargs)
except openai.RateLimitError:
if attempt == max_retries - 1:
raise
wait = 2 ** attempt # 1s, 2s, 4s
time.sleep(wait)
except openai.APIError:
if attempt == max_retries - 1:
raise
time.sleep(1)
Using Environment Variables (Recommended)
import os
import openai
client = openai.OpenAI(
base_url="https://api.tokenmix.ai/v1",
api_key=os.environ["TOKENMIX_API_KEY"] # Set in your environment
)
# In your .env or shell profile
export TOKENMIX_API_KEY=sk-your-key-here
Where to Go Next?
After your first call works: explore the Models page for full pricing, monitor usage in Dashboard, and read function-calling/embeddings docs for advanced features.
- Explore available models: Visit the Models page to see all supported models with capabilities and pricing
- Read the full API docs: Check the Documentation for advanced features like function calling, embeddings, and image generation
- Monitor your usage: The Dashboard shows real-time token usage and cost breakdowns
- Add credits: Top up your account at Dashboard > Credits using Alipay, WeChat Pay, or Stripe
- Get help: If you run into issues, reach out through the support channel listed on the website
You now have everything you need to start building with any major AI model through a single API. The entire setup — from sign-up to working code — should take less than 5 minutes.