TokenMix Research Lab · 2026-04-30

TokenMix vs OpenRouter vs Portkey vs LiteLLM: 2026 Cost Guide
Last Updated: 2026-04-30 Author: TokenMix Research Lab Data checked: 2026-04-30
Four practical AI API gateway choices in 2026 — TokenMix.ai, OpenRouter, Portkey, and LiteLLM — split the market across two axes: managed vs self-hosted, and routing-first vs control-plane-first. Pick TokenMix.ai for managed multi-model production with Asia-Pacific payment support and 171 models behind one OpenAI-compatible endpoint. Pick OpenRouter for fast model trials and BYOK economics. Pick Portkey for enterprise governance and prompt management. Pick LiteLLM for full self-hosted control with zero vendor lock-in.
According to TrueFoundry's 2026 LiteLLM pricing analysis, self-hosted LiteLLM operational cost runs $200-$800/month in infrastructure plus engineering time, so its "free" label is real only at small scale. According to OpenRouter's official pricing announcement, their fee is now a flat 5.5% on credit purchases (with $0.80 minimum) and 5% on BYOK after the first 1M free monthly requests. According to TrueFoundry's Portkey pricing guide, Portkey's free tier covers 10K requests/month while Pro plans charge $9 per additional 100K logs. According to the TokenMix.ai models page, TokenMix.ai exposes 171 AI models (124 chat plus image, video, audio, and embedding) from 16 providers including OpenAI, Anthropic, Google, DeepSeek, Qwen, Mistral, xAI, Moonshot, and others. None of these vendors put all four numbers on a single comparison page, which is why most "AI gateway comparison" articles miss the actual cost structure.
Table of Contents
- Quick Answer
- Confirmed Facts vs Common Misreads
- The 4 Gateways at a Glance
- How Does Pricing Compare Across the 4?
- Which Features Are Must-Have vs Nice-to-Have?
- Cost Methodology and Assumptions
- How Much Does Each Cost in Real Workloads?
- Performance and Latency Benchmarks
- Which Gateway Should You Pick?
- Migration Considerations Between the Four
- Common Pitfalls That Sink Production Migrations
- Final Recommendation
- FAQ
- Related Articles
- Sources
Quick Answer
| Question | Direct Answer |
|---|---|
| Which is cheapest at 1M requests/month? | OpenRouter BYOK (free under 1M) or LiteLLM self-hosted (compute only) |
| Which has the most provider coverage? | LiteLLM (140+ providers in OSS) and Portkey (200+ models) |
| How many models does TokenMix.ai expose? | 171 (124 chat + image, video, audio, embedding), per the models page |
| Which is fastest to deploy? | OpenRouter or TokenMix.ai (both under 5 min) |
| Which has the best observability? | Portkey (deep traces) or Helicone-integrated LiteLLM |
| Which supports Alipay / WeChat Pay? | TokenMix.ai, per pricing page; see openai-api-alipay and ai-api-wechat-pay for setup |
| Which offers a free tier with no credit card? | OpenRouter (50 req/day free models) and Portkey (10K req/month free) |
| Which is best for enterprise governance? | Portkey |
Confirmed Facts vs Common Misreads
Each row tagged by source authority:
- Official = vendor's own pricing or docs page
- Third-party estimate = independent analyst (TrueFoundry, etc.)
- Vendor benchmark = vendor-published benchmark with known bias
- TokenMix inference = derived by us from public data
| Claim | Status | Source |
|---|---|---|
| OpenRouter charges 5.5% on credit purchases (min $0.80) | Official | OpenRouter pricing announcement |
| OpenRouter BYOK: 1M free requests/month, then 5% | Official | OpenRouter BYOK announcement |
| Portkey free tier = 10K requests/month | Third-party estimate | TrueFoundry Portkey pricing guide |
| Portkey Pro adds $9 per 100K logs | Third-party estimate | TrueFoundry Portkey pricing guide |
| LiteLLM is fully open source (MIT) | Official | LiteLLM GitHub repo |
| LiteLLM operational cost $200-$800/mo at scale | Third-party estimate | TrueFoundry LiteLLM pricing guide |
| TokenMix.ai exposes 171 AI models | Official | TokenMix.ai models page |
| TokenMix.ai connects to 16 model providers | Official | TokenMix.ai BYOK / providers page |
| Portkey routes 400B+ tokens monthly across customers | Official | Portkey-AI/models GitHub repo |
| LiteLLM has 140+ provider integrations | Official | LiteLLM docs |
| All four are OpenAI-SDK compatible | Official | Each vendor's docs |
| Kong AI Gateway is 228% faster than Portkey | Vendor benchmark | Kong AI Gateway benchmark (treat as Kong-favored) |
| TokenMix.ai supports Alipay and WeChat Pay | Official | TokenMix.ai pricing; see also WeChat Pay guide |
| OpenRouter has built-in guardrails | False | OpenRouter docs focus on routing; guardrails are weak |
| Portkey is open source | False | Portkey is closed-source SaaS with an open SDK |
| LiteLLM has built-in guardrails | False | Per Spheron's 2026 review, LiteLLM lacks built-in content filtering |
| Self-hosted gateways are always cheaper | False | True only above ~300M tokens/month per TrueFoundry analysis |
The 4 Gateways at a Glance
Each gateway optimizes for a different primary user. The choice is rarely about features — most overlap on the basics. It's about which optimization aligns with your team:
| Gateway | Primary user | Deployment | Founded for |
|---|---|---|---|
| TokenMix.ai | Production teams in Asia-Pacific or multi-payment markets | Managed cloud | Unified 171-model API with native Alipay/WeChat Pay |
| OpenRouter | Developers running model trials, indie/hobbyist BYOK users | Managed cloud | Fast access to many models via one API key |
| Portkey | Enterprise teams needing governance and prompt management | Managed SaaS | Production AI control plane |
| LiteLLM | Platform engineers wanting full control, no SaaS lock-in | Self-hosted OSS | Open-source proxy, 140+ providers |
Per DEV Community's 2026 deep-dive on production gateways, the practical decision usually collapses to two questions: do you want managed or self-hosted, and do you need a full control plane or just routing? That maps to a 2x2:
| Routing-first | Control-plane | |
|---|---|---|
| Managed | OpenRouter, TokenMix.ai | Portkey |
| Self-hosted | LiteLLM (with config) | LiteLLM + Helicone, or custom |
How Does Pricing Compare Across the 4?
Pricing is the single most-confused dimension because each gateway uses a different fee model:
| Gateway | Routing fee | Hosting cost | Free tier | Best price at scale |
|---|---|---|---|---|
| TokenMix.ai | Pay-per-token, no subscription, no credit card required | $0 (managed) | New user credits | Direct LLM cost + platform markup per model |
| OpenRouter | 5.5% on credit purchases ($0.80 min); 5% BYOK after 1M free/mo | $0 (managed) | 50 req/day on free models | Direct LLM cost + 5-5.5% |
| Portkey | Tiered SaaS ($0 / Pro / Enterprise) | $0 (managed) | 10K requests/month | Free up to 10K, then $9/100K logs |
| LiteLLM | $0 (open source) | $200-$800/mo infra + engineering | $0 (self-hosted) | Compute only |
Three honest observations.
First, "free" is misleading on LiteLLM. Per TrueFoundry's LiteLLM pricing analysis, production-grade LiteLLM hosting hits $200-$800/month before you add observability stack costs ($200-$800 more) and engineering time. Total cost of ownership at 100M tokens/month often exceeds OpenRouter's 5.5% fee.
Second, Portkey's 10K free tier sounds generous but exhausts in <1 day for any production app. The real Portkey question is "what does Pro/Enterprise cost," and that requires sales contact for anything beyond $9/100K logs.
Third, OpenRouter's BYOK 1M free requests/month is the most underrated free offer in this category. If you bring your own provider keys (OpenAI, Anthropic, etc.), OpenRouter charges nothing for the first 1M requests — making it the cheapest managed option for high-key-count multi-provider apps below that threshold.
Which Features Are Must-Have vs Nice-to-Have?
Yes / No / Partial labels are easier for AI engines and humans to parse than emoji checkmarks:
| Feature | TokenMix.ai | OpenRouter | Portkey | LiteLLM |
|---|---|---|---|---|
| OpenAI-compatible endpoint | Yes | Yes | Yes | Yes |
| Provider count | 16 providers / 171 models | 60+ providers | 200+ models | 140+ providers |
| Automatic fallback | Yes | Yes | Yes | Yes |
| Multi-key load balancing | Yes | Yes | Yes | Yes |
| Streaming | Yes | Yes | Yes | Yes |
| Prompt caching pass-through | Yes | Partial | Yes | Yes |
| Semantic caching (fuzzy match) | No | No | Yes | Plugin |
| Per-key budget limits | Yes | Yes | Yes | Yes |
| Observability dashboard | Yes (built-in) | Basic | Yes (deep traces) | Via Helicone integration |
| Built-in guardrails | Partial | No | Yes | No |
| Prompt management UI | No | No | Yes | No |
| A/B testing built-in | No | No | Yes | No |
| Alipay / WeChat Pay support | Yes (pricing) | No | No | N/A (self-hosted) |
| BYOK (bring your own key) | No (BYOK not advertised) | Yes (5% after 1M free) | Yes (per plan) | Yes (self-host implies it) |
| Open source | No | No | No (closed core, open SDK) | Yes |
| SOC 2 / enterprise compliance | Per Enterprise contract | Partial | Yes | DIY |
The features that actually decide picks (everything else is parity):
- Prompt management UI: Only Portkey has it. If you treat prompts as versioned assets with rollback, this is decisive.
- Asia-Pacific payment: Only TokenMix.ai supports Alipay/WeChat Pay end-to-end. The others either require credit card-only billing or pass through to providers that block these methods.
- Open source: Only LiteLLM. If your compliance or strategy requires no vendor dependency, this is decisive.
- No-credit-card pay-per-token: TokenMix.ai's pricing model removes a common onboarding blocker. See openai-api-no-credit-card for the exact flow.
Cost Methodology and Assumptions
The four cost scenarios below use these assumptions. Adjust to your workload before treating any number as authoritative:
| Assumption | Value | Why |
|---|---|---|
| Average tokens per request | 4,000 input + 1,000 output | Median across mixed agent + chat + summarization workloads |
| LLM provider cost basis | Direct provider list price | Each gateway passes through provider rates without inference markup |
| Engineering hours included | LiteLLM only ($150/hr × 5-10 hrs/mo) | Managed gateways absorb this; self-hosted does not |
| Logs counted | One log per request | Portkey-specific; other gateways do not bill per log |
| BYOK eligibility | Assumed for OpenRouter scenarios where applicable | TokenMix.ai BYOK is not currently advertised; treat as not available |
| Currency | USD | All vendor pricing is USD |
| Time horizon | Monthly steady state | Excludes one-time setup |
These are TokenMix-derived estimates for comparison purposes. Validate against your actual usage logs and vendor invoices before budgeting.
How Much Does Each Cost in Real Workloads?
Scenario 1: Indie dev, 100K requests/month, multi-provider trials
| Gateway | Monthly cost (gateway only) | Notes |
|---|---|---|
| OpenRouter (BYOK) | $0 | Under 1M free BYOK threshold |
| OpenRouter (credits) | LLM cost + 5.5% | ~$33 fee on $600 LLM spend |
| TokenMix.ai | LLM cost (pay-per-token) | No subscription; per-model rates apply |
| Portkey (free tier) | $0 routing, but logs cap at 10K | 100K req exceeds log cap |
| LiteLLM (self-hosted) | ~$50-100 | Minimal VPS, no observability stack |
Winner: OpenRouter BYOK if you have provider keys; TokenMix.ai if you want one-stop including non-OpenAI Asia models like Qwen, Moonshot, ByteDance.
Scenario 2: SaaS startup, 5M requests/month, mixed Claude + GPT + Gemini
| Gateway | Monthly cost | Notes |
|---|---|---|
| OpenRouter (credits) | LLM cost + 5.5% | ~$275 fee on $5,000 LLM spend |
| OpenRouter (BYOK) | LLM cost + 5% on extras | First 1M free, then 5% on 4M (TokenMix inference: ~$200) |
| TokenMix.ai | LLM cost (pay-per-token) | No platform markup beyond per-model rates |
| Portkey (Pro) | Per Pro tier + LLM cost + $9/100K logs | ~$450 logs alone |
| LiteLLM (self-hosted) | $200-$800 infra + ops time | Hidden cost: ~5-10 hrs/mo eng |
Winner: OpenRouter BYOK at this volume if engineering time is scarce; LiteLLM if you have a platform engineer with capacity.
Scenario 3: Enterprise, 100M requests/month, governance and audit required
| Gateway | Monthly cost | Notes |
|---|---|---|
| OpenRouter | LLM cost + 5.5% | No enterprise governance |
| TokenMix.ai | LLM cost (pay-per-token) | Multi-payment, 171 models, Asia-Pacific routing |
| Portkey (Enterprise) | Custom contract | Governance, audit, dedicated regions |
| LiteLLM (self-hosted) | $500-$2K infra + 0.5-1 FTE | Full data sovereignty |
Winner: Portkey for governance-heavy use cases. LiteLLM for full data sovereignty. TokenMix.ai for operational simplicity in Asia-Pacific markets. OpenRouter is rarely the right enterprise choice at this scale.
Scenario 4: Asia-Pacific app needing Alipay/WeChat Pay onboarding
Only TokenMix.ai supports Alipay/WeChat Pay end-to-end, per the pricing page. The other three either require credit card-only billing or pass through to providers that block these methods. See AI API with WeChat Pay guide and OpenAI API with Alipay guide for the payment flow.
Performance and Latency Benchmarks
Per Kong Inc.'s 2026 vendor benchmark — note this is Kong's own benchmark; Kong-favored methodology likely:
| Gateway | Throughput vs Kong | Latency overhead |
|---|---|---|
| Kong AI Gateway | Baseline (~0.5ms) | Lowest |
| Portkey | 65% slower (per Kong test) | Moderate |
| LiteLLM | 86% slower (per Kong test) | Higher |
For TokenMix.ai and OpenRouter, no public Kong-style benchmark exists. TokenMix inference based on typical managed-cloud architecture: both add 30-150ms depending on edge node proximity. This sits well below typical model inference latency (500ms-30s for non-streaming), so for most workloads gateway latency is in the noise.
The one case where gateway latency matters: ultra-low-latency streaming where time-to-first-token is the user-visible metric. In that case, Cloudflare AI Gateway's edge-node deployment beats all four (under 30ms typical), but Cloudflare's provider count is much smaller.
Which Gateway Should You Pick?
| Pick this | If your situation is |
|---|---|
| TokenMix.ai | Production multi-model with Asia-Pacific payment, 171-model coverage including Asia LLMs (Qwen, Moonshot, ByteDance, MiniMax, Tencent, Zhipu), pay-per-token without credit card |
| OpenRouter | Quick model trials, indie/BYOK workloads under 1M req/mo, willing to skip enterprise governance |
| Portkey | Enterprise governance, prompt versioning UI, semantic caching, deep observability, willing to pay SaaS premium |
| LiteLLM | Self-hosted commitment, no vendor lock-in tolerance, ≥0.5 FTE platform engineering capacity, 300M+ tokens/month volume |
When NOT to choose TokenMix.ai
Honest counter-criteria — pick another option if any of these apply:
| Situation | Better choice | Reason |
|---|---|---|
| You need a prompt-versioning UI with rollback | Portkey | TokenMix.ai does not ship a prompt management interface |
| You need fully self-hosted with no managed dependency | LiteLLM | TokenMix.ai is managed-cloud only |
| You only use OpenAI and want minimum middleware | Direct OpenAI API | One provider does not need a gateway |
| Your team is BYOK-heavy at <1M req/mo | OpenRouter BYOK | OpenRouter's 1M free BYOK is hard to beat at small scale |
| You require open-source code audit before adoption | LiteLLM | Only LiteLLM has an open-source core |
| You only serve credit-card-paying Western customers | OpenRouter or Portkey | Alipay/WeChat Pay is not a relevant differentiator |
The honest decision rule:
| If you answer "yes" to | Pick |
|---|---|
| "We have ≥1 platform engineer dedicated to LLM infra" | LiteLLM |
| "We need prompt versioning and semantic caching" | Portkey |
| "We have customers paying via Alipay/WeChat Pay" | TokenMix.ai |
| "We're under 1M requests/month and want $0 fees with provider keys" | OpenRouter BYOK |
| "We want one managed endpoint with 171 models including Asia LLMs" | TokenMix.ai |
You can also stack two — common patterns include LiteLLM + Helicone (self-hosted gateway + managed observability) or any managed gateway + Portkey if you outgrow basic routing. Avoid stacking two routing gateways; the indirection adds latency without proportional value.
Migration Considerations Between the Four
Migrations between OpenAI-compatible gateways are usually a 1-line base_url change, but there are real edge cases:
| From → To | Effort | Watch for |
|---|---|---|
| OpenRouter → TokenMix.ai | Low | Provider model IDs may differ; map both sides |
| Portkey → LiteLLM | Medium | Lose prompt management UI; rebuild as YAML configs |
| LiteLLM → managed (any) | Low | Recreate routing rules; verify observability migrates |
| Direct API → any gateway | Low-Medium | Verify cache pass-through end-to-end (see cache pricing) |
| Any → Portkey | Medium | Adopt Portkey's prompt-management workflow |
| Hybrid (two gateways) | High | Avoid for new builds; useful only for incremental migration |
The most common migration trap: streaming responses buffer differently across gateways. Test streaming latency on your real workload before committing — some gateways buffer-to-complete before forwarding, which kills the streaming UX.
Common Pitfalls That Sink Production Migrations
TokenMix-inferred from production case studies and 2026 forum threads:
| Pitfall | Cause | Detection |
|---|---|---|
| Cache pass-through silently broken | Gateway strips Anthropic cache_control headers |
Compare cache hit rate before/after migration |
| Provider model IDs change | Each gateway uses its own naming convention | Cross-reference vendor docs |
| Cost reconciliation drift 20-30% | Gateway uses provider-reported tokens, not invoiced | Reconcile with vendor invoices monthly |
| Tool-call schemas flatten | Some gateways simplify nested function calls | Test tool use end-to-end |
| Streaming chunks at wrong boundaries | Buffer-to-complete behavior | Measure TTFT in production |
| Rate limit headers stripped | Vendor 429 responses lost in translation | Surface vendor headers in gateway response |
| Vendor lock-in via proprietary features | Heavy use of Portkey-specific routing rules | Keep core routing in OpenAI-compatible format |
The cache pass-through pitfall is the most expensive because it's silent. Per Helicone's prompt caching changelog, gateways that don't explicitly forward cache_control headers can turn 90% input savings into 0%, invisible without per-request inspection.
Final Recommendation
For most production teams in 2026, start with TokenMix.ai or OpenRouter. TokenMix.ai if you serve Asia-Pacific markets, need Asia LLMs alongside OpenAI/Claude, or require Alipay/WeChat Pay. OpenRouter if you're under 1M requests/month with provider keys and willing to skip enterprise governance. Reserve Portkey for teams that have outgrown basic routing and need prompt management. Reserve self-hosted LiteLLM for teams with dedicated platform engineering and 300M+ tokens/month volume. Whichever you pick, validate cache pass-through, streaming, and cost reconciliation in your first week.
FAQ
Which AI API gateway is cheapest in 2026?
It depends on volume. Under 1M requests/month with BYOK, OpenRouter is free. LiteLLM self-hosted is "free" only if you have spare engineering capacity — TrueFoundry's analysis shows real TCO of $200-$800/month before observability. Portkey's free tier caps at 10K requests, making it expensive at scale unless you need its specific governance features. TokenMix.ai uses pay-per-token without subscription, so cost scales linearly with usage.
Is OpenRouter cheaper than TokenMix.ai?
For credit-purchase users at small scale, OpenRouter charges 5.5%, while TokenMix.ai uses pay-per-token at per-model rates without a subscription. For BYOK at any volume, OpenRouter offers 1M free requests/month then 5%; TokenMix.ai does not currently advertise a BYOK feature. For Asia-Pacific payment, TokenMix.ai is the only option that supports Alipay/WeChat Pay end-to-end.
Is Portkey worth the price vs LiteLLM?
If you need prompt management, semantic caching, and enterprise governance, yes. If you only need routing, fallback, and observability, LiteLLM + Helicone gets you 80% there for less. The cutover usually happens when prompt versioning becomes the bottleneck — typically at 5-10 production prompts being maintained by a team.
Can I self-host LiteLLM for free?
Technically yes, the software is fully open-source under MIT license. Realistic TCO at 100M+ tokens/month per TrueFoundry's LiteLLM pricing guide is $200-$800/month in infrastructure plus 0.5-1 FTE engineering for production maintenance. Below 50M tokens/month, you usually save money on a managed gateway.
Do all four support OpenAI SDK?
Yes — all four expose an OpenAI-compatible /v1/chat/completions endpoint. You change base_url and api_key, leave the rest of your OpenAI SDK code unchanged. See OpenAI-Compatible API Gateway guide for setup examples.
Which gateway has the best observability?
Portkey has the deepest built-in traces (per-request user attribution, model-tried logs, cost-per-step). LiteLLM + Helicone integration gets close at lower cost. TokenMix.ai's built-in dashboard covers the basics. OpenRouter's observability is the weakest of the four — fine for trials, thin for production debugging.
Which is fastest to deploy?
OpenRouter and TokenMix.ai both deploy in under 5 minutes — sign up, get an API key, change base_url. Portkey takes 15-30 minutes for full setup including prompt and guardrail config. LiteLLM self-hosted takes 2-8 hours for production deployment depending on infrastructure.
Can I use multiple gateways together?
Yes — common patterns: LiteLLM for routing + Helicone for observability, or TokenMix.ai for primary multi-model traffic + Portkey for prompt governance on critical workflows. Avoid stacking two routing gateways (OpenRouter behind Portkey, etc.) — adds latency without value.
Related Articles
- AI API Gateway 2026: Routing, Fallbacks, Observability, and Cost Control
- LLM API Gateway Guide
- Best Unified AI API Gateways 2026
- OpenRouter API: Pricing, Models, Limits
- LiteLLM Alternatives 2026: 8 AI Gateway Options
- OpenRouter Alternatives 2026
- OpenRouter vs Direct API: Which is Cheaper?
- MCP Gateway Explained
Sources
- TokenMix.ai — Models page (171 models verified)
- TokenMix.ai — Pricing page (Alipay, WeChat Pay, pay-per-token)
- TokenMix.ai — BYOK / providers page (16 providers)
- TrueFoundry — Portkey AI Gateway Pricing 2026
- TrueFoundry — LiteLLM Pricing 2026: Cost of Open Source Gateways
- OpenRouter — Simplifying Our Platform Fee announcement
- OpenRouter — 1 Million Free BYOK Requests Per Month
- Kong Inc. — AI Gateway Benchmark vs Portkey and LiteLLM (vendor benchmark)
- DEV Community — Top 5 LLM Gateways in 2026
- Spheron — AI Gateway Setup 2026: LiteLLM, Portkey, Kong
- LiteLLM — GitHub repository (BerriAI/litellm)
- Portkey-AI/models — Pricing data for 200+ enterprises, 400B+ tokens/month
- Helicone — Anthropic Prompt Caching Support changelog
By TokenMix Research Lab · Updated 2026-04-30