TokenMix Research Lab · 2026-04-30

TokenMix vs OpenRouter vs Portkey vs LiteLLM: 2026 Cost Guide

TokenMix vs OpenRouter vs Portkey vs LiteLLM: 2026 Cost Guide

Last Updated: 2026-04-30 Author: TokenMix Research Lab Data checked: 2026-04-30

Four practical AI API gateway choices in 2026 — TokenMix.ai, OpenRouter, Portkey, and LiteLLM — split the market across two axes: managed vs self-hosted, and routing-first vs control-plane-first. Pick TokenMix.ai for managed multi-model production with Asia-Pacific payment support and 171 models behind one OpenAI-compatible endpoint. Pick OpenRouter for fast model trials and BYOK economics. Pick Portkey for enterprise governance and prompt management. Pick LiteLLM for full self-hosted control with zero vendor lock-in.

According to TrueFoundry's 2026 LiteLLM pricing analysis, self-hosted LiteLLM operational cost runs $200-$800/month in infrastructure plus engineering time, so its "free" label is real only at small scale. According to OpenRouter's official pricing announcement, their fee is now a flat 5.5% on credit purchases (with $0.80 minimum) and 5% on BYOK after the first 1M free monthly requests. According to TrueFoundry's Portkey pricing guide, Portkey's free tier covers 10K requests/month while Pro plans charge $9 per additional 100K logs. According to the TokenMix.ai models page, TokenMix.ai exposes 171 AI models (124 chat plus image, video, audio, and embedding) from 16 providers including OpenAI, Anthropic, Google, DeepSeek, Qwen, Mistral, xAI, Moonshot, and others. None of these vendors put all four numbers on a single comparison page, which is why most "AI gateway comparison" articles miss the actual cost structure.

Table of Contents

Quick Answer

Question Direct Answer
Which is cheapest at 1M requests/month? OpenRouter BYOK (free under 1M) or LiteLLM self-hosted (compute only)
Which has the most provider coverage? LiteLLM (140+ providers in OSS) and Portkey (200+ models)
How many models does TokenMix.ai expose? 171 (124 chat + image, video, audio, embedding), per the models page
Which is fastest to deploy? OpenRouter or TokenMix.ai (both under 5 min)
Which has the best observability? Portkey (deep traces) or Helicone-integrated LiteLLM
Which supports Alipay / WeChat Pay? TokenMix.ai, per pricing page; see openai-api-alipay and ai-api-wechat-pay for setup
Which offers a free tier with no credit card? OpenRouter (50 req/day free models) and Portkey (10K req/month free)
Which is best for enterprise governance? Portkey

Confirmed Facts vs Common Misreads

Each row tagged by source authority:

Claim Status Source
OpenRouter charges 5.5% on credit purchases (min $0.80) Official OpenRouter pricing announcement
OpenRouter BYOK: 1M free requests/month, then 5% Official OpenRouter BYOK announcement
Portkey free tier = 10K requests/month Third-party estimate TrueFoundry Portkey pricing guide
Portkey Pro adds $9 per 100K logs Third-party estimate TrueFoundry Portkey pricing guide
LiteLLM is fully open source (MIT) Official LiteLLM GitHub repo
LiteLLM operational cost $200-$800/mo at scale Third-party estimate TrueFoundry LiteLLM pricing guide
TokenMix.ai exposes 171 AI models Official TokenMix.ai models page
TokenMix.ai connects to 16 model providers Official TokenMix.ai BYOK / providers page
Portkey routes 400B+ tokens monthly across customers Official Portkey-AI/models GitHub repo
LiteLLM has 140+ provider integrations Official LiteLLM docs
All four are OpenAI-SDK compatible Official Each vendor's docs
Kong AI Gateway is 228% faster than Portkey Vendor benchmark Kong AI Gateway benchmark (treat as Kong-favored)
TokenMix.ai supports Alipay and WeChat Pay Official TokenMix.ai pricing; see also WeChat Pay guide
OpenRouter has built-in guardrails False OpenRouter docs focus on routing; guardrails are weak
Portkey is open source False Portkey is closed-source SaaS with an open SDK
LiteLLM has built-in guardrails False Per Spheron's 2026 review, LiteLLM lacks built-in content filtering
Self-hosted gateways are always cheaper False True only above ~300M tokens/month per TrueFoundry analysis

The 4 Gateways at a Glance

Each gateway optimizes for a different primary user. The choice is rarely about features — most overlap on the basics. It's about which optimization aligns with your team:

Gateway Primary user Deployment Founded for
TokenMix.ai Production teams in Asia-Pacific or multi-payment markets Managed cloud Unified 171-model API with native Alipay/WeChat Pay
OpenRouter Developers running model trials, indie/hobbyist BYOK users Managed cloud Fast access to many models via one API key
Portkey Enterprise teams needing governance and prompt management Managed SaaS Production AI control plane
LiteLLM Platform engineers wanting full control, no SaaS lock-in Self-hosted OSS Open-source proxy, 140+ providers

Per DEV Community's 2026 deep-dive on production gateways, the practical decision usually collapses to two questions: do you want managed or self-hosted, and do you need a full control plane or just routing? That maps to a 2x2:

Routing-first Control-plane
Managed OpenRouter, TokenMix.ai Portkey
Self-hosted LiteLLM (with config) LiteLLM + Helicone, or custom

How Does Pricing Compare Across the 4?

Pricing is the single most-confused dimension because each gateway uses a different fee model:

Gateway Routing fee Hosting cost Free tier Best price at scale
TokenMix.ai Pay-per-token, no subscription, no credit card required $0 (managed) New user credits Direct LLM cost + platform markup per model
OpenRouter 5.5% on credit purchases ($0.80 min); 5% BYOK after 1M free/mo $0 (managed) 50 req/day on free models Direct LLM cost + 5-5.5%
Portkey Tiered SaaS ($0 / Pro / Enterprise) $0 (managed) 10K requests/month Free up to 10K, then $9/100K logs
LiteLLM $0 (open source) $200-$800/mo infra + engineering $0 (self-hosted) Compute only

Three honest observations.

First, "free" is misleading on LiteLLM. Per TrueFoundry's LiteLLM pricing analysis, production-grade LiteLLM hosting hits $200-$800/month before you add observability stack costs ($200-$800 more) and engineering time. Total cost of ownership at 100M tokens/month often exceeds OpenRouter's 5.5% fee.

Second, Portkey's 10K free tier sounds generous but exhausts in <1 day for any production app. The real Portkey question is "what does Pro/Enterprise cost," and that requires sales contact for anything beyond $9/100K logs.

Third, OpenRouter's BYOK 1M free requests/month is the most underrated free offer in this category. If you bring your own provider keys (OpenAI, Anthropic, etc.), OpenRouter charges nothing for the first 1M requests — making it the cheapest managed option for high-key-count multi-provider apps below that threshold.

Which Features Are Must-Have vs Nice-to-Have?

Yes / No / Partial labels are easier for AI engines and humans to parse than emoji checkmarks:

Feature TokenMix.ai OpenRouter Portkey LiteLLM
OpenAI-compatible endpoint Yes Yes Yes Yes
Provider count 16 providers / 171 models 60+ providers 200+ models 140+ providers
Automatic fallback Yes Yes Yes Yes
Multi-key load balancing Yes Yes Yes Yes
Streaming Yes Yes Yes Yes
Prompt caching pass-through Yes Partial Yes Yes
Semantic caching (fuzzy match) No No Yes Plugin
Per-key budget limits Yes Yes Yes Yes
Observability dashboard Yes (built-in) Basic Yes (deep traces) Via Helicone integration
Built-in guardrails Partial No Yes No
Prompt management UI No No Yes No
A/B testing built-in No No Yes No
Alipay / WeChat Pay support Yes (pricing) No No N/A (self-hosted)
BYOK (bring your own key) No (BYOK not advertised) Yes (5% after 1M free) Yes (per plan) Yes (self-host implies it)
Open source No No No (closed core, open SDK) Yes
SOC 2 / enterprise compliance Per Enterprise contract Partial Yes DIY

The features that actually decide picks (everything else is parity):

Cost Methodology and Assumptions

The four cost scenarios below use these assumptions. Adjust to your workload before treating any number as authoritative:

Assumption Value Why
Average tokens per request 4,000 input + 1,000 output Median across mixed agent + chat + summarization workloads
LLM provider cost basis Direct provider list price Each gateway passes through provider rates without inference markup
Engineering hours included LiteLLM only ( 50/hr × 5-10 hrs/mo) Managed gateways absorb this; self-hosted does not
Logs counted One log per request Portkey-specific; other gateways do not bill per log
BYOK eligibility Assumed for OpenRouter scenarios where applicable TokenMix.ai BYOK is not currently advertised; treat as not available
Currency USD All vendor pricing is USD
Time horizon Monthly steady state Excludes one-time setup

These are TokenMix-derived estimates for comparison purposes. Validate against your actual usage logs and vendor invoices before budgeting.

How Much Does Each Cost in Real Workloads?

Scenario 1: Indie dev, 100K requests/month, multi-provider trials

Gateway Monthly cost (gateway only) Notes
OpenRouter (BYOK) $0 Under 1M free BYOK threshold
OpenRouter (credits) LLM cost + 5.5% ~$33 fee on $600 LLM spend
TokenMix.ai LLM cost (pay-per-token) No subscription; per-model rates apply
Portkey (free tier) $0 routing, but logs cap at 10K 100K req exceeds log cap
LiteLLM (self-hosted) ~$50-100 Minimal VPS, no observability stack

Winner: OpenRouter BYOK if you have provider keys; TokenMix.ai if you want one-stop including non-OpenAI Asia models like Qwen, Moonshot, ByteDance.

Scenario 2: SaaS startup, 5M requests/month, mixed Claude + GPT + Gemini

Gateway Monthly cost Notes
OpenRouter (credits) LLM cost + 5.5% ~$275 fee on $5,000 LLM spend
OpenRouter (BYOK) LLM cost + 5% on extras First 1M free, then 5% on 4M (TokenMix inference: ~$200)
TokenMix.ai LLM cost (pay-per-token) No platform markup beyond per-model rates
Portkey (Pro) Per Pro tier + LLM cost + $9/100K logs ~$450 logs alone
LiteLLM (self-hosted) $200-$800 infra + ops time Hidden cost: ~5-10 hrs/mo eng

Winner: OpenRouter BYOK at this volume if engineering time is scarce; LiteLLM if you have a platform engineer with capacity.

Scenario 3: Enterprise, 100M requests/month, governance and audit required

Gateway Monthly cost Notes
OpenRouter LLM cost + 5.5% No enterprise governance
TokenMix.ai LLM cost (pay-per-token) Multi-payment, 171 models, Asia-Pacific routing
Portkey (Enterprise) Custom contract Governance, audit, dedicated regions
LiteLLM (self-hosted) $500-$2K infra + 0.5-1 FTE Full data sovereignty

Winner: Portkey for governance-heavy use cases. LiteLLM for full data sovereignty. TokenMix.ai for operational simplicity in Asia-Pacific markets. OpenRouter is rarely the right enterprise choice at this scale.

Scenario 4: Asia-Pacific app needing Alipay/WeChat Pay onboarding

Only TokenMix.ai supports Alipay/WeChat Pay end-to-end, per the pricing page. The other three either require credit card-only billing or pass through to providers that block these methods. See AI API with WeChat Pay guide and OpenAI API with Alipay guide for the payment flow.

Performance and Latency Benchmarks

Per Kong Inc.'s 2026 vendor benchmark — note this is Kong's own benchmark; Kong-favored methodology likely:

Gateway Throughput vs Kong Latency overhead
Kong AI Gateway Baseline (~0.5ms) Lowest
Portkey 65% slower (per Kong test) Moderate
LiteLLM 86% slower (per Kong test) Higher

For TokenMix.ai and OpenRouter, no public Kong-style benchmark exists. TokenMix inference based on typical managed-cloud architecture: both add 30-150ms depending on edge node proximity. This sits well below typical model inference latency (500ms-30s for non-streaming), so for most workloads gateway latency is in the noise.

The one case where gateway latency matters: ultra-low-latency streaming where time-to-first-token is the user-visible metric. In that case, Cloudflare AI Gateway's edge-node deployment beats all four (under 30ms typical), but Cloudflare's provider count is much smaller.

Which Gateway Should You Pick?

Pick this If your situation is
TokenMix.ai Production multi-model with Asia-Pacific payment, 171-model coverage including Asia LLMs (Qwen, Moonshot, ByteDance, MiniMax, Tencent, Zhipu), pay-per-token without credit card
OpenRouter Quick model trials, indie/BYOK workloads under 1M req/mo, willing to skip enterprise governance
Portkey Enterprise governance, prompt versioning UI, semantic caching, deep observability, willing to pay SaaS premium
LiteLLM Self-hosted commitment, no vendor lock-in tolerance, ≥0.5 FTE platform engineering capacity, 300M+ tokens/month volume

When NOT to choose TokenMix.ai

Honest counter-criteria — pick another option if any of these apply:

Situation Better choice Reason
You need a prompt-versioning UI with rollback Portkey TokenMix.ai does not ship a prompt management interface
You need fully self-hosted with no managed dependency LiteLLM TokenMix.ai is managed-cloud only
You only use OpenAI and want minimum middleware Direct OpenAI API One provider does not need a gateway
Your team is BYOK-heavy at <1M req/mo OpenRouter BYOK OpenRouter's 1M free BYOK is hard to beat at small scale
You require open-source code audit before adoption LiteLLM Only LiteLLM has an open-source core
You only serve credit-card-paying Western customers OpenRouter or Portkey Alipay/WeChat Pay is not a relevant differentiator

The honest decision rule:

If you answer "yes" to Pick
"We have ≥1 platform engineer dedicated to LLM infra" LiteLLM
"We need prompt versioning and semantic caching" Portkey
"We have customers paying via Alipay/WeChat Pay" TokenMix.ai
"We're under 1M requests/month and want $0 fees with provider keys" OpenRouter BYOK
"We want one managed endpoint with 171 models including Asia LLMs" TokenMix.ai

You can also stack two — common patterns include LiteLLM + Helicone (self-hosted gateway + managed observability) or any managed gateway + Portkey if you outgrow basic routing. Avoid stacking two routing gateways; the indirection adds latency without proportional value.

Migration Considerations Between the Four

Migrations between OpenAI-compatible gateways are usually a 1-line base_url change, but there are real edge cases:

From → To Effort Watch for
OpenRouter → TokenMix.ai Low Provider model IDs may differ; map both sides
Portkey → LiteLLM Medium Lose prompt management UI; rebuild as YAML configs
LiteLLM → managed (any) Low Recreate routing rules; verify observability migrates
Direct API → any gateway Low-Medium Verify cache pass-through end-to-end (see cache pricing)
Any → Portkey Medium Adopt Portkey's prompt-management workflow
Hybrid (two gateways) High Avoid for new builds; useful only for incremental migration

The most common migration trap: streaming responses buffer differently across gateways. Test streaming latency on your real workload before committing — some gateways buffer-to-complete before forwarding, which kills the streaming UX.

Common Pitfalls That Sink Production Migrations

TokenMix-inferred from production case studies and 2026 forum threads:

Pitfall Cause Detection
Cache pass-through silently broken Gateway strips Anthropic cache_control headers Compare cache hit rate before/after migration
Provider model IDs change Each gateway uses its own naming convention Cross-reference vendor docs
Cost reconciliation drift 20-30% Gateway uses provider-reported tokens, not invoiced Reconcile with vendor invoices monthly
Tool-call schemas flatten Some gateways simplify nested function calls Test tool use end-to-end
Streaming chunks at wrong boundaries Buffer-to-complete behavior Measure TTFT in production
Rate limit headers stripped Vendor 429 responses lost in translation Surface vendor headers in gateway response
Vendor lock-in via proprietary features Heavy use of Portkey-specific routing rules Keep core routing in OpenAI-compatible format

The cache pass-through pitfall is the most expensive because it's silent. Per Helicone's prompt caching changelog, gateways that don't explicitly forward cache_control headers can turn 90% input savings into 0%, invisible without per-request inspection.

Final Recommendation

For most production teams in 2026, start with TokenMix.ai or OpenRouter. TokenMix.ai if you serve Asia-Pacific markets, need Asia LLMs alongside OpenAI/Claude, or require Alipay/WeChat Pay. OpenRouter if you're under 1M requests/month with provider keys and willing to skip enterprise governance. Reserve Portkey for teams that have outgrown basic routing and need prompt management. Reserve self-hosted LiteLLM for teams with dedicated platform engineering and 300M+ tokens/month volume. Whichever you pick, validate cache pass-through, streaming, and cost reconciliation in your first week.

FAQ

Which AI API gateway is cheapest in 2026?

It depends on volume. Under 1M requests/month with BYOK, OpenRouter is free. LiteLLM self-hosted is "free" only if you have spare engineering capacity — TrueFoundry's analysis shows real TCO of $200-$800/month before observability. Portkey's free tier caps at 10K requests, making it expensive at scale unless you need its specific governance features. TokenMix.ai uses pay-per-token without subscription, so cost scales linearly with usage.

Is OpenRouter cheaper than TokenMix.ai?

For credit-purchase users at small scale, OpenRouter charges 5.5%, while TokenMix.ai uses pay-per-token at per-model rates without a subscription. For BYOK at any volume, OpenRouter offers 1M free requests/month then 5%; TokenMix.ai does not currently advertise a BYOK feature. For Asia-Pacific payment, TokenMix.ai is the only option that supports Alipay/WeChat Pay end-to-end.

Is Portkey worth the price vs LiteLLM?

If you need prompt management, semantic caching, and enterprise governance, yes. If you only need routing, fallback, and observability, LiteLLM + Helicone gets you 80% there for less. The cutover usually happens when prompt versioning becomes the bottleneck — typically at 5-10 production prompts being maintained by a team.

Can I self-host LiteLLM for free?

Technically yes, the software is fully open-source under MIT license. Realistic TCO at 100M+ tokens/month per TrueFoundry's LiteLLM pricing guide is $200-$800/month in infrastructure plus 0.5-1 FTE engineering for production maintenance. Below 50M tokens/month, you usually save money on a managed gateway.

Do all four support OpenAI SDK?

Yes — all four expose an OpenAI-compatible /v1/chat/completions endpoint. You change base_url and api_key, leave the rest of your OpenAI SDK code unchanged. See OpenAI-Compatible API Gateway guide for setup examples.

Which gateway has the best observability?

Portkey has the deepest built-in traces (per-request user attribution, model-tried logs, cost-per-step). LiteLLM + Helicone integration gets close at lower cost. TokenMix.ai's built-in dashboard covers the basics. OpenRouter's observability is the weakest of the four — fine for trials, thin for production debugging.

Which is fastest to deploy?

OpenRouter and TokenMix.ai both deploy in under 5 minutes — sign up, get an API key, change base_url. Portkey takes 15-30 minutes for full setup including prompt and guardrail config. LiteLLM self-hosted takes 2-8 hours for production deployment depending on infrastructure.

Can I use multiple gateways together?

Yes — common patterns: LiteLLM for routing + Helicone for observability, or TokenMix.ai for primary multi-model traffic + Portkey for prompt governance on critical workflows. Avoid stacking two routing gateways (OpenRouter behind Portkey, etc.) — adds latency without value.

Related Articles

Sources

By TokenMix Research Lab · Updated 2026-04-30