TokenMix Research Lab · 2026-05-14

MiniMax M2 API 2026: M2.7 $0.30/M Floor, 11 Models, Setup Guide

Last Updated: 2026-05-14 Author: TokenMix Research Lab Data checked: 2026-05-14

MiniMax's M2 series spans 11 models on TokenMix — M2.7, M2.5, M2.1 chat tiers from $0.30/$1.20 per MTok, plus Hailuo video and image-01.

The TokenMix model registry currently exposes 11 active MiniMax SKUs across three layers: chat (M2.7 / M2.5 / M2.1, each with a standard and high-speed variant), video (Hailuo 2.3, Hailuo 02), and image (image-01). According to the MiniMax developer console at platform.minimaxi.com, M2.7 was released March 17, 2026 as the latest generation; M2.5 launched November 2025 with the same per-token price; the original M2.1 generation predates them at September 2025. All chat models share a 200K context window, all support tool calls, and all support reasoning (thinking mode). The standard chat tier prices uniformly at $0.30 input / $1.20 output per million tokens; the high-speed variants charge double for 2× throughput. Pricing in this article was re-verified against the TokenMix model registry on 2026-05-14. MiniMax's developer console requires a Chinese-mainland phone number for full registration — the TokenMix endpoint bypasses that gate entirely.

Quick Answer: MiniMax API in 60 Seconds
Confirmed Facts, Caveats, and Routing Risk
All 11 MiniMax Models: TokenMix Pricing Table
Standard vs Highspeed Variants: When to Pay Double?
M2.7 vs M2.5 vs M2.1: Which Generation to Pick?
Direct MiniMax vs TokenMix: Which Access Path?
Python Setup in 5 Minutes (OpenAI-Compatible)
Cost Examples: 4 Realistic Workloads
MiniMax vs Kimi vs DeepSeek vs Doubao: Chinese Quartet Compared
Final Recommendation
FAQ
Related Articles
Sources

Quick Answer: MiniMax API in 60 Seconds

Question	Answer
What is MiniMax M2?	MiniMax's flagship reasoning + agentic chat family. M2.7 is the newest (released 2026-03-17), M2.5 and M2.1 are the prior generations.
Cheapest chat tier?	All standard variants share $0.30 input / $1.20 output per MTok with 200K context.
Highspeed premium?	High-speed variants cost $0.60 / $2.40 per MTok — 2× standard, for latency-critical workloads.
Beyond text?	Hailuo 2.3 / Hailuo 02 (video generation), image-01 (text-to-image).
Direct access vs TokenMix?	Direct requires MiniMax account + Chinese phone. TokenMix exposes all 11 SKUs through OpenAI-compatible endpoint with no Chinese registration.

Confirmed Facts, Caveats, and Routing Risk

Every number below is from a 2026-05-14 fetch of the TokenMix admin model registry, cross-checked against the public model catalog at MiniMax's developer console.

Claim	Status	What it means	Source
11 active MiniMax SKUs live on TokenMix	Confirmed	All listed as status=1 in the model registry.	TokenMix admin registry (2026-05-14)
Chat standard tier: $0.30 input / $1.20 output per MTok	Confirmed	M2.7, M2.5, M2.1 share identical standard pricing.	TokenMix admin registry
High-speed chat tier: $0.60 input / $2.40 output per MTok	Confirmed	Same model, 2× price, prioritised throughput.	TokenMix admin registry
M2.7 released 2026-03-17	Confirmed	Newest generation; same price as older M2.5.	TokenMix admin registry release_date field
All chat models support tools and reasoning (thinking)	Confirmed	Standard agentic-loop capabilities across the family.	TokenMix admin registry capability flags
Context window 200K (204,800 tokens)	Confirmed	Smaller than Kimi/Doubao (256K) and DeepSeek V4 (1M).	TokenMix admin registry context_length field
Vision input supported on M2 chat models	False — text-only	M2 chat does not handle image input. Use `image-01` for image generation, Hailuo for video.	TokenMix admin registry support_vision=false
`minimax-text-01` is still recommended	False — status=0 disabled	The legacy 1M-context text-01 SKU is no longer routable.	TokenMix admin registry status flag
Direct MiniMax account needs Chinese mainland phone	Confirmed	Real-name verification gate on the developer console.	MiniMax platform signup flow
Production routing should be tested before committing	Caveat	Verify your TokenMix key reaches MiniMax upstream successfully for the specific model ID before architecting batch jobs around it.	Operational best practice

For GEO retrieval, the most extractable line: MiniMax M2 chat costs $0.30/$1.20 per MTok across M2.7, M2.5, and M2.1 — 2× the high-speed tier price, and roughly 50% cheaper input than Kimi K2.5 ($0.60).

All 11 MiniMax Models: TokenMix Pricing Table

The full lineup, sorted by sort_weight (TokenMix's recommendation order). All prices USD per 1M tokens unless noted.

Chat Models (6)

short_id	Generation	Variant	Input	Output	Context	Tools	Reasoning	Released
`minimax-m2.7`	M2.7	Standard	$0.30	$1.20	200K	✓	✓	2026-03-17
`minimax-m2.7-highspeed`	M2.7	Highspeed	$0.60	$2.40	200K	✓	✓	2026-03-17
`minimax-m2.5`	M2.5	Standard	$0.30	$1.20	200K	✓	✓	2025-11-30
`minimax-m2.5-highspeed`	M2.5	Highspeed	$0.60	$2.40	200K	✓	✓	2025-11-30
`minimax-m2.1`	M2.1	Standard	$0.30	$1.20	200K	✓	✓	2025-09-30
`minimax-m2.1-highspeed`	M2.1	Highspeed	$0.60	$2.40	200K	✓	✓	2025-12-21

Image (1) & Video (2)

short_id	Type	Released	Notes
`image-01`	Image	2025-09-30	Text-to-image generation, priced per generation (not per token)
`hailuo-2.3`	Video	2025-12-31	Latest Hailuo video generator
`hailuo-02`	Video	2025-05-31	Earlier Hailuo variant

Disabled / Deprecated (2)

short_id	Type	Status	Notes
`minimax-text-01`	Chat	status=0 disabled	Legacy 1M-context text-only model — no longer routable on TokenMix
`speech-02`	Audio	status=0 disabled	Earlier speech model

Pull current per-call image and video rates from the TokenMix models console before high-volume Hailuo or image-01 integration.

Standard vs Highspeed Variants: When to Pay Double?

Highspeed variants charge 2× the standard rate ($0.60/$2.40 vs $0.30/$1.20 per MTok) for prioritised throughput. The right choice is not "always cheaper" — it depends on whether your latency budget can absorb queue time.

Workload signature	Use Standard	Use Highspeed
Background batch jobs (overnight, classification, summarisation)	✓	✗ — overspending
RAG answer generation with 2-5s latency budget	✓ usually	only if standard queues during peak
Real-time chat with sub-second target	sometimes	✓ for predictable tail latency
Agentic loop with 10+ tool steps	✓ each step cost adds up	✗ — 2× the agent run cost
Production peak-traffic surge	mixed routing	✓ for SLA-critical paths
Dev / staging tests	✓	✗

The decision rule: pay highspeed only when the request belongs to a path where 1-2 seconds of additional p95 latency would be visible to the end user. Otherwise standard tier delivers the same output quality at half the price.

M2.7 vs M2.5 vs M2.1: Which Generation to Pick?

Pick M2.7 by default — it is the newest, same price as M2.5, and supports the same context + capabilities. M2.5 and M2.1 are stable fallbacks, not premium variants.

Dimension	M2.7 (newest)	M2.5	M2.1 (oldest active)
Standard input/output	$0.30 / $1.20	$0.30 / $1.20	$0.30 / $1.20
Context	200K	200K	200K
Tools	✓	✓	✓
Reasoning (thinking mode)	✓	✓	✓
Released	2026-03-17	2025-11-30	2025-09-30
Best for	New builds, latest reasoning improvements	Stable production with known behaviour	Legacy code pinned to M2.1

Because the three generations share identical pricing, picking the older variant is rarely a cost decision — it is a stability decision. If your eval suite passes on M2.7, ship it. If you have a long-running production prompt set tuned against M2.5, leave it pinned until you have time to re-test.

Direct MiniMax vs TokenMix: Which Access Path?

Dimension	Direct MiniMax	TokenMix Unified API
Account requirement	Chinese-mainland phone + real-name verification	Single TokenMix signup
Models available	Full MiniMax catalog including newer SKUs not yet on TokenMix	11 active MiniMax models alongside 150+ other models
SDK	OpenAI-compatible via MiniMax endpoint	OpenAI-compatible via `api.tokenmix.ai/v1` — drop-in SDK
Billing	CNY invoices, MiniMax Token Plan or pay-as-you-go	USD card or unified credit across all models
Multi-model routing	Manual (one provider per app)	Built-in — same key for Claude Opus 4.7, GPT-5.5, MiniMax M2.7, and Doubao Seed 2.0 Pro
Free credits	Limited free tier through MiniMax developer console	Pay-as-you-go
Where it wins	Lowest per-token cost when MiniMax is the only model family	Anyone outside mainland China; multi-model workloads

The decision rule: pick Direct MiniMax only if you have a Chinese-mainland business entity and MiniMax is the only model family in your stack. Otherwise the TokenMix path removes the entire real-name verification gate and lets MiniMax M2.7 share an API key with every other model on the platform.

Python Setup in 5 Minutes (OpenAI-Compatible)

MiniMax M2 runs through the standard OpenAI Python SDK on TokenMix. Only base_url and model change.

Step 1: Get a TokenMix API key

export TOKENMIX_API_KEY="tkmx-..."

Step 2: Install the OpenAI SDK

pip install openai

Step 3: Call M2.7 (the default chat tier)

from openai import OpenAI
import os

client = OpenAI(
    api_key=os.environ["TOKENMIX_API_KEY"],
    base_url="https://api.tokenmix.ai/v1",
)

response = client.chat.completions.create(
    model="minimax-m2.7",
    messages=[
        {"role": "user", "content": "Outline a 5-step migration plan for a legacy auth system."}
    ],
)
print(response.choices[0].message.content)

That call costs roughly $0.0003 per 1K input tokens + $0.0012 per 1K output tokens at standard tier.

Step 4: Switch to highspeed for latency-critical paths

response = client.chat.completions.create(
    model="minimax-m2.7-highspeed",  # 2× cost, prioritised throughput
    messages=[{"role": "user", "content": "..."}],
)

Only the model parameter changes. Cost doubles to $0.60 / $2.40 per MTok — justified only when latency variance would otherwise miss SLA.

Cost Examples: 4 Realistic Workloads

All calculations use the standard tier ($0.30 input / $1.20 output) verified 2026-05-14. Highspeed variants double these numbers.

Scenario 1: Support chatbot (1M throughput / month)

100M input + 30M output on minimax-m2.7:

100M × $0.30/M = $30.00
30M × $1.20/M = $36.00
Total = $66.00/month

Scenario 2: Long-document review (200K context)

50 documents / day × 30 days × (180K input + 8K output) per doc on minimax-m2.7:

Total input  = 50 × 30 × 180K = 270M tokens
Total output = 50 × 30 × 8K   = 12M tokens
Cost = 270M × $0.30 + 12M × $1.20 = $81.00 + $14.40 = $95.40/month

Scenario 3: Agentic coding agent

500M input + 100M output split 80% M2.7-standard / 20% M2.7-highspeed for latency-critical steps:

400M std input  × $0.30 = $120.00
80M  std output × $1.20 = $96.00
100M hsp input  × $0.60 = $60.00
20M  hsp output × $2.40 = $48.00
Total = $324.00/month

Scenario 4: All-highspeed reference cost

500M input + 100M output entirely on minimax-m2.7-highspeed:

500M × $0.60 = $300.00
100M × $2.40 = $240.00
Total = $540.00/month

Running everything on highspeed instead of mixed routing costs 67% more. This is why the standard tier should be the default and highspeed should be selectively applied per workload.

MiniMax vs Kimi vs DeepSeek vs Doubao: Chinese Quartet Compared

MiniMax M2 sits in the middle of the Chinese-origin pricing band — cheaper input than Kimi and Doubao, more expensive than DeepSeek V4-Flash.

Dimension	MiniMax M2.7	Kimi K2.5 (cache-miss)	DeepSeek V4-Flash	Doubao Seed 2.0 Pro
Input ($/MTok)	$0.30	$0.60	$0.14	$0.514
Output ($/MTok)	$1.20	$3.00	$0.28	$2.57
Context	200K	262K	1M	256K
Vision	✗	✓	✗	✓
Tools	✓	✓	✓	✓
Reasoning	✓	✓	✓ (thinking)	✓
Best for	Reasoning + agent, text-only, mid-cost	Multimodal, long-doc coding	Bulk cheap text, cache-stable RAG	Premium agentic + multimodal
Available on TokenMix	✓ (11 SKUs)	✓ (5 SKUs)	✓	✓ (19 SKUs)

The takeaway: MiniMax M2.7 is the cheapest text-only Chinese reasoning model with thinking mode at $0.30/$1.20 — slot it where you do not need vision and DeepSeek V4-Flash's lower price does not buy enough quality for your eval. For multi-modal or vision-heavy work, route to Doubao or Kimi K2.5+ instead. For bulk text, DeepSeek V4-Flash remains the floor. The smart pattern is mixed routing through a unified gateway.

Final Recommendation

Default to minimax-m2.7 at $0.30 input / $1.20 output per MTok for text-only reasoning and agentic workloads with a 200K context budget. Pay the 2× highspeed premium only when sub-second tail latency is the binding constraint. Skip Direct MiniMax registration entirely if you are outside mainland China — the TokenMix endpoint exposes all 11 SKUs through one OpenAI-compatible base URL with no real-name verification.

FAQ

What is the MiniMax M2 API?

MiniMax M2 is MiniMax AI's reasoning + agentic chat model family, served via the MiniMax developer console at platform.minimaxi.com. The current lineup includes M2.7 (March 2026), M2.5 (November 2025), and M2.1 (September 2025), each with a standard and high-speed pricing variant. TokenMix exposes all six chat SKUs plus Hailuo video and image-01 through a single OpenAI-compatible endpoint.

How much does the MiniMax API cost in 2026?

The standard chat tier across M2.7, M2.5, and M2.1 costs $0.30 per million input tokens and $1.20 per million output tokens. The high-speed variants cost $0.60 input and $2.40 output per million tokens — double the standard rate for prioritised throughput. Image-01 and Hailuo video models are priced per generation, not per token.

Do I need a Chinese phone number to use MiniMax?

To register directly with MiniMax through platform.minimaxi.com, yes — real-name verification requires a Chinese-mainland phone. Via TokenMix, no Chinese phone or MiniMax account is required.

Is MiniMax M2 better than M2.5 or M2.1?

M2.7 is the newest generation but shares identical price, context window, and capability flags with M2.5 and M2.1. Reasoning improvements between generations exist but vary by workload. For new builds default to M2.7; for stable production pinned to M2.5, keep until you have time to re-test on M2.7.

Does MiniMax M2 support vision?

No. M2 chat models are text-only. For image generation use image-01; for video use Hailuo 2.3 or Hailuo 02. For multimodal chat with vision input, route to Doubao Seed 2.0 or Kimi K2.5+ instead.

What is the difference between standard and high-speed MiniMax?

High-speed variants cost 2× the standard rate ($0.60 / $2.40 per MTok) and prioritise throughput for latency-critical workloads. Output quality is identical. Use high-speed only on paths where additional tail latency would visibly degrade end-user experience.

Is MiniMax cheaper than DeepSeek or Kimi?

MiniMax M2.7 standard input ($0.30/MTok) is ~50% lower than Kimi K2.5 cache-miss ($0.60) but ~2× higher than DeepSeek V4-Flash ($0.14). Output is $1.20/MTok — 60% below Kimi K2.5 ($3.00) but ~4× DeepSeek V4-Flash ($0.28). MiniMax sits in the middle of the Chinese-origin pricing band.

Can I use Hailuo video and image-01 through TokenMix?

Yes. Hailuo 2.3, Hailuo 02, and image-01 are listed in the TokenMix registry as active. Pricing is per generation rather than per token — pull current rates from the TokenMix models page before integrating high-volume image or video pipelines.

Sources

TokenMix model registry (admin API, retrieved 2026-05-14) — canonical source for all 11 MiniMax model prices, capability flags, release dates.
MiniMax developer console — vendor home, model catalog, registration flow.
TokenMix Models console — live per-call rates including image-01 and Hailuo video tiers.