TokenMix Research Lab · 2026-04-23

Llama 4 Behemoth Release Date: Still Training 1 Year After Meta's "In Progress" Claim (2026)

It has been one year and 18 days since Meta unveiled Llama 4 Behemoth at the Scout/Maverick launch on April 5, 2025 — a 2-trillion-parameter MoE positioned as the "teacher model" that would codistill frontier capabilities into Meta's open-weight lineup. As of April 23, 2026, Behemoth remains unreleased, still officially "in training," and the competitive landscape it was built to beat has already moved past it. Gemini 2.5 Pro, Claude Opus 4.7, GPT-5.4, and Kimi K2.6 all shipped while Behemoth stayed silent. This is the status update: what Meta has confirmed, what's still speculation, why the delay matters for open-weight AI, and three plausible scenarios for what happens next. TokenMix.ai tracks 300+ models in real time — Behemoth's row has been "Not Released" for 12 consecutive months.

Confirmed vs Speculation
Timeline: What Meta Said and When
Behemoth Specs We Know
Why Behemoth Matters: Teacher Model Economics
What Caught Up While Behemoth Was Training
Three Scenarios for the Next 90 Days
Behemoth vs Current Frontier Models
What This Means for Open-Weight AI
FAQ

Confirmed vs Speculation

Claim	Status
Behemoth announced April 5, 2025	Confirmed (Meta blog)
~2 trillion total parameters	Confirmed (Meta)
~288 billion active parameters per token	Confirmed
16 experts MoE architecture	Confirmed
Trained on 30T+ tokens	Confirmed
Natively multimodal (text + image + video)	Confirmed
Beats GPT-4.5, Claude Sonnet 3.7, Gemini 2.0 Pro on MATH-500 / GPQA Diamond	Confirmed (Meta self-reported preview)
Outperformed by Gemini 2.5 Pro	Confirmed
Serves as teacher model for Scout/Maverick via codistillation	Confirmed
Still not released 12+ months after announcement	Confirmed
Public release imminent	No official confirmation
Will launch at a LlamaCon 2026 event	No — Meta has not announced LlamaCon 2026
Delay is due to safety review	Speculation
Behemoth may be quietly shelved in favor of a Llama 5 successor	Speculation

Timeline: What Meta Said and When

Date	Event
2025-02-18	Meta announces LlamaCon for April 29, 2025 (TechCrunch)
2025-04-05	Llama 4 Scout + Maverick release; Behemoth previewed, "still training"
2025-04-29	LlamaCon 2025 — Behemoth mentioned but no release, no date
2025-Q3	Third-party benchmarks show Gemini 2.5 Pro exceeding Behemoth's preview numbers
2025-Q4	Meta internal reports (per Interconnects) suggest training hiccups
2026-Q1	Claude Opus 4.7 ships (87.6% SWE-Bench); GPT-5.4 ships; Kimi K2.6 ships — all exceed Behemoth's preview metrics
2026-04-05	12-month mark; Behemoth still not released
2026-04-23	Today — no updated release date, no LlamaCon 2026 confirmed

That's 384 days of "still training" — a duration that, by 2026's pace, is three or four frontier-model release cycles.

Behemoth Specs We Know

From Meta's April 2025 preview announcement:

Spec	Value
Total parameters	~2 trillion
Active parameters per token	~288 billion
Expert count	16
Training tokens	30+ trillion
Modalities	Text, image, video (native, early-fusion)
Intended role	Teacher model for codistillation into Scout + Maverick
Target strength	STEM reasoning (MATH-500, GPQA Diamond)
Disclosed weaknesses	Underperforms Gemini 2.5 Pro
License	Expected to match Llama 4 community license (700M MAU restriction)
Release form	Weights expected on HuggingFace + llama.com, per Meta's open-weight pattern

Source: Meta Llama 4 blog / RDWorld on 2T preview

The 288B active figure is notable — it's 9× larger than Kimi K2.6's 32B active and 26× larger than Step 3.5 Flash's 11B active. Inference would require premium data-center silicon (8× H200 or B200-class). This alone makes Behemoth a very different category of release: not something most teams could self-host.

Why Behemoth Matters: Teacher Model Economics

Meta's pitch in April 2025 was that Behemoth is not primarily a product — it's infrastructure. By running codistillation from a 2T teacher into the 109B Scout and 400B Maverick, you get open-weight models that punch above their parameter class.

If Behemoth works as a teacher:

Future Scout 2 / Maverick 2 inherit Behemoth-class reasoning at fraction of inference cost
Open-weight community gets access to distilled frontier capabilities
Meta's "open beats closed" thesis gets concrete proof

If Behemoth is delayed indefinitely:

Scout/Maverick successors lose their knowledge source
Meta falls behind in the "best open-weight flagship" race (currently Kimi K2.6 holds the title)
The teacher-model strategy looks uncompetitive vs Claude's vertical integration

That's why 384 days of silence is louder than "we pushed a new coding model."

What Caught Up While Behemoth Was Training

Models that have shipped in Behemoth's silent year, all of which match or exceed its preview benchmark numbers:

Model	Release	Key benchmark vs Behemoth preview
Gemini 2.5 Pro	Mid-2025	Exceeds Behemoth on MATH-500 and GPQA Diamond
Claude Opus 4.6	Late 2025	Matches Behemoth on STEM, exceeds on code
GPT-5.4 (xhigh)	Early 2026	Exceeds Behemoth on SWE-Bench Pro (57.7 vs speculative 52)
Kimi K2.6	2026-04-20	58.6 SWE-Bench Pro — open-weight, ships in the category Behemoth was supposed to own
DeepSeek V3.2	2026-Q1	Cheaper and faster in many production workloads
Step 3.5 Flash	2026-02-01	Smaller (196B) but beats Behemoth-preview ratios on AIME 2025

Source: TokenMix blog posts on each model + vendor announcements

The read: Behemoth's 2025 preview was frontier-competitive at the time. Today, Kimi K2.6 alone has taken the "best open-weight flagship on SWE-Bench" title — exactly the slot Behemoth was designed to occupy.

Three Scenarios for the Next 90 Days

Scenario 1 — Soft release (most likely, ~45%): Behemoth weights quietly appear on HuggingFace in May-June 2026 with a blog post but no big event. Rationale: Meta wants to ship before Gemini 3 / GPT-6, but doesn't want to over-hype a model that's already been surpassed on several benchmarks.

Scenario 2 — Bigger Llama 5 announcement (~35%): Behemoth becomes a footnote inside a broader "Llama 5 family" announcement later in 2026. Rationale: Meta restructures messaging so Behemoth is "chapter 1" of a new story rather than "the delayed flagship." This is what the Interconnects essay hinted at.

Scenario 3 — Indefinite delay / quiet shelving (~20%): Behemoth never releases as-named. Meta ships a successor teacher under a different codename, or folds the work into Scout 2/Maverick 2 without a standalone Behemoth release. Rationale: By the time it's ready, it's no longer competitive and public release would invite unfavorable comparison.

None of these are good outcomes for Meta's open-weight narrative. The least bad is Scenario 1 — ship soon, accept some criticism, preserve credibility.

Behemoth vs Current Frontier Models

Dimension	Behemoth (preview, 2025)	Kimi K2.6 (2026)	Claude Opus 4.7 (2026)	Gemini 2.5 Pro (2025)
Total params	2T	1T	Undisclosed	Undisclosed
Active params	288B	32B	Dense (undisclosed)	Dense (undisclosed)
Open weights	Expected	Yes	No	No
Released	No	2026-04-20	2026-Q1	Mid-2025
SWE-Bench Pro	Not public	58.6	~55 (Opus 4.6)	54.2
MATH-500	Beats GPT-4.5 (preview)	Strong	Strong	Stronger than Behemoth-preview
GPQA Diamond	Beats Claude 3.7 (preview)	~72	~80+	Stronger than Behemoth-preview

Read: Behemoth's preview numbers were frontier in April 2025. In April 2026, those same numbers would place it behind 3-4 already-shipped models. Meta would need to update training to current-generation benchmarks to make a release feel like an upgrade.

What This Means for Open-Weight AI

Three takeaways:

Scaling is no longer a competitive moat alone. Step 3.5 Flash (196B) and Kimi K2.6 (1T) have shown that sparse MoE at mid-scale with good training beats dense giants. Behemoth's 2T advantage has shrunk from "decisive" to "marginal."
Time-to-release matters more than raw capability. The model you ship in Q2 2026 competes with Q2 2026 peers — not Q1 2025 peers. Meta's 12-month delay converted a lead into a lag.
Chinese open-weight labs are operating on a faster cycle. DeepSeek, Moonshot, StepFun, Zhipu, Alibaba all released frontier-tier models in the same 12 months Meta spent "still training." For Western open-weight AI, this is a wake-up call.

For teams building on open-weight models today, the practical advice: don't plan roadmap around Behemoth. Plan around Kimi K2.6 + DeepSeek V3.2 + Step 3.5 Flash, which are here, priced in, and improving monthly. If Behemoth eventually ships, treat it as a bonus. TokenMix.ai lets you add any new model to your routing layer in minutes — no lock-in on whichever flagship is currently in training.

FAQ

Q: What is Llama 4 Behemoth's release date? A: There is no confirmed release date as of April 23, 2026. Meta announced it in April 2025 as "still in training" and has not updated that status publicly in 12 months.

Q: How big is Llama 4 Behemoth? A: Approximately 2 trillion total parameters with ~288 billion active per token across 16 experts, trained on 30+ trillion tokens of multimodal data.

Q: Is Behemoth going to be open-weight like Scout and Maverick? A: Meta's stated intent is open-weight release under the same Llama 4 community license (which restricts companies with over 700M monthly active users). No release has happened yet to confirm.

Q: Why is Behemoth delayed so long? A: Meta has not publicly explained the delay. Industry speculation includes training instabilities, competitive repositioning (waiting for Llama 5 family messaging), and benchmark underperformance vs Gemini 2.5 Pro. None is officially confirmed.

Q: Is Behemoth still competitive with 2026 frontier models? A: Based on the April 2025 preview numbers, it would now trail Claude Opus 4.7, GPT-5.4, Gemini 2.5 Pro, and Kimi K2.6 on key benchmarks. Meta would likely need to re-train or fine-tune against newer benchmarks to ship a compelling release.

Q: Will there be a LlamaCon 2026 where Behemoth launches? A: As of April 23, 2026, Meta has not announced a LlamaCon 2026 event. The inaugural LlamaCon was in April 2025. No official second-edition date exists publicly.

Q: Should I build production workloads assuming Behemoth will launch? A: No. Plan around models that are actually released — Kimi K2.6, Claude Opus 4.7, GPT-5.4, DeepSeek V3.2. If Behemoth eventually releases, integrating it later via a unified API like TokenMix.ai is a few lines of config.

Q: What happens to Scout and Maverick if Behemoth never releases? A: Scout and Maverick are already released and will continue to work. Their quality ceiling is partly set by Behemoth distillation; without a released Behemoth successor, their future upgrades may come from different teacher architectures or direct fine-tuning.

Sources

By TokenMix Research Lab · Updated 2026-04-23