TokenMix Research Lab · 2026-04-23

Llama 4 Behemoth Release Date: Still Training 1 Year After Meta's 'In Progress' Claim (2026)

Llama 4 Behemoth Release Date: Still Training 1 Year After Meta's "In Progress" Claim (2026)

It has been one year and 18 days since Meta unveiled Llama 4 Behemoth at the Scout/Maverick launch on April 5, 2025 — a 2-trillion-parameter MoE positioned as the "teacher model" that would codistill frontier capabilities into Meta's open-weight lineup. As of April 23, 2026, Behemoth remains unreleased, still officially "in training," and the competitive landscape it was built to beat has already moved past it. Gemini 2.5 Pro, Claude Opus 4.7, GPT-5.4, and Kimi K2.6 all shipped while Behemoth stayed silent. This is the status update: what Meta has confirmed, what's still speculation, why the delay matters for open-weight AI, and three plausible scenarios for what happens next. TokenMix.ai tracks 300+ models in real time — Behemoth's row has been "Not Released" for 12 consecutive months.

Table of Contents


Confirmed vs Speculation

Claim Status
Behemoth announced April 5, 2025 Confirmed (Meta blog)
~2 trillion total parameters Confirmed (Meta)
~288 billion active parameters per token Confirmed
16 experts MoE architecture Confirmed
Trained on 30T+ tokens Confirmed
Natively multimodal (text + image + video) Confirmed
Beats GPT-4.5, Claude Sonnet 3.7, Gemini 2.0 Pro on MATH-500 / GPQA Diamond Confirmed (Meta self-reported preview)
Outperformed by Gemini 2.5 Pro Confirmed
Serves as teacher model for Scout/Maverick via codistillation Confirmed
Still not released 12+ months after announcement Confirmed
Public release imminent No official confirmation
Will launch at a LlamaCon 2026 event No — Meta has not announced LlamaCon 2026
Delay is due to safety review Speculation
Behemoth may be quietly shelved in favor of a Llama 5 successor Speculation

Timeline: What Meta Said and When

Date Event
2025-02-18 Meta announces LlamaCon for April 29, 2025 (TechCrunch)
2025-04-05 Llama 4 Scout + Maverick release; Behemoth previewed, "still training"
2025-04-29 LlamaCon 2025 — Behemoth mentioned but no release, no date
2025-Q3 Third-party benchmarks show Gemini 2.5 Pro exceeding Behemoth's preview numbers
2025-Q4 Meta internal reports (per Interconnects) suggest training hiccups
2026-Q1 Claude Opus 4.7 ships (87.6% SWE-Bench); GPT-5.4 ships; Kimi K2.6 ships — all exceed Behemoth's preview metrics
2026-04-05 12-month mark; Behemoth still not released
2026-04-23 Today — no updated release date, no LlamaCon 2026 confirmed

That's 384 days of "still training" — a duration that, by 2026's pace, is three or four frontier-model release cycles.

Behemoth Specs We Know

From Meta's April 2025 preview announcement:

Spec Value
Total parameters ~2 trillion
Active parameters per token ~288 billion
Expert count 16
Training tokens 30+ trillion
Modalities Text, image, video (native, early-fusion)
Intended role Teacher model for codistillation into Scout + Maverick
Target strength STEM reasoning (MATH-500, GPQA Diamond)
Disclosed weaknesses Underperforms Gemini 2.5 Pro
License Expected to match Llama 4 community license (700M MAU restriction)
Release form Weights expected on HuggingFace + llama.com, per Meta's open-weight pattern

Source: Meta Llama 4 blog / RDWorld on 2T preview

The 288B active figure is notable — it's 9× larger than Kimi K2.6's 32B active and 26× larger than Step 3.5 Flash's 11B active. Inference would require premium data-center silicon (8× H200 or B200-class). This alone makes Behemoth a very different category of release: not something most teams could self-host.

Why Behemoth Matters: Teacher Model Economics

Meta's pitch in April 2025 was that Behemoth is not primarily a product — it's infrastructure. By running codistillation from a 2T teacher into the 109B Scout and 400B Maverick, you get open-weight models that punch above their parameter class.

If Behemoth works as a teacher:

If Behemoth is delayed indefinitely:

That's why 384 days of silence is louder than "we pushed a new coding model."

What Caught Up While Behemoth Was Training

Models that have shipped in Behemoth's silent year, all of which match or exceed its preview benchmark numbers:

Model Release Key benchmark vs Behemoth preview
Gemini 2.5 Pro Mid-2025 Exceeds Behemoth on MATH-500 and GPQA Diamond
Claude Opus 4.6 Late 2025 Matches Behemoth on STEM, exceeds on code
GPT-5.4 (xhigh) Early 2026 Exceeds Behemoth on SWE-Bench Pro (57.7 vs speculative 52)
Kimi K2.6 2026-04-20 58.6 SWE-Bench Pro — open-weight, ships in the category Behemoth was supposed to own
DeepSeek V3.2 2026-Q1 Cheaper and faster in many production workloads
Step 3.5 Flash 2026-02-01 Smaller (196B) but beats Behemoth-preview ratios on AIME 2025

Source: TokenMix blog posts on each model + vendor announcements

The read: Behemoth's 2025 preview was frontier-competitive at the time. Today, Kimi K2.6 alone has taken the "best open-weight flagship on SWE-Bench" title — exactly the slot Behemoth was designed to occupy.

Three Scenarios for the Next 90 Days

Scenario 1 — Soft release (most likely, ~45%): Behemoth weights quietly appear on HuggingFace in May-June 2026 with a blog post but no big event. Rationale: Meta wants to ship before Gemini 3 / GPT-6, but doesn't want to over-hype a model that's already been surpassed on several benchmarks.

Scenario 2 — Bigger Llama 5 announcement (~35%): Behemoth becomes a footnote inside a broader "Llama 5 family" announcement later in 2026. Rationale: Meta restructures messaging so Behemoth is "chapter 1" of a new story rather than "the delayed flagship." This is what the Interconnects essay hinted at.

Scenario 3 — Indefinite delay / quiet shelving (~20%): Behemoth never releases as-named. Meta ships a successor teacher under a different codename, or folds the work into Scout 2/Maverick 2 without a standalone Behemoth release. Rationale: By the time it's ready, it's no longer competitive and public release would invite unfavorable comparison.

None of these are good outcomes for Meta's open-weight narrative. The least bad is Scenario 1 — ship soon, accept some criticism, preserve credibility.

Behemoth vs Current Frontier Models

Dimension Behemoth (preview, 2025) Kimi K2.6 (2026) Claude Opus 4.7 (2026) Gemini 2.5 Pro (2025)
Total params 2T 1T Undisclosed Undisclosed
Active params 288B 32B Dense (undisclosed) Dense (undisclosed)
Open weights Expected Yes No No
Released No 2026-04-20 2026-Q1 Mid-2025
SWE-Bench Pro Not public 58.6 ~55 (Opus 4.6) 54.2
MATH-500 Beats GPT-4.5 (preview) Strong Strong Stronger than Behemoth-preview
GPQA Diamond Beats Claude 3.7 (preview) ~72 ~80+ Stronger than Behemoth-preview

Read: Behemoth's preview numbers were frontier in April 2025. In April 2026, those same numbers would place it behind 3-4 already-shipped models. Meta would need to update training to current-generation benchmarks to make a release feel like an upgrade.

What This Means for Open-Weight AI

Three takeaways:

  1. Scaling is no longer a competitive moat alone. Step 3.5 Flash (196B) and Kimi K2.6 (1T) have shown that sparse MoE at mid-scale with good training beats dense giants. Behemoth's 2T advantage has shrunk from "decisive" to "marginal."

  2. Time-to-release matters more than raw capability. The model you ship in Q2 2026 competes with Q2 2026 peers — not Q1 2025 peers. Meta's 12-month delay converted a lead into a lag.

  3. Chinese open-weight labs are operating on a faster cycle. DeepSeek, Moonshot, StepFun, Zhipu, Alibaba all released frontier-tier models in the same 12 months Meta spent "still training." For Western open-weight AI, this is a wake-up call.

For teams building on open-weight models today, the practical advice: don't plan roadmap around Behemoth. Plan around Kimi K2.6 + DeepSeek V3.2 + Step 3.5 Flash, which are here, priced in, and improving monthly. If Behemoth eventually ships, treat it as a bonus. TokenMix.ai lets you add any new model to your routing layer in minutes — no lock-in on whichever flagship is currently in training.

FAQ

Q: What is Llama 4 Behemoth's release date? A: There is no confirmed release date as of April 23, 2026. Meta announced it in April 2025 as "still in training" and has not updated that status publicly in 12 months.

Q: How big is Llama 4 Behemoth? A: Approximately 2 trillion total parameters with ~288 billion active per token across 16 experts, trained on 30+ trillion tokens of multimodal data.

Q: Is Behemoth going to be open-weight like Scout and Maverick? A: Meta's stated intent is open-weight release under the same Llama 4 community license (which restricts companies with over 700M monthly active users). No release has happened yet to confirm.

Q: Why is Behemoth delayed so long? A: Meta has not publicly explained the delay. Industry speculation includes training instabilities, competitive repositioning (waiting for Llama 5 family messaging), and benchmark underperformance vs Gemini 2.5 Pro. None is officially confirmed.

Q: Is Behemoth still competitive with 2026 frontier models? A: Based on the April 2025 preview numbers, it would now trail Claude Opus 4.7, GPT-5.4, Gemini 2.5 Pro, and Kimi K2.6 on key benchmarks. Meta would likely need to re-train or fine-tune against newer benchmarks to ship a compelling release.

Q: Will there be a LlamaCon 2026 where Behemoth launches? A: As of April 23, 2026, Meta has not announced a LlamaCon 2026 event. The inaugural LlamaCon was in April 2025. No official second-edition date exists publicly.

Q: Should I build production workloads assuming Behemoth will launch? A: No. Plan around models that are actually released — Kimi K2.6, Claude Opus 4.7, GPT-5.4, DeepSeek V3.2. If Behemoth eventually releases, integrating it later via a unified API like TokenMix.ai is a few lines of config.

Q: What happens to Scout and Maverick if Behemoth never releases? A: Scout and Maverick are already released and will continue to work. Their quality ceiling is partly set by Behemoth distillation; without a released Behemoth successor, their future upgrades may come from different teacher architectures or direct fine-tuning.


Sources

By TokenMix Research Lab · Updated 2026-04-23