TokenMix Research Lab · 2026-04-23
Llama 4 Behemoth Release Date: Still Training 1 Year After Meta's "In Progress" Claim (2026)
It has been one year and 18 days since Meta unveiled Llama 4 Behemoth at the Scout/Maverick launch on April 5, 2025 — a 2-trillion-parameter MoE positioned as the "teacher model" that would codistill frontier capabilities into Meta's open-weight lineup. As of April 23, 2026, Behemoth remains unreleased, still officially "in training," and the competitive landscape it was built to beat has already moved past it. Gemini 2.5 Pro, Claude Opus 4.7, GPT-5.4, and Kimi K2.6 all shipped while Behemoth stayed silent. This is the status update: what Meta has confirmed, what's still speculation, why the delay matters for open-weight AI, and three plausible scenarios for what happens next. TokenMix.ai tracks 300+ models in real time — Behemoth's row has been "Not Released" for 12 consecutive months.
Table of Contents
- Confirmed vs Speculation
- Timeline: What Meta Said and When
- Behemoth Specs We Know
- Why Behemoth Matters: Teacher Model Economics
- What Caught Up While Behemoth Was Training
- Three Scenarios for the Next 90 Days
- Behemoth vs Current Frontier Models
- What This Means for Open-Weight AI
- FAQ
Confirmed vs Speculation
| Claim | Status |
|---|---|
| Behemoth announced April 5, 2025 | Confirmed (Meta blog) |
| ~2 trillion total parameters | Confirmed (Meta) |
| ~288 billion active parameters per token | Confirmed |
| 16 experts MoE architecture | Confirmed |
| Trained on 30T+ tokens | Confirmed |
| Natively multimodal (text + image + video) | Confirmed |
| Beats GPT-4.5, Claude Sonnet 3.7, Gemini 2.0 Pro on MATH-500 / GPQA Diamond | Confirmed (Meta self-reported preview) |
| Outperformed by Gemini 2.5 Pro | Confirmed |
| Serves as teacher model for Scout/Maverick via codistillation | Confirmed |
| Still not released 12+ months after announcement | Confirmed |
| Public release imminent | No official confirmation |
| Will launch at a LlamaCon 2026 event | No — Meta has not announced LlamaCon 2026 |
| Delay is due to safety review | Speculation |
| Behemoth may be quietly shelved in favor of a Llama 5 successor | Speculation |
Timeline: What Meta Said and When
| Date | Event |
|---|---|
| 2025-02-18 | Meta announces LlamaCon for April 29, 2025 (TechCrunch) |
| 2025-04-05 | Llama 4 Scout + Maverick release; Behemoth previewed, "still training" |
| 2025-04-29 | LlamaCon 2025 — Behemoth mentioned but no release, no date |
| 2025-Q3 | Third-party benchmarks show Gemini 2.5 Pro exceeding Behemoth's preview numbers |
| 2025-Q4 | Meta internal reports (per Interconnects) suggest training hiccups |
| 2026-Q1 | Claude Opus 4.7 ships (87.6% SWE-Bench); GPT-5.4 ships; Kimi K2.6 ships — all exceed Behemoth's preview metrics |
| 2026-04-05 | 12-month mark; Behemoth still not released |
| 2026-04-23 | Today — no updated release date, no LlamaCon 2026 confirmed |
That's 384 days of "still training" — a duration that, by 2026's pace, is three or four frontier-model release cycles.
Behemoth Specs We Know
From Meta's April 2025 preview announcement:
| Spec | Value |
|---|---|
| Total parameters | ~2 trillion |
| Active parameters per token | ~288 billion |
| Expert count | 16 |
| Training tokens | 30+ trillion |
| Modalities | Text, image, video (native, early-fusion) |
| Intended role | Teacher model for codistillation into Scout + Maverick |
| Target strength | STEM reasoning (MATH-500, GPQA Diamond) |
| Disclosed weaknesses | Underperforms Gemini 2.5 Pro |
| License | Expected to match Llama 4 community license (700M MAU restriction) |
| Release form | Weights expected on HuggingFace + llama.com, per Meta's open-weight pattern |
Source: Meta Llama 4 blog / RDWorld on 2T preview
The 288B active figure is notable — it's 9× larger than Kimi K2.6's 32B active and 26× larger than Step 3.5 Flash's 11B active. Inference would require premium data-center silicon (8× H200 or B200-class). This alone makes Behemoth a very different category of release: not something most teams could self-host.
Why Behemoth Matters: Teacher Model Economics
Meta's pitch in April 2025 was that Behemoth is not primarily a product — it's infrastructure. By running codistillation from a 2T teacher into the 109B Scout and 400B Maverick, you get open-weight models that punch above their parameter class.
If Behemoth works as a teacher:
- Future Scout 2 / Maverick 2 inherit Behemoth-class reasoning at fraction of inference cost
- Open-weight community gets access to distilled frontier capabilities
- Meta's "open beats closed" thesis gets concrete proof
If Behemoth is delayed indefinitely:
- Scout/Maverick successors lose their knowledge source
- Meta falls behind in the "best open-weight flagship" race (currently Kimi K2.6 holds the title)
- The teacher-model strategy looks uncompetitive vs Claude's vertical integration
That's why 384 days of silence is louder than "we pushed a new coding model."
What Caught Up While Behemoth Was Training
Models that have shipped in Behemoth's silent year, all of which match or exceed its preview benchmark numbers:
| Model | Release | Key benchmark vs Behemoth preview |
|---|---|---|
| Gemini 2.5 Pro | Mid-2025 | Exceeds Behemoth on MATH-500 and GPQA Diamond |
| Claude Opus 4.6 | Late 2025 | Matches Behemoth on STEM, exceeds on code |
| GPT-5.4 (xhigh) | Early 2026 | Exceeds Behemoth on SWE-Bench Pro (57.7 vs speculative 52) |
| Kimi K2.6 | 2026-04-20 | 58.6 SWE-Bench Pro — open-weight, ships in the category Behemoth was supposed to own |
| DeepSeek V3.2 | 2026-Q1 | Cheaper and faster in many production workloads |
| Step 3.5 Flash | 2026-02-01 | Smaller (196B) but beats Behemoth-preview ratios on AIME 2025 |
Source: TokenMix blog posts on each model + vendor announcements
The read: Behemoth's 2025 preview was frontier-competitive at the time. Today, Kimi K2.6 alone has taken the "best open-weight flagship on SWE-Bench" title — exactly the slot Behemoth was designed to occupy.
Three Scenarios for the Next 90 Days
Scenario 1 — Soft release (most likely, ~45%): Behemoth weights quietly appear on HuggingFace in May-June 2026 with a blog post but no big event. Rationale: Meta wants to ship before Gemini 3 / GPT-6, but doesn't want to over-hype a model that's already been surpassed on several benchmarks.
Scenario 2 — Bigger Llama 5 announcement (~35%): Behemoth becomes a footnote inside a broader "Llama 5 family" announcement later in 2026. Rationale: Meta restructures messaging so Behemoth is "chapter 1" of a new story rather than "the delayed flagship." This is what the Interconnects essay hinted at.
Scenario 3 — Indefinite delay / quiet shelving (~20%): Behemoth never releases as-named. Meta ships a successor teacher under a different codename, or folds the work into Scout 2/Maverick 2 without a standalone Behemoth release. Rationale: By the time it's ready, it's no longer competitive and public release would invite unfavorable comparison.
None of these are good outcomes for Meta's open-weight narrative. The least bad is Scenario 1 — ship soon, accept some criticism, preserve credibility.
Behemoth vs Current Frontier Models
| Dimension | Behemoth (preview, 2025) | Kimi K2.6 (2026) | Claude Opus 4.7 (2026) | Gemini 2.5 Pro (2025) |
|---|---|---|---|---|
| Total params | 2T | 1T | Undisclosed | Undisclosed |
| Active params | 288B | 32B | Dense (undisclosed) | Dense (undisclosed) |
| Open weights | Expected | Yes | No | No |
| Released | No | 2026-04-20 | 2026-Q1 | Mid-2025 |
| SWE-Bench Pro | Not public | 58.6 | ~55 (Opus 4.6) | 54.2 |
| MATH-500 | Beats GPT-4.5 (preview) | Strong | Strong | Stronger than Behemoth-preview |
| GPQA Diamond | Beats Claude 3.7 (preview) | ~72 | ~80+ | Stronger than Behemoth-preview |
Read: Behemoth's preview numbers were frontier in April 2025. In April 2026, those same numbers would place it behind 3-4 already-shipped models. Meta would need to update training to current-generation benchmarks to make a release feel like an upgrade.
What This Means for Open-Weight AI
Three takeaways:
Scaling is no longer a competitive moat alone. Step 3.5 Flash (196B) and Kimi K2.6 (1T) have shown that sparse MoE at mid-scale with good training beats dense giants. Behemoth's 2T advantage has shrunk from "decisive" to "marginal."
Time-to-release matters more than raw capability. The model you ship in Q2 2026 competes with Q2 2026 peers — not Q1 2025 peers. Meta's 12-month delay converted a lead into a lag.
Chinese open-weight labs are operating on a faster cycle. DeepSeek, Moonshot, StepFun, Zhipu, Alibaba all released frontier-tier models in the same 12 months Meta spent "still training." For Western open-weight AI, this is a wake-up call.
For teams building on open-weight models today, the practical advice: don't plan roadmap around Behemoth. Plan around Kimi K2.6 + DeepSeek V3.2 + Step 3.5 Flash, which are here, priced in, and improving monthly. If Behemoth eventually ships, treat it as a bonus. TokenMix.ai lets you add any new model to your routing layer in minutes — no lock-in on whichever flagship is currently in training.
FAQ
Q: What is Llama 4 Behemoth's release date? A: There is no confirmed release date as of April 23, 2026. Meta announced it in April 2025 as "still in training" and has not updated that status publicly in 12 months.
Q: How big is Llama 4 Behemoth? A: Approximately 2 trillion total parameters with ~288 billion active per token across 16 experts, trained on 30+ trillion tokens of multimodal data.
Q: Is Behemoth going to be open-weight like Scout and Maverick? A: Meta's stated intent is open-weight release under the same Llama 4 community license (which restricts companies with over 700M monthly active users). No release has happened yet to confirm.
Q: Why is Behemoth delayed so long? A: Meta has not publicly explained the delay. Industry speculation includes training instabilities, competitive repositioning (waiting for Llama 5 family messaging), and benchmark underperformance vs Gemini 2.5 Pro. None is officially confirmed.
Q: Is Behemoth still competitive with 2026 frontier models? A: Based on the April 2025 preview numbers, it would now trail Claude Opus 4.7, GPT-5.4, Gemini 2.5 Pro, and Kimi K2.6 on key benchmarks. Meta would likely need to re-train or fine-tune against newer benchmarks to ship a compelling release.
Q: Will there be a LlamaCon 2026 where Behemoth launches? A: As of April 23, 2026, Meta has not announced a LlamaCon 2026 event. The inaugural LlamaCon was in April 2025. No official second-edition date exists publicly.
Q: Should I build production workloads assuming Behemoth will launch? A: No. Plan around models that are actually released — Kimi K2.6, Claude Opus 4.7, GPT-5.4, DeepSeek V3.2. If Behemoth eventually releases, integrating it later via a unified API like TokenMix.ai is a few lines of config.
Q: What happens to Scout and Maverick if Behemoth never releases? A: Scout and Maverick are already released and will continue to work. Their quality ceiling is partly set by Behemoth distillation; without a released Behemoth successor, their future upgrades may come from different teacher architectures or direct fine-tuning.
Sources
- Meta Llama 4 Official Blog
- Llama 4 Behemoth Specs (APXML)
- Llama 4 Behemoth Release Status (Serenities AI)
- Interconnects: Did Meta Push the Panic Button?
- RDWorld: Llama 4 Scout to 2T Behemoth
- VentureBeat: Meta's Answer to DeepSeek
- Hugging Face Llama 4 Release
By TokenMix Research Lab · Updated 2026-04-23