Today we’re releasing the weights for Laguna M.1,
our most capable model to date, with a 256K context length.
Both base and post-trained checkpoints are now available on Hugging Face under Apache 2.0.
someone asked if they could try getting Laguna M.1 running on a Mac.
we said yes.
they came back with a 3-bit MLX build running locally on Apple Silicon: ~26 tok/s, with ~100 GB peak memory on an M3 Max with 128 GB unified.
absolute GOAT behavior from @eauchs
Today we’re releasing the weights for Laguna M.1,
our most capable model to date, with a 256K context length.
Both base and post-trained checkpoints are now available on Hugging Face under Apache 2.0.
Open weights at the frontier! Laguna M.1 is a 225B MoE with 23B active params and a 256K context window, now Apache 2.0 on both checkpoints.
Run it on your own infrastructure, evaluate it in your own harnesses, fine-tune it, and build on it directly.
→ Available in Kilo.
Today we’re releasing the weights for Laguna M.1,
our most capable model to date, with a 256K context length.
Both base and post-trained checkpoints are now available on Hugging Face under Apache 2.0.
🎉 Congrats to @poolsideai on Laguna M.1, a new open-weights agentic coding model. Day-0 support landed in vLLM v0.21.0.
🧠 70-layer sparse MoE: 225B total params, 23B active per token, 256K context
🔀 256 experts with top-k=16 routing, built for long-horizon agentic coding
🛠️
Today we’re releasing the weights for Laguna M.1,
our most capable model to date, with a 256K context length.
Both base and post-trained checkpoints are now available on Hugging Face under Apache 2.0.
IYMI: the best way to try Laguna M.1 is to jump in the pool.
pool is our agent harness. It works as both an ACP server and client, so you can run M.1 as a coding agent and build with the same interface we use ourselves.
go build something cool ↓
🎉 Day-0 support for Laguna M.1 from @poolsideai is live in SGLang! This is a 225B MoE built for agentic coding & long-horizon work.
1️⃣ 70-layer MoE: 3 dense SwiGLU layers + 67 sparse MoE layers, 256 experts, top-k=16 with aux-loss-free load balancing
2️⃣ Global attention across
Today we’re releasing the weights for Laguna M.1,
our most capable model to date, with a 256K context length.
Both base and post-trained checkpoints are now available on Hugging Face under Apache 2.0.
M.1 and XS.2 remain available for free on our API and through @OpenRouter
We are launching dedicated paid endpoints for both models on OpenRouter for more demanding work.
Open weights are now our default. We’ll keep building toward the frontier and releasing increasingly
Since April, M.1 has seen strong usage on @OpenRouter through coding agents including @kilocode and @NousResearch Hermes Agent.
Now researchers and builders can run it on their own infrastructure, evaluate it in their own harnesses, fine-tune it, and build on it directly!
As AI becomes more capable, the question is not only who builds the best models.
It is who gets to build them at all.
A founder’s view from @eisokant on where Poolside stands today.
another banger from @pupposandro and the @luceboxai team
Luce Spark runs Laguna XS.2 in 14.6 GiB at ~100 tok/s on an RTX 3090, versus ~119 tok/s fully resident.
you can now run Laguna below the 16 GiB line and use it for local evals, agent traces, routing analysis,
Excited to launch Luce Spark: now a 35B MoE runs on a 16GB GPU, with no offload tax.
An A3B model fires ~8 of its 256 experts per token, but to keep it resident you pay VRAM for all 256. Spark pins the experts your traffic actually hits, offloads the rest to CPU, and decodes the
Just finished reading the latest technical report released by @poolsideai for Laguna. It is so well written and information-dense, covering all stages of a large-scale training run. Each decision and assumption was clearly explained and concisely referenced.
The Model Factory