Johannes Hagemann (@johannes

Johannes Hagemann

1,704 posts

Johannes Hagemann

@johannes_hage

co-founder/cto @PrimeIntellect | open superintelligence infra, longevity, techno-optimism

Joined March 2017

Pinned
Johannes Hagemann
@johannes_hage
Nov 27, 2025
we've scaled RL for a 100B+ MoE model achieving SOTA benchmark results for its size more important than the final model checkpoint is making the frontier infra required to train models like this accessible to everyone details on the full training recipe, our open source
00:41
Prime Intellect
@PrimeIntellect
Nov 27, 2025
Introducing INTELLECT-3: Scaling RL to a 100B+ MoE model on our end-to-end stack Achieving state-of-the-art performance for its size across math, code and reasoning Built using the same tools we put in your hands, from environments & evals, RL frameworks, sandboxes & more
44K
Johannes Hagemann
@johannes_hage
Aug 12, 2025
big shout out to the absolute legend Parth
Johannes Hagemann
@johannes_hage
Aug 12, 2025
uv tool install prime
621K
Johannes Hagemann
@johannes_hage
Jan 5, 2024
today I learned that there’s a literal GPU church with 4k H100s in Barcelona h/t @tugot17
Readers added context they thought people might want to knowReaders added context
The picture is of MareNostrum 4, a supercomputer installed in 2017 and has Nvidia Volta & AMD Radeon Instinct MI50 GPUs, not H100 GPUs: bsc.es/marenostrum/ma… The MareNostrum 5 does use a total of 4480 Nvidia Hopper/H100 GPUs (1120 nodes with 4 gpus each), but the datacenter is not in a chapel: bsc.es/marenostrum/ma… hpcwire.com/2023/05/23/mar…
Context is written by people who use X, and appears when rated helpful by others. Find out more.
1.3M
Johannes Hagemann
@johannes_hage
Jul 2, 2025
Zuckerberg spent $500 million on one OpenAI researcher. The US population is 327 million. He could have given each American $1 million and still have money left over.
446K
Johannes Hagemann
@johannes_hage
Nov 21, 2020
I’ve become such a big fan of the @lexfridman podcast, that I created a website with all the book recommendations of his guests. You can check it out here: lexfridmanlibrary.com
Johannes Hagemann
@johannes_hage
Mar 16, 2025
criminally underrated interview of the cursor CTO with <1k views that goes into detail on the scale of their infra, including incident responses and more youtube.com/watch?v=4jDQi9…
153K
Johannes Hagemann
@johannes_hage
Sep 15, 2025
gpus.new
180K
Johannes Hagemann
@johannes_hage
Jun 1, 2025
there is now a Bloomberg ticker, SDH100RT, for h100 rental market prices
117K
Johannes Hagemann
@johannes_hage
Sep 17, 2025
if any of the 3.1k folks who bookmarked this want to join Grad’s weekly research reading group where he presents those papers, please apply.
Grad
@Grad62304977
Sep 15, 2025
Replying to @vikhyatk
For well executed reasoning RL I would say: arxiv.org/abs/2505.22312 arxiv.org/abs/2506.13284 arxiv.org/abs/2508.06471 arxiv.org/abs/2504.13914 arxiv.org/abs/2508.08221 arxiv.org/abs/2505.08311 arxiv.org/abs/2506.13585 github.com/Tencent-Hunyua… honorable-payment-890.notion.site/POLARIS-A-POst…
109K
Johannes Hagemann
@johannes_hage
Sep 25, 2025
B200 spot instances available for $0.92/hr right now b200.gpus.new
42K
Johannes Hagemann
@johannes_hage
Aug 3, 2025
🇺🇸🇺🇸🇺🇸
36K
Johannes Hagemann
@johannes_hage
Dec 6, 2023
The training infrastructure for Gemini Ultra is fascinating. They trained data parallel across multiple TPUv4 Superpods (4096 TPUs) in multiple datacenters. Insane that their network speeds are good enough to sync the gradients across multiple datacenters without significantly
109K
Johannes Hagemann
@johannes_hage
Jun 27, 2025
lesson in there
61K
Johannes Hagemann
@johannes_hage
Aug 10, 2024
Zyphra is one of the most underrated AI labs right now Great work on Tree Attention, an exact attention approach with less communication and memory requirements than Ring Attention, enabling more efficient scaling to million token sequence lengths arxiv.org/abs/2408.04093
43K