alex zhang (@a1zhang) / X

alex zhang

1,070 posts

alex zhang

@a1zhang

phd student @mit_csail @nlp_mit, previously undergrad @princeton 🫵🏻 go participate in the @GPU_MODE kernel competitions!

USA

alexzhang13.github.io/blog/2025/rlm

Joined December 2015

Pinned
alex zhang
@a1zhang
Apr 10
Article
The "Mismanaged Geniuses" Hypothesis
tldr; AI models are already good enough for the next leap in capabilities. By: Alex Zhang (@a1zhang), Zhening (Zed) Li (@zli11010), Omar Khattab (@lateinteraction). For the last decade, scaling the...
320K
alex zhang
@a1zhang
Oct 15, 2025
What if scaling the context windows of frontier LLMs is much easier than it sounds? We’re excited to share our work on Recursive Language Models (RLMs). A new inference strategy where LLMs can decompose and recursively interact with input prompts of seemingly unbounded length,
950K
alex zhang
@a1zhang
Jun 11, 2025
“derive KL divergence” 😭😭😭 oh the irony
This post is unavailable.
121K
alex zhang
@a1zhang
Sep 30, 2025
it's insane to me how little attention the llm.q repo has it's a fully C/C++/CUDA implementation of multi-gpu (zero + fsdp), quantized LLM training with support for selective AC it's genuinely the coolest OSS thing I've seen this year (what's crazier is 1 person wrote it!)
41K
alex zhang
@a1zhang
Oct 31, 2024
Scared of DL systems 🎃? I made a Meticulous Guide to Advances in Deep Learning Efficiency over the Years, which is a detailed story from pre-AlexNet to foundation model training centered on #efficient #deeplearning from the hardware, libraries, algorithms, compilers... 🧵(1/7)
66K
alex zhang
@a1zhang
Sep 1, 2025
All the recordings for the @GPU_MODE x @scaleml series are up as a playlist in case you missed it 😁 There's so much value in these ~8 hours of lectures, from proving quantization error bounds on a whiteboard to a deep-dive into GPU warp schedulers! Plz take advantage of it!
61K
alex zhang
@a1zhang
May 28, 2025
Can GPT, Claude, and Gemini play video games like Zelda, Civ, and Doom II? 𝗩𝗶𝗱𝗲𝗼𝗚𝗮𝗺𝗲𝗕𝗲𝗻𝗰𝗵 evaluates VLMs on Game Boy & MS-DOS games given only raw screen input, just like how a human would play. The best model (Gemini) completes just 0.48% of the benchmark! 🧵👇
00:00
134K
alex zhang
@a1zhang
Jan 10, 2024
for the past 2 weeks I’ve been reading every abstract from #neurips2023 here are my notes about what I discovered: alexzhang13.github.io/blog/2024/neur… i think everyone can learn something new from this, and I hope this resource is useful!
alexzhang13.github.io
Highlights of NeurIPS 2023 from Reading All 3584 Abstracts
Just me reading through every paper abstract...
98K
alex zhang
@a1zhang
Feb 23, 2025
🚨ATTENTION ALL GPU / CUDA PPL 🚨 We’re excited to be launching 🚀 the official @GPU_MODE 🍿kernel writing leaderboard directly on Discord, all completely OSS! Think Codeforces or Kaggle but for GPU kernels, where compute is free – the first practice round is NOW ⏱️
26K
alex zhang
@a1zhang
Oct 16, 2025
Lots of folks have been asking for a gist or simple notebook to try out RLMs. While we work on some more exciting experiments, here's a self-contained, minimal version I quickly put together for people to build on top of. Happy hacking :)
GitHub - alexzhang13/rlm: General plug-and-play inference library for Recursive Language Models...
From github.com
31K
alex zhang
@a1zhang
Jul 17, 2025
Bro actually denied OpenAI an AlphaGo moment LOL @FakePsyho is him. Huge congrats👏👏
14K
alex zhang
@a1zhang
Apr 17, 2025
Claude can play Pokemon, but can it play DOOM? With a simple agent, we let VLMs play it, and found Sonnet 3.7 to get the furthest, finding the blue room! Our VideoGameBench (twenty games from the 90s) and agent are open source so you can try it yourself now --> 🧵
00:00
76K
alex zhang
@a1zhang
Jun 16, 2025
btw a shit ton of amazing learning material + open-source code for GPU programming ($150K worth) is linked on the latest @GPU_MODE news post a year ago when I was an undergrad I was scouring the internet for these kinds of resources, plz take advantage of it!
33K
alex zhang
@a1zhang
Sep 29, 2025
upcoming @GPU_MODE talk on how flash attention 4 was optimized for Blackwell GPUs, e.g. B200s! this is super timely given all the new DSLs and features NVIDIA has been releasing charles will be live Wednesday, 1pm PST on the YT channel, so make sure to be there to ask any Qs!
23K