Pinned
alex zhang
1,070 posts
phd student @mit_csail @nlp_mit, previously undergrad @princeton
🫵🏻 go participate in the @GPU_MODE kernel competitions!
- What if scaling the context windows of frontier LLMs is much easier than it sounds? We’re excited to share our work on Recursive Language Models (RLMs). A new inference strategy where LLMs can decompose and recursively interact with input prompts of seemingly unbounded length,
- “derive KL divergence” 😭😭😭 oh the ironyThis post is unavailable.
- it's insane to me how little attention the llm.q repo has it's a fully C/C++/CUDA implementation of multi-gpu (zero + fsdp), quantized LLM training with support for selective AC it's genuinely the coolest OSS thing I've seen this year (what's crazier is 1 person wrote it!)
- Scared of DL systems 🎃? I made a Meticulous Guide to Advances in Deep Learning Efficiency over the Years, which is a detailed story from pre-AlexNet to foundation model training centered on #efficient #deeplearning from the hardware, libraries, algorithms, compilers... 🧵(1/7)
- Can GPT, Claude, and Gemini play video games like Zelda, Civ, and Doom II? 𝗩𝗶𝗱𝗲𝗼𝗚𝗮𝗺𝗲𝗕𝗲𝗻𝗰𝗵 evaluates VLMs on Game Boy & MS-DOS games given only raw screen input, just like how a human would play. The best model (Gemini) completes just 0.48% of the benchmark! 🧵👇
00:00 - for the past 2 weeks I’ve been reading every abstract from #neurips2023 here are my notes about what I discovered: alexzhang13.github.io/blog/2024/neur… i think everyone can learn something new from this, and I hope this resource is useful!
- 🚨ATTENTION ALL GPU / CUDA PPL 🚨 We’re excited to be launching 🚀 the official @GPU_MODE 🍿kernel writing leaderboard directly on Discord, all completely OSS! Think Codeforces or Kaggle but for GPU kernels, where compute is free – the first practice round is NOW ⏱️
- Lots of folks have been asking for a gist or simple notebook to try out RLMs. While we work on some more exciting experiments, here's a self-contained, minimal version I quickly put together for people to build on top of. Happy hacking :)
- Claude can play Pokemon, but can it play DOOM? With a simple agent, we let VLMs play it, and found Sonnet 3.7 to get the furthest, finding the blue room! Our VideoGameBench (twenty games from the 90s) and agent are open source so you can try it yourself now --> 🧵
00:00 - btw a shit ton of amazing learning material + open-source code for GPU programming ($150K worth) is linked on the latest @GPU_MODE news post a year ago when I was an undergrad I was scouring the internet for these kinds of resources, plz take advantage of it!
- upcoming @GPU_MODE talk on how flash attention 4 was optimized for Blackwell GPUs, e.g. B200s! this is super timely given all the new DSLs and features NVIDIA has been releasing charles will be live Wednesday, 1pm PST on the YT channel, so make sure to be there to ask any Qs!










