PinnedInData Science CollectivebyRyan Pégoud·Feb 19Cutting LLM Memory by 84%, A Deep Dive into Fused KernelsWhy your final LLM layer is OOMing and how to fix it with a custom Triton kernel.A response icon2A response icon2
InData Science CollectivebyRyan Pégoud·May 3LatentVLA: Latent Reasoning Models for Autonomous DrivingWhat if natural language is not the best abstraction for driving?A response icon2A response icon2
InData Science CollectivebyRyan Pégoud·Feb 19AlpamayoR1: Large Causal Reasoning Models for Autonomous DrivingAll you need to know about Chain of Causation reasoning and the current state of Autonomous Driving!
InData Science CollectivebyRyan Pégoud·Dec 27, 2025Learning Triton One Kernel at a Time: SoftmaxAll you need to know to write a fast, readable and PyTorch-ready softmax kernel!A response icon1A response icon1
InData Science CollectivebyRyan Pégoud·Nov 14, 2025Learning Triton One Kernel at a Time: Matrix MultiplicationLearn about efficient matrix multiplication, memory hierarchy in modern GPUs, coalescing and much more!A response icon4A response icon4
InData Science CollectivebyRyan Pégoud·Oct 29, 2025Learning Triton One Kernel At a Time: Vector AdditionThe basics of GPU programming, optimisation, and your first Triton kernel!A response icon4A response icon4
InTDS ArchivebyRyan Pégoud·Jul 12, 2024Rainbow: The Colorful Evolution of Deep Q-Networks 🌈Everything you need to assemble the DQN Megazord in JAX.
InTDS ArchivebyRyan Pégoud·May 1, 2024A Practical Guide to Proximal Policy Optimization in JAXAll the tricks and details you wish you knew about PPOA response icon1A response icon1
InTDS ArchivebyRyan Pégoud·Nov 21, 2023A Gentle Introduction to Deep Reinforcement Learning in JAXSolving the CartPole environment with DQN in under a secondA response icon2A response icon2
InTDS ArchivebyRyan Pégoud·Nov 7, 2023Implementing a Transformer Encoder from Scratch with JAX and Haiku 🤖Understanding the fundamental building blocks of Transformers.A response icon3A response icon3