Song Han
337 posts
- Welcome to my new course on TinyML and Efficient Deep Learning, starting tomorrow: efficientml.ai
- EfficientML.ai Lecture 16 - Diffusion Models (MIT 6.5940, Fall 2023) Denoising diffusion models Conditional diffusion Latent diffusion Image editing Model personalization Fast sampling DDIM Progressive distillation Guided distillation youtu.be/nFE1euQ_Wtw
- We are excited to open source TinyEngine: memory-efficient and high-performance neural network library for microcontrollers.
- EfficientML.ai will be offered again this fall, starting tomorrow. I will teach model compression and acceleration techniques for efficient AI computing. Lectures will be streamed via live.efficientml.ai Tue/Thur 3:30-5:00pm EST. Looking forward to see you.
- It's time to systematically study the fundamental principles of GPU-accelerated AI computing:
- ⚡️ SANA is released: github.com/NVlabs/Sana Demo: nv-sana.mit.edu Highlight: - 20 x smaller & 100x faster than FLUX - Generate 4K image. - Deployable on laptop GPU.
00:00 - EfficientViT is highlighted by MIT home page today: mit.edu. EfficientViT accelerates Segment Anything from 12 images/s to 842 images/s on a GPU. Key idea: light-weighted multi-scale linear attention. Full story: news.mit.edu/2023/ai-model-…
- Welcome to the live stream of EfficientML.ai lectures, every Tue/Thur 3:35-5pm ET at live.efficientml.ai, today we'll introduce pruning, part 1.
- Proud moment of my first PhD student passing his thesis defense, "Efficient Deep Learning Computing: From TinyML to Large Language Model" youtu.be/E7cOyB20HDM. Congrats Ji Lin! @jilin_14
- The enrollment of my MIT class EfficientML.ai is projected to double again this year. For those who can not enroll at MIT, welcome to take the online version: live.efficientml.ai
- AWQ received best paper award at MLSys. AWQ quantized models have been downloaded more than a million times on HuggingFace.
- We release VILA-1.5, an efficient visual language model (VLM) that can understand not only images but also videos. VILA-1.5 achieves state-of-the-art accuracy among open source VLMs on the MMMU dataset. CVPR'24 paper: arxiv.org/pdf/2312.07533 Code: github.com/Efficient-Larg…🌟New from #NVIDIAResearch, VILA is a vision language model that can reason among multiple images, learn in context, and even understand videos. 🤔Read our technical deep dive ➡️ nvda.ws/3QyZQzJ. In the past, vision language models have struggled with in-context









