Haotian Tang (@haotiant1998) / X

Haotian Tang

109 posts

Haotian Tang

@haotiant1998

Ph.D. @MITEECS, B.Eng. @sjtu1896.

Joined September 2021

Haotian Tang
@haotiant1998
Dec 31, 2024
Personal update: I am excited to share that I will join @GoogleDeepMind next week after defending my PhD thesis @MITEECS earlier last month. I will be working on generative models that simulate the physical world. Looking forward to the new journey ahead in 2025!
126K
Haotian Tang
@haotiant1998
Oct 15, 2024
🚀 We're thrilled to introduce HART, an efficient AR model that generates stunning 1024x1024 images! 🎨✨ HART delivers: ⚡️ 4.5-7.7x higher throughput 🔋 6.9-13.4x less compute 🔥 top-notch FID & CLIP scores, rivaling diffusion models in quality! Code: tinyurl.com/nkvpnhyk
24K
Haotian Tang
@haotiant1998
Jun 6, 2024
Excited to share my #MLSys 2024 best paper 🏆 presentation on AWQ. AWQ democratizes edge LLM deployment 💻 and has been downloaded over 1 million times on Huggingface 🙌!
15K
Haotian Tang
@haotiant1998
Jun 6, 2024
Replying to @haotiant1998
AWQ website: hanlab.mit.edu/projects/awq Paper: arxiv.org/abs/2306.00978 Code: github.com/mit-han-lab/ll… Joint work with @jilin_14, @jmtang42, @Shang_mit, Wei-Ming Chen, Wei-Chen Wang, @Guangxuan_Xiao, Xingyu Dang, Prof. @gan_chuang, Prof. @songhan_mit.
arxiv.org
AWQ: Activation-aware Weight Quantization for LLM Compression and...
Large language models (LLMs) have transformed numerous AI applications. On-device LLM is becoming increasingly important: running LLMs locally on edge devices can reduce the cloud computing cost...
1.3K
Haotian Tang
@haotiant1998
Oct 15, 2024
Replying to @haotiant1998
📄 Paper: arxiv.org/abs/2410.10812 🌐 Project: hanlab.mit.edu/projects/hart 🖥 Demo: hart.mit.edu 💻 Code: github.com/mit-han-lab/ha…
arxiv.org
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
We introduce Hybrid Autoregressive Transformer (HART), an autoregressive (AR) visual generation model capable of directly generating 1024x1024 images, rivaling diffusion models in image generation...
1.2K
Haotian Tang
@haotiant1998
May 8, 2024
🔥Welcome to try out QServe! TRT-LLM efficiency⚡️ + PyTorch flexibility 😄, your LLM serving turn-key solution 🔑
Shang Yang
@Shang_mit
May 8, 2024
🔥🎉Thrilled to introduce QServe, our latest breakthrough in efficient LLM serving with W4-A8-KV4 quantization. 🚀⚡1.2-3.5x higher throughput over TensorRT-LLM. 💵 Matches TensorRT-LLM’s A100 throughput with 3x cheaper L40S GPUs. 👐 Code: github.com/mit-han-lab/qs… (1/4)
1.1K
Haotian Tang
@haotiant1998
Oct 15, 2024
Replying to @haotiant1998
✨ How it works: We decompose continuous latents into two parts: 🔹 Discrete tokens for the big picture, modeled by a scalable-resolution AR transformer 🔸 Residual tokens for image details, handled by a lightweight diffusion module (37M parameters, 8 sampling steps)
1.4K
Haotian Tang
@haotiant1998
Jan 22, 2025
What an achievement! Congrats to the team!
Demis Hassabis
@demishassabis
Jan 21, 2025
Our latest update to our Gemini 2.0 Flash Thinking model (available here: goo.gle/4jsCqZC) scores 73.3% on AIME (math) & 74.2% on GPQA Diamond (science) benchmarks. Thanks for all your feedback, this represents super fast progress from our first release just this past
709
Haotian Tang
@haotiant1998
Jan 5, 2025
Replying to @xiuyu_l @GoogleDeepMind and @MITEECS
Thank you, Xiuyu! See you in the Bay Area!
61
Haotian Tang
@haotiant1998
Jan 14, 2025
Replying to @phillip_lippe @GoogleDeepMind and @m__dehghani
Excited to work together, Phillip!
222
Haotian Tang
@haotiant1998
Jan 5, 2025
Replying to @daoluc_ @GoogleDeepMind and @MITEECS
Thank you Luc! It’s my great pleasure to work as the TA for 6.5940!
67
Haotian Tang
@haotiant1998
Dec 31, 2024
Replying to @yule_gan @GoogleDeepMind and @MITEECS
Thank you, Yulu!
376
Haotian Tang
@haotiant1998
Jan 5, 2025
Replying to @vernons @GoogleDeepMind and @MITEECS
Thank you Vernon!
25
Haotian Tang
@haotiant1998
Mar 25, 2025
Replying to @TianweiY @MIT and 2 others
Congrats, Tianwei!
168