Hayden727

Follow

🎯

Focusing

Chenchen Hong Hayden727

🎯

Focusing

Follow

Progress, not Perfection.

9 followers · 51 following

RedNote
Hangzhou, China
19:20 (UTC +08:00)
https://hayden727.github.io/
https://orcid.org/0009-0003-4986-1098

Achievements

Achievements

Hayden727/README.md

Hi, I'm Chenchen Hong 👋

AI Infrastructure · MLSys · Compilers

I build and optimize the systems that make large models fast — from compiler-level kernel work up to distributed inference serving.

🚀 What I work on

Multimodal & LLM Inference Infrastructure (main focus) — performance engineering for multimodal serving on SGLang-omni, alongside LLM serving stacks (SGLang, vLLM): model integration, scheduling, memory efficiency, and throughput/latency optimization.
RL Infrastructure — systems and tooling for reinforcement learning workloads: training/inference orchestration, rollout, and scaling.
Kernel Compiler Optimization — compiler-driven kernel optimization for ML workloads: codegen, graph-level transformations, and automatic kernel generation/tuning (Triton, CUDA) on NVIDIA Hopper (H100) and Blackwell (B200).

🛠️ Tech & Tools

📊 GitHub Stats

📫 Reach me

📧 Email: chongyue.cc@gmail.com
💼 LinkedIn: Chenchen Hong
🐦 X / Twitter: @HaydenCC
✍️ Blog: hayden727.github.io

Feel free to touch me on WeChat: hayden-gai.

Pinned Loading

sgl-project/sglang-omni sgl-project/sglang-omni Public

SGLang Omni: High-Performance Multi-Stage Pipeline Framework for Omni Models

Python 522 217
sgl-project/sglang sgl-project/sglang Public

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 29.5k 6.6k
sgl-project/SpecForge sgl-project/SpecForge Public

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 900 262
NVIDIA-NeMo/Automodel NVIDIA-NeMo/Automodel Public

🚀 Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support

Python 610 185
ctorch ctorch Public

C++ 1 1
Hayden727.github.io Hayden727.github.io Public

CSS