"The best way to predict the future is to invent it."

— Alan Kay
Hong Chenchen

Hong Chenchen

洪晨辰

LLM Infra @ RedNote · Hangzhou, China

LLM Infra engineer focused on inference acceleration, reinforcement learning, and diffusion LLMs. Core committer of SGLang-Omni and contributor to SGLang.

I work at the intersection of machine learning systems and large-scale model serving. My focus areas include inference for multimodal and language models, and reinforcement learning. Currently at RedNote, working on systems that make large-scale AI workloads run faster.

LLM Inference Inference Acceleration SGLang Diffusion LLMs Reinforcement Learning CUDA MLIR LLVM PyTorch Performance Engineering ML Systems
Tiling-Aware Vectorization Framework for Perfect Loop Nests in MLIR
ICA3PP 2025 CCF-C
2026.06.16 FDFO: First Done, First Out — Rethinking dLLM Inference Scheduling
2026.03.10 CiteBot: Automating Academic Citations with LLM + NLP Fusion
2026.02.20 From PyTorch to MLIR: Building a TorchDynamo-Based Compiler Frontend
2026.02.05 Cost-Model-Driven Tiling in MLIR: Automating Vectorization Decisions
2026.01.20 Building a Production MLIR Compiler: Architecture and Design Decisions