Hong Chenchen

"The best way to predict the future is to invent it."

— Alan Kay

About

I work at the intersection of machine learning systems and large-scale model serving. My focus areas include inference for multimodal and language models, and reinforcement learning. Currently at RedNote, working on systems that make large-scale AI workloads run faster.

Interests

LLM Inference Inference Acceleration SGLang Diffusion LLMs Reinforcement Learning CUDA MLIR LLVM PyTorch Performance Engineering ML Systems

Publications

Tiling-Aware Vectorization Framework for Perfect Loop Nests in MLIR

ICA3PP 2025 CCF-C

Projects

A production MLIR compiler targeting FT-Matrix with cost-model-driven optimization, PyTorch frontend, and a comprehensive benchmark framework achieving ~57x kernel speedup

An intelligent LaTeX citation assistant that automates reference discovery and BibTeX generation using LLM + NLP fusion

Writing

2026.06.16 FDFO: First Done, First Out — Rethinking dLLM Inference Scheduling

2026.03.10 CiteBot: Automating Academic Citations with LLM + NLP Fusion

2026.02.20 From PyTorch to MLIR: Building a TorchDynamo-Based Compiler Frontend

2026.02.05 Cost-Model-Driven Tiling in MLIR: Automating Vectorization Decisions

2026.01.20 Building a Production MLIR Compiler: Architecture and Design Decisions