Changbin Zhang张长彬Ph.D. Candidate
The University of Hong Kong |
![]() |
I'm currently a final-year Ph.D. candidate working with Dr. Yujie Zhong and Prof. Kai Han at The University of Hong Kong (HKU) . Before that, I spent wonderful years at Nankai University (NKU), supervised by Prof. Ming-Ming Cheng and received my M.Eng degree in Computer Science.
I am currently working on research related to RL / Agentic RL, on-policy distillation (OPD), Multi-Agent Systems, and self-evolving agents. I have published 5 first-author and 2 co-first-author papers in CCF-A venues, including 1 ESI Highly Cited Paper (Top 1%) and 1 CVPR Highlight paper (Top 2.8%). My work has received over 2,300 citations on Google Scholar. I also won a 🥈 Silver Medal in the ACM-ICPC Asia Regional Contest. I am currently on the job market, seeking roles in post-training and agents.
我目前从事 RL / Agentic RL、on-policy distillation (OPD)、Multi-Agent Systems、self-evolving agents 等相关研究。我以第一作者(含共同一作)身份在 CCF-A 类会议上发表论文 7 篇(5 篇一作、2 篇共同一作),其中包含 1 篇 ESI 高被引论文(Top 1%)和 1 篇 CVPR Highlight 论文(Top 2.8%)。我在 Google Scholar 上累计获得超过 2300 次引用。我也曾获得 ACM-ICPC 亚洲区域赛 🥈 银牌。我目前正在求职,寻找 post-training 与 agent 相关的职位。
iVGR: Internalizing Visually Grounded Reasoning for MLLMs with Reinforcement Learning
Changbin Zhang, Yujie Zhong, Qiang Zhang and Kai Han
ICML, 2026
[arXiv]
[code]
[project page]
[models]
A new third paradigm of visual reasoning that internalizes visual grounding into textual CoT via RL — beyond OpenAI's "think with images" and DeepSeek's "think with visual primitives".
Mr. DETR++: Instructive Multi-Route Training for Detection Transformers with Mixture-of-Experts
Changbin Zhang, Yujie Zhong and Kai Han
Under Review, 2025
[arXiv]
[code]
[project page]
[HuggingFace Demo]
A flexible multi-route MoE training framework that boosts detection transformers to SOTA across object detection, instance and panoptic segmentation.
Mr. DETR: Instructive Multi-Route Training for Detection Transformers
Changbin Zhang, Yujie Zhong and Kai Han
CVPR, 2025
[arXiv]
[code]
[project page]
[HuggingFace Demo]
April, 2025, Rank #1 in the Leaderboard of COCO 2017 val
v-CLR: View-Consistent Learning for Open-World Instance Segmentation
Changbin Zhang, Jinhong Ni, Yujie Zhong and Kai Han
CVPR, 2025, Highlight (Top 2.8% of submissions)
[arXiv]
[code]
[project page]
[HuggingFace Demo]
View-consistent learning achieves SOTA on open-world instance segmentation.
Representation Compensation Networks for Continual Semantic Segmentation
Changbin Zhang*, Jia-Wen Xiao*, Xialei Liu, Yingcong Chen and Ming-Ming Cheng
CVPR, 2022
[IEEE/CVF]
[arXiv]
[中译版]
[code]
A re-parameterization approach that decouples old/new knowledge, hitting SOTA continual segmentation with zero inference overhead.
Delving Deep into Label Smoothing
Changbin Zhang*, Peng-Tao Jiang*, Qibin Hou, Yunchao Wei, Qi Han, Zhen Li and Ming-Ming Cheng
IEEE Transactions on Image Processing (IEEE TIP), 2021
[IEEE Explore]
[arXiv]
[中译版]
[code]
An online label smoothing that mines the model's own prediction statistics to craft more reliable soft labels.
LayerCAM: Exploring Hierarchical Class Activation Maps For Localization
Peng-Tao Jiang*, Changbin Zhang*, Qibin Hou, Ming-Ming Cheng and Yunchao Wei
IEEE Transactions on Image Processing (IEEE TIP), 2021
[IEEE Explore]
[中译版]
[code]
[ESI Highly Cited Paper (1%)]
The first method to extract class activation maps from ANY network layer, revealing hierarchical localization.
Deep Hough Transform for Semantic Line Detection
Kai Zhao*, Qi Han*, Changbin Zhang, Jun Xu, and Ming-Ming Cheng
IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), 2021
[IEEE Explore]
[arXiv]
[中译版]
[code]
[ESI Highly Cited Paper (1%)]