Sihan Yang (杨思涵)

Hi there! I am an incoming PhD student at The Chinese University of Hong Kong, where I am advised by Weiyang Liu. Now I'm interning at Tongyi Lab. Previously, I spent a wonderful time at the University of Electronic Science and Technology of China (undergraduate) and Shanghai AI Laboratory.

Email / CV / Scholar / Github

Photo credit to my homie Taoran

Research

I am deeply interested in the fundamental problems of multimodal sequence modeling (e.g., efficient attention and position encoding) and optimizing training and inference efficiency. Recently, I enjoy developing simple yet effective algorithms that scale efficiently on modern ML systems. Some papers are highlighted.

^*Equal Contribution ^‡Project Lead ^†Corresponding Author

Orthogonal Model Merging
Sihan Yang, Kexuan Shi, Weiyang Liu
ICML 2026
Homepage | Code | Paper | arXiv

We introduce a geometrically principled framework that shifts the integration of expert models from Euclidean space to the Riemannian manifold of the orthogonal group, effectively maintaining model performance across diverse tasks and mitigating catastrophic forgetting.

MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence
Jingli Lin^*, Runsen Xu^*‡, Shaohao Zhu, Sihan Yang, Peizhou Cao, Yunlong Ran, Miao Hu, Chenming Zhu, Yiman Xie, Yilin Long, Wenbo Hu, Dahua Lin, Tai Wang^†, Jiangmiao Pang^†
arXiv
Homepage | Dataset | Paper | arXiv | Code

MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence
Sihan Yang^*, Runsen Xu^*‡, Yiman Xie, Sizhe Yang, Mo Li, Jingli Lin, Chenming Zhu, Xiaochen Chen, Haodong Duan, Xiangyu Yue, Dahua Lin, Tai Wang^†, Jiangmiao Pang^†
ICLR 2026
Homepage | Dataset | Paper | arXiv | Code

We introduce a challenging, diverse, and comprehensive multi-image spatial reasoning benchmark, manually annotated by six 3D vision experts, which additionally supports thorough evaluation of reasoning processes.

VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization
Sihan Yang, Runsen Xu, Chenhang Cui, Tai Wang, Dahua Lin, Jiangmiao Pang
ICCV 2025
Paper | arXiv | Code

Improving Alignment in LVLMs with Debiased Self-Judgment
Sihan Yang^*, Chenhang Cui^*, Zihao Zhao, Yiyang Zhou, Weilong Yan, Ying Wei, Huaxiu Yao
EMNLP 2025 Findings
Paper | arXiv | Dataset | Code

Calibrated Self-rewarding Vision Language Models
Yiyang Zhou^*, Zhiyuan Fan^*, Dongjie Cheng^*, Sihan Yang, Zhaorun Chen, Chenhang Cui, Xiyao Wang, Yun Li, Linjun Zhang, Huaxiu Yao
NeurIPS 2024
Paper | arXiv | Code

Miscellanea

Honors and Awards

SenseTime Scholarship (awarded annually to 30 UGs in the field of AI from across China)

Tencent Scholarship (sole recipient in the School of Software Engineering, UESTC; 1/718)

The Most Outstanding Students Award of UESTC (top 10 at UESTC)

Scholarship in Honor of Modern Scientists (top 10 at UESTC)

National Scholarship for 2023, 2024, and 2025 Academic Years

Academic Service

Reviewer: ICLR, ICML, CVPR, ECCV