About

I am currently an Assistant Professor at the Institute of Automation, Chinese Academy of Sciences (CASIA). I received my Ph.D. in Statistical Machine Learning from the Department of Computing Science, University of Alberta, supervised by Prof. Martin Müller. During my Ph.D., I was a member of the Alberta Machine Intelligence Institute (Amii) and the RLAI Lab. Prior to that, I obtained my M.Sc. from the School of Mathematical Sciences, Peking University and B.Sc. from the School of Mathematical Sciences, Beijing Normal University.

Over the years, I have built solid expertise in both algorithm design and large-scale engineering implementation. On the algorithm side, my work spans value-based methods, policy optimization, planning, and representation learning, with publications at top venues including NeurIPS, ICML, ICLR, and AAMAS. On the engineering side, I have extensive experience in distributed training and have successfully trained RL systems on clusters with hundreds of GPUs. Notably, I developed FPDou, which ranked first among 452 bots on the Botzone platform, and built an AlphaZero-based Gomoku agent with asynchronous MPI training.

Living in this exciting era, I am deeply fascinated by rapid technological advancements and eager to contribute to this remarkable journey. I am eager to explore Reinforcement Learning and its related frontiers, and warmly welcome collaborations of any form. Feel free to reach out!

Research Interests

My research centers on decision-making, with an emphasis on deep reinforcement learning (DRL) and planning across model-free, model-based, online, and offline settings. I develop sample-efficient algorithms by improving update rules, exploration, experience replay, and representation learning. My current work covers RL-driven applications in game AI, LLMs, and robotics.

Book

Selected Applications

Full list at Applications

Selected Publications

First-author, co-first-author (*), and corresponding-author (†) papers. Full list at Publications