【学术报告】Intrinsic Benefits of Categorical Distributional Loss: Uncertainty-aware Regularized Exploration in Reinforcement Learning

发布者：pc蛋蛋发布时间：2026-07-01浏览次数：10

【主讲人简介】：Dr. Linglong Kong is a Professor in the Department of Mathematical and Statistical Sciences at the University of Alberta, holding a Canada Research Chair in Statistical Learning and a Canada CIFAR AI Chair. He is a Fellow of the American Statistical Association (ASA) and the Alberta Machine Intelligence Institute (Amii), with over 150 peer-reviewed publications in leading journals and conferences such as AOS, JASA, JRSSB, NeurIPS, ICML, and ICLR. Dr. Kong received the 2025 CRM-SSC Prize for outstanding research in Canada. He serves as Associate Editor for several top journals, including JASA and AOAS, and has held leadership roles within the ASA and the Statistical Society of Canada. Dr. Kong’s research interests include high-dimensional and neuroimaging data analysis, statistical machine learning, robust statistics, quantile regression, trustworthy machine learning, and artificial intelligence for smart health.

【内容简介】：The remarkable empirical performance of distributional reinforcement learning (RL) has garnered increasing attention to understanding its theoretical advantages over classical RL. By decomposing the categorical distributional loss commonly employed in distributional RL, we find that the potential superiority of distributional RL can be attributed to a derived distribution-matching entropy regularization. This less-studied entropy regularization aims to capture additional knowledge of return distribution beyond only its expectation, contributing to an augmented reward signal in policy optimization. In contrast to the vanilla entropy regularization in MaxEnt RL, which explicitly encourages exploration by promoting diverse actions, the novel entropy regularization derived from categorical distributional loss implicitly updates policies to align the learned policy with (estimated) environmental uncertainty. Finally, extensive experiments verify the significance of this uncertainty-aware regularization from distributional RL on the empirical benefits over classical RL. Our study offers an innovative exploration perspective to explain the intrinsic benefits of distributional learning in RL.

【讲座时间】：2026年7月1日（星期三）下午15:00

【讲座地点】：人文社科科研楼1801会议室