| |
 Constrained Reinforcement Learning: From Single-Agent Safety to Multi-Agent Coordination |  | JIANG Hao PhD Candidate School of Computing and Information Systems Singapore Management University | Research Area Dissertation Committee Research Advisor Committee Members External Member - Arunesh SINHA, Assistant Professor, Department of Management Science & Information Systems, Rutgers Business School, Rutgers University
|
| | Date 19 January 2026 (Monday) | Time 10:00am - 11:00am | Venue Meeting room 5.1, Level 5 School of Computing and Information Systems 1, Singapore Management University, 80 Stamford Road Singapore 178902 | Please register by 15 January 2026. We look forward to seeing you at this research seminar. 
|
|
|
| | ABOUT THE TALK Real-world decision-making systems such as autonomous driving and large-scale ride-pooling must operate under strict safety and resource constraints, yet traditional reinforcement learning (RL) methods often fail to provide such guarantees. This dissertation studies Constrained Reinforcement Learning (CRL) from both single-agent and multi-agent perspectives. In the single-agent setting, it introduces a reward penalty framework that augments the state with cumulative cost and penalizes only constraint-violating trajectories, unifying expected, chance, and CVaR constraints and enabling safe variants of standard RL algorithms such as DQN and SAC.
In the multi-agent setting, motivated by on-demand ride-pooling, the thesis proposes Hierarchical Value Decomposition (HIVES) to model large-scale agent interactions. Building on this framework, it further develops extensions that incorporate flexible pickup and drop-off (FlexiPool) and jointly optimize matching and pricing using reinforcement learning. Together, these contributions provide a unified framework for safe and scalable reinforcement learning in complex, constrained environments, with particular relevance to real-world mobility systems. | | | SPEAKER BIOGRAPHY JIANG Hao is a PhD candidate in Computer Science at Singapore Management University. His doctoral research focuses on reinforcement learning and constrained decision-making, with particular emphasis on safety and scalability in sequential decision problems. During his PhD, he worked on both theoretical and applied aspects of constrained reinforcement learning, including algorithm design, large-scale simulation, and empirical evaluation on real-world datasets. His research spans single-agent safety as well as multi-agent coordination in urban mobility systems. |
|