PhD Dissertation Defense by LING Jiajing | Reinforcement Learning for Sequential Decision Making with Constraints

Please click here if you are unable to view this page.

Reinforcement Learning for Sequential Decision Making with Constraints

LING Jiajing

PhD Candidate
School of Computing and Information Systems
Singapore Management University

FULL PROFILE

Research Area

Artificial Intelligence & Data Science
- Intelligent Systems & Optimisation

Dissertation Committee

Research Advisor

Associate Prof. Akshat KUMAR

Committee Members

External Member

Arunesh SINHA, Assistant Professor, Department of Management Science & Information Systems, Rutgers Business School, Rutgers University

Date

18 July 2023 (Tuesday)

Time

9:00am - 10:00am

Venue

Meeting room 5.1, Level 5
School of Computing and Information Systems 1,
Singapore Management University,
80 Stamford Road
Singapore 178902

Please register by 17 July 2023

We look forward to seeing you at this research seminar.

About The Talk

Reinforcement learning is a widely used approach to tackle problems in sequential decision making where an agent learns from rewards or penalties. However, in decision-making problems that involve safety or limited resources, the agent's exploration is often limited by constraints. To model such problems, constrained Markov decision processes and constrained decentralized partially observable Markov decision processes have been proposed for single-agent and multi-agent settings, respectively. A significant challenge in solving constrained Dec-POMDP is determining the contribution of each agent to the primary objective and constraint violations. To address this issue, we propose a fictitious play-based method that uses Lagrangian Relaxation to perform credit assignment for both primary objectives and constraints in large-scale multi-agent systems. Another major challenge in solving both CMDP and constrained Dec-POMDP is the sample inefficiency issue, mainly resulting from finding valid actions that satisfy all constraints, which becomes even more difficult in large state and action spaces. Recent works in RL have attempted to incorporate domain knowledge from experts into the learning process through neuro-symbolic methods to address the sample inefficiency issue. We propose a knowledge compilation framework using decision diagrams by treating constraints as domain knowledge and introducing neuro-symbolic methods to support effective learning in constrained RL. Firstly, we propose a zone-based multi-agent pathfinding (ZBPF) framework that is motivated by drone delivery applications. We propose a neuro-symbolic method to efficiently solve the ZBPF problem with several domain constraints, such as simple path constraint and landmark constraint in ZBPF. Secondly, we propose another neuro-symbolic method to solve action constrained RL where the action space is discrete and combinatorial. Empirical results show that our proposed approaches achieve better performance than standard constrained RL algorithms in several real-world applications.

Speaker Biography

Jiajing LING is a Ph.D. candidate in Computer Science at Singapore Management University. He started his Ph.D. studies at SMU in 2018 under the supervision of Prof. Akshat KUMAR. Prior to this, he obtained his Bachelor's degree in Electronics Engineering from Guangzhou University and his Master's degree in Quantitative Finance from Singapore Management University. His research interests are primarily focused on reinforcement learning, multi-agent systems, and neuro-symbolic AI. He has made significant contributions to the field by developing neuro-symbolic methods for solving constrained RL problems, which have been published in various prestigious conferences and workshops, including ICAPS, AAMAS, ECML/PKDD, and AAAI. He was awarded the SMU Presidential Doctoral Fellowship (2021-2022) for outstanding research. In his free time, he enjoys physical exercise.

Where to find us

Get in touch