showSidebars ==
showTitleBreadcrumbs == 1
node.field_disable_title_breadcrumbs.value ==

Pre-Conference Talk by LOW Siow Meng | Safe MDP Planning by Learning Temporal Patterns of Undesirable Trajectories

Please click here if you are unable to view this page.

 

Safe MDP Planning by Learning Temporal Patterns
of Undesirable Trajectories

Speaker (s):

LOW Siow Meng
PhD Candidate
School of Computing and Information Systems
Singapore Management University

Date:

Time:

Venue:

 

4 April 2023, Tuesday

11:15am – 12:00pm

Meeting Room 5.1, Level 5
School of Computing & Information Systems 1
Singapore Management University
80 Stamford Road Singapore 178902

Please register by 3 April 2023.

We look forward to seeing you at this research seminar.

About the Talk

In safe MDP planning, a cost function based on the current state and action is often used to specify safety aspects. In the real world, often the state representation used may lack sufficient fidelity to specify such safety constraints. Operating based on an incomplete model can often produce unintended negative side effects (NSEs). To address these challenges, our proposed method associates safety signals with state-action trajectories and only requires categorical safety labels to be provided for different trajectories. A supervised learning model, which learns the non-Markovian safety patterns present in the trajectory data, is further utilized to facilitate agent learning of safe behaviours. We present empirical results on a variety of discrete and continuous domains, demonstrating that our proposed approach is (1) applicable to both Markovian and non-Markovian safety constraints, (2) scalable to continuous domains, (3) optimizes total return while satisfying complex safety constraints.

This is a Pre-Conference talk for The 33rd International Conference on Automated Planning and Scheduling (ICAPS 2023).

About the Speaker

LOW Siow Meng is a PhD candidate in Computer Science at the SMU School of Computing and Information Systems, supervised by Prof. Akshat KUMAR. His research focuses on safe model-based reinforcement learning and planning. Prior to starting his academic career, he had worked as Data Scientist and Software Consultant in a number of global technology MNCs.