PhD Dissertation Proposal by LOW Siow Meng | Safe Decision Making using Data-Driven Safety Model

Please click here if you are unable to view this page.

Safe Decision Making using Data-Driven Safety Model

LOW Siow Meng

PhD Candidate
School of Computing and Information Systems
Singapore Management University

FULL PROFILE

Research Area

Artificial Intelligence & Data Science
- Decision Making & Optimization

Dissertation Committee

Research Advisor

Associate Prof Akshat KUMAR

Committee Members

Date

14 November 2023 (Tuesday)

Time

3:30pm - 4:30pm

Venue

Meeting room 4.4, Level 4
School of Computing and Information Systems 1,
Singapore Management University,
80 Stamford Road
Singapore 178902

Please register by 13 November 2023.

We look forward to seeing you at this research seminar.

About The Talk

In decision making problems, it is important for a smart agent to execute a safe sequence of actions while maximizing total return. Since the agent directly interacts with the environment, an unsafe sequence of actions can have adverse impacts on the environment or even the agent's physical self. Contemporary research literature in safe decision making has mainly focused on known safety function comprised of Markovian costs which are able to be specified mathematically. This seminar discusses our work which is aimed at generalizing to problem settings where (1) the state representation of an Markovian Decision Process (MDP) lacks sufficient fidelity to model the true safety cost as Markovian, or (2) the exact mathematical form of the true safety cost function may not be known or easily specifiable. The focus of this seminar will be to introduce to the audience our proposed methods in facilitating safe decision-making in a sample efficient manner. The complex and potentially non-Markovian cost is learned offline using safety-labelled data and this pretrained safety model is then incorporated in our planning algorithms to facilitate learning of safe yet locally-optimal policy. To generalize to safe RL problems where sample collection is expensive, we will also discuss an off-policy algorithm which balances between total return maximization and non-Markovian safety cost minimization.

Speaker Biography

LOW Siow Meng is a PhD candidate in Computer Science at the SMU School of Computing and Information Systems, supervised by Prof. Akshat KUMAR. His research focuses on safe reinforcement learning and planning. Prior to starting his academic career, he had worked as Data Scientist and Software Consultant in a number of global technology MNCs.

Where to find us

Get in touch