showSidebars ==
showTitleBreadcrumbs == 1
node.field_disable_title_breadcrumbs.value ==

PhD Dissertation Defense by LOW Siow Meng | From Sparse Feedback to Sequential Decision-Making: Learning Safety Constraints with Weak Supervision

Please click here if you are unable to view this page.

 

From Sparse Feedback to Sequential Decision-Making: Learning Safety Constraints with Weak Supervision

LOW Siow Meng

PhD Candidate
School of Computing and Information Systems
Singapore Management University
 

FULL PROFILE

Research Area

Dissertation Committee

Research Advisor
Committee Members
External Member
  • GONG Ze, Assistant Professor, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences
 

Date

23 July 2025 (Wednesday)

Time

1:00pm - 2:00pm

Venue

Meeting room 5.1, 
Level 5
School of Computing and Information Systems 1,
Singapore Management University,
80 Stamford Road
Singapore 178902

Please register by 21 July 2025.

We look forward to seeing you at this research seminar.

 

ABOUT THE TALK

Real-world decision-making often involves safety requirements that are implicit or difficult to define explicitly. Traditional reinforcement learning (RL) methods typically assume access to detailed cost functions and predefined safety constraints, which limits their applicability in practice. This dissertation presents a unified framework for learning safety constraints directly from high-level, weak supervision—such as binary feedback on entire trajectories—rather than relying on handcrafted specifications.

The work begins with an efficient planning algorithm for continuous environments with known dynamics, forming the foundation for subsequent developments. The core contributions then focus on algorithms that infer both Markovian and non-Markovian safety constraints from sparse supervision, including settings where the environment dynamics are unknown. Notably, the proposed frameworks enable agents to learn safe behaviours even with limited, coarse-grained trajectory-level feedback. 

Together, these methods bridge the gap between data-driven constraint learning and safe reinforcement learning, paving the way for deploying intelligent agents in environments where safety requirements are dynamic, implicit, or hard to formalize.

 

SPEAKER BIOGRAPHY

Low Siow Meng is a PhD candidate in Computer Science at the SMU School of Computing and Information Systems, where he conducted research under the supervision of A/P Akshat Kumar. He holds a Bachelor’s degree from the National University of Singapore and a Master’s degree from Imperial College London. Prior to joining SMU, he gained software R&D experience at technology companies including IBM and NEC. 

His research interests lie at the intersection of reinforcement learning, data-driven constraint learning, and safety in sequential decision-making. His doctoral work focuses on inferring safety constraints from weak supervision and developing safe learning algorithms for complex environments. His research has been presented at leading venues such as AAAI, ICAPS, and an ICLR workshop. 

Beyond research, Siow Meng enjoys running, following strategic board games, exploring new productivity tools, and reading about the intersection of AI and society.