showSidebars ==
showTitleBreadcrumbs == 1
node.field_disable_title_breadcrumbs.value ==

PhD Dissertation Defense by LU Yuxiao | Safety-Constrained Learning for Intelligent Systems: From Risk-Aware Planning to Safety Aligned Large Language Models

Please click here if you are unable to view this page.

 

Safety-Constrained Learning for Intelligent Systems:
From Risk-Aware Planning to Safety Aligned Large Language Models

LU Yuxiao

PhD Candidate
School of Computing and Information Systems
Singapore Management University
 

FULL PROFILE

Research Area

Dissertation Committee

Research Advisor
Co-Research Advisor
  • Arunesh SINHA, Assistant Professor, Department of Management Science & Information Systems, Rutgers Business School, Rutgers University
Committee Members
External Member
  • Thanh Hong NGUYEN, Associate Professor, Department of Computer Science, University of Oregon
 

Date

15 May 2026 (Friday)

Time

9:00am - 10:00am

Venue

Meeting room 5.1, 
Level 5
School of Computing and Information Systems 1,
Singapore Management University,
80 Stamford Road
Singapore 178902

Please register by 13 May 2026.

We look forward to seeing you at this research seminar.

 

ABOUT THE TALK

This dissertation studies how to build intelligent systems that are safe without sacrificing their usefulness. It addresses this challenge in two settings: long-horizon reinforcement learning and large language model alignment. For sequential decision-making, the thesis proposes a constrained hierarchical reinforcement learning framework that treats safety as an explicit and adjustable planning objective. By combining high-level constrained subgoal planning with low-level goal-conditioned control, the method improves task success, constraint satisfaction, and re-planning efficiency in safety-critical environments. For language alignment, the thesis develops two complementary methods for improving LLM safety. The first is a semantics-guided supervised fine-tuning approach that uses harmful data only and encourages models to move away from unsafe responses while preserving general capability. The second, Discernment via Contrastive Refinement (DCR), reduces over-refusal by separating the representations of truly toxic and seemingly toxic prompts before safety alignment. Together, these studies show that safety can be enforced through principled constraints and representation-level interventions, leading to AI systems that are both reliable and useful.

 

SPEAKER BIOGRAPHY

Yuxiao LU is a final-year PhD candidate at Singapore Management University, supervised by Prof. Pradeep Varakantham and Prof. Arunesh Sinha. His research lies at the intersection of artificial intelligence, reinforcement learning, and large language model safety, with a focus on developing intelligent systems that are both safe and useful under practical constraints. Before joining SMU,Yuxiao received his B.Sc. at Shandong University. His work has been published in leading AI and computer science venues, including AAAI, ICLR, and JCSS.