PhD Dissertation Defense by Roman Lok-ming BELAIRE | Sequential Robustness in Adversarial Reinforcement Learning

Please click here if you are unable to view this page.

Sequential Robustness in Adversarial Reinforcement Learning

BELAIRE Roman Lok-Ming

PhD Candidate
School of Computing and Information Systems
Singapore Management University

FULL PROFILE

Research Area

Artificial Intelligence & Data Science
- Decision Making & Optimization

Dissertation Committee

Advisor:	Professor Pradeep Reddy VARAKANTHAM
Members:	Assistant Professor Djordje ZIKELIC
	Associate Professor Akshat KUMAR
External Members:	Arunesh SINHA, Assistant Professor, Department of Management Science & Information Systems, Rutgers Business School, Rutgers University

Date

19 May 2026 (Tuesday)

Time

10:00am – 11:00am

Venue

Meeting room 5.1, Level 5
School of Computing and Information Systems 1,
Singapore Management University,
80 Stamford Road,
Singapore 178902

Please register by 17 May 2026

We look forward to seeing you at this research seminar.

ABOUT THE TALK

This thesis establishes a new foundation for adversarial reinforcement learning, motivated by the comparative structure of regret. Existing approaches to robustness typically rely on adversarial training, which often fails to generalize to novel attacks, or worst-case optimization, which provides lower-bound guarantees but tends to produce overly conservative policies. To address these limitations, I propose a framework guided by three key principles. First, robustness should be evaluated at the trajectory level, ensuring stability over long sequences of decisions rather than individual actions. Second, robust agents must be designed with future adversaries in mind, rather than optimized against a fixed perturbation strategy. Finally, principled descriptions of the underlying problem structure are essential: methods that exploit the true structure of adversarial decision-making remain robust as applications evolve, while purely heuristic approaches often prove brittle.

Building on these principles, this dissertation develops scalable regret-based formulations of robustness, analyzes the structural role of partial observability in adversarial reinforcement learning, and demonstrates their effectiveness on both standard benchmarks and real-world applications.

ABOUT THE SPEAKER

Roman Belaire is a Ph.D. candidate at Singapore Management University, advised by Pradeep Varakantham and affiliated with the CARE.ai lab. My research concerns adversarial robustness in reinforcement learning, and I also have interests in RL fundamentals, generalization, and AI safety. Recent work has focused on formalizing LLM attacks (prompting LLMs to cause harm) and creating robust defenses via the application of robust RL work. I enjoy exercising, playing games, eating, and being in the ocean.

Where to find us

Get in touch