PhD Dissertation Defense by JOE Waldy | Reinforcement Learning Approach to Coordinate Real-World Multi-Agent Dynamic Routing and Scheduling

Please click here if you are unable to view this page.

Reinforcement Learning Approach to Coordinate Real-World Multi-Agent Dynamic Routing and Scheduling

JOE Waldy

PhD Candidate
School of Computing and Information Systems
Singapore Management University

FULL PROFILE

Research Area

Artificial Intelligence & Data Science
- Intelligent Systems & Optimisation

Dissertation Committee

Research Advisor

Prof. LAU Hoong Chuin

Committee Members

External Member

Arunesh SINHA, Assistant Professor, Department of Management Science & Information Systems, Rutgers Business School, Rutgers University

Date

23 November 2022 (Wednesday)

Time

10:00am - 11:00am

Venue

Meeting room 5.1, Level 5
School of Computing and Information Systems 1,
Singapore Management University,
80 Stamford Road
Singapore 178902

We look forward to seeing you at this research seminar.

About The Talk

In this dissertation, we study new variants of routing and scheduling problems motivated by real-world problems from the urban logistics and law enforcement domains. In particular, we focus on two key aspects: dynamic and multi-agent. While routing problems such as the Vehicle Routing Problem (VRP) is well-studied in the Operations Research (OR) community, we know that in real-world route planning today, initially-planned route plans and schedules may be disrupted by dynamically-occurring events. In addition, routing and scheduling plans cannot be done in silos due to the presence of other agents which may be independent and self-interested.

This dissertation discusses and proposes new methodologies that incorporate relevant techniques from the field of AI (Reinforcement Learning (RL) and Multi-Agent System (MAS) more precisely) to supplement and complement classical OR techniques to solve dynamic and multi-agent variants of routing and scheduling problems. This dissertation makes three main contributions. Firstly, to address dynamic aspect of routing and scheduling problem, we propose an RL-based approach that combines Value Function Approximation (VFA) and planning heuristic to learn assignment/dispatch and rerouting/rescheduling policies jointly without the need to decompose the problem or action into multiple stages. Secondly, to address multi-agent aspect of routing and scheduling problem, we formulate the problem as strategic game and propose a scalable, decentralized, coordinated planning approach based on iterative best response. Lastly, to address both dynamic and multi-agent aspects of the problem, we present a pioneering effort on a cooperative Multi-Agent RL (MARL) approach to solve multi-agent dynamic routing and scheduling problem directly without any decomposition step. This contribution builds upon our two earlier contributions by extending the proposed VFA method to address multi-agent setting and incorporating the iterative best response procedure as a decentralized optimization heuristic and an explicit coordination mechanism.

Speaker Biography

Joe Waldy is a PhD candidate in Computer Science and is advised by Prof. Lau Hoong Chuin. His research focus is on the intersection between AI and OR specifically on how both fields can work together to solve real-world multi-agent dynamic routing and scheduling problems. Prior to joining SMU, Waldy graduated with a Bachelor’s degree in Industrial and Systems Engineering from National University of Singapore and spent 7 years working in the government sector in various roles ranging from IT, Operations Research and Data Science.

Where to find us

Get in touch