PhD Dissertation Proposal by Arambam James SINGH | Temporal Abstraction in Collective Multiagent Reinforcement Learning

Please click here if you are unable to view this page.

Temporal Abstraction in Collective Multiagent Reinforcement Learning

Arambam James SINGH

PhD Candidate

School of Information Systems

Singapore Management University

FULL PROFILE

Research Area

Intelligent Systems & Optimization

Dissertation Committee

Chairman

Assistant Prof. Akshat KUMAR

Committee Members

Date

December 6, 2019 (Friday)

Time

3.00pm - 4.00pm

Venue

Meeting Room 5.1, Level 5,

School of Information Systems,

Singapore Management University

80 Stamford Road

Singapore 178902

We look forward to seeing you at this research seminar.

About The Talk

In current age, growth in sectors like healthcare, finance, transportation etc., involve digitization of industrial processes and the rapid growth demands development of next generation artificial intelligence(AI) systems where not just a closed system with single agent operates but multiple agents operate at scale, and also in some instances involving human as agents in the system. Multiagent reinforcement learning (MARL) is the field of study that address problems in such multiagent systems.

In this dissertation, I propose algorithms for decision making in large scale multiagent systems with temporal abstraction, i.e action that takes variable amount of time to execute. As a motivating domain, I address the maritime traffic management problem near busy port waters.

Increasing global maritime traffic coupled with rapid digitisation and automation in shipping mandate developing next generation maritime traffic management systems to mitigate congestion, increase safety of navigation, and avoid collisions in busy and geographically constrained ports. To achieve these objectives, we model the maritime traffic as a large scale multiagent system with individual vessels as agents. First, I propose a simulation model for the maritime traffic system that works at an aggregate level incorporating realistic domain constraints such as uncertain and asynchronous movement of vessels. Then, based on the aggregate information I propose a policy gradient method for a deterministic policy that provides navigation guidance to vessels. As our approach is decentralised, in order to tackle multiagent credit assignment problem, I also propose an individual vessel based value function.

I also develop a hierarchical reinforcement learning approach where vessels first select a high level action based on the underlying traffic flow, and then select the low level action that determines their future speed. We exploit the nature of collective interactions among agents to develop a policy gradient approach that can scale up to large real world problems. We also develop an effective multiagent credit assignment scheme based on the high level action, that significantly improves the convergence of policy gradient. Extensive empirical results on synthetic, and real world data from one of the busiest port of the world show that our approach consistently performs significantly better than the previous best approach, providing about 30%-40% improvement in solution quality in several settings.

Speaker Biography

Arambam James SINGH is a PhD candidate at School of Information Systems, Singapore Management University advised by Assistant Prof. Akshat Kumar and Prof. Hoong Chuin Lau. His research focuses on deep reinforcement learning and optimisation.

Where to find us

Get in touch