showSidebars ==
showTitleBreadcrumbs == 1
node.field_disable_title_breadcrumbs.value ==

Pre-Conference Talk by NGUYEN Duc Thien | Policy Gradient With Value Function Approximation For Collective Multiagent Planning

Please click here if you are unable to view this page.

 
Policy Gradient With Value Function Approximation For Collective Multiagent Planning

Speaker (s):

NGUYEN Duc Thien
PhD Candidate
School of Information Systems
Singapore Management University

Date:

Time:

Venue:
 

November 24, 2017, Friday

2:00pm - 3:00pm

Meeting Room 5.1, Level 5
School of Information Systems
Singapore Management University
80 Stamford Road
Singapore 178902

We look forward to seeing you at this research seminar.

About the Talk

Decentralized (PO)MDPs provide an expressive framework for sequential decision making in a multiagent system. Given their computational complexity, recent research has focused on tractable yet practical subclasses of Dec-POMDPs. We address such a subclass called CDec-POMDP where the collective behavior of a population of agents affects the joint-reward and environment dynamics. Our main contribution is an actor-critic (AC) reinforcement learning method for optimizing CDec-POMDP policies. Vanilla AC has slow convergence for larger problems. To address this, we show how a particular decomposition of the approximate action-value function over agents leads to effective updates, and also derive a new way to train the critic based on local reward signals. Comparisons on a synthetic benchmark and a real world taxi fleet optimization problem show that our new AC approach provides better quality solutions than previous best approaches.

This is a pre-conference talk for Neural Information Processing Systems (NIPS 2017).
 

About the Speaker

NGUYEN Duc Thien is a fourth-year PhD candidate in Information Systems. Since 2014, he has been working under the supervision of Professor Lau Hoong Chuin and Assistant Professor Akshat Kumar in his PhD thesis topic "Collective Multi-agent Planning and Inference", i.e. to find the agent policy in a (large) population. Before joining SMU as a PhD student, he had his Master degree in Information Systems from SMU in 2013 and Bachelor degree in Mathematics from Vietnam National University in 2010.