| |

|
Pre-conference talk by LI Wenjun and Sidney TIO Xi Rong
|
|
| DATE : |
5 July 2023, Wednesday |
| TIME : |
3:00pm - 3:40pm |
| VENUE : |
Meeting room 4.4, Level 4
School of Computing and Information Systems 1,
Singapore Management University,
80 Stamford Road,
Singapore 178902
Please register by 4 July 2023 |
|

|
| |
|
There are 2 talks in this session, each talk is approximately 20 minutes.
All sessions are for pre-conference talk for Environment Generation for Generalizable Robots (EGG) At Robotics: Science and Systems (RSS) 2023.
|
|
|
About the Talk (s)
|
Talk #1: Generalization through Diversity: Improving Unsupervised Environment Design
by LI Wenjun, PhD Candidate
|
|
Agent decision making using Reinforcement Learning (RL) heavily relies on either a model or simulator of the environment (e.g., moving in an 8x8 maze with three rooms, playing Chess on an 8x8 board). Due to this dependence, small changes in the environment (e.g., positions of obstacles in the maze, size of the board) can severely affect the effectiveness of the policy learned by the agent. To that end, existing work has proposed training RL agents on an adaptive curriculum of environments (generated automatically) to improve performance on out-of-distribution (OOD) test scenarios. Specifically, existing research has employed the potential for the agent to learn in an environment (captured using Generalized Advantage Estimation, GAE) as the key factor to select the next environment(s) to train the agent. However, such a mechanism can select similar environments (with a high potential to learn) thereby making agent training redundant on all but one of those environments. To that end, we provide a principled approach to adaptively identify diverse environments based on a novel distance measure relevant to environment design. We empirically demonstrate the versatility and effectiveness of our method in comparison to multiple leading approaches for unsupervised environment design on three distinct benchmark problems used in literature.
|
|
Talk #2: Transferable Curricula Through Difficulty-Conditioned Generators
by Sidney TIO Xi Rong, PhD Candidate
|
|
Advancements in reinforcement learning (RL) have demonstrated superhuman performance in complex tasks such as Starcraft, Go, Chess etc. However, knowledge transfer from Artificial “Experts” to humans remain a significant challenge. A promising avenue for such transfer would be the use of curricula. Recent methods in curricula generation focuses on training RL agents efficiently, yet such methods rely on surrogate measures to track student progress, and are not suited for training robots in the real world (or more ambitiously humans). In this paper, we introduce a method named Parameterized Environment Response Model (PERM) that shows promising results in training RL agents in parameterized environments. Inspired by Item Response Theory, PERM seeks to model difficulty of environments and ability of RL agents directly. Given that RL agents and humans are trained more efficiently under the “zone of proximal development”, our method generates a curriculum by matching the difficulty of an environment to the current ability of the student. In addition, PERM can be trained offline and does not employ non- stationary measures of student ability, making it suitable for transfer between students. We demonstrate PERM’s ability to represent the environment parameter space, and training with RL agents with PERM produces a strong performance in deterministic environments. Lastly, we show that our method is transferable between students, without any sacrifice in training quality.
|
|
|
|
About the Speaker (s)
 |
|
LI Wenjun is a Ph.D. candidate in Computer Science at the SMU School of Computing and Information Systems, supervised by Professor Pradeep Varakantham. His research aims to design and build open-ended systems which continuously propose new tasks for RL agents to solve, ultimately producing generally capable agents.
|
| |
 |
|
Sidney TIO is a Ph.D. candidate in Computer Science at the SMU School of Computing and Information Systems. He is supervised by Professor Pradeep Varakantham. His research area focused in maximizing training gains for both humans and artificial agents through research in Reinforcement Learning.
|
| |
|
|
|