| |
Improving Sample Efficiency of Online Temporal Difference Learning
|
Speaker (s):

PAN Yangchen
Ph.D. Candidate
Computer Science Department
University of Alberta
|
|
Date:
Time:
Venue:
|
|
20 October 2020, Tuesday
10:00am - 11:15am
This is a virtual seminar. Please register by 14 October, the webex link will be sent to those who have registered on the following day.
We look forward to seeing you at this research seminar.

|
|
About the Talk
Reinforcement Learning (RL) achieved several remarkable successes in recent years, such as playing Atari games at the human-level, power station control, and finance portfolio management, etc. However, it is still far away from developing RL’s full potential. One of the most important scientific hurdles is that RL algorithms suffer from low sample efficiency. That is, an RL agent typically needs to have many physical interactions with the real-world to achieve a reasonably good policy. Such interactions are typically quite expensive. I will introduce my efforts in improving the sample efficiency of online RL algorithms for both policy evaluation and control problems. I have been making efforts in the following directions: 1) bringing in preconditioning acceleration techniques for policy evaluation algorithms in a linear function approximation setting; 2) investigating efficient sampling distribution for model-based control problems; 3) designing special regularization method to leverage the intrinsic structure of a PDE control problem; 4) designing an efficient, scalable sparse representation learning activation function for a broad class of deep RL algorithms. All of my developed methods are supported by strong empirical evidence.
About the Speaker
Yangchen Pan is currently a Ph.D. candidate at the University of Alberta. He is co-supervised by Dr. Martha White from the University of Alberta and Dr. Amir-massoud from the University of Toronto. His primary research interest is reinforcement learning. His long-term goal is to develop RL agents that interactively learn from data to solve complex real-world tasks. During his Ph.D. program, he has been working on fundamental research in reinforcement learning, including policy evaluation problems, model-based reinforcement learning control problems, sparse representation learning methods, extremely high dimensional continuous control problems, etc. He has published refereed papers at several well-known conferences such as ICML, ICLR, NeurIPS, IJCAI, AAAI, UAI. He also serves as a committee member/reviewer for those conferences and for the Journal of Machine Learning Research (JMLR).
He is a tenure-track faculty candidate for the Artificial Intelligence & Data Science, Machine Learning & Intelligence cluster.
|