PhD Dissertation Defense by Jing LU | Scalable Online Kernel Learning

Please click here if you are unable to view this page.

Scalable Online Kernel Learning

Speaker (s):

Jing LU

PhD Candidate

School of Information Systems

Singapore Management University

Date:

Time:

Venue:

November 13, 2017, Monday

9:30am - 10:30am

Meeting Room 4.4, Level 4

School of Information Systems

Singapore Management University

80 Stamford Road

Singapore 178902

We look forward to seeing you at this research seminar.

About the Talk

One critical deficiency of traditional online kernel learning methods is their increasing and unbounded number of support vectors (SV's), making them inefficient and non-scalable for large-scale applications. Recent studies on budget online learning have attempted to overcome this shortcoming by bounding the number of SV's. Despite being extensively studied, budget algorithms usually suffer from several drawbacks.

First of all, although existing algorithms attempt to bound the number of SV's at each iteration, most of them fail to bound the number of SV's for the final averaged classifier, which is commonly used for online-to-batch conversion.

To solve this problem, we propose a novel bounded online kernel learning method, Sparse Passive Aggressive learning (SPA), which is able to yield a final output kernel-based hypothesis with a bounded number of support vectors. The idea is to attain the bounded number of SV's using an efficient stochastic sampling strategy which samples an incoming training example as a new SV with a probability proportional to its loss suffered. Since the SV's are added wisely and no SV's are removed during the learning, the proposed SPA algorithm achieves a bounded final averaged classifier. We theoretically prove that the proposed SPA algorithm achieves an optimal regret bound in expectation, and empirically show that the new algorithm outperforms various bounded kernel-based online learning algorithms.

Secondly, existing budget learning algorithms are either too simple to achieve satisfactory approximation accuracy, or too computationally intensive to scale for large datasets. To overcome this limitation and make online kernel learning efficient and scalable, we explore two functional approximation based online kernel machine learning algorithms, Fourier Online Gradient Descent (FOGD) and Nystr\"{o}m Online Gradient Descent (NOGD). The main idea is to adopt the two methods to approximate the kernel model with a linear classifier, so that the efficiency is highly improved. The encouraging results of our experiments on large-scale datasets validate the effectiveness and efficiency of the proposed algorithms, making them potentially more practical than the family of existing budget online kernel learning approaches.

Thirdly, we also extend the three proposed algorithms to solve the online multiple kernel learning (OMKL) problem, in which we significantly reduce the learning complexity of Online Multiple Kernel Classification (OMKC) while achieving satisfactory accuracy as compared with existing unbounded OMKC algorithms. We theoretically prove that the proposed algorithms achieve an optimal regret bound, and empirically show that the new algorithms outperform various bounded kernel-based online learning algorithms.

In conclusion, this work focuses on scaling up online kernel learning algorithms for large scale applications. To facilitate other researchers, we also provide an open source tool-box that includes the implementation of all proposed algorithms in this paper, as well as many state-of-the-art online kernel learning algorithms.

About the Speaker

Jing LU is a PhD candidate in the School of Information Systems (SIS), Singapore Management University (SMU), Singapore. Prior to joining SMU, She received her Bachelor’s Degree in Honor School, Harbin Institute of Technology, China in 2012 and worked as a Project Officer in School of Computer Science, Nanyang Technological University, Singapore 2012-2014. During her past PhD study, she has been dedicated to her research area of online learning for addressing the emerging challenges of big data analytics particularly for dealing with real-time data stream analytics. She has published several research papers as the first authors in top tier journals and high-impact conferences. Most of her research publications addressed the key open challenges in the area of machine learning and big data analytics fields.

Where to find us

Get in touch