showSidebars ==
showTitleBreadcrumbs == 1
node.field_disable_title_breadcrumbs.value ==

PhD Dissertation Proposal by CAO Rui | Using Pre-trained Models for Multimodal Understanding Tasks

Please click here if you are unable to view this page.

Using Pre-trained Models for Multimodal Understanding Tasks


PhD Candidate
School of Computing and Information Systems
Singapore Management University

Research Area Dissertation Committee
Research Advisor
Committee Members

11 August 2023 (Friday)


3:30pm - 4:30pm


Meeting room 5.1, Level 5
School of Computing and Information Systems 1,
Singapore Management University,
80 Stamford Road
Singapore 178902

Please register by 10 August 2023.

We look forward to seeing you at this research seminar.

About The Talk

We live in a world of multiple modalities. To facilitate people's daily life, AI systems are expected to be capable of multimodal understanding. In this proposal, we focus on two challenging but practically important multimodal understanding tasks, multimodal hateful meme detection and visual question answering. Previously people trained models from scratch, which led to overfitting and lack of generalization capabilities due to the limited training data. Recently people conduct model pre-training to learn universal representations. In this proposal we try to leverage pre-trained models to facilitate the two tasks, hateful meme detection and VQA. Specifically, we follow the timeline of using pre-trained models, consider tuning, frozen and composition of pre-trained models and design methods based on the three strategies to improve the two multimodal understanding tasks.

Speaker Biography

Rui Cao is currently a Ph.D candidate in the School of Computing and Information Systems, supervised by Prof. Jing Jiang. She focused on multimodal understanding, specifically, multimodal hateful meme detection and visual question answering.