showSidebars ==
showTitleBreadcrumbs == 1
node.field_disable_title_breadcrumbs.value ==

Pre-Conference Talk by CHENG Yu Tong | A Pruning-based Question-Answering for Interactive Video Search: A Simple Baseline

Please click here if you are unable to view this page.

 

A Pruning-based Question-Answering for Interactive Video Search: A Simple Baseline

Speaker:


CHENG Yu Tong
Ph.D. Candidate
School of Computing and Information Systems
Singapore Management University

 

Date:

Time:

Venue:

 

11 June 2026, Thursday

2:00pm – 2:15pm

Meeting room 4.4, Level 4. School of Computing and Information Systems 1, Singapore Management University, 80 Stamford Road Singapore 178902

Please register by 9 June 2026.

About the Talk

Video search systems frequently struggle with imprecise user queries, which retrieve excessively broad and noisy sets of candidate videos and reduce the effectiveness of ranking functions. This issue is compounded by the sheer volume of visually and semantically similar videos in large datasets, forcing users to browse extensively and increasing cognitive load. While interactive questioning offers a promising strategy to progressively narrow down candidate videos, dynamically generating these questions remains a significant challenge. Relying on large language models for this task often introduces high computational latency and the risk of hallucination. In this talk, I will present a simple yet highly effective alternative grounded in Shannon’s information theory. Rather than generating questions from scratch, our method mathematically selects the most discriminative concepts to compose targeted questions, allowing for rapid and reliable pruning of the search space. I will discuss the practical implementation mechanics of the system, specifically detailing dynamic search result updates and top-K rank list sampling. Furthermore, the presentation will address how our approach accounts for realistic system and user constraints, including imperfectly indexed video content and noisy or misleading user feedback. Finally, I will share our experimental findings, which demonstrate the strong retrieval performance of this method on the AVSD and TRECVid benchmarks.

This is a Pre-Conference talk for 16th ACM International Conference on Multimedia Retrieval (ICMR 2026).

About the Speaker

CHENG Yu Tong is a Ph.D. candidate supervised by Prof. NGO Chong Wah at Singapore Management University. His research primarily focuses on the field of multimedia retrieval, with a specialization in the known-item search task. He also has an extensive background in system design and competitive benchmarking, having participated in numerous iterations of the Video Browser Showdown (VBS). Notably, in the most recent VBS competition, his team's retrieval system, VIREO, achieved first place in the visual known-item search category. His current research investigates the development of smart conversational search systems, aiming to leverage interactive querying to streamline user interaction and improve retrieval accuracy in large-scale video datasets like TRECVid.