PhD Dissertation Proposal by Mudiyanselage Dulanga Kaveesha WEERAKOON | Enabling and Optimizing Multi-Modal Sense-Making for Human-AI Interaction

Please click here if you are unable to view this page.

Enabling and Optimizing Multi-Modal Sense-Making for Human-AI Interaction

Mudiyanselage Dulanga Kaveesha WEERAKOON

PhD Candidate
School of Computing and Information Systems
Singapore Management University

FULL PROFILE

Research Area

Human-Machine Collaborative Systems
- Pervasive Sensing & Systems

Dissertation Committee

Research Advisor

Prof. Archan MISRA

Committee Members

External Member

Vigneshwaran SUBBARAJU, Research Fellow, Agency for Science Technology and Research

Date

19 August 2022 (Friday)

Time

9:00am - 10:00am

Venue

This is a virtual seminar. Please register by 17 August 2022, the zoom link will be sent out on the following day to those who have registered.

We look forward to seeing you at this research seminar.

About The Talk

In recent years, widespread popularity of Augmented Reality (AR) enables a multitude of Human-AI collaborative applications, For example, interactive virtual assistants, which enable humans to communicate their instructions and queries more naturally using modalities such as voice and text, are progressively empowering exciting new ubiquitous, mixed-reality computing applications. Traditional voice-based conversational agents like Apple’s Siri and Amazon’s Alexa are evolving to be increasingly multi-modal (e.g., Nvidia’s RIVA SDKs) supporting the comprehension of human instructions via a mix of language, gesture, and visual inputs. While the success of such multi-modal machine comprehension is driven by progressively sophisticated Deep Neural Network (DNN) models, their prohibitive computational overhead and model size makes it challenging to support low-latency, on-device execution of inference tasks on resource-constrained wearable and IoT devices (such as Microsoft HoloLens or Nvidia Jetson platforms). Thus, the central focus of my research is to enable these multi-modal human interactive tasks to be executable on resource-constrained wearable and IoT devices, while capacitating low-power, latency execution with a comparable task accuracy. Natural human-human interaction generally occurs through multiple modalities including verbal, visual and gestural modalities. Driving motivation from such human-human interaction scenarios, our work broadly explores how existing wearable sensors be used for enabling multi-modal sense-making for human-AI interaction tasks. In particular, we consider object acquisition task as an exemplar of a human-AI collaborative task to introduce a number of sense-making models and optimization techniques to deploy these models in a pervasive setting.

Speaker Biography

I am a PhD student at the School of Computing and Information Systems, Singapore Management University. My research interests lie in Human-AI Collaboration, Multi-modal Sense-making, Pervasive Computing and Referring Expression Comprehension. I have worked extensively on Multi-modal sensing-making for Human-AI collaboration tasks on pervasive devices. In particular, I have explored several static and dynamic optimization techniques on referring expression comprehension models to support human-AI collaborative object acquisition tasks with low energy and latency processing with comparable task accuracy.

Where to find us

Get in touch