| |
MIMIC: AI and AR-enhanced Multi-Modal, Immersive, Relative Instruction Comprehension Speaker (s):  Dhanuja Tharith Wanniarachchi WANNIARACHCHIGE PhD Candidate School of Computing and Information Systems Singapore Management University
| Date: Time: Venue: | | 7 October 2025, Tuesday 4:30pm – 4:45pm Meeting room 5.1, Level 5 School of Computing and Information Systems 1, Singapore Management University, 80 Stamford Road, Singapore 178902 We look forward to seeing you at this research seminar. Please register by 5 October 2025. 
|
|
About the Talk Situated AI agents are rapidly evolving to support natural human interaction by combining language, vision, and gesture cues. We present MImIC (Multi-Modal, Immersive, Relative Instruction Comprehension), a framework that enhances instruction comprehension through RGB and LiDAR sensing, spatial reasoning, and transformer-based language understanding. Unlike conventional systems that rely on fully-specified commands, MImIC resolves relative references such as “a chair taller than that table,” enabling more intuitive and natural communication. We demonstrate these capabilities through AIRFurn, an AR-enhanced furniture shopping application, and evaluate its performance in both lab and real-world environments. Our findings show significantly faster task completion, higher accuracy, and greater user satisfaction compared to existing AR-based interfaces, highlighting MImIC’s potential to shape the next generation of immersive AI assistants.
This is a Pre-Conference talk for The ACM international joint conference on Pervasive and Ubiquitous Computing (UbiComp 2025). About the Speaker Dhanuja Wanniarachchi is a fifth-year Ph.D. candidate at the School of Computing and Information Systems, Singapore Management University, under the supervision of Professor Archan Misra. His research focuses on multi-view, multi-modal sensing and perception, with a particular emphasis on edge computing. He develops novel lightweight collaborative frameworks that enhance the perception capabilities of on-device visual pipelines, enabling more efficient and scalable real-world deployments.
|