|
Improving AI Safety With Constrained Generation Strategies |  | LU Yuxiao PhD Candidate School of Computing and Information Systems Singapore Management University | Research Area Dissertation Committee Research Advisor Co-Research Advisor - Arunesh SINHA, Assistant Professor, Department of Management Science & Information Systems, Rutgers Business School, Rutgers University
Dissertation Committee Members |
| | Date 22 July 2024 (Monday) | Time 9:00am – 10:00am | Venue Meeting Room 5.1, Level 5 School of Computing and Information Systems 1, Singapore Management University, 80 Stamford Road Singapore 178902 | Please register by 21 July 2024. We look forward to seeing you at this research seminar. 
|
|
|
| ABOUT THE TALK Recent advancements in artificial intelligence (AI) have led to remarkable achievements, demonstrating exceptional capabilities in planning, sequential decision-making, and generating human-like text. However, despite these successes, there remains a critical need to enhance the safety and reliability of AI systems. For instance, for autonomous electric vehicles to travel long distances in minimum time, AI systems need to optimize the positioning of recharge locations to ensure the vehicles are not left stranded. Additionally, even the most widely used AI systems today, such as Large Language Models (LLMs), can still produce unsafe or inappropriate responses. These issues highlight significant risks when AI is deployed in real-world scenarios. Ensuring that AI systems can generate safe and trustworthy outputs is essential before they can be fully relied upon by humans. We attempt to address this challenge by introducing constraints in the generative processes to improve the safety and reliability of AI responses. | | ABOUT THE SPEAKER LU Yuxiao is a Ph.D. candidate in Computer Science at the SMU School of Computing and Information Systems, supervised by Professor Pradeep VARAKANTHAM and Professor Arunesh SINHA (external). His research focuses on Constrained Reinforcement Learning and Trustworthy Machine Learning. |
|