|
Towards Robust, Secure, and Privacy-aware Large Language Models of Code |  | YANG Zhou PhD Candidate School of Computing and Information Systems Singapore Management University | Research Area Dissertation Committee Research Advisor Dissertation Committee Members |
| | Date 26 July 2024 (Friday) | Time 10:30am – 11:30am | Venue Meeting Room 5.1, Level 5 School of Computing and Information Systems 1, Singapore Management University, 80 Stamford Road Singapore 178902 | Please register by 25 July 2024. We look forward to seeing you at this research seminar. 
|
|
|
| ABOUT THE TALK Artificial Intelligence, specifically the large language models for code (LLM4Code), has reshaped software engineering. LLM4Code demonstrate strong functional capability in generating and summarizing code, predicting vulnerabilities, etc. Yet, researchers have recently unraveled that LLM4Code fail to satisfy non-functional properties. In a recent survey, we analyze 146 papers and identify six important properties that deserve attention from researchers and practitioners, including robustness, security, privacy, explainability, efficiency, and usability.
In this talk, I will highlight my research regarding three properties: robustness, security, and privacy. First, LLM4Code are not robust. We show human-imperceptible perturbations can make models produce wrong results. Second, LLM4Code is vulnerable to backdoor attack [3] and membership inference attack. It is worrisome that existing methods cannot fully address such threats. Third, we expose that LLM4Code can memorize its training data, exposing vulnerable, sensitive, and privacy-revealing code to the end users. It potentially causes security and ethical issues. I will also briefly explain our latest work on effectively mitigating such undesired behavior in a time efficient manner. To summarize, we provide a higher-level "ecosystem perspective" of analyzing LLM4Code, aiming to improve the trustworthiness and transparency in building the next generation of AI tools for software engineering. | | ABOUT THE SPEAKER YANG Zhou is a third-year PhD candidate at Singapore Management University, mentored by Prof. David LO. Zhou's main research focus is "beyond accuracy of large language models for code (LLM4Code)," analyzing and assuring a broad list of properties including robustness, security, privacy, efficiency, explainability and usability of LLM4Code ecosystems. Zhou also has publication records in the general AI testing, including evaluating correctness of speech recognition systems, fairness of NLP models, and security threats in reinforcement learning models. |
|