Pre-Conference Talks by CAI Xuemeng

Please click here if you are unable to view this page.

Pre-Conference Talks by CAI Xuemeng

Speaker (s):

CAI Xuemeng
PhD Candidate
School of Computing and Information Systems
Singapore Management University

Date:

Time:

Venue:

20 June 2025, Friday

3:00pm – 4:00pm

Meeting room 4.4, Level 4.
School of Computing and
Information Systems 1,
Singapore Management University,
80 Stamford Road, Singapore 178902

We look forward to seeing you at this research seminar.

Please register by 19 June 2025.

*There are 2 talks in this session, each talk is approximately 30 minutes.

About the TalkS

[ Talk 1: Measuring Model Alignment for Code Clone Detection Using Causal Interpretation ]

The ACM International Conference on the Foundations of Software Engineering (FSE) Abstract: Deep neural network-based models have shown high accuracy in semantic code clone detection, yet they often lack generalization, limiting their trustworthiness and interpretability. Their black-box nature obscures how they identify clones and what code components influence predictions. This paper introduces a causal interpretation framework grounded in the Neyman-Rubin causal model to analyze the decision-making of four state-of-the-art clone detection models. By applying expert-guided interventions, we derive causal explanations and measure how well model predictions align with expert intuition. We assess each model’s alignment with code similarity reasoning, robustness to confounding factors, and prediction consistency. Based on these metrics, we rank the models from most to least aligned, providing a foundation for evaluating and developing more reliable semantic code clone detection systems.

[ Talk 2: RustMap: Towards Project-Scale C-to-Rust Migration via Program Analysis and LLM ]

The 29th International Conference on Engineering of Complex Computer Systems (ICECCS 2025) Abstract: Migrating C programs to Rust is increasingly desirable due to Rust’s memory safety and C-like performance. However, existing tools like C2Rust often rely on syntactic translation, producing unsafe and hard-to-maintain Rust code. This paper presents RustMap, a novel dependency-guided, LLM-based C-to-Rust translation approach built on three ideas: (1) leveraging LLMs to generate idiomatic Rust code from small C code fragments, (2) decomposing large C codebases based on usage dependencies to address LLMs' limitations with long inputs, and (3) improving translation correctness via test-driven feedback and iterative refinement using compilation and testing results. We evaluate RustMap on 126 real-world programs, including 125 from Rosetta Code and the 7000+ LOC bzip2 project, using GPT-4o. RustMap generates significantly more idiomatic, readable, and functional Rust code than existing tools, with much less unsafe code, demonstrating reusable translation strategies for future research.

The first session (Talk #1) is for The ACM International Conference on the Foundations of Software Engineering (FSE 2025).

The second session (Talk #2) is for The 29th International Conference on Engineering of Complex Computer Systems (ICECCS 2025).

About the Speaker

CAI Xuemeng is a third-year PhD candidate and a Research Engineer at the Centre for Research on Intelligent Software Engineering (RISE) at Singapore Management University (SMU). She received her Bachelor’s degree in Computer Science and Design from the Singapore University of Technology and Design (SUTD) in 2022. Her research interests focus on software engineering challenges such as automatic program repair, code translation, and the interpretability of Large Language Models.

Where to find us

Get in touch