|
Pre-Conference Talk by ZHANG Ting, ZHOU Xin and LIU Jiakun
|
| DATE : |
4 September 2023, Monday |
| TIME : |
2:30pm - 4:00pm |
| VENUE : |
Meeting room 4.4, Level 4
School of Computing and Information Systems 1,
Singapore Management University,
80 Stamford Road,
Singapore 178902
Please register by 3 September 2023 |
|

|
|
|
There are 3 talks in this session, each talk is approximately 30 minutes.
All sessions are for pre-conference talk for 38th IEEE/ACM International Conference on Automated Software Engineering (ASE 2023)
|
|
About the Talk (s)
|
Talk #1: Duplicate Bug Report Detection: How Far Are We?
by ZHANG Ting, PhD Candidate
|
|
Many Duplicate Bug Report Detection (DBRD) techniques have been proposed in the research literature. The industry uses some other techniques. Unfortunately, there is insufficient comparison among them. To fill this gap, we first investigated potential biases that affect the fair comparison of the accuracy of DBRD techniques. Our experiments suggest that data age and issue tracking system choice cause a significant difference. Based on these findings, we prepared a new benchmark. We then used it to evaluate DBRD techniques to estimate better how far we have been. Surprisingly, a simpler technique outperforms recently proposed sophisticated techniques on most projects in our benchmark. In addition, we compared the DBRD techniques proposed in research with those used in Mozilla and VSCode. Surprisingly, a simple technique already adopted in practice can achieve comparable results as a recently proposed research tool.
|
|
Talk #2: The Devil is in the Tails: How Long-Tailed Code Distributions Impact Large Language Models
by ZHOU Xin, PhD Candidate
|
|
Learning-based techniques have gained considerable popularity in various software engineering (SE) tasks. Learning-based models, including popular Large Language Models (LLMs) for code, heavily rely on data, and the data's properties (e.g., data distribution) could significantly affect their behavior. We first conducted an exploratory study on the distribution of SE data and found that such data usually follows a skewed distribution (i.e., long-tailed distribution). We then investigate three distinct SE tasks and analyze the impacts of long-tailed distribution on the performance of popular LLMs for code. Our experimental results reveal that the long-tailed distribution has a substantial impact on the effectiveness of popular LLMs for code. Our study provides a better understanding of the effects of long-tailed distributions on LLMs for code and insights for the future development of SE automation.
|
|
Talk #3: AutoDebloater: Automated Android App Debloating
by LIU Jiakun, Research Scientist
|
|
Android applications are enlarging with more features. However, not all users need all the features. Unnecessary features can increase the attack surface and cost additional resources (e.g., storage and memory). Therefore, removing unnecessary features from applications is needed. However, end users often find it difficult to fully explore the apps and identify unnecessary features; and there is no off-the-shelf tool to assist users in debloating apps by themselves. In this work, we propose a web application, AutoDebloater, to debloat Android applications. Specifically, AutoDebloater automatically explores an app, presents the app’s Activity Transition Graph to users, and asks users to select the undesired activities. Then, AutoDebloater automatically removes the undesired activities from the app. The results of a user study show that users are satisfied with AutoDebloater in terms of the stability of the debloated apps and the ability of AutoDebloater to identify features that have never been noticed before.
|
|
|
About the Speaker (s)
 |
|
ZHANG, Ting is a Ph.D. candidate at SMU SCIS, supervised by Prof. David Lo and Prof. Lingxiao Jiang. Her research focuses on automatic software bug management, from detecting duplicate bug reports to repairing API misuse bugs. More info available: https://happygirlzt.com/academic
|
|
 |
|
ZHOU, Xin is a Ph.D. student in SCIS, under the supervision of Prof. David LO. Xin's research focuses on pre-trained code representation and automation for software maintenance and development.
|
|
 |
|
LIU, Jiakun is a research scientist at SMU advised by Prof. David LO. He received his Ph.D. degree in March 2022 from the College of Computer Science and Technology, Zhejiang University, China, advised by Prof. LI Shanping and Dr. XIA Xin. His research interest is intelligent software engineering, including mining software repositories (e.g., GitHub and Stack Overflow), as well as intelligent program reduction (e.g., debloating Android applications).
|
|
|
|