showSidebars ==
showTitleBreadcrumbs == 1
node.field_disable_title_breadcrumbs.value ==

PhD Dissertation Proposal by LE Dinh Xuan Bach | Overfitting in Automated Program Repair : Challenges and Resolutions

Please click here if you are unable to view this page.

 
Overfitting in Automated Program Repair: Challenges and Resolutions

Speaker (s):

LE Dinh Xuan Bach
PhD Candidate
School of Information Systems
Singapore Management University

Date:

Time:

Venue:
 

November 23, 2017, Thursday

12:00pm - 1:00pm

Meeting Room 5.1, Level 5
School of Information Systems
Singapore Management University
80 Stamford Road
Singapore 178902

We look forward to seeing you at this research seminar.

About the Talk

Bug fixing is time-consuming and costly. Hence, automated program repair (APR) techniques that can relieve the burden on human developers in bug fixing would be of tremendous value. Substantial recent works have been proposed to automatically repair variety of bugs in many real-world large software,gradually materializing the futuristic idea of APR. These APR techniques, despite varying in the ways they search for repairs, commonly rely on test cases to guide the repair process and validate machine-generated patches. The reliance on test cases is, in fact, problematic to research in APR since test cases are known to be incomplete, in a sense that they often insufficiently encode desired behaviors of software. This could lead APR techniques to generate patches that overfit to the test cases used for repair, but do not necessarily generalize to expected behavior that developers would expect. To overcome the mentioned problem – often regarded as patch overfitting, APR techniques must address the followings: (1) maintaining both scalability and tractability, in which APR techniques must cheaply scale to large, real-world programs, while being able to tackle the large search space for repairs for those programs to find correct repairs, (2) methodologies to validate machine-generated patches.

This thesis tackles the above challenges posed by the overfitting problem by (1) proposing new search- and semantics-based APR techniques that are capable of generating generalizable repairs, (2) empirically studying the overfitting issue in semantics-based APR, complementing existing study on the search-based counterparts, (3) empirically evaluating the reliability of patch validation methodologies, providing insightful guidelines on how machine-generated patches should be evaluated. In particular, we proposed HDRepair – a search-based APR technique that leverages the development history of many software to guide and drive the repair process. We empirically studied various characteristics of different semantics-based APR techniques, showing that APR techniques are indeed subject to overfitting at various degrees. We subsequently proposed S3 – a semantics-based APR technique that systematically constrains the syntactic search space for repairs and effectively ranks solutions to find correct repairs. Finally, we studied the reliability of existing popular patch validation methodologies, and provide several guidelines and insights on how APR-generated patches should be evaluated.
 

About the Speaker

Bach is currently a fourth year PhD candidate in SIS, SMU, under Associate Professor David Lo. Bach's main research interest is in software engineering, particularly in software mining, analysis, repair, synthesis, and verification. Before joining SMU in 2014, he was a research assistant in National University of Singapore, and obtained his B.S. degree from Hanoi University of Science and Technology, Vietnam, in 2012.