PhD Dissertation Proposal by HOANG Van Duc Thong (James) | Statistical and Deep Learning Models for Software Engineering Corpora

Please click here if you are unable to view this page.

Statistical and Deep Learning Models for Software Engineering Corpora

HOANG Van Duc Thong

PhD Candidate

School of Information Systems

Singapore Management University

FULL PROFILE

Research Area

Data Science & Engineering

Dissertation Committee

Chairman

Associate Prof. David LO

Committee Members

Associate Prof. Hady W. LAUW

Associate Prof. JIANG Lingxiao

Date

August 13, 2019 (Tuesday)

Time

1.00pm - 2.00pm

Venue

Meeting Room 5.1, Level 5,

School of Information Systems Singapore Management University

80 Stamford Road

Singapore 178902

We look forward to seeing you at this research seminar.

About The Talk

Software engineering corpora, collected from large software systems (i.e., MacOS, Ubuntu, Firefox, etc.), differs from natural language corpora. Specifically, software engineering corpora does not only include natural language, used by humans, but also includes programming language, used by machines. Software engineering corpora has been heavily studied in the last decade and used to solve many software engineering problems, e.g., tag recommendation, detecting duplicated bug report, pro ling android application, etc. In this dissertation, we take advantage of software engineering corpora to detect bugs in software systems. Specifically, we aim to solve three main software engineering problems: i.e., bug localization, just-in-time defect prediction, and bug fixing patch identification.

In this dissertation, we aim to (1) propose a model taking advantage of bug report similarity and method similarity graphs to localize bug effectively; and (2) propose a deep learning model automatically extracting code change features by leveraging the semantic and syntactic structure of the actual code changes for detecting bugs in commits. While (1) aims to solve bug localization problem, (2) aims to address just-in-time defect prediction and bug fixing patches problems in the software engineering community. Our contributions in this dissertation proposal are as follows: (1) bug localization: We propose a new approach, namely Network-clustered Multi-modal Bug Localization (NetML), which utilizes multi-modal information from both bug reports and program spectra to localize bugs. NetML facilitates an effective bug localization by carrying out a joint optimization of bug localization error and clustering of both bug reports and program elements (i.e., methods). (2) Just-in-time defect prediction: We propose an end-to-end deep learning framework, named DeepJIT, that automatically extracts features from commit messages and code changes and use them to identify defects. (3) We propose a hierarchical deep learning-based approach capable of automatically extracting features from commit messages and commit code and using them to identify stable patches, namely PatchNet. Unlike DeepJIT, PatchNet contains a deep hierarchical structure that mirrors the hierarchical and sequential structure of commit code, making it distinctive from the existing deep learning models on source code.

Speaker Biography

Hoang Van Duc Thong (James) is a third-year Ph.D. candidate in the School of Information Systems, Singapore Management University, advised by Associate Professor David Lo. His research focuses on machine learning and deep learning for accurate bug identification.

Where to find us

Get in touch