showSidebars ==
showTitleBreadcrumbs == 1
node.field_disable_title_breadcrumbs.value ==

PhD Dissertation Defense by XU Bowen | Learning to Interpret Knowledge from Software Question and Answer Sites

Please click here if you are unable to view this page.

 
 
Learning to Interpret Knowledge from Software Question and Answer Sites

XU Bowen

PhD Candidate
School of Computing and Information Systems
Singapore Management University
 

FULL PROFILE
Research Area Dissertation Committee
Advisor
Committee Members
External Member
  • Dr. Xin XIA, Director, Software Engineering Application Technology Lab, Huawei Technologies Co., Ltd
 
Date

23 August 2021 (Monday)

Time

1:00pm - 2:00pm

Venue

This is a virtual seminar. Please register by 19 August 2021, the zoom link will be send out on the following day to those who have registered.

We look forward to seeing you at this research seminar.

 
About The Talk

Nowadays, software question and answer (SQA) data has become a treasure for software engineering as it contains a huge volume of programming knowledge. That knowledge can be interpreted in many different ways to support various software activities, such as code recommendation, program repair, and so on. In this dissertation, we interpret SQA data by addressing three novel research problems.

The first research problem is about linkable knowledge unit prediction. In this problem, a question and its answers within a post in Stack Overflow are considered as a knowledge unit(KU). KUs often contain semantically relevant knowledge, and thus linkable for different purposes. Being able to classify different classes of linkable knowledge units would support more targeted information needs when users search or explore the linkable knowledge. Compare with the approaches proposed in prior works, we design a relatively simpler but more effective machine learning model to address the problem. Moreover, we discover the limitation of the data set used in the previous works and construct a new one with a larger size and higher diversity. Our experimental result shows that our model outperforms the state-of-the-art approaches significantly.

The second research problem is about distributed representation for Stack Overflow posts. In this dissertation, we propose a specialized deep learning architecture Post2Vecwhich extracts distributed representations of Stack Overflow posts. To evaluatePost2Vec, we first investigate its end-to-end effectiveness in tag recommendation task. We observe that Post2Vec achieves significant improvement in terms of F1-score@5 at a lower computational cost. Moreover, to evaluate the value of representations learned by Post2Vec, we use them for three other tasks, i.e., relatedness prediction, post classification, and API recommendation. We demonstrate that the representations can be used to boost the effectiveness of state-of-the-art solutions for the three tasks by substantial margins.

The third research problem is about answer summary generation for technical questions. We formulate the task as a query-focused multi-answer-posts summarization task for a given technical question. We conduct user studies to evaluate the quality of the answer summaries generated by our approach. The user study results demonstrate those answer summaries generated by Answer Bot are relevant, useful, and diverse.

 
Speaker Biography

Bowen XU is a PhD student at School of Information Systems, Singapore Management University advised by Professor David Lo. He received his M.Eng. in College of Software Technology, Zhejiang University in 2017. His research interests are in software engineering area, especially, data science for software engineering.