PhD Dissertation Defense by YANG Chengran | Harnessing SE Community Knowledge for Developer-Centric Code Intelligence

Please click here if you are unable to view this page.

Harnessing SE Community Knowledge for Developer-Centric Code Intelligence

YANG Chengran

PhD Candidate
School of Computing and Information Systems
Singapore Management University

FULL PROFILE

Research Area

Information Systems & Technology
- Software Engineering

Dissertation Committee

Research Advisor

OUB Chair Prof David LO

Committee Members

External Member

John GRUNDY, Professor of Software Engineering, Department of Software Systems & Cybersecurity, Monash University

Date

10 October 2025 (Friday)

Time

1:00pm - 2:00pm

Venue

Meeting room 5.1,
Level 5
School of Computing and Information Systems 1,
Singapore Management University,
80 Stamford Road
Singapore 178902

Please register by 8 October 2025.

We look forward to seeing you at this research seminar.

ABOUT THE TALK

This dissertation, "Harnessing SE Community Knowledge for Developer-Centric Code Intelligence," addresses the limitations of current Large Language Models (LLMs) in real-world software engineering scenarios. While code LLMs enhance productivity by automating tasks like code generation and debugging, they often produce inefficient code, lack transparent reasoning, and adapt poorly to diverse developer needs. To overcome these challenges, this work proposes leveraging structured knowledge from the software engineering community, drawing insights from developer forums and Q&A platforms, to improve the capabilities of code LLMs.

The research presents a two-fold approach: (1) guiding the training of code LLMs to enhance the quality of code generation in terms of accuracy, reasoning, and runtime efficiency; and (2) adapting code LLMs to downstream software engineering tasks by aligning model behavior with practical, real-world development workflows. The dissertation makes five key contributions. The first two focus on enhancing code generation by using community-derived logic for structured code reasoning and employing reinforcement learning to improve the runtime efficiency of the generated code. The latter three contributions focus on adapting LLMs to downstream tasks. These include a framework for enriching API documentation with practical insights from Q&A forums; a structured classification system for API reviews from developer forums; and TechSumBot, a query-focused summarization tool for technical Q&A platforms that distills knowledge from multiple answers.

SPEAKER BIOGRAPHY

Chengran Yang is a PhD candidate majoring in Computer Science from Singapore Management University. His research focuses on the intersection of Software Engineering and Artificial Intelligence, with a particular interest in enhancing code intelligence through the use of large language models and community-sourced knowledge. His work targets two primary directions: 1) Enhancing the Capacity of Code Large Language Models, specifically focusing on code correctness, runtime efficiency, and vulnerability detection; and 2) Integrating Intelligent Assistants into Developer Workflows, enabling automated and intelligent support for developer activities, such as automated question answering, documentation generation, and structured code reasoning. He has published papers in top-tier software engineering domain conferences such as ICSE, ASE, TOSEM, and TSE.

Where to find us

Get in touch