showSidebars ==
showTitleBreadcrumbs == 1
node.field_disable_title_breadcrumbs.value ==

PhD Dissertation Defense by YING Jiahao | Towards Auto-Evaluation for Large Language Models

Please click here if you are unable to view this page.

 
Towards Auto-Evaluation for Large Language Models

YING Jiahao

PhD Candidate
School of Computing and Information Systems
Singapore Management University
 

FULL PROFILE 

Research Area

  • Artificial Intelligence & Data Science
    • Machine Learning & Intelligence

Dissertation Committee

Advisor:
Co-Advisor:CAO Yixin, Professor, Fudan University
Members:
 
External Members:SUN Aixin, Associate Professor, College of Computing and Data Science,Nanyang Technological University
 

Date

29 Jun 2026 (Monday)

Time

11:00am – 12:00pm

Venue

Meeting room 5.1, Level 5
School of Computing and Information Systems 1, 
Singapore Management University, 
80 Stamford Road, 
Singapore 178902

Please register by 28 Jun 2026.

We look forward to seeing you at this research seminar.

 

ABOUT THE TALK

Large language models (LLMs) are advancing faster than traditional evaluation can keep up. Manually curated benchmarks are costly to maintain, quickly lose discriminative power, and risk leakage that inflates results. This thesis studies automatic evaluation for LLMs under a question-based paradigm spanning three stages: constructing evaluation data, judging outputs, and looking beyond performance scores.

First, for automatic data construction, it presents two studies: an automated robustness evaluation that transforms existing benchmarks to probe how LLMs behave when context conflicts with their internal knowledge, and an automated dataset updating method that revises and expands benchmarks to mitigate leakage and control difficulty. Second, it proposes Language-Model-as-an-Examiner, where LLMs generate questions and judge responses reference-free, with peer-examination to reduce single-model bias. Third, it introduces the Model Utilization Index, using internal activation signals to assess how efficiently a model engages its capacity.

Together, these contributions reduce reliance on manual annotation and sustain reliable evaluation as LLMs evolve.

ABOUT THE SPEAKER

Jiahao YING is a PhD candidate at the School of Computing and Information Systems, Singapore Management University, supervised by Prof. Qianru Sun and Prof. Yixin Cao. His research focuses on the evaluation of large language models (LLMs), including the automatic construction of evaluation data, LLM-based evaluation pipelines, and interpretability-driven analysis of model behaviour beyond performance scores. His work has been published at leading venues including ACL, NeurIPS, NAACL, and EMNLP, with over ten papers and several first-authored works on auto-evaluation for LLMs. He is a recipient of SMU's Presidential Doctoral Fellowship Award.