Pre-Conference Talk by ZOU Xiandong | Variational Speculative Decoding: Rethinking Draft Training from Token Likelihood to Sequence Acceptance

Please click here if you are unable to view this page.

Variational Speculative Decoding: Rethinking Draft Training from Token Likelihood to Sequence Acceptance

Speaker:

ZOU Xiandong
Ph.D. Candidate
School of Computing and Information Systems
Singapore Management University

Date:

Time:

Venue:

23 June 2026, Tuesday

10:30am – 11:00am

Meeting room 4.4, Level 4
School of Computing
and Information Systems 1,
Singapore Management University,
80 Stamford Road
Singapore 178902

Please register by 22 June 2026.

About the Talk

Speculative decoding accelerates inference for (M)LLMs, yet a training-decoding discrepancy persists: while existing methods optimize single greedy trajectories, decoding involves verifying and ranking multiple sampled draft paths. We propose Variational Speculative Decoding (VSD), formulating draft training as variational inference over latent proposals (draft paths). VSD maximizes the marginal probability of target-model acceptance, yielding an ELBO that promotes high-quality latent proposals while minimizing divergence from the target distribution. To enhance quality and reduce variance, we incorporate a path-level utility and optimize via an Expectation-Maximization procedure. The E-step draws Monte Carlo samples from an oracle-filtered posterior, while the M-step maximizes weighted likelihood using Adaptive Rejection Weighting (ARW) and Confidence-Aware Regularization (CAR). Theoretical analysis confirms that VSD increases expected acceptance length and speedup. Extensive experiments across LLMs and MLLMs show that VSD achieves up to a 9.6% speedup over EAGLE-3 and 7.9% over ViSpec, significantly improving decoding efficiency.

This is a Pre-Conference talk for Forty-Third International Conference on Machine Learning (ICML 2026).

About the Speaker

Xiandong ZOU is a Ph.D. candidate in Computer Science at the School of Computing and Information Systems, Singapore Management University, under the supervision of Professor Pan Zhou. He is a member of the Language and Vision Lab (LV-Lab), directed by Professor Shuicheng Yan and Professor Pan Zhou. His research interests include AIGC, generative models, and machine learning.

Where to find us

Get in touch