Why Diffusion Models Work

This post was inspired by the Stanford course on diffusion models, CME296, by Afshine Amidi and Shervine Amidi, and is intended to provide a detailed review of the theoretical background of diffusion models. We review the derivation of the loss functions and the underlying principles behind the current training and inference algorithms. It requires a basic familiarity with diffusion models to begin with. PARADIGM 1: DDPM Basic equations $$ x_{t+1} = \sqrt{1-\beta_t}\,x_t + \sqrt{\beta_t}\,\epsilon \qquad \text{with } \beta_t \text{ noise schedule} $$ which can be rewritten as: ...

Date: May 27, 2026 | 37 min read | Author: Ahmad Vafaeian