Denoising Diffusion and Score Based Generative Models

This pill gives a short overview of the development of diffusion models over the last 7 years. It contains several references useful for diving into this topic.

In some of our latest paper pills we have summarised the recent developments of denoising generative models (DGM), with particular emphasis on score-based techniques. In [Son21S] we have seen how DGM can be studied with the formalism of Stochastic differential equations, while [Doc22S] presents the recent state of the art in image generation.

In this pill we will be taking a step back and briefly go through some of the key publications that, in the past 7 years, have led to the success of DGMs.

In 2015, the seminal paper [Soh15D] showed that it is possible to generate samples (e.g. images or audio) by learning a variational decoder to reverse a discrete diffusion process that perturbs data with noise. The models trained with this type of technique were named de-noising diffusion probabilistic models (DDPM). Without awareness of this work, score-based generative models (SGM) were also being developed, motivated independently and through the use of a different mathematical formalism. In 2019, [Son19G] showed that the empirical performance of SGMs could rival that of other, widely acclaimed generative methods (GANs and VAEs).

At first glance, the connection between the SGM and DDPM seemed superficial, since the former is trained by score matching and sampled by Langevin dynamics, while the latter is trained by the evidence lower bound (ELBO) and sampled with a learned decoder. However, in 2020 the paper “Denoising Diffusion Probabilistic Models” [Ho20D] (of which both original code and a pytorch implementation are available) showed that the ELBO used for training diffusion probabilistic models is essentially equivalent to the weighted combination of score matching objectives used in score-based generative modeling.

Inspired by that work, the aforementioned [Son21S] further investigated the relationship between diffusion models and score-based generative models, and proved that not only the training process, but also the sampling method of DDPMs can be integrated with the annealed Langevin dynamics of score-based models. This creates a unified and more powerful sampler: the Predictor-Corrector sampler.

If you are interested in learning more about the history and development of de-noising generative models, I recommend the following blog posts: the first focuses on DDPM and goes through all the essential math. The second is centered around score-based methods and is a bit more high level, but it was written by one of the key authors (Yang Song) of the DGM revolution and presents some unique insights.

In summary, diffusion models are an exciting new direction for generative models that is based on rigorous mathematics and beautiful insights. It has quickly matured in recent years and, by now, looks ready to be deployed in great applications.

References

[Ho20D]

Denoising Diffusion Probabilistic Models, Jonathan Ho, Ajay Jain, Pieter Abbeel.

Dec 2020

We present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics. Our best results are obtained by training on a weighted variational bound designed according to a novel connection between diffusion probabilistic models and denoising score matching with Langevin dynamics, and our …

[Doc22S]

Score-Based Generative Modeling with Critically-Damped Langevin Diffusion, Tim Dockhorn, Arash Vahdat, Karsten Kreis.

2022

Score-based generative models (SGMs) have demonstrated remarkable synthesis quality. SGMs rely on a diffusion process that gradually perturbs the data towards a tractable distribution, while the...

[Son19G]

Generative Modeling by Estimating Gradients of the Data Distribution, Yang Song, Stefano Ermon.

2019

We introduce a new generative model where samples are produced via Langevin dynamics using gradients of the data distribution estimated with score matching. Because gradients can be ill-defined and hard to estimate when the data resides on low-dimensional manifolds, we perturb the data with different levels of Gaussian noise, and jointly estimate the corresponding scores, i.e., the vector fields …

[Son21S]

Score-Based Generative Modeling through Stochastic Differential Equations, Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, Ben Poole.

Jan 2021

Creating noise from data is easy; creating data from noise is generative modeling. We present a stochastic differential equation (SDE) that smoothly transforms a complex data distribution to a known prior distribution by slowly injecting noise, and a corresponding reverse-time SDE that transforms the prior distribution back into the data distribution by slowly removing the noise. Crucially, the …

[Soh15D]

Deep Unsupervised Learning using Nonequilibrium Thermodynamics, Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, Surya Ganguli.

Jun 2015

A central problem in machine learning involves modeling complex data-sets using highly flexible families of probability distributions in which learning, sampling, inference, and evaluation are still analytically or computationally tractable. Here, we develop an approach that simultaneously achieves both flexibility and tractability. The essential idea, inspired by non-equilibrium statistical …

Publication

See more pills

References

In this series →