Other Diffusion Variants

This library contains implementations of several other notable variations and improvements on the core diffusion model concept. This page provides a brief overview of them.

Learned Gaussian Diffusion

File: denoising_diffusion_pytorch/learned_gaussian_diffusion.py

Paper: "Improved Denoising Diffusion Probabilistic Models" by Nichol and Dhariwal.

This variant, LearnedGaussianDiffusion, modifies the model to predict not just the mean of the reverse process distribution, but also its variance. The U-Net's output dimension is doubled, with one half predicting the mean (via noise ε or x_0) and the other half predicting an interpolation value between the minimum and maximum possible log variances.

This is trained with a hybrid objective that combines the simple mean-squared error loss with a variational bound (VB) loss, controlled by vb_loss_weight.

from denoising_diffusion_pytorch import Unet
from denoising_diffusion_pytorch.learned_gaussian_diffusion import LearnedGaussianDiffusion

# Note: `learned_variance` must be True in the Unet
model = Unet(dim=64, learned_variance=True)

diffusion = LearnedGaussianDiffusion(
    model,
    image_size=128,
    timesteps=1000,
    vb_loss_weight=0.001
)

Continuous-Time Gaussian Diffusion

File: denoising_diffusion_pytorch/continuous_time_gaussian_diffusion.py

Paper: "On Density Estimation with Diffusion Models" by Kingma et al.

Instead of a discrete set of timesteps t = 1, ..., T, this model operates in continuous time t ∈ [0, 1]. The noise schedule is defined by a continuous function log_snr(t). This formulation can be more flexible and theoretically elegant.

The ContinuousTimeGaussianDiffusion class requires a U-Net with random_or_learned_sinusoidal_cond = True to handle continuous time values.

V-Param Continuous-Time Diffusion

File: denoising_diffusion_pytorch/v_param_continuous_time_gaussian_diffusion.py

Paper: "Progressive Distillation for Fast Sampling of Diffusion Models" by Salimans and Ho.

This is a continuous-time model that uses the "v-parameterization," where the model predicts v = α * ε - σ * x_0. This objective was found to be particularly effective for progressive distillation and has also been noted to reduce color-shifting artifacts in upsampling models.

Weighted Objective Gaussian Diffusion

File: denoising_diffusion_pytorch/weighted_objective_gaussian_diffusion.py

This is an experimental variant where the model is trained to predict both the noise ε and the clean image x_0. The U-Net outputs a third component: a set of weights that are used to create a dynamic, weighted average of the two predictions (x_0 derived from predicted noise, and the directly predicted x_0). The model learns to adjust this weighting based on the timestep.