Other Diffusion Variants
This library contains implementations of several other notable variations and improvements on the core diffusion model concept. This page provides a brief overview of them.
Learned Gaussian Diffusion
File: denoising_diffusion_pytorch/learned_gaussian_diffusion.py
Paper: "Improved Denoising Diffusion Probabilistic Models" by Nichol and Dhariwal.
This variant, LearnedGaussianDiffusion
, modifies the model to predict not just the mean of the reverse process distribution, but also its variance. The U-Net's output dimension is doubled, with one half predicting the mean (via noise ε
or x_0
) and the other half predicting an interpolation value between the minimum and maximum possible log variances.
This is trained with a hybrid objective that combines the simple mean-squared error loss with a variational bound (VB) loss, controlled by vb_loss_weight
.
from denoising_diffusion_pytorch import Unet
from denoising_diffusion_pytorch.learned_gaussian_diffusion import LearnedGaussianDiffusion
# Note: `learned_variance` must be True in the Unet
model = Unet(dim=64, learned_variance=True)
diffusion = LearnedGaussianDiffusion(
model,
image_size=128,
timesteps=1000,
vb_loss_weight=0.001
)
Continuous-Time Gaussian Diffusion
File: denoising_diffusion_pytorch/continuous_time_gaussian_diffusion.py
Paper: "On Density Estimation with Diffusion Models" by Kingma et al.
Instead of a discrete set of timesteps t = 1, ..., T
, this model operates in continuous time t ∈ [0, 1]
. The noise schedule is defined by a continuous function log_snr(t)
. This formulation can be more flexible and theoretically elegant.
The ContinuousTimeGaussianDiffusion
class requires a U-Net with random_or_learned_sinusoidal_cond = True
to handle continuous time values.
V-Param Continuous-Time Diffusion
File: denoising_diffusion_pytorch/v_param_continuous_time_gaussian_diffusion.py
Paper: "Progressive Distillation for Fast Sampling of Diffusion Models" by Salimans and Ho.
This is a continuous-time model that uses the "v-parameterization," where the model predicts v = α * ε - σ * x_0
. This objective was found to be particularly effective for progressive distillation and has also been noted to reduce color-shifting artifacts in upsampling models.
Weighted Objective Gaussian Diffusion
File: denoising_diffusion_pytorch/weighted_objective_gaussian_diffusion.py
This is an experimental variant where the model is trained to predict both the noise ε
and the clean image x_0
. The U-Net outputs a third component: a set of weights that are used to create a dynamic, weighted average of the two predictions (x_0
derived from predicted noise, and the directly predicted x_0
). The model learns to adjust this weighting based on the timestep.