The key math behind Diffusion

Telemarketing List offers comprehensive and verified phone contact databases for businesses. Boost your telemarketing campaigns with accurate leads and targeted customer connections.
Post Reply
rochona
Posts: 748
Joined: Thu May 22, 2025 11:25 am

The key math behind Diffusion

Post by rochona »

-DPO is formulating what “more likely” means for diffusion models. The conclusion (after some chunky math) turns out to be pretty simple: diffusion models are trained to denoise images and if you give a diffusion model a noisy image to denoise the “likelihood” of the clean image scales with how good of a denoising estimate your model made. In other words, the DPO-Diffusion objective is tune the model to be better at denoising preferred data and relatively worse at denoising unpreferred data.

The loss surface for the Diffusion-DPO objective (lower is better). The loss can be improved by becoming better on the good data while getting worse on the bad data.
The error increase/decrease (getting better/worse) is america phone number list measured by performance relative to a “reference” or initialization model. In our experiments we mainly use StableDiffusion-XL-1.0, we’ll just refer to this specific model as “SDXL”. We use SDXL as a starting point and train it on the Pick-a-Pic dataset which consists of collected pairs of preferences between two generated images from the same caption.

Results
We first visually compare the generations of our DPO-tuned SDXL model (DPO-SDXL) with the original SDXL. We see that DPO-SDXL is both more faithful to the given prompt and produces extremely high-quality imagery which is very pleasing to humans, in other words the model has become aligned to our preferences! Note that preferences are not universal, but it seems like a love for detailed exciting imagery is a common shared preference across a broad swath of users.
Post Reply