These results demonstrate the power

rochona · Post by **rochona** » Mon May 26, 2025 11:38 am

preference learning in diffusion models. Despite training in an offline set-up on a limited dataset, Diffusion-DPO closes the gap between the state-of-the-art open-source and closed-source models.

Real-time Generation: Diffusion-DPO goes Turbo

Diffusion models can be sped up by distilling their knowledge through an additional process that enables generation of realistic images with only a few function calls. The preeminent model of this class is SDXL Turbo which generates images in only 1-4 steps. DPO also can benefit this type of model a lot. Using the exact same DPO loss (just modifying some of the settings to align with the original SDXL Turbo training) we’re able to substantially improve SDXL Turbo’s 4-step generations.The development of this model is still in progress, but the early version shown here wins 55% of the comparisons on PartiPrompts.

Emergent areas of improvement

One of the most common complaints about AI-generated america phone number list images is the appearance of people. As humans, aberrations in rendered human appearance really stand out to us. Interestingly, we see that these preferences are reflected in our training dataset which result in substantial improvement on people generation as shown below. Given that these changes are fairly incidental as part of generic alignment, targeted improvement is an exciting path for future development.