Nvidia's PID: Turn Low-Res AI Art into Sharp 4K

Summary

Nvidia has released a new tool called the Pixel Diffusion Decoder, or PID, that transforms low-resolution AI-generated images into sharp 4K quality in just four steps. Instead of relying on the original VAEs from image models, PID uses a diffusion-based decoder working directly in pixel space. It currently supports popular open-source AI image generators like Flux One, Flux Two, SD3, and Z-Image. ComfyUI has already integrated PID, requiring a small text encoder based on the Gemma two billion parameter model and PID diffusion models available in 1K to 4K resolutions, all using a fixed four-step sampling process. When tested against other upscaling methods like RTX Video Super Resolution and SeedVR 2, PID demonstrated superior detail enhancement, particularly in faces, textures, and reflections, without introducing noticeable artifacts or over-sharpening. PID effectively refines details, adding definition to elements like armor, gears, and even watermarks, resulting in a significantly sharper and more detailed final image compared to standard upscalers.

Summary

Play the full video