Google’s new video generation AI model Lumiere uses a new diffusion model called Space-Time-U-Net, or STUNet, that figures out where things are in a video (space) and how they simultaneously move and change (time). Ars Technica reports this method lets Lumiere create the video in one process instead of putting smaller still frames together.
Lumiere starts with creating a base frame from the prompt. Then, it uses the STUNet framework to begin approximating where objects within that frame…
Source link