Add InfiNet module for DiffusionOverDiffusion training to allow for extremely (minutes!) long video creation #27

kabachuha · 2023-04-02T11:06:29Z

Hi, Exponential-ML!

As you probably know, a bit more than a week ago, Microsoft published their paper where they described the novel DiffusionOverDiffusion technique https://arxiv.org/abs/2303.12346 working by firstly outlining the coarse keyframes and then picking a pair of them as starting points and filling in-betweens (with different, more local prompts!)

Using it they were able to tune on and create whole 11 minutes long Flintstones episodes https://www.reddit.com/r/StableDiffusion/comments/11zwaxx/microsofts_nuwaxl_creates_an_11_minute/

Seeing their impressive results, I couldn't have restrained myself from trying to replicate them.

Having read the article, I noticed that the model structure is extremely similar to the ModelScope one, and the only difference is the 'video conditioning' layer (in green), which information is being transferred into the preexisting U-net3D by a set of Conv-down cells.

Thanks to them using so called zero-convolutions I realized that layer as a ControlNet-like network https://github.com/kabachuha/InfiNet, with which it is possible to introduce the new layers without altering the work of the existing model. (See DoDBlock in the code)

I already tested the inference with diffusion_depth=0 and diffusion_depth=1 (any diffusion_depth>0 turns on the DoD-blocks), so when inferring the model definitely works

I'll start training experiments as soon as I'll figure out the dataset and the system requirements for it

P.S. @ExponentialML, contact me on Discord. I'd really appreciate more close communications

ExponentialML · 2023-04-02T19:54:15Z

This is great @kabachuha! Thanks for this PR, and sure we can get in touch.

sergiobr · 2023-04-02T23:08:54Z

@kabachuha thanks for your contribution!
I agree would be nice to have a discord server or channel about txt2video showcase and tech discuss. I'll ping you there.

kabachuha · 2023-04-02T23:14:26Z

@sergiobr hi, we have a some sort of a text2vodeo team on the Deforum discord server, join it :) https://discord.gg/deforum

kabachuha · 2023-04-09T09:53:41Z

@ExponentialML training works, btw

ExponentialML · 2023-04-09T20:36:06Z

@ExponentialML training works, btw

Great! Let me know if any you need any assistance getting things up to speed with the new repository changes.

kabachuha · 2023-04-09T22:44:30Z

Yeah, I'd really appreciate help in carrying it over, since you know much better about the mainline changes

ExponentialML · 2023-04-10T02:06:24Z

Yeah, I'd really appreciate help in carrying it over, since you know much better about the mainline changes

By all means. Just let me know when it's ready to merge. If you don't want to resolve the conflicts yourself, I'm more than willing to do it 👍 .

as mp4 often fails for such short videos

kabachuha · 2023-04-22T13:26:43Z

Now sampling to a video folder dataset is working correctly

Gitterman69 · 2023-05-24T19:14:47Z

bump bump

kabachuha · 2023-05-29T15:21:04Z

So, I'm going to write an automatic DoD captioner using OpenAI's (or other LLM provider, maybe local oobabooga).

How it will work:

Multilevel DoD-splitting is done with the current script
The lowest level subclips are captioned with BLIP2 (see @ExponentialML's repo)
The LLM forms the upper level descriptions given just one global prompt for the whole video

It eliminates the difficulty of forming the mid-level captions

Maki9009 · 2023-06-25T15:43:47Z

sooo any updates on this?

Gitterman69 · 2023-06-26T06:39:26Z

bump
bump

add infinet training

ceffb82

kabachuha added 2 commits April 4, 2023 23:33

add video chopping script

f31deef

add infinet dataloader

0821aa7

ExponentialML assigned kabachuha Apr 8, 2023

kabachuha added 3 commits April 9, 2023 12:49

don't reset infinet weights at start

e49f8e4

try catch for empty items

bd69c67

code for depth alteration

a7abd97

ExponentialML added the enhancement New feature or request label Apr 10, 2023

kabachuha added 2 commits April 17, 2023 02:13

fix the dataset composer

380eeb7

fix infinet dataset collection

67b7a4b

kabachuha marked this pull request as ready for review April 22, 2023 11:53

kabachuha added 2 commits April 22, 2023 15:29

fix dataset converter not reading parts > L

5b860d0

save to PIL gifs instead of mp4s

664f647

as mp4 often fails for such short videos

kabachuha added 2 commits April 22, 2023 18:49

add support for InfiNet training for FolderDataset

a821d8b

move to the utils

1e60627

ExponentialML mentioned this pull request May 6, 2023

About VideoLDM #55

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add InfiNet module for DiffusionOverDiffusion training to allow for extremely (minutes!) long video creation #27

Add InfiNet module for DiffusionOverDiffusion training to allow for extremely (minutes!) long video creation #27

kabachuha commented Apr 2, 2023 •

edited

Loading

ExponentialML commented Apr 2, 2023

sergiobr commented Apr 2, 2023 •

edited

Loading

kabachuha commented Apr 2, 2023

kabachuha commented Apr 9, 2023

ExponentialML commented Apr 9, 2023

kabachuha commented Apr 9, 2023 •

edited

Loading

ExponentialML commented Apr 10, 2023

kabachuha commented Apr 22, 2023

Gitterman69 commented May 24, 2023

kabachuha commented May 29, 2023

Maki9009 commented Jun 25, 2023

Gitterman69 commented Jun 26, 2023

Add InfiNet module for DiffusionOverDiffusion training to allow for extremely (minutes!) long video creation #27

Are you sure you want to change the base?

Add InfiNet module for DiffusionOverDiffusion training to allow for extremely (minutes!) long video creation #27

Conversation

kabachuha commented Apr 2, 2023 • edited Loading

ExponentialML commented Apr 2, 2023

sergiobr commented Apr 2, 2023 • edited Loading

kabachuha commented Apr 2, 2023

kabachuha commented Apr 9, 2023

ExponentialML commented Apr 9, 2023

kabachuha commented Apr 9, 2023 • edited Loading

ExponentialML commented Apr 10, 2023

kabachuha commented Apr 22, 2023

Gitterman69 commented May 24, 2023

kabachuha commented May 29, 2023

Maki9009 commented Jun 25, 2023

Gitterman69 commented Jun 26, 2023

kabachuha commented Apr 2, 2023 •

edited

Loading

sergiobr commented Apr 2, 2023 •

edited

Loading

kabachuha commented Apr 9, 2023 •

edited

Loading