Forked from official repository for Flexible Diffusion Modeling of Long Videos
Tested with Python 3.10 in a conda environment. We require the Python packages mpi4py torch torchvision wandb blobfile tqdm moviepy imageio diffusers opencv-python ffmpeg lpips tensorflow==2.15 tensorflow_hub==0.15.0 diffusers transformers pandas scikit-learn
.
This repository itself should also be installed by running
pip install -e .
This repo logs to wandb, using the wandb entity/username and project name set by:
export WANDB_ENTITY=<...>
export WANDB_PROJECT=<...>
And add a directory for checkpoints to be saved in:
mkdir checkpoints
- Go to the project directory.
- Make the dataset:
python datasets/ball.py --save_path=datasets/ball_stn
.
- Go to the project directory.
- Make the dataset:
python datasets/ball.py --save_path=datasets/ball_stn --color_shift
.
- Get the zipped dataset file from Jason and unzip it at
datasets
folder.
- Go to the project directory.
- Make a symlink in
datasets
foler that points to VQVAE-encoded video data:ln -s /ubc/cs/research/plai-scratch/plaicraft/data/processed datasets/plaicraft
.
Sample SLURM scripts for training and evaluation are at sample_slurm_scripts
.
TLDR
scripts/video_train.py
is the model training script.scripts/video_sample.py
is the sampling script.scripts/video_make_mp4.py
is the script that makes MP4 videos from model samples.scripts/video_fvd.py
calculates FVD and saves it the theresults/...
folder.scripts/collect_results.py
can group metrics from multiple runs.
Datasets used to construct continual learning data streams are at improved_diffusion/video_datasets.py
and improved_diffusion/plaicraft_dataset.py
. There are three types of datasets: ContinuousDataset, ChunkedDataset, and SpacedDataset. ContinuousDataset returns a sliding window of size T
indexed by the first frame index (ex. [0,1,2,3,4], [1,2,3,4,5], ... when T=5
). ChunkedDataset return a window of size T
that are mutually exclusive (ex. [0,1,2,3,4], [5,6,7,8,9], ... when T=5
). SpacedDataset returns a window of size T
that are evenly spaced out across the data stream (ex. [0,1,2,3,4], [10,11,12,13,14], ..., [90,91,92,93,94] when we want 10 videos with length T=5
from a dataset with 100 total frames).
Model and optimizer checkpoints locations: checkpoints/<WANDB RUN ID>
Model sample and results locations: results/<WANDB RUN ID>/<MODEL NAME>_<SAMPLER CONFIGS>/<DATA STREAM CONFIG>
Run summary (that can group multiple runs) locations: summarized/<SUMMARY NAME>