You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am training UNet part of latent diffusion with a conditional encoder. I have added one extra module of image encoder for reference. It's been a week I ran the training on almost 3000 images and it has completed the 250 epochs. but the loss is not decreasing, it is still arround 1.0. I am using the 80Gb A100 for training.
Please let me know if I am missing something. or still need to wait for improvements.
I am using the image of size 64643 so not using VAE. I am passing direct image to UNET.
here is loss graph
The text was updated successfully, but these errors were encountered:
Hey,
As a general rule, you shouldn't solely rely on the loss function to assess the training of diffusion models. Often, the loss might saturate after a few thousand iterations, yet the quality of the samples generated by the model can continue to improve post-saturation. Regarding your situation, it appears that the loss plateaued at around 1, which is quite high. If the quality of your samples hasn't improved, I recommend tweaking some hyperparameters, particularly reducing the learning rate to something like 1e-5.
I am training UNet part of latent diffusion with a conditional encoder. I have added one extra module of image encoder for reference. It's been a week I ran the training on almost 3000 images and it has completed the 250 epochs. but the loss is not decreasing, it is still arround 1.0. I am using the 80Gb A100 for training.
Please let me know if I am missing something. or still need to wait for improvements.
I am using the image of size 64643 so not using VAE. I am passing direct image to UNET.
here is loss graph
The text was updated successfully, but these errors were encountered: