-
Notifications
You must be signed in to change notification settings - Fork 866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SD3.5 Large support #1719
SD3.5 Large support #1719
Conversation
SD3.5L training seems to work now. Implemented block swap. With 30 blocks, training may be possible on an SD3.5L with 12GB VRAM. Random drop out of Text Encoder embedding is not implemented yet. LoRA training is not implemented yet. |
awesome can't wait to test it once matured |
Basic LoRA training (MMDiT only) may work now. |
Fix latent scaling/shifting. |
…ers LoRA not trained
Training for LoRA including the Text Encoder should now work correctly. Each of the dropout options will work for LoRA training, but the optimal value is unknown. |
L and G aren't dropped out separately. can't drop out all of clip either due to it being used for adanorm |
Thank you! Hmm, Appendix B.3 of their technical paper http://arxiv.org/pdf/2403.03206 states the following:
So I think it makes sense to drop it separately. |
@kohya-ss what is the purpose of dropout? so when SAI training they almost dropped out 50% of the times each one randomly? so with dropout it is training only U-NET part is that meaning this? |
this kind of dropout is for pretraining, not for finetuning. for small datasets you will merely harm the model to drop out the text encoders so much. it will overcondition the uncond space. |
@kohya-ss FP8BASEを使用し、learning_rateは1e-4でDim64、alpha 32.0の設定で画像数は約8000枚で24000ステップ回しています。 |
I don't know the details either, so please see the technical paper: https://arxiv.org/pdf/2403.03206. |
無事に動作したようで幸いです。I'm glad it seems to have worked fine! |
@Bocchi-Chan2023 |
INFO clip_l is not included in the checkpoint and clip_l_path is not provided sd3_utils.py:117│ |
Please specify the respective weight files downloaded from HuggingFace for each option: |
|
…into sd3_5_support
Added SD3.5M support.
0 means no random crop, 1 means always. The default is 0. |
can't load the VAE, either 3.5 and 3, i'm getting an Missing key error |
actully, i've managed to extract the VAE directly from sd3 and it did work now, maybe I was using an wrong one |
SAI's SD3.5L/M checkpoint seems to have VAE built in. Please omit the |
do you have an minimum params config to train an LoRA on sd35M? i've tried with some configs(that i've used for flux), yet no results at all |
README.md is updated. |
Supported SD3.5M multi-resolution learning. The feature has not yet been fully tested, so please let us know if you find any issues. The specifications of the latent cache have changed, so please delete the previous cache files (it works but garbage will remain in the file). The idea and code for positional embedding of SD3.5M was contributed by KBlueLeaf. Thank you KBlueLeaf! |
Fixed a memory leak when caching latents. This does not affect data that has already been cached. Images were not being discarded after latent conversion. |
As the main functionality appears to be working, I'll proceed with merging this branch. Thank you for your significant contributions. edit: This branch will be removed in the next few days. |
why is this not default? what is the reason not making it default? thank you for example on runpod machines, model loading when training painfully slow, loading FLUX, can it help there too? |
SD3 Medium fine tuning works.