Replies: 3 comments
-
I don't necessarily have the answer for you, but since no one else has replied: Have you tried pulling a previous commit to see if an older version works? I don't do a lot of LoRA stuff aside from extracting LoRAs from model checkpoints I've trained (I find that works best). The tool I use to do the extraction is kohya's script in this repo. For some reason it stopped working at some point and the extractions did what you say where they just don't trigger anything. So around the time that happened I rolled back to commit 3b83a1c (Oct 1st) and the extraction script worked again.... So that could be related to your issue. If you haven't and are familiar w/ checking out an older commit I'd recommend doing that. Curiously the actual LoRA extraction script I ended up keeping in that folder is from 8/22/23 - right before SDXL was merged into the main branch (I used the dev branch prior, so that's prob why). I may have actually rolled that folder forward also (See I have 2 or 3 git clones of kohya). It should work if you pull a commit around when SDXL training was merged into the main branch on 8/31/23 (633bb8d - commit: 633bb8d) ChatGPT can guide you on how to checkout a particular commit (all listed in date order here: https://github.com/bmaltais/kohya_ss/commits/master). It'd probably be best to create a totally new git clone in a new folder just so if it doesn't work you can just delete the attempt. Basically: Open a terminal in the root folder Hope maybe that helps - should at least help narrow down what the problem is. BTW have the same resourcess :p - although dumb mobo (asus proarts 690) won't post w/ all 4 RAM sticks in - pretty ridiculous.....64gb works for me though.... |
Beta Was this translation helpful? Give feedback.
-
It's true that text captioning is more important than I once thought, but for personal use, I used to always use just a single trigger word that's unique (not linked to any vector info in the model)? You know, like 'MyAnim3L0R4'? You can try that in place of a regular descriptive text, then reference the trigger word and see if that works any different. It always worked for me. You may also have one of the settings that re-calibrates a model's strength, so using 1.0 won't blow out the image every time. This may be enabled. I know the main one was at the bottom of the first wall of settings in the GUI. If your concern is good captions, but you hate writing and editing each one (like me), check out TagUI. Not the one that comes up in EVERY Google search, but rather this one: https://github.com/jhc13/taggui It is so much better than all the other captioning tools and models, like BLIP2 can't compare to the newer ones they have, it's crazy everyone isn't using it. The descriptions are always way better than anything I'd write by hand, with no extra flowery text like a ChatGPT answer and no missing details. Also. you mention 'always' using the same setting because it was all that runs without error, even though you're getting bad results, which may be due to how you set up your install with accelerate. This happened to me, as well and what fixed it was switching to BF16 (I think it's like Nvidia's proprietary version of fp16), so maybe try that first before trying the stuff below. It may fix it. The bucket settings are important and training on too high of a resolution can break things. I don't use images with a side wideer/longer than 1280 pixels. I don't use the GUI, since settings are all over the UI and every time I went through issues and someone suggested a setting change, I could never find it. But here is my sd-scripts CLI command (what the GUI passes to the actual training script) which I run from CMD, if you want to try it. It should work perfectly for your 40-series GPU. To use it,
If you are adding to an existing LoRA you already created, add this one, too:
This setup is for SDXL LoRAs and I use it with my EVGA RTX 3080 12GB FTW3. This post could have saved me 4-8 hours and I hope it does that for someone. Good luck! |
Beta Was this translation helpful? Give feedback.
-
Can you post the settings you are trying to train on. I have had this same issue when using LyCoris and such. I train something but... nothing seems to be going INTO the model. Post the training settings and lets see if we can debug this out. However I'd recommend trying just a plain AdamW and regular LoRA, with no captions, just on plain SDXL base model, all default settings. Something simple, clear and obvious as a subject. You have beefy enough hardware that you can probably iterate things quickly. |
Beta Was this translation helpful? Give feedback.
-
I'm trying to train a Lora using kohya_ss. I'm using a computer with Windows 11, equipped with a 4090 GPU, an i9-13900KS processor (3.2 GHz), and a whopping 128GB of RAM (yes, I know, that number is absurd). My model is realistic, and I have a lot of photos available, taken from different angles and under different lighting conditions. The photos are in 4k, and I've resized them.
I've tried captions like BLIP, captions like Danbooru, I've tried resizing to 1024, and also resizing to 512. I've tried with many images and few repeats, I've tried the opposite, I've tried with 1 epoch as well as 6 epochs and 15 epochs. I've tried a lot of different configurations, and I've come to the same point: absolute nothingness, even after 10 different trainings with different configurations each time. My Loras are producing nothing. If I input a prompt with my trigger word and my Lora, there's almost no difference compared to the same prompt without the Lora. It's as if it's training on emptiness. I've obviously tried varying the learning rates, varying the network alpha, varying the optimizer, and changing the model (usually I try to train on "analogmadness").
Someone even sent me their own folders and captions with a more "anime" style for me to try on my end, and the result was similar: pure emptiness. Except for reinstalling kohya_ss, I've literally tried everything, at least everything that came to mind.
I did notice something, though: when I use the utility/verify Lora function of kohya_ss, something surprising happens... but I have no idea if it's relevant. My "number of Lora modules" is always 528, regardless of the Loras I put in that I've trained. However, when I test those I download from Civitai, they all vary in the number of modules. I don't know if that plays a role, but I have no other leads.
Also I always use Dadaptadam because the 8bit ones doesn't work on my computer for some reason.
Please, I beg that someone here has the miraculous solution because honestly, I really want to invest seriously in this, and it's starting to feel hopeless.
Beta Was this translation helpful? Give feedback.
All reactions