You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been lately testing the effects of training a Lora using multiple different seeds throughout the process. So far, the results have been pretty good, as long as one resumes training using saved training states after changing the seed, rather than defining just the network weights to resume from.
However, as I was looking through all the various options, settings and such, I came to read more about Gradient Accumulation. And it gave me a little idea for my multi-seed training:
What would be the effects of a type of Gradient Accumulation, where each backward pass would not be composed of the gradients of multiple different images generated with the same seed, but rather the gradients of the same image generated with a number of different seeds?
There would be a problem with regularization images if they are used, as for each seed used in this type of training, a regularization image set generated with that seed would need to be defined.
Now, this is just an idea, so I'm not here asking it to be added. Rather, I am here to ask some help on how I could potentially implement this to my copy of Kohya_ss so I can test the effects it would have on the results.
I unfortunately don't have a lot of experience with this kind of stuff, but some pointers would be greatly appreciated. For example, I'd like to find the code on how normal Gradient Accumulation is done in Kohya_ss so I could maybe try to just modify it directly for testing purposes.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I have been lately testing the effects of training a Lora using multiple different seeds throughout the process. So far, the results have been pretty good, as long as one resumes training using saved training states after changing the seed, rather than defining just the network weights to resume from.
However, as I was looking through all the various options, settings and such, I came to read more about Gradient Accumulation. And it gave me a little idea for my multi-seed training:
What would be the effects of a type of Gradient Accumulation, where each backward pass would not be composed of the gradients of multiple different images generated with the same seed, but rather the gradients of the same image generated with a number of different seeds?
There would be a problem with regularization images if they are used, as for each seed used in this type of training, a regularization image set generated with that seed would need to be defined.
Now, this is just an idea, so I'm not here asking it to be added. Rather, I am here to ask some help on how I could potentially implement this to my copy of Kohya_ss so I can test the effects it would have on the results.
I unfortunately don't have a lot of experience with this kind of stuff, but some pointers would be greatly appreciated. For example, I'd like to find the code on how normal Gradient Accumulation is done in Kohya_ss so I could maybe try to just modify it directly for testing purposes.
Beta Was this translation helpful? Give feedback.
All reactions