Skip to content

Releases: bghira/SimpleTuner

v0.9.5.3b - SDXL refiner training

14 May 19:20
4e790e6
Compare
Choose a tag to compare

What's Changed

  • SDXL refiner training support - LoRA and full u-net. Can't reuse the text embeds from the base model, you must use a different directory.
  • validations: completely refactored workflow
  • huggingface hub: now can use --push_checkpoints_to_hub to upload all intermediary checkpoints
  • dropout: improve implementation to bypass any issues with tokenizer setup that might result in an incorrect embed. by @bghira in #388
  • lr schedules: polynomial fixed / last_epoch being set correctly for the rest
  • parquet backend will ignore missing captions
  • deepfloyd: text encoder loading fixed
  • sd2.x: tested, bugfixed. uncond text embeds excluded from zeroing
  • huggingface hub: improved model card, --push_checkpoints_to_hub will push every saved model and validation image (tested with 168 validation prompts)
  • mps: new pytorch nightly, resolves some strange issues before.
  • mps: use 'auto' slice width for sd 2.x instead of null
  • validations: refactored logic entirely, cleaned up and simplified to tie-in with huggingface hub uploader
  • timestep schedule is segmented by train_batch_size now, ensuring we hit a more broad distribution of timestep sampling for each mini-batch by @bghira in #391
  • follow-up fixes from botched v0.9.5.3 build by @bghira in #397

Full Changelog: v0.9.5.2...v0.9.5.3b

v0.9.5.2 - hugging face hub upload fixes/improvements, minor vae encoding fixes

09 May 17:38
9cd535a
Compare
Choose a tag to compare

What's Changed

  • huggingface hub model upload improvement / fixes
  • validations double-run fix
  • json backend image size microconditioning input fix (SDXL) by @bghira in #385
  • bitfit restrictions / model freezing simplification
  • updates to huggingface hub integration, automatically push model card and weights
  • webhooks: minor log level fixes, other improvements. ability to debug image cropping by sending them to discord.
  • resize and crop fixes for json and parquet backend edge cases (VAE encode in-flight) by @bghira in #386

Full Changelog: v0.9.5.1...v0.9.5.2

v0.9.5.1

08 May 11:34
d658924
Compare
Choose a tag to compare

What's Changed

  • parquet: follow-up metadata handling fixes
  • legacy trainer: memory leak fix in text embed generation for large datasets
  • vae cache: fix the check for current vs first aspect bucket by @bghira in #383

Full Changelog: v0.9.5...v0.9.5.1

v0.9.5 - now with more robust flavour

06 May 20:58
eddbfa6
Compare
Choose a tag to compare

image
Finetuning Terminus XL Velocity v2

What's Changed

  • New cropping logic is now working across the board for parquet/json backends. Images are always cropped now, even when cropped=false, if necessary to maintain 8px or 64px alignment with the resulting dataset.
    • Resulting image sizes and aspect ratios did not change for resolution_type=area
    • Resulting image sizes and aspect ratios did change for resolution_type=pixel
    • This was necessary to avoid stretching/squeezing images when aligning to 64px interval
  • Discord webhook support, see the TUTORIAL for information.
  • "Sensible defaults" are now set for minimum_image_size, maximum_image_size, and target_downsample_size to avoid unexpected surprises mostly when using crop=true, but also for some benefits when using crop=false as well.
  • Image upscaling restrictions have been relaxed, but it will refuse to upscale an image beyond 25%, and instead asks you to change the dataset configuration values.
  • Image quality when training SDXL models has substantially improved thanks to the minimisation of the microconditioning input ranges:
    image
    Finetuning a particularly poorly-performing Terminus checkpoint with reduced high frequency patterning
  • Single subject dreambooth was benchmarked on SDXL with 30 diverse images, achieving great results in just 500 steps.

Commits

  • Convert image to accepted format for calculate_luminance by @Beinsezii in #376
  • vae cache fix for SDXL / legacy SD training
  • epoch / resume step fix for a corner case where the path to the training data includes the dataset name by @bghira in #377
  • when crop=false, we will crop from the intermediary size to the target size instead of squishing
  • set default min_image_size, maximum_image_size, and target_downsample_size values to 100%, 150%, and 150% of the value set for resolution by @bghira in #378
  • resolved bugged-out null embed when dropout is disabled
  • discord webhook support
  • cuda/rocm: bugfix for eval on final legacy (sd 1.5/2.1) training validations
  • avoid stretching/squeezing images by always cropping to maintain 8/64px alignment
  • set default values for minimum_image_size, maximum_image_size, and target_downsample_size by @bghira in #379

Full Changelog: v0.9.5-beta...v0.9.5-beta2

v0.9.5-beta - optimized training, 3x speed-up

02 May 22:14
528d8fe
Compare
Choose a tag to compare

What's Changed

This release includes an experimental rewrite of the image handling code. Please report any issues.

  • flexible pixel resize to 8 or 64 px alignment, no more rounding up where unnecessary by @bghira in #368
  • more deepfloyd stage II fixes for model evaluation by @bghira in #369
  • AMD/ROCm support by @Beinsezii in #373
  • TrainingSample: refactor and encapsulate image handling, improving performance and reliability by @bghira in #374
  • fix --aspect_bucket_rounding not being applied correctly
  • rebuild image sample handling to be structured object-oriented logic
  • fix early epoch exit problem
  • max epochs vs max steps ambiguity reduced by setting default to 0 for one of them
  • fixes for LoRA text encoder save/load hooks
  • optimise trainer
  • 300% performance gain by removing the torch anomaly detector
  • fix dataset race condition where a single image dataset was not being detected
  • AMD documentation for install, dependencies thanks to Beinsezii
  • fix for wandb timestep distribution chart values racing ahead of reality by @bghira in #375

Full Changelog: v0.9.4...v0.9.5-beta

v0.9.4 - the deepest of floyds

22 Apr 17:02
bfe0112
Compare
Choose a tag to compare

image
(DeepFloyd stage I LoRA trained using v0.9.4)

What's Changed

  • parquet: fix aspect bucketing
  • json: mild optimisation
  • llava: add 1.6 support
  • pillow: fix deprecations by @bghira in #350
  • (#343) fix for image backend load failure by @bghira in #352
  • sdxl: validations fix at the end
  • more example scripts for the toolkit
  • --aspect_bucket_rounding by @bghira in #359
  • DeepFloyd IF Stage II and II LoRA/full u-net training by @bghira in #361
  • Add Dockerfile by @komninoschatzipapas in #353
  • multi-res validations via --validation_resolution=1024,1024x1536,...
  • disable torch inductor by default

New Contributors

Full Changelog: v0.9.3.4...v0.9.4

v0.9.3.4 - parquet/multi-gpu improvements

11 Apr 13:41
fcecfba
Compare
Choose a tag to compare

What's Changed

  • bugfix: multigpu training would gradually erode aspect bucket list by saving splitted version
  • add --lora_init_type argument
  • multi-gpu optimisations
  • parquet backend speedup by 100x by @bghira in #349

Full Changelog: v0.9.3.3...v0.9.3.4

v0.9.3.3 - faster startup

09 Apr 12:39
42fd40e
Compare
Choose a tag to compare

What's Changed

  • multigpu fixes - logging, startup, resume, validation
  • regression fixes - torch tensor dtype error during CUDA validations
  • better image detection - "missing images" occur less frequently/not at all
  • tested jpg/png mixed datasets
  • face detection documentation updates
  • higher NCCL timeout
  • diffusers update to v0.27.2
  • mps pytorch nightly 2.4 by @bghira in #346

Full Changelog: v0.9.3.2...v0.9.3.3

v0.9.3.2

07 Apr 16:57
a9b8499
Compare
Choose a tag to compare

What's Changed

  • Added face crop_style option to use cv2 for face cropping. Experimental.
  • Fixed some typos and documentation updates.

Commits

  • doc updates, fixes, face detection cropping by @bghira in #345

Full Changelog: v0.9.3.1...v0.9.3.2

v0.9.3.1 - follow-up improvements for fp16/fp32 removal

07 Apr 14:25
0f8c00d
Compare
Choose a tag to compare

What's Changed

  • pesky vae cache bugs by @bghira in #342
  • NaN guards for VAE
  • Remove fp16 fully
  • Remove fp32 training options
  • Remove upcast logic
  • Add dreambooth guide
  • Fix 'instanceprompt' caption strategy
  • Disable multiprocessing by default to save memory by @bghira in #344

Full Changelog: v0.9.3...v0.9.3.1