Skip to content

v0.9.0-alpha2 - fixes for multi-backend training

Pre-release
Pre-release
Compare
Choose a tag to compare
@bghira bghira released this 31 Dec 02:21
· 2608 commits to release since this release
3b5781a

What's Changed

This is another alpha release following up on the multi-databackend work from v0.9.0-alpha.

As it is a prerelease, it is recommended to use caution and keep backups of sensitive data.

Great care has been taken to ensure this has "correctness" for this release. It might be wise to start a new training run for this release series due to the extensive changes in how checkpoints are saved and loaded.

  • allow disabling backends
  • default noise scheduler should be euler
  • fix state tracker IDs by @bghira in #254
  • CogVLM: 4bit inference by default
  • Diffusers: bump to 0.25.0
  • MultiDataBackend: better support for epoch tracking across datasets.
  • MultiDataBackend: throw error and end training when global epoch != dataset epoch.
  • Logging: major reduction in debug noise
  • SDXL: fix num update steps per epoch calculations
  • SDXL: Fix number of batch display
  • SDXL: Correctness fixes for global_step handling by @bghira in #255

Full Changelog: v0.9.0-alpha...v0.9.0-alpha2