v0.1.23
What's Changed
- Improve and update docs by @EricLBuehler in #477
- Progress bar and logging when loading repeating layers by @EricLBuehler in #479
- Update deps by @EricLBuehler in #483
- Optimize decoding by removing redundant qkv transpose by @EricLBuehler in #487
- Fixes and tweak docs, logging for local loading by @EricLBuehler in #489
- Add the Gemma 2 model by @EricLBuehler in #490
- Update demo video by @EricLBuehler in #491
- Utilize new quantize_onto qtensor api by @EricLBuehler in #492
- Update deps by @EricLBuehler in #493
- Bump version to 0.1.23 by @EricLBuehler in #495
Full Changelog: v0.1.22...v0.1.23