✨ New features and improvements
- Register models using
catalogue
to support external models in Auto{Decoder,Encoder,CausalLM}
(#351, #352).
- Add support for loading parameters in-place (#370).
- Support for ELECTRA models (#358).
- Add support for write/upload operations with
HFHubRepository
(#354).
- Add support for converting Curated Transformer configs to HF-compatible configs (#333).
🔴 Bug fixes
- Support PyTorch 2.2 (#360).
⚠️ Backwards incompatibilities
- Support for TorchScript tracing is removed (#361).
- The
qkv_split
argument is now mandatory for AttentionHeads
, AttentionHeads.uniform
, AttentionHeads.multi_query
, and AttentionHeads.key_value_broadcast
(#374).
- All
FromHFHub
mixins are renamed to FromHF
(#374).
FromHF.convert_hf_state_dict
is removed in favor of FromHF.state_dict_from_hf
(#374).
👥 Contributors
@danieldk, @honnibal, @ines, @KennethEnevoldsen, @shadeMe