Releases: explosion/curated-transformers
v2.0.1: Fix Python 3.12.3 compatibility
v1.3.2: Fix Python 3.12.3 compatibility
🔴 Bug fixes
- Fix Python 3.12.3 activation lookup error (#377).
v2.0.0 (Superposition)
✨ New features and improvements
- Register models using
catalogue
to support external models inAuto{Decoder,Encoder,CausalLM}
(#351, #352). - Add support for loading parameters in-place (#370).
- Support for ELECTRA models (#358).
- Add support for write/upload operations with
HFHubRepository
(#354). - Add support for converting Curated Transformer configs to HF-compatible configs (#333).
🔴 Bug fixes
- Support PyTorch 2.2 (#360).
⚠️ Backwards incompatibilities
- Support for TorchScript tracing is removed (#361).
- The
qkv_split
argument is now mandatory forAttentionHeads
,AttentionHeads.uniform
,AttentionHeads.multi_query
, andAttentionHeads.key_value_broadcast
(#374). - All
FromHFHub
mixins are renamed toFromHF
(#374). FromHF.convert_hf_state_dict
is removed in favor ofFromHF.state_dict_from_hf
(#374).
👥 Contributors
v1.3.1 (Venusian 2)
🔴 Bug fixes
- Ensure that parameters are leaf nodes when loading a model (#364).
- Set the Torch upper bound to <2.1.0 (#363).
Note: we have set the Torch upper bound to <2.1.0 because later versions made some incompatible changes. Newer versions of Torch will be supported by Curated Transformers 2.0.0.
v1.3.0 (Venusian 1)
✨ New features and improvements
- Add support for model repositories other than Hugging Face Hub (#331).
- Add support for
fsspec
filesystems as a repository type (#327, #331). - Add support for NVTX Ranges (#320).
- Add a
config
property to models to query their configuration (#328).
🔴 Bug fixes
- Fix a potential loading issue that may arise when a model's
dtype
is not set in the Hugging Face configuration (#330).
🏛️ Feature: Model repositories
The new (experimental) repository API adds support for loading models from repositories other than Hugging Face Hub. You can also easily add your own repository types by implementing the Repository
interface. Using a repository is as easy as calling the new from_repo
method that is provided by all models and tokenizers:
from curated_transformers.models import AutoDecoder
decoder = AutoDecoder.from_repo(MyRepository("mpt-7b-my-qa"))
Curated Transformers comes with two repository classes out-of-the-box:
HfHubRepository
downloads models from Hugging Face Hub and is now used by thefrom_hf_hub
methods.FsspecRepository
supports the wide range of filesystems provided by the fsspec package and third-party implementations.
👥 Contributors
v1.2.0 (Hypertension)
✨ New features and improvements
- Add support for Safetensor checkpoints (#310).
- Add
from_hf_hub_to_cache
method toFromHFHub
mixins. This method downloads a model from Hugging Face hub to the local cache without loading it (#303).
🔴 Bug fixes
👥 Contributors
v1.1.0 (Tetrachromacy)
v1.0.0 (Beginner's Luck)
Three weeks on the heels of our tech preview we are excited to announce first stable release of Curated Transformers! 🎉 From this release onwards, we provide a stable API following semantic versioning guidelines. Of course, this release is also packed with new features.
✨ New features and improvements since version 0.9.0
- Support Llama 2 (#263, #265).
- Support the Falcon new decoder architecture, used by the 40B parameter models (#253).
- Support for ALiBi to the self attention layer (#252) and support Falcon with ALiBi (#260).
- Support
torch.compile
(#257) and TorchScript tracing for all models (#262, #266). - New logit transforms: vocabulary masking and (#245) and top-p filtering (#255).
- Support authentication when downloading tokenizers from Hugging Face Hub (#267).
- Many improvements to the building blocks, such as shared configuration between models (#258) and shared encoder/decoder layers (#248). As a result, most model definitions are very short.
- The APIs and documentation were polished a lot, to make the semantic versioning guarantees for 1.x.y possible.
👥 Contributors
v0.9.1: Falcon compatibility fix
v0.9.0 (Robot Stop)
We are very happy to announce this major new release of Curated Transformers! 🎉
Curated Transformers started as a small transformer library for spaCy pipelines. Over the last two months we made it a pure PyTorch library that is completely independent of spaCy and Thinc. We also added support for popular LLM models, generation, 8-bit/4-bit quantization, and many other features:
- Curated Transformers is now a pure PyTorch library.
- Support for popular LLMs such as Falcon, LLaMA, and Dolly v2.
- Greedy generation and generation with sampling.
- 8-bit and 4-bit quantization of models through
bitsandbytes
. - Flash attention and other optimizations through PyTorch Scaled Dot Product Attention.
- Efficient model loading without unneeded allocations and initialization through the Torch
meta
devices. - Support for modern
tokenizer.json
tokenizers. - Load models from Hugging Face Hub without requiring the
transformers
package. - Extensive API documentation and examples.
Curated Transformers can be used in spaCy using the spacy-curated-transformers
package.