Releases: axolotl-ai-cloud/axolotl
Releases · axolotl-ai-cloud/axolotl
v0.4.0
New Features (highlights)
- Streaming multipack for continued pre-training
- Mistral & Mixtral support
- Simplified Multipack for Mistral, Falcon, Qwen2, and Phi
- DPO/IPO/KTO-pairs RL-training support via trl
- Improve BatchSampler for multipack support, allows for resume from checkpointing, shuffling data each epoch
- bf16: auto support
- add MLFlow support
- save YAML configs to WandB
- save predictions during evals to WandB
- more tests! more smoke tests for smol model training
- NEFTune support
What's Changed
- document that packaging needs to be installed before flash-attn by @winglian in #559
- Fix pretraining with iterable/streaming Dataset by @jphme in #556
- Add training callback to send predictions to WandB table by @Glavin001 in #521
- fix wandb so mypy doesn't complain by @winglian in #562
- check for the existence of the default accelerate config that can create headaches by @winglian in #561
- add optimization for group-by-len by @winglian in #563
- gracefully handle length feature used for group by by @winglian in #565
- improve how we setup eval/save strategies and steps by @winglian in #547
- let hf trainer handle torch compile by @winglian in #516
- Model parallel by @winglian in #538
- fix save_steps so it doesn't get duplicated by @winglian in #567
- set auto for other params that hf trainer sets for ds. include zero1 json by @winglian in #570
- remove columns after tokenizing for pretraining by @winglian in #571
- mypy wandb ignore by @winglian in #572
- Phi examples by @winglian in #569
- e2e testing by @winglian in #574
- E2e device cuda by @winglian in #575
- E2e passing tests by @winglian in #576
- refactor scripts/finetune.py into new cli modules by @winglian in #550
- update support matrix with btlm and phi by @winglian in #579
- prevent cli functions from getting fired on import by @winglian in #581
- Fix Codellama examples by @Kimiko-AI in #582
- support custom field for completion from yml by @winglian in #580
- Feat(doc): Add features to doc by @NanoCode012 in #583
- Support Sample packing for phi arch by @winglian in #586
- don't resize embeddings if it's already large enough by @winglian in #577
- Enable full (non-sharded) model saving with SHARDED_STATE_DICT by @jphme in #584
- make phi training work with Loras by @winglian in #588
- optionally configure sample packing for evals by @winglian in #589
- don't add position_ids for evals when not using eval sample packing by @winglian in #591
- gather/broadcast the max value of the packing efficiency automatically by @winglian in #463
- Feat(data): Allow loading local csv and text by @NanoCode012 in #594
- add bf16 check by @winglian in #587
- btlm and falcon monkey patches for flash attn by @winglian in #566
- minor tweaks to simplify by @winglian in #597
- Fix for check with cfg and merge_lora by @winglian in #600
- improve handling for empty text on the tokenization step by @winglian in #502
- more sane defaults for openllama 3b used for quickstarts by @winglian in #602
- update dockerfile to not build evoformer since it fails the build by @winglian in #607
- Delete duplicate lines in models.py by @bofenghuang in #606
- support to disable exllama for gptq by @winglian in #604
- Update requirements.txt - Duplicated package by @Psancs05 in #610
- Only run tests when a change to python files is made by @maximegmd in #614
- Create multi-node.md by @maximegmd in #613
- fix distributed devices by @maximegmd in #612
- ignore wandb to resolve isort headaches by @winglian in #619
- skip the gpu memory checks if the device is set to 'auto' by @winglian in #609
- let MAX_JOBS use the default since we're not resource constrained on our self-hosted runners by @winglian in #427
- run eval on the first step to get a baseline by @winglian in #617
- split completion text to sequence_len by @winglian in #616
- misc fixes to add gptq tests by @winglian in #621
- chore(callback): Remove old peft saving code by @NanoCode012 in #510
- update README w deepspeed info by @winglian in #605
- create a model card with axolotl badge by @winglian in #624
- better handling and logging of empty sharegpt turns by @winglian in #603
- tweak: improve base builder for smaller layers by @maximegmd in #500
- Feat(doc): Add eval_sample_packing to doc by @NanoCode012 in #625
- Fix: Fail bf16 check when running on cpu during merge by @NanoCode012 in #631
- default model changed by @mhenrichsen in #629
- Added quotes to the pip install -e command in the documentation to fix an incompatibility … by @Nan-Do in #632
- Feat: Add support for upstream FA2 by @NanoCode012 in #626
- eval_table isn't quite stable enough to be in default llama configs by @winglian in #637
- attention_mask not needed for training by @winglian in #642
- update for recent transformers updates by @winglian in #636
- use fastchat conversations template by @winglian in #578
- skip some flash attn patches unless explicitly enabled by @winglian in #643
- Correct typos in datasets.py by @felixonmars in #639
- Fix bug in dataset loading by @ethanhs in #284
- Warn users to login to HuggingFace by @Napuh in #645
- Mistral flash attn packing by @winglian in #646
- Fix(cfg): Add validation for save_strategy and eval_strategy by @NanoCode012 in #633
- Feat: Add example for Mistral by @NanoCode012 in #644
- Add mistral/README.md by @adarshxs in #647
- fix for flash attn w mistral w/o sammple packing by @winglian in #648
- don't strip the prompt for check since we don't strip to tokenize anymore by @winglian in #650
- add support for defined train split by @winglian in #654
- Fix bug when using pretokenized datasets by @ein-ich in https...
v0.3.0
What's Changed
- Fix sharegpt type in doc by @NanoCode012 in #202
- add support for opimum bettertransformers by @winglian in #92
- Use AutoTokenizer for redpajama example by @sroecker in #209
- issue #205 bugfix by @MaciejKarasek in #206
- Fix tokenizing labels by @winglian in #214
- add float16 docs and tweak typehints by @winglian in #212
- support adamw and grad norm hyperparams by @winglian in #215
- Fixing Data Readme by @msinha251 in #235
- don't fail fast by @winglian in #218
- better py3 support w pre-commit by @winglian in #241
- optionally define whether to use_fast tokenizer by @winglian in #240
- skip the system prompt by @winglian in #243
- push intermediate model checkpoints to hub by @winglian in #244
- System prompt data by @winglian in #224
- Add cfg.push_to_hub_model_id to readme by @NanoCode012 in #252
- Fix typing list in prompt tokenizer by @NanoCode012 in #249
- add option for instruct w sys prompts by @winglian in #246
- open orca support by @winglian in #255
- update pip install command for apex by @winglian in #247
- Fix future deprecation push_to_hub_model_id by @NanoCode012 in #258
- [WIP] Support loading data files from a local directory by @utensil in #221
- Fix(readme): local path loading and custom strategy type by @NanoCode012 in #264
- don't use llama if trust_remote_code is set since that needs to use AutoModel path by @winglian in #266
- params are adam_, not adamw_ by @winglian in #268
- Quadratic warmup by @winglian in #271
- support for loading a model by git revision by @winglian in #272
- Feat(docs): Add model_revision arg by @NanoCode012 in #273
- Feat: Add save_safetensors by @NanoCode012 in #275
- Feat: Set push to hub as private by default by @NanoCode012 in #274
- Allow non-default dataset configurations by @cg123 in #277
- Feat(readme): improve docs on multi-gpu by @NanoCode012 in #279
- Update requirements.txt by @teknium1 in #280
- Logging update: added PID and formatting by @theobjectivedad in #276
- git fetch fix for docker by @winglian in #283
- misc fixes by @winglian in #286
- fix axolotl training args dataclass annotation by @winglian in #287
- fix(readme): remove accelerate config by @NanoCode012 in #288
- add hf_transfer to requirements for faster hf upload by @winglian in #289
- Fix(tokenizing): Use multi-core by @NanoCode012 in #293
- Pytorch 2.0.1 by @winglian in #300
- Fix(readme): Improve wording for push model by @NanoCode012 in #304
- add apache 2.0 license by @winglian in #308
- Flash attention 2 by @winglian in #299
- don't resize embeddings to multiples of 32x by default by @winglian in #313
- Add XGen info to README and example config by @ethanhs in #306
- better handling since xgen tokenizer breaks with convert_tokens_to_ids by @winglian in #307
- add runpod envs to .bashrc, fix bnb env by @winglian in #316
- update prompts for open orca to match the paper by @winglian in #317
- latest HEAD of accelerate causes 0 loss immediately w FSDP by @winglian in #321
- Prune cuda117 by @winglian in #327
- update README for updated docker images by @winglian in #328
- fix FSDP save of final model by @winglian in #329
- pin accelerate so it works with llama2 by @winglian in #330
- add peft install back since it doesn't get installed by setup.py by @winglian in #331
- lora/qlora w flash attention fixes by @winglian in #333
- feat/llama-2 examples by @mhenrichsen in #319
- update README by @tmm1 in #337
- Fix flash-attn + qlora not working with llama models by @tmm1 in #336
- optimize the iteration when tokenizeing large datasets by @winglian in #332
- Added Orca Mini prompt strategy by @jphme in #263
- Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) by @ssmi153 in #339
- add a basic ds zero3 config by @winglian in #347
- experimental llama 2 chat support by @jphme in #296
- ensure enable_input_require_grads is called on model before getting the peft model by @winglian in #345
- set
group_by_length
to false in all examples by @tmm1 in #350 - GPU memory usage logging by @tmm1 in #354
- simplify
load_model
signature by @tmm1 in #356 - Clarify pre-tokenize before multigpu by @NanoCode012 in #359
- Update README.md on pretraining_dataset by @NanoCode012 in #360
- bump to latest bitsandbytes release with major bug fixes by @tmm1 in #355
- feat(merge): save tokenizer on merge by @NanoCode012 in #362
- Feat: Add rope scaling by @NanoCode012 in #343
- Fix(message): Improve error message for bad format by @NanoCode012 in #365
- fix(model loading): warn when model revision is passed to gptq by @NanoCode012 in #364
- Add wandb_entity to wandb options, update example configs, update README by @morganmcg1 in #361
- fix(save): save as safetensors by @NanoCode012 in #363
- Attention mask and position id fixes for packing by @winglian in #285
- attempt to run non-base docker builds on regular cpu hosts by @winglian in #369
- revert previous change and build ax images w docker on gpu by @winglian in #371
- extract module for working with cfg by @tmm1 in #372
- quiet noise from llama tokenizer by setting pad token earlier by @tmm1 in #374
- improve GPU logging to break out pytorch cache and system mem by @tmm1 in #376
- simplify
load_tokenizer
by @tmm1 in #375 - fix check for flash attn branching by @w...
v0.2.1
What's Changed
- docker fixes: py310, fix cuda arg in deepspeed by @winglian in #115
- add support for gradient accumulation steps by @winglian in #123
- split up llama model loading so config can be loaded from base config and models can be loaded from a path by @winglian in #120
- copy xformers attn from ooba since we removed dep on alpaca_lora_4bit by @winglian in #124
- Fix(readme): Fix torch missing from readme by @NanoCode012 in #118
- Add accelerate dep by @winglian in #114
- Feat(inference): Swap to GenerationConfig by @NanoCode012 in #119
- add py310 support from base image by @winglian in #127
- add badge info to readme by @winglian in #129
- fix packing so that concatenated sequences reset the attention by @winglian in #131
- swap batch size for gradient accumulation steps to decouple from num gpu by @winglian in #130
- fix batch size calculation by @winglian in #134
- Fix: Update doc for grad_accu and add validation tests for batch size by @NanoCode012 in #135
- Feat: Add lambdalabs instruction by @NanoCode012 in #141
- Feat: Add custom prompt readme and add missing prompt strategies to Readme by @NanoCode012 in #142
- added docker-compose file by @FarisHijazi in #146
- Update README.md for correct image tags by @winglian in #147
- fix device map by @winglian in #148
- clone in docker by @winglian in #149
- new prompters, misc fixes for output dir missing using fsdp, and changing max seq len by @winglian in #155
- fix camel ai, add guanaco/oasst mapping for sharegpt by @winglian in #158
- Fix: Update peft and gptq instruction by @NanoCode012 in #161
- Fix: Move custom prompts out of hidden by @NanoCode012 in #162
- Fix future deprecate prepare_model_for_int8_training by @NanoCode012 in #143
- Feat: Set matmul tf32=True when tf32 passed by @NanoCode012 in #163
- Fix: Validate falcon with fsdp by @NanoCode012 in #164
- Axolotl supports falcon + qlora by @utensil in #132
- Fix: Set to use cfg.seed or 42 for seed by @NanoCode012 in #166
- Fix: Refactor out unmodified save_steps and eval_steps by @NanoCode012 in #167
- Disable Wandb if no wandb project is specified by @bratao in #168
- Feat: Improve lambda labs instruction by @NanoCode012 in #170
- Fix falcon support lora by @NanoCode012 in #171
- Feat: Add landmark attention by @NanoCode012 in #169
- Fix backward compat for peft by @NanoCode012 in #176
- Update README.md to reflect current gradient checkpointing support by @PocketDocLabs in #178
- fix for max sequence len across different model types by @winglian in #179
- Add streaming inference & fix stopping at EOS by @Glavin001 in #180
- add support to extend context with xpos rope by @winglian in #181
- fix for local variable 'LlamaForCausalLM' referenced before assignment by @winglian in #182
- pass a prompt in from stdin for inference by @winglian in #183
- Update FAQS.md by @akj2018 in #186
- various fixes by @winglian in #189
- more config pruning and migrating by @winglian in #190
- Add save_steps and eval_steps to Readme by @NanoCode012 in #191
- Fix config path after config moved by @NanoCode012 in #194
- Fix training over existing lora by @AngainorDev in #159
- config fixes by @winglian in #193
- misc fixes by @winglian in #192
- Fix landmark attention patch by @NanoCode012 in #177
- peft no longer needs device_map by @winglian in #187
- chore: Fix inference README. by @mhenrichsen in #197
- Update README.md to include a community showcase by @PocketDocLabs in #200
- chore: Refactor inf_kwargs out by @NanoCode012 in #199
- tweak config to work by @winglian in #196
New Contributors
- @FarisHijazi made their first contribution in #146
- @utensil made their first contribution in #132
- @bratao made their first contribution in #168
- @PocketDocLabs made their first contribution in #178
- @Glavin001 made their first contribution in #180
- @akj2018 made their first contribution in #186
- @AngainorDev made their first contribution in #159
- @mhenrichsen made their first contribution in #197
Full Changelog: v0.2.0...v0.2.1
v0.2.0
What's Changed
- Add pre-commit: black+flake8+pylint+mypy+isort+bandit by @NanoCode012 in #98
- Qlora openllama 3b example by @fearnworks in #106
- Viktoriussuwandi patch by @viktoriussuwandi in #105
- default to qlora support, make gptq specific image by @winglian in #108
New Contributors
- @fearnworks made their first contribution in #106
- @viktoriussuwandi made their first contribution in #105
Full Changelog: v0.1.0...v0.2.0
current "Stable"
v0.1.0 Merge pull request #111 from OpenAccess-AI-Collective/sharegpt-token-…