modelscope · tastelikefeet · Sep 11, 2024 · Sep 11, 2024 · Sep 11, 2024 · Sep 11, 2024
diff --git a/MANIFEST.in b/MANIFEST.in
@@ -2,4 +2,4 @@ recursive-include swift/utils *.py
 recursive-include swift/llm/data *.*
 recursive-include swift/llm/ds_config *.json
 recursive-include requirements *.txt
-recursive-include swift/llm/agent *.json
+recursive-include swift/llm/template/agent *.json
diff --git a/README.md b/README.md
@@ -121,7 +121,7 @@ You can contact us and communicate with us by adding our group:
 - 2024.04.29: Supports inference and fine-tuning of InternVL-Chat-V1.5 model. For best practice, you can refer to [here](https://github.com/modelscope/swift/tree/main/docs/source_en/Multi-Modal/internvl-best-practice.md).
 - 🔥2024.04.26: Support **LISA** and **unsloth** training! Specify `--lisa_activated_layers=2` to use LISA(to reduce the memory cost to 30 percent!), specify `--tuner_backend unsloth` to use unsloth to train a huge model(full or lora) with lesser memory(30 percent or lesser) and faster speed(5x)!
 - 🔥2024.04.26: Support the fine-tuning and inference of Qwen1.5-110B and Qwen1.5-110B-Chat model, use [this script](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/qwen1half_110b_chat/lora_ddp_ds/sft.sh) to start training!
-- 2024.04.24: Support for inference and fine-tuning of Phi3 series models. Including: [phi3-4b-4k-instruct](examples/pytorch/llm/scripts/phi3_4b_4k_instruct/lora), phi3-4b-128k-instruct.
+- 2024.04.24: Support for inference and fine-tuning of Phi3 series models. Including: [phi3-4b-4k-instruct](legacy/pytorch/llm/scripts/phi3_4b_4k_instruct/lora), phi3-4b-128k-instruct.
 - 2024.04.22: Support for inference, fine-tuning, and deployment of **chinese-llama-alpaca-2** series models. This includes：chinese-llama-2-1.3b, chinese-llama-2-7b, chinese-llama-2-13b, chinese-alpaca-2-1.3b, chinese-alpaca-2-7b and chinese-alpaca-2-13b along with their corresponding 16k and 64k long text versions.
 - 2024.04.22: Support for inference and fine-tuning of Llama3 GPTQ-Int4, GPTQ-Int8, and AWQ series models. Support for inference and fine-tuning of chatglm3-6b-128k, Openbuddy-Llama3.
 - 2024.04.20: Support for inference, fine-tuning, and deployment of **Atom** series models. This includes: Atom-7B and Atom-7B-Chat. use [this script](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/atom_7b_chat/lora/sft.sh) to train.

diff --git a/README_CN.md b/README_CN.md
@@ -121,7 +121,7 @@ SWIFT具有丰富全面的文档，请查看我们的文档网站:
 - 2024.04.29: 支持InternVL-Chat-V1.5的推理与微调, 最佳实践可以查看[这里](https://github.com/modelscope/swift/tree/main/docs/source/Multi-Modal/internvl最佳实践.md).
 - 🔥2024.04.26: 支持**LISA** 和 **unsloth**训练！指定 `--lisa_activated_layers=2` 来开启LISA（显存使用降低至全参训练的30%），指定 `--tuner_backend unsloth` 来使用unsloth，用更少的显存（30%或更少）更快的速度（5x）训练一个超大模型！
 - 🔥2024.04.26: 支持Qwen1.5-110B和Qwen1.5-110B-Chat模型的推理与微调, 使用[这个脚本](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/qwen1half_110b_chat/lora_ddp_ds/sft.sh)来开始训练！
-- 2024.04.24: 支持Phi3系列模型的推理与微调. 包括: [phi3-4b-4k-instruct](examples/pytorch/llm/scripts/phi3_4b_4k_instruct/lora), phi3-4b-128k-instruct.
+- 2024.04.24: 支持Phi3系列模型的推理与微调. 包括: [phi3-4b-4k-instruct](legacy/pytorch/llm/scripts/phi3_4b_4k_instruct/lora), phi3-4b-128k-instruct.
 - 2024.04.22: 支持**chinese-llama-alpaca-2**系列模型的推理与微调和部署等. 包括：chinese-llama-2-1.3b, chinese-llama-2-7b, chinese-llama-2-13b, chinese-alpaca-2-1.3b, chinese-alpaca-2-7b和chinese-alpaca-2-13b以及对应的16k和64k长文本模型.
 - 2024.04.22: 支持Llama3 GPTQ-Int4, GPTQ-Int8, AWQ系列模型的推理与微调. 支持chatglm3-6b-128k, Openbuddy-llama3的推理与微调.
 - 2024.04.20: 支持**Atom**系列模型的推理, 微调和部署等. 包括: Atom-7B and Atom-7B-Chat. 使用[这个脚本](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/atom_7b_chat/lora/sft.sh)来开始训练！

diff --git a/TEST_POINTS.md b/TEST_POINTS.md
@@ -0,0 +1,4 @@
+- 所有的examples需要测试
+- 所有的模型上传下载和数据集load
+- Internlm-xcomposer2 hook
+- tagengo-gpt4 vqa-v2 gqa grit
diff --git a/docs/source/AIGC/AnimateDiff微调推理文档.md b/docs/source/AIGC/AnimateDiff微调推理文档.md
diff --git a/docs/source/GetStarted/使用tuners.md b/docs/source/GetStarted/使用tuners.md
@@ -15,7 +15,8 @@ tuner是指附加在模型上的额外结构部分，用于减少训练参数量
 11. Vision Prompt Tuning: [Visual Prompt Tuning](https://arxiv.org/abs/2203.12119)
 12. Side: [Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks](https://arxiv.org/abs/1912.13503)
 13. Res-Tuning: [Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone](https://arxiv.org/abs/2310.19859)  < [arXiv](https://arxiv.org/abs/2310.19859)  |  [Project Page](https://res-tuning.github.io/)  |  [Usage](ResTuning.md) >
-14. [PEFT](https://github.com/huggingface/peft)提供的tuners, 如IA3, AdaLoRA等
+14. ReFT: [ReFT: Representation Finetuning for Language Models](https://arxiv.org/pdf/2404.03592)
+15. [PEFT](https://github.com/huggingface/peft)提供的tuners, 如IA3, AdaLoRA等
 
 ## 在训练中使用
 

diff --git a/docs/source/Instruction/LLM微调文档.md b/docs/source/Instruction/LLM微调文档.md
@@ -204,10 +204,12 @@ import os
 os.environ['CUDA_VISIBLE_DEVICES'] = '0'
 
 from swift.llm import (
-    get_model_tokenizer, get_template, inference, ModelType, get_default_template_type
+    get_model_tokenizer, get_template, ModelType, get_default_template_type, TransformersFramework, InferArguments
 )
 from swift.tuners import Swift
 
+infer_framework = TransformersFramework()
+
 ckpt_dir = 'vx-xxx/checkpoint-100'
 model_type = ModelType.qwen_7b_chat
 template_type = get_default_template_type(model_type)

diff --git a/docs/source/Instruction/index.md b/docs/source/Instruction/index.md
@@ -12,3 +12,4 @@
 6. [命令行参数](命令行参数.md)
 7. [支持的模型和数据集](支持的模型和数据集.md)
 8. [自定义与拓展](自定义与拓展.md)
+9. [常见问题](LLM&VLM训练、推理、部署、评测常见问题.md)
diff --git a/docs/source/Instruction/命令行参数.md b/docs/source/Instruction/命令行参数.md
@@ -136,12 +136,7 @@
 - `--gpu_memory_fraction`: 默认为`None`. 该参数旨在指定显卡最大可用显存比例的情况下运行训练，用于极限测试.
 - `--train_dataset_mix_ratio`: 默认为`0.`. 该参数定义了如何进行数据集打混训练. 指定该参数时, 会混合训练集的`train_dataset_mix_ratio`倍数的`train_dataset_mix_ds`指定的通用知识数据集. 该参数已废弃, 请使用`--dataset`进行数据集混合.
 - `--train_dataset_mix_ds`: 默认为`['ms-bench']`. 用于防止知识遗忘的通用知识数据集. 该参数已废弃, 请使用`--dataset`进行数据集混合.
-- `--use_loss_scale`: 默认为`False`. 生效时会将Agent的部分字段(Action/Action Input部分)的loss权重加强以强化CoT, 对普通SFT场景没有任何效果.
-- `--loss_scale_config_path` 选项指定自定义的 loss_scale 配置，适用于在启用 use_loss_scale 时，例如在 Agent 训练中放大 Action 和其他关键 ReAct 字段的损失权重。
-  - 在配置文件中，您可以使用字典格式来设置 loss_scale。每个键代表一个特定字段名，其关联的值设定了该字段及其后续内容的损失缩放倍数。例如，通过设定 `"Observation:": [2, 0]`，当response包含 `xxxx Observation:error` 时，`Observation:` 字段loss将增加到两倍，`error` 部分的loss则不计入。除了字面匹配，配置也支持正则表达式规则，以实现更灵活的匹配，如模式 '<.*?>':[2.0] 将针对所有尖括号括起来的部分损失增加到两倍。字段匹配与正则匹配所对应的损失缩放倍数，分别由长度为2和1的列表表示。
-  - 同时支持匹配query对整段response设置loss_scale, 这在处理像[Agent-FLAN](https://arxiv.org/abs/2403.12881)论文中描述的固定多轮对话查询时极其有用，如果query中包含了预定义键的任一项，相应的响应将采用关联的 loss_scale 值。，你可以参考`swift/llm/agent/agentflan.json`
-  - 默认情况下，我们为 Action:, Action Input:, Thought:, Final Answer:, 和 Observation: 等字段预设了损失缩放值。我们为[alpha-umi](https://arxiv.org/pdf/2401.07324)和[Agent-FLAN](https://arxiv.org/abs/2403.12881)也提供了默认配置，你可以设置为`alpha-umi`和`agent-flan`来使用。默认的配置文件位于`swift/llm/agent`下
-  - 匹配规则的应用优先级，从高到低为：query字段 > response特定字段 > 正则表达式匹配规则。
+- `--loss_scale`: 默认为`default`. 目前支持`default`(无scale), `agentflan`(针对agentflan格式数据集提供了不同的loss weights), `react`(针对Action/Action Input/Final Answer等), 用户也可以通过plugin功能定制自己的loss scale.
 - `--custom_register_path`: 默认为`None`. 传入`.py`文件, 用于注册模板、模型和数据集.
 - `--custom_dataset_info`: 默认为`None`, 传入外置dataset_info.json的路径、json字符串或者dict. 用于拓展数据集. 格式参考: https://github.com/modelscope/swift/blob/main/swift/llm/data/dataset_info.json
 - `--device_map_config`: 手动配置模型的device_map, 默认为`None`. 你可以传入本地路径(.json), json字符串或者dict.