已经使用LoRa技术及进行了模型微调，产生了很多文件，如何使用微调后的模型呢？ #97

LiangYong1216 · 2024-02-27T07:31:45Z

LiangYong1216
Feb 27, 2024

Feature request / 功能建议

使用LoRa技术微调，产生了下面的这些文件，有没有使用微调后的模型和原始模型一起进行预测的代码呢？

Chen8566 · 2024-02-29T07:31:36Z

Chen8566
Feb 29, 2024

同问，使用SFT或者LORA的脚本训练后，生产出的模型需要怎么进行Chat或者推理？似乎没办法再用model.chat的方式了，是不是需要对输入做一些前置处理？

0 replies

LiangYong1216 · 2024-02-29T07:35:43Z

LiangYong1216
Feb 29, 2024
Author

同问，使用SFT或者LORA的脚本训练后，生产出的模型需要怎么进行Chat或者推理？似乎没办法再用model.chat的方式了，是不是需要对输入做一些前置处理？

试试下面的代码：
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
path = '/mnt/ly/project/MiniCPM/models/MiniCPM' # 替换你的基础模型路径
device = torch.device("cuda:2" if torch.cuda.is_available() else "cpu") # 一张显卡就 cuda:0

加载基础模型

model = AutoModelForCausalLM.from_pretrained(path, torch_dtype=torch.float16, trust_remote_code=True).to(device)

tokenizer编码

tokenizer = AutoTokenizer.from_pretrained(path)

加载PEFT模型（基础模型，微调后的模型）

model = PeftModel.from_pretrained(model, '/mnt/ly/project/MiniCPM/finetune/output/AdvertiseGenLoRA/20240227091142/checkpoint-15000').to(device)

while True:
content = input('请输入要查询的问题:\n')
responds, history = model.chat(tokenizer, content, temperature=0.5, top_p=0.8, repetition_penalty=1.02, eos_token_id=2, pad_token_id=2)
print(responds)
有结果了通知我一声，注意我打断点的地方，替换成你的模型路径：

0 replies

Chen8566 · 2024-02-29T10:35:55Z

Chen8566
Feb 29, 2024

同问，使用SFT或者LORA的脚本训练后，生产出的模型需要怎么进行Chat或者推理？似乎没办法再用model.chat的方式了，是不是需要对输入做一些前置处理？

试试下面的代码： import torch from peft import PeftModel from transformers import AutoModelForCausalLM, AutoTokenizer path = '/mnt/ly/project/MiniCPM/models/MiniCPM' # 替换你的基础模型路径 device = torch.device("cuda:2" if torch.cuda.is_available() else "cpu") # 一张显卡就 cuda:0

加载基础模型

model = AutoModelForCausalLM.from_pretrained(path, torch_dtype=torch.float16, trust_remote_code=True).to(device)

tokenizer编码

tokenizer = AutoTokenizer.from_pretrained(path)

加载PEFT模型（基础模型，微调后的模型）

model = PeftModel.from_pretrained(model, '/mnt/ly/project/MiniCPM/finetune/output/AdvertiseGenLoRA/20240227091142/checkpoint-15000').to(device)

while True: content = input('请输入要查询的问题:\n') responds, history = model.chat(tokenizer, content, temperature=0.5, top_p=0.8, repetition_penalty=1.02, eos_token_id=2, pad_token_id=2) print(responds) 有结果了通知我一声，注意我打断点的地方，替换成你的模型路径：

您好，按照您的代码可以加载模型，但是模型推理耗时非常非常久（可能到几分钟一条），我按照教程训练的广告生成，训练了15000 steps（8 epochs），但是效果极其不好。请问这是因为什么呀？

0 replies

LiangYong1216 · 2024-03-01T08:59:34Z

LiangYong1216
Mar 1, 2024
Author

15000step应该数据集一次还没迭代完吧？batch_size是1的话，并且使用一张卡。你的epoch=8是怎么算出来的？我现在也在微调，还没跑完，不知道结果怎么样呢。出来结果了我们可以对照一下

0 replies

LiangYong1216 · 2024-03-04T06:17:49Z

LiangYong1216
Mar 4, 2024
Author

这是我训练的模型，输出也很烂，不知道是为什么？

0 replies

Chen8566 · 2024-03-04T17:24:27Z

Chen8566
Mar 4, 2024

这是我训练的模型，输出也很烂，不知道是为什么？

我是4张卡，--per_device_train_batch_size 2，lora的训练效果很差。

0 replies

LiangYong1216 · 2024-03-05T00:54:48Z

LiangYong1216
Mar 5, 2024
Author

有没有试过全量微调啊，用float32的话大概需要多少显存？

1 reply

jctime Mar 13, 2024
Maintainer

12G左右吧

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

已经使用LoRa技术及进行了模型微调，产生了很多文件，如何使用微调后的模型呢？ #97

{{title}}

Replies: 7 comments 1 reply

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

加载基础模型

tokenizer编码

加载PEFT模型（基础模型，微调后的模型）

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

已经使用LoRa技术及进行了模型微调，产生了很多文件，如何使用微调后的模型呢？ #97

LiangYong1216 Feb 27, 2024

Feature request / 功能建议

Replies: 7 comments · 1 reply

Chen8566 Feb 29, 2024

LiangYong1216 Feb 29, 2024 Author

加载基础模型

tokenizer编码

加载PEFT模型（基础模型， 微调后的模型）

Chen8566 Feb 29, 2024

加载基础模型

tokenizer编码

加载PEFT模型（基础模型， 微调后的模型）

LiangYong1216 Mar 1, 2024 Author

LiangYong1216 Mar 4, 2024 Author

Chen8566 Mar 4, 2024

LiangYong1216 Mar 5, 2024 Author

jctime Mar 13, 2024 Maintainer

LiangYong1216
Feb 27, 2024

Replies: 7 comments 1 reply

Chen8566
Feb 29, 2024

LiangYong1216
Feb 29, 2024
Author

加载PEFT模型（基础模型，微调后的模型）

Chen8566
Feb 29, 2024

加载PEFT模型（基础模型，微调后的模型）

LiangYong1216
Mar 1, 2024
Author

LiangYong1216
Mar 4, 2024
Author

Chen8566
Mar 4, 2024

LiangYong1216
Mar 5, 2024
Author

jctime Mar 13, 2024
Maintainer