Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: 使用PaddleNLP套件中的Llama-2-13b模型推理 “解释一下温故而知新” 结果错误 #9587

Open
1 task done
gqzhang-ai opened this issue Dec 9, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@gqzhang-ai
Copy link

软件环境

paddlenlp                 3.0.0b0
paddlenlp_ops             0.0.0
paddlepaddle-gpu          3.0.0.dev20240927

paddlenlp套件代码commit为:db270d91b7db700c5716fd610730d157bdb56891

所使用的硬件环境如下:
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|

重复问题

  • I have searched the existing issues

错误描述

在GPU4卡环境中,使用所提供的meta-llama/Llama-2-13b下的权重,推理"解释一下温故而知新"的结果如下:

[2024-12-09 10:31:09,199] [    INFO] - We are using <class 'paddlenlp.transformers.llama.configuration.LlamaConfig'> to load 'meta-llama/Llama-2-13b'.
[2024-12-09 10:31:09,199] [    INFO] - Loading configuration file /root/.paddlenlp/models/meta-llama/Llama-2-13b/config.json
[2024-12-09 10:31:09,200] [    INFO] - Loading configuration file /root/.paddlenlp/models/meta-llama/Llama-2-13b/generation_config.json
[2024-12-09 10:31:09,323] [    INFO] - Start predict
[2024-12-09 10:31:09,360] [    INFO] - We are using <class 'paddlenlp.transformers.llama.tokenizer.LlamaTokenizer'> to load 'meta-llama/Llama-2-13b'.
[2024-12-09 10:31:09,599] [    INFO] - Start read result message
[2024-12-09 10:31:09,599] [    INFO] - Current path is /opt/PaddleNLP/llm
[2024-12-09 10:32:21,703] [    INFO] - running spend 72.10325646400452
[2024-12-09 10:32:21,743] [    INFO] - Finish read result message
[2024-12-09 10:32:21,745] [    INFO] - End predict
***********Source**********
解释一下温故而知新
***********Target**********

***********Output**********
,是怎么回事吗?
Wen guo zhi shi zen me hui
练听力,练读书,练说话,练写作,练英语,练算法,练计算机,练游戏,练篮球,练乒乓球,练游泳,练冲浪,练打啦啦队,练写听,练写读,练写说,练写写,练写算,练写计,练写游,练写泳,练写冲,练写打,练写啦,练写啦啦,练写啦啦队,练写啦啦队队,练写啦啦队队队,练写啦啦队队队队,练写啦啦队队队队队队,练写啦啦队队队队队队队队,练写啦啦队队队队队队队队队队,练写啦啦队队队队队队队队队队队队,练写啦啦队队队队队队队队队队队队队队队,练写啦啦队队队队队队队队队队队队队队队队队队,练写啦啦队队队队队队队队队队队队队队队队队队队队队,练写啦啦队队队队队队队队队队队队队队队队队队队队队队队队队,练写啦啦队队队队队队队队队队队队队队队队队队队队队队队队队队队队队,练写啦啦队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队队��
LAUNCH INFO 2024-12-09 10:32:23,585 Pod completed
LAUNCH INFO 2024-12-09 10:32:23,585 Exit code 0

稳定复现步骤 & 代码

cd PaddleNLP/llm
FLAGS_dynamic_static_unified_comm=0 FLAGS_enable_pir_api=0 CUDA_VISIBLE_DEVICES=4,5,6,7 python -m paddle.distributed.launch ./predict/predictor.py --model_name_or_path meta-llama/Llama-2-13b --inference_model --dtype float16 --quant_type weight_only_int8 --block_attn --block_size 32

@gqzhang-ai gqzhang-ai added the bug Something isn't working label Dec 9, 2024
@wawltor
Copy link
Collaborator

wawltor commented Dec 10, 2024

你用的是base模型是吗?可以尝试用chat模型

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants