When running the code of the module finetune_demo on the windosw11 system, an error will be reported #515

gyjlll · 2024-07-30T09:34:10Z

System Info / 系統信息

deep speed0.14.0
triton2.1.0
install torch-2.2.1+cu121-cp311-cp311-win_amd64.whl

Who can help? / 谁可以帮助到您？

Information / 问题信息

The official example scripts / 官方的示例脚本
My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

[2024-07-30 17:30:18,378] torch.distributed.elastic.multiprocessing.redirects: [WARNING] NOTE: Redirects are currently not supported in Windows or MacOs.
[2024-07-30 17:30:23,857] [WARNING] No training data specified
[2024-07-30 17:30:23,857] [WARNING] No train_iters (recommended) or epochs specified, use default 10k iters.
[2024-07-30 17:30:23,857] [INFO] using world size: 1 and model-parallel size: 1
[2024-07-30 17:30:23,857] [INFO] > padded vocab (size: 100) with 28 dummy tokens (new size: 128)
Traceback (most recent call last):
File "D:\PycharmProjects\CogVLM-main\finetune_demo\finetune_cogagent_demo.py", line 260, in
args = get_args(args_list)
^^^^^^^^^^^^^^^^^^^
File "D:\conda3\envs\cogvlm\Lib\site-packages\sat\arguments.py", line 442, in get_args
initialize_distributed(args)
File "D:\conda3\envs\cogvlm\Lib\site-packages\sat\arguments.py", line 513, in initialize_distributed
torch.distributed.init_process_group(
File "D:\conda3\envs\cogvlm\Lib\site-packages\torch\distributed\c10d_logger.py", line 86, in wrapper
func_return = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\conda3\envs\cogvlm\Lib\site-packages\torch\distributed\distributed_c10d.py", line 1184, in init_process_group
default_pg, _ = _new_process_group_helper(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\conda3\envs\cogvlm\Lib\site-packages\torch\distributed\distributed_c10d.py", line 1302, in _new_process_group_helper
raise RuntimeError("Distributed package doesn't have NCCL built in")
RuntimeError: Distributed package doesn't have NCCL built in

Expected behavior / 期待表现

yes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When running the code of the module finetune_demo on the windosw11 system, an error will be reported #515

When running the code of the module finetune_demo on the windosw11 system, an error will be reported #515

gyjlll commented Jul 30, 2024

When running the code of the module finetune_demo on the windosw11 system, an error will be reported #515

When running the code of the module finetune_demo on the windosw11 system, an error will be reported #515

Comments

gyjlll commented Jul 30, 2024

System Info / 系統信息

Who can help? / 谁可以帮助到您？

Information / 问题信息

Reproduction / 复现过程

Expected behavior / 期待表现