You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After I ran the following command and waited for a while:
python test/test_flores101.py --lang_pair deu-eng --retriever random --ice_num 8 --prompt_template "</E></X>=</Y>" --model_name facebook/xglm-7.5B --tokenizer_name facebook/xglm-7.5B --output_dir output --output_file test --seed 43
The output shows CUDA out of Memory, the full output is as follows:
Traceback (most recent call last):
File "/home/alex/Documents/MMT-LLM/test/test_flores101.py", line 122, in<module>
print(f"BLEU score = {test_flores(args)}")
File "/home/alex/Documents/MMT-LLM/test/test_flores101.py", line 84, in test_flores
infr = IclGenInferencer(
File "/home/alex/Documents/MMT-LLM/openicl/icl_inferencer/icl_gen_inferencer.py", line 63, in __init__
super().__init__(retriever[0], metric, references, model_name, tokenizer_name, max_model_token_num, model_config, batch_size, accelerator, output_json_filepath, api_name)
File "/home/alex/Documents/MMT-LLM/openicl/icl_inferencer/icl_base_inferencer.py", line 59, in __init__
self.model.to(self.device)
File "/home/alex/anaconda3/envs/mmt/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2460, in to
returnsuper().to(*args, **kwargs)
File "/home/alex/anaconda3/envs/mmt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1160, in to
return self._apply(convert)
File "/home/alex/anaconda3/envs/mmt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
File "/home/alex/anaconda3/envs/mmt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
File "/home/alex/anaconda3/envs/mmt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
[Previous line repeated 1 more time]
File "/home/alex/anaconda3/envs/mmt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 833, in _apply
param_applied = fn(param)
File "/home/alex/anaconda3/envs/mmt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1158, in convert
return t.to(device, dtype ift.is_floating_point() or t.is_complex() else None, non_blocking)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 256.00 MiB. GPU 0 has a total capacty of 23.68 GiB of which 307.62 MiB is free. Including non-PyTorch memory, this process has 23.20 GiB memory in use. Of the allocated memory 22.94 GiB is allocated by PyTorch, and 4.67 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Environment
Ubuntu 20.04 LTS
NVIDIA GPU 3090, 24GB Memory
Python 3.10.13
pip environment is in accordance with the requirement.txt
CUDA Toolkit and its version:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Jun_13_19:16:58_PDT_2023
Cuda compilation tools, release 12.2, V12.2.91
Build cuda_12.2.r12.2/compiler.32965470_0
Questions
What kind of hardware capability do you recommend? (What development did you use during the experiment?)
Do you have any suggestions for solving the problem of CUDA_OUT_OF_MEMORY on a GPU3090 machine?
Do you have some specific examples regarding this project, at least for reproducing the results mentioned in your paper.
Thanks~
The text was updated successfully, but these errors were encountered:
Way to Reproduce
CUDA out of Memory
, the full output is as follows:Environment
pip
environment is in accordance with therequirement.txt
Questions
Thanks~
The text was updated successfully, but these errors were encountered: