You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I used the config named recipes/configs/mistral/7B_full_low_memory.yaml
the default optimizer inside is bitsandbytes.optim.PagedAdamW, which will raise the error as follows,
[rank6]: Traceback (most recent call last):
[rank6]: File "/sensei-fs/users/my_name/code/finetune/finetune_distributed.py", line 776, in <module>
[rank6]: sys.exit(recipe_main())
[rank6]: File "/sensei-fs/users/my_name/code/finetune/torchtune/torchtune/config/_parse.py", line 99, in wrapper
[rank6]: sys.exit(recipe_main(conf))
[rank6]: File "/sensei-fs/users/my_name/code/finetune/finetune_distributed.py", line 771, in recipe_main
[rank6]: recipe.train()
[rank6]: File "/sensei-fs/users/my_name/code/finetune/finetune_distributed.py", line 680, in train
[rank6]: self._optimizer.step()
[rank6]: File "/opt/venv/lib/python3.10/site-packages/torch/optim/optimizer.py", line 484, in wrapper
[rank6]: out = func(*args, **kwargs)
[rank6]: File "/opt/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank6]: return func(*args, **kwargs)
[rank6]: File "/opt/venv/lib/python3.10/site-packages/bitsandbytes/optim/optimizer.py", line 292, in step
[rank6]: torch.cuda.synchronize()
[rank6]: File "/opt/venv/lib/python3.10/site-packages/torch/cuda/__init__.py", line 892, in synchronize
[rank6]: return torch._C._cuda_synchronize()
[rank6]: RuntimeError: CUDA error: an illegal memory access was encountered
[rank6]: CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[rank6]: For debugging consider passing CUDA_LAUNCH_BLOCKING=1
[rank6]: Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
The text was updated successfully, but these errors were encountered:
Hi,
I used the config named
recipes/configs/mistral/7B_full_low_memory.yaml
the default optimizer inside is
bitsandbytes.optim.PagedAdamW
, which will raise the error as follows,The text was updated successfully, but these errors were encountered: