You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since a few months ago I was using CTranslate2 with CUDA 11 on a GTX 960M.
Recently I tried to update to CUDA 12, which is still supported for GTX 960M, and CTranslate2.
I expected the update to work, since documentation still reports compatibility with Compute Capability 3.5 (https://opennmt.net/CTranslate2/hardware_support.html) and I had no major issue updating PyTorch.
Unfortunately this was not the case, since I started receiving:
RuntimeError: parallel_for failed: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device
I tried to compile the code with Compute Cabability 5.0, but this was not enough, since some code was introduced that required Compute Capability 5.3 (which is not supported on my GPU), thus I disabled that code and recompiled.
After that I was able to run through quickstart, using cuda instead of cpu.
I was also able to run faster-whisper.
Is it possible to reintroduce support for Compute Capablity 5.0 in distributed wheels?
If yes I will be happy to provide a pull request.
The text was updated successfully, but these errors were encountered:
Since a few months ago I was using CTranslate2 with CUDA 11 on a GTX 960M.
Recently I tried to update to CUDA 12, which is still supported for GTX 960M, and CTranslate2.
I expected the update to work, since documentation still reports compatibility with Compute Capability 3.5 (https://opennmt.net/CTranslate2/hardware_support.html) and I had no major issue updating PyTorch.
Unfortunately this was not the case, since I started receiving:
Apparently this is a common issue, since I saw other related issues from users of GTX 9xx GPUs:
SYSTRAN/faster-whisper#806
https://forums.developer.nvidia.com/t/runtimeerror-parallel-for-failed-cudaerrornokernelimagefordevice-no-kernel-image-is-available-for-execution-on-the-device/291404
m-bain/whisperX#794
I tried to compile the code with Compute Cabability 5.0, but this was not enough, since some code was introduced that required Compute Capability 5.3 (which is not supported on my GPU), thus I disabled that code and recompiled.
After that I was able to run through quickstart, using cuda instead of cpu.
I was also able to run faster-whisper.
Is it possible to reintroduce support for Compute Capablity 5.0 in distributed wheels?
If yes I will be happy to provide a pull request.
The text was updated successfully, but these errors were encountered: