Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with trying this with YouTube video, Docker, and Nvidia #311

Closed
NightHawkATL opened this issue Oct 2, 2024 · 10 comments
Closed

Issue with trying this with YouTube video, Docker, and Nvidia #311

NightHawkATL opened this issue Oct 2, 2024 · 10 comments
Assignees
Labels
bug Something isn't working

Comments

@NightHawkATL
Copy link

NightHawkATL commented Oct 2, 2024

Which OS are you using?

  • OS: [e.g. iOS or Windows.. If you are using Google Colab, just Colab.]
  • Linux/Docker

I have cloned the git and have built the image and launched the UI correctly as far as I can tel. I have it on my AI VM that I have an Nvidia M40 12GB passed through and it does say that it detected CUDA and I can see it loading the model when testing YouTube transcribing. It will error out and stop the container once it tries to create the file. I did see inanother issue where someone was having a similar issue and you suggested they change the compute type. I only have "float32" as an option for my setup. As it is on the same VM as my Ollama and InvokeAI setups, those have access to the GPUs and are currently not in use.
here is my compose:

  app:
    # build: .
    image: jhj0517/whisper-webui:latest
    container_name: whisper_webui
    volumes:
      # Update paths to mount models and output paths to your custom paths:
      - /portainer/Files/AppData/Config/Whisper-WebUI/models:/Whisper-WebUI/models
      - /portainer/Files/AppData/Config/Whisper-WebUI/outputs:/Whisper-WebUI/outputs
      - /portainer/Files/AppData/Config/Whisper-WebUI/configs:/Whisper-WebUI/configs


    ports:
      - "7860:7860"

    stdin_open: true
    tty: true

    entrypoint: ["python", "app.py", "--server_port", "7860", "--server_name", "0.0.0.0",]

    # If you're not using nvidia GPU, Update device to match yours.
    # See more info at : https://docs.docker.com/compose/compose-file/deploy/#driver
# GPU support
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ['0']
              capabilities:
                - compute
                - utility
                - gpu```

@NightHawkATL NightHawkATL added the bug Something isn't working label Oct 2, 2024
@NightHawkATL
Copy link
Author

image
image
image
image
image
image

@jhj0517
Copy link
Owner

jhj0517 commented Oct 2, 2024

Hi. I don't know why ctranslate2.get_supported_compute_types("cuda") returns only float32 for your environment.
I think it's probably a bug in ctranslate2.

For now, I have just allowed custom values in #312, as the error message says.

But you may have to manually enter the float16 in the Dropdown as you did.

@NightHawkATL
Copy link
Author

Loaded the new image and I am still getting an error.
image

@NightHawkATL
Copy link
Author

The model seems to load 98MiB into the GPU and just stops.
image

@dng-nguyn
Copy link

float16 requires Maxwell architecture with Compute Capability of 5.3 or above, while Tesla M40 only supports 5.2

@jhj0517
Copy link
Owner

jhj0517 commented Oct 3, 2024

faster-whisper now needs CUDA version atleast 12.1.

You can see CUDA version compatibility with your GPU here:

As @dng-nguyn said, it seems that M40 supports only for really old version of CUDA, and it may require to some strugglings to setup.

Using just the CPU may be a better choice, although it's slower.

@NightHawkATL
Copy link
Author

NightHawkATL commented Oct 3, 2024

I am running CUDA 12.6 as shown in my nvidia-smi screenshot from earlier. I guess if your container can't support my card, I will just not use it until I can afford a better card.

@dng-nguyn
Copy link

The card is supported in CUDA 12 though (Maxwell microarchitecture), just not float16

Have you tried running it bare-metal? Or change whisper model to openai's.

Since you're manually building the image this may help you.

@jhj0517
Copy link
Owner

jhj0517 commented Oct 3, 2024

@NightHawkATL Ah, sorry. I misread the table. Tesla M40 supports CUDA 12.x.

Could not load library libcudnn_ops_infer.so.8

I didn't notice that. Lately github doesn't allow me to open the image in the new tab and it makes it difficult to read small image.
It seems the same with #271. If your OS is Windows, then I recommend using Purfview's solution,

But since yours is Linux, the comment @dng-nguyn pointed out would help!

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0-1_all.deb 
dpkg -i cuda-keyring_1.0-1_all.deb 
apt update && apt upgrade
apt install libcudnn8 libcudnn8-dev

It seems that this will manually install some missing .so files.

And if it's still problematic, you can try using openai's whisper implementation, you can edit docker-compose.yaml to use some CLI lines:

entrypoint: ["python", "app.py", "--server_port", "7860", "--server_name", "0.0.0.0",]

to

entrypoint: ["python", "app.py", "--server_port", "7860", "--server_name", "0.0.0.0", "--whisper_type", "whisper"] 

@jhj0517
Copy link
Owner

jhj0517 commented Oct 4, 2024

Investigated issue.

Could not load library libcudnn_ops_infer.so.8

Since this was about version incompatibility between faster-whisper (CTranslate2) and torch >= 2.4.0
I downgraded torch in #318.

So the new image works fine now!
@NightHawkATL

If you still face the same bug, please feel free to re-open!

@jhj0517 jhj0517 closed this as completed Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants