You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've done all the things to set pipeline.to(torch.device('cuda:n')) where n is my GPU number.
It appears to load the model onto the GPU (I see a process with ~1 to 2 GB occupancy ) and it does perform the diarization. About 30 seconds for a 30 minute file.
However, I also see that during the pyannote process, it's essentially maxing out my CPU (i5 13600k) while performing the diarization.
Is this expected behavior when the Pyannote is "running on the GPU" ??
I'm running a ASR chain that involves
NEMO MSDD diarization
pyannote diarization.
faster-whisper
whisperX
a punctuation model
The problem is if I run two separate ASR chains (each on a separate RTX-3090), the pyannote process slows down dramatically as the available CPU compute is split between the two processes. Performance drops from ~30 seconds down up to ~280 to 320 seconds. (all else being equal, I would have thought splitting the CPU compute would double or triple the time (60 to 90 seconds) but not by an order of magnitude.
Such high CPU usage makes me suspect that my pyannote may actually be running on the CPU even though it appears the model is loaded onto the GPU. I'm now very confident that it is on the GPU as I forced it to CPU and the runtime was fantastically longer.
I'm still curious about the high CPU usage even when running the model on the GPU.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I've done all the things to set pipeline.to(torch.device('cuda:n')) where n is my GPU number.
It appears to load the model onto the GPU (I see a process with ~1 to 2 GB occupancy ) and it does perform the diarization. About 30 seconds for a 30 minute file.
However, I also see that during the pyannote process, it's essentially maxing out my CPU (i5 13600k) while performing the diarization.
Is this expected behavior when the Pyannote is "running on the GPU" ??
I'm running a ASR chain that involves
The problem is if I run two separate ASR chains (each on a separate RTX-3090), the pyannote process slows down dramatically as the available CPU compute is split between the two processes. Performance drops from ~30 seconds down up to ~280 to 320 seconds. (all else being equal, I would have thought splitting the CPU compute would double or triple the time (60 to 90 seconds) but not by an order of magnitude.
Such high CPU usage makes me suspect that my pyannote may actually be running on the CPU even though it appears the model is loaded onto the GPU.I'm now very confident that it is on the GPU as I forced it to CPU and the runtime was fantastically longer.I'm still curious about the high CPU usage even when running the model on the GPU.
Beta Was this translation helpful? Give feedback.
All reactions