You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I run inference streaming code in CUDA environment (flashlight and wav2letter with CUDA) but results was the same between cpu and cuda.
And i have a question, how to run inference with CUDA?
Thanks!
The text was updated successfully, but these errors were encountered:
For running inference with CUDA, you can use fl_asr_decode binary which will do beam search decoding with a LM.
If you meant using streaming inference from w2l@anywhere, we currently support only CPU version for now.
Thank for your comment @vineelpratap
Do you have any ideal for streaming inferences with CUDA?
And performance of streaming inference with cuda is better than cpu or not? Because, we must copy multi chunks to vram in streaming progress.
I run inference streaming code in CUDA environment (flashlight and wav2letter with CUDA) but results was the same between cpu and cuda.
And i have a question, how to run inference with CUDA?
Thanks!
The text was updated successfully, but these errors were encountered: