Transcription error related to BatchedInferencePipeline and numpy #1102

shkstar · 2024-10-29T10:45:40Z

I am using BatchedInferencePipeline of faster whisper in Google Colab by

! pip install --force-reinstall "faster-whisper @ https://github.com/SYSTRAN/faster-whisper/archive/refs/heads/master.tar.gz"
! pip install ctranslate2==4.4.0

Today when I execute the transcription it showed below error msg:

[/usr/local/lib/python3.10/dist-packages/faster_whisper/transcribe.py](https://eq31t7k3e4m-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20241025-060057_RC00_689738598#) in transcribe(self, audio, language, task, beam_size, best_of, patience, length_penalty, repetition_penalty, no_repeat_ngram_size, temperature, compression_ratio_threshold, log_prob_threshold, log_prob_low_threshold, no_speech_threshold, condition_on_previous_text, prompt_reset_on_temperature, initial_prompt, prefix, suppress_blank, suppress_tokens, without_timestamps, max_initial_timestamp, word_timestamps, prepend_punctuations, append_punctuations, multilingual, output_language, vad_filter, vad_parameters, max_new_tokens, chunk_length, clip_timestamps, hallucination_silence_threshold, hotwords, language_detection_threshold, language_detection_segments)
    758             audio = torch.from_numpy(audio)
    759         elif not isinstance(audio, torch.Tensor):
--> 760             audio = decode_audio(audio, sampling_rate=sampling_rate)
    761 
    762         duration = audio.shape[0] / sampling_rate

[/usr/local/lib/python3.10/dist-packages/faster_whisper/audio.py](https://eq31t7k3e4m-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20241025-060057_RC00_689738598#) in decode_audio(input_file, sampling_rate, split_stereo)
     75         return torch.from_numpy(left_channel), torch.from_numpy(right_channel)
     76 
---> 77     return torch.from_numpy(audio)
     78 
     79 

TypeError: expected np.ndarray (got numpy.ndarray)

May I ask what is the problem and how to solve? It is weird that it used to work without problems.

The text was updated successfully, but these errors were encountered:

MahmoudAshraf97 · 2024-10-29T12:40:49Z

please use a debugger and check the value of audio or upload the audio file here

shkstar · 2024-10-30T04:04:05Z

I turn youtube video to wav using ! pip install yt-

Type of video_path_local: <class 'str'>
File exists: True
File size: 22384718
Error processing 9ez8lm9I26Y.wav: expected np.ndarray (got numpy.ndarray)

https://app.box.com/s/okmln29g34hdkbsn5r8no7gbg0orb8ny

MahmoudAshraf97 · 2024-11-14T13:27:09Z

Should be solved after #1106

MahmoudAshraf97 closed this as completed Nov 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transcription error related to BatchedInferencePipeline and numpy #1102

Transcription error related to BatchedInferencePipeline and numpy #1102

shkstar commented Oct 29, 2024 •

edited by MahmoudAshraf97

Loading

MahmoudAshraf97 commented Oct 29, 2024

shkstar commented Oct 30, 2024 •

edited

Loading

MahmoudAshraf97 commented Nov 14, 2024

Transcription error related to BatchedInferencePipeline and numpy #1102

Transcription error related to BatchedInferencePipeline and numpy #1102

Comments

shkstar commented Oct 29, 2024 • edited by MahmoudAshraf97 Loading

MahmoudAshraf97 commented Oct 29, 2024

shkstar commented Oct 30, 2024 • edited Loading

MahmoudAshraf97 commented Nov 14, 2024

shkstar commented Oct 29, 2024 •

edited by MahmoudAshraf97

Loading

shkstar commented Oct 30, 2024 •

edited

Loading