Cannot make offline speaker diarization work #1615
Hieroglyph17
started this conversation in
Development
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Dear All,
I have the code and error message below. No matter what I try, I can't fulfil the request for parameters required as per the error message. I am happy to use other code. What I want to do is offline speaker diarization.
The code:
`from pyannote.audio import Pipeline
Load the pretrained pipeline
offline_vad = Pipeline.from_pretrained("config.yaml")
Check if the pipeline requires instantiation
if hasattr(offline_vad, 'instantiate'):
# Define the parameters for instantiation
params = {
'clustering': {
'method': 'centroid',
'min_cluster_size': 12,
'threshold': 0.7045654963945799
},
'segmentation': {
'min_duration_off': 0.0
}
}
# Instantiate the pipeline with the provided parameters
offline_vad.instantiate(params)
Path to your audio file
audio_file_path = "/path/to/your/audiofile.wav"
audio_file_path = "./LarrySinclairObama_02.wav"
Apply the pipeline to your audio file
diarization = offline_vad({'audio': audio_file_path})
**The config.yaml file:**
version: 3.1.0pipeline:
name: pyannote.audio.pipelines.SpeakerDiarization
params:
clustering: AgglomerativeClustering
embedding: pyannote/wespeaker-voxceleb-resnet34-LM
embedding_batch_size: 32
embedding_exclude_overlap: true
segmentation: pyannote/segmentation-3.0
segmentation_batch_size: 32
params:
clustering:
method: centroid
min_cluster_size: 12
threshold: 0.7045654963945799
segmentation:
min_duration_off: 0.0
`
The error message, preceded by warnings:
/.venv/lib/python3.11/site-packages/pyannote/audio/core/io.py:43: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call.
torchaudio.set_audio_backend("soundfile")
/.venv/lib/python3.11/site-packages/torch_audiomentations/utils/io.py:27: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call.
torchaudio.set_audio_backend("soundfile")
Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.1.2. To apply the upgrade to your files permanently, run
python -m pytorch_lightning.utilities.upgrade_checkpoint pytorch_model.bin
Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.10.0+cu102, yours is 2.1.2. Bad things might happen unless you revert torch to 1.x.
Traceback (most recent call last):
File "/.venv/lib/python3.11/site-packages/pyannote/audio/core/pipeline.py", line 302, in call
default_parameters = self.default_parameters()
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.11/site-packages/pyannote/audio/pipelines/speaker_diarization.py", line 190, in default_parameters
raise NotImplementedError()
NotImplementedError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "SpeakerDiarization.py", line 27, in
diarization = offline_vad({'audio': audio_file_path})
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.11/site-packages/pyannote/audio/core/pipeline.py", line 304, in call
raise RuntimeError(
RuntimeError: A pipeline must be instantiated with
pipeline.instantiate(parameters)
before it can be applied.Beta Was this translation helpful? Give feedback.
All reactions