Only two speakers in an audio but pyannote assigned a new speaker for each segment. #1591

Hieroglyph17 · 2023-12-20T03:04:35Z

Hieroglyph17
Dec 20, 2023

I got a bunch of warnings when running but presume I can ignore them. However, the output assigns a new speaker for each segment. The audio file is a professional quality interview.
Can you help?
Christoph

Code:
pipeline = Pipeline.from_pretrained("config.yaml")

DEMO_FILE = {'uri': 'blabal', 'audio': '/Users/christophschnelle/Documents/Larry Sinclair Obama_02.wav'}
dz = pipeline(DEMO_FILE)

with open("diarization.txt", "w") as text_file:
text_file.write(str(dz))

print(*list(dz.itertracks(yield_label = True))[:100], sep="\n")

Output:
(...)
torchvision is not available - cannot save figures

Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.1.2. To apply the upgrade to your files permanently, run python -m pytorch_lightning.utilities.upgrade_checkpoint pytorch_model.bin

Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x.

Model was trained with torch 1.10.0+cu102, yours is 2.1.2. Bad things might happen unless you revert torch to 1.x.
(<Segment(0.605802, 12.9096)>, 'A', 'SPEECH')
(<Segment(13.0973, 25.1109)>, 'B', 'SPEECH')
(<Segment(25.2645, 27.8413)>, 'C', 'SPEECH')
(<Segment(28.0802, 51.971)>, 'D', 'SPEECH')
(<Segment(53.3362, 59.9061)>, 'E', 'SPEECH')
(<Segment(61.0495, 69.0529)>, 'F', 'SPEECH')
(<Segment(70.4181, 77.7389)>, 'G', 'SPEECH')
(<Segment(78.4386, 86.5785)>, 'H', 'SPEECH')
(<Segment(87.1587, 89.2065)>, 'I', 'SPEECH')
(<Segment(89.3942, 98.9846)>, 'J', 'SPEECH')
(<Segment(99.4283, 102.21)>, 'K', 'SPEECH')
(<Segment(103.951, 126.766)>, 'L', 'SPEECH')

hbredin · 2023-12-20T06:26:18Z

hbredin
Dec 20, 2023
Maintainer

Looks like you are using a VoiceActivityDetection pipeline while what you are looking for is a SpeakerDiarization pipeline.

1 reply

Hieroglyph17 Dec 20, 2023
Author

Thank you, Hervé!

Doh!

Hieroglyph17 · 2023-12-20T17:34:17Z

Hieroglyph17
Dec 20, 2023
Author

Hi Hervé,

I presume you mean https://huggingface.co/pyannote/speaker-diarization-3.1
Is there a way to run this offline?

Many thanks, Christoph

1 reply

Hieroglyph17 Dec 22, 2023
Author

Hi Hervé @hbredin,
I have now spent ages to find a way to use SpeakerDiarization offline but haven't found anything. Is it possible to run SpeakerDiarization offline? If yes, what would be required?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only two speakers in an audio but pyannote assigned a new speaker for each segment. #1591

{{title}}

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Only two speakers in an audio but pyannote assigned a new speaker for each segment. #1591

Hieroglyph17 Dec 20, 2023

Replies: 2 comments · 2 replies

hbredin Dec 20, 2023 Maintainer

Hieroglyph17 Dec 20, 2023 Author

Hieroglyph17 Dec 20, 2023 Author

Hieroglyph17 Dec 22, 2023 Author

Hieroglyph17
Dec 20, 2023

Replies: 2 comments 2 replies

hbredin
Dec 20, 2023
Maintainer

Hieroglyph17 Dec 20, 2023
Author

Hieroglyph17
Dec 20, 2023
Author

Hieroglyph17 Dec 22, 2023
Author