You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I not sure if this is possible. But my use case in relation to dataset building of TTS models. If I have .wav which contains audio for two different speakers. Is it possible to separate to an accurate degree the audio into two batches depending on the ID of the speaker?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello all,
I not sure if this is possible. But my use case in relation to dataset building of TTS models. If I have .wav which contains audio for two different speakers. Is it possible to separate to an accurate degree the audio into two batches depending on the ID of the speaker?
I've been trying find out my self and with the help with GPT I got this : https://github.com/rikabi89/diarization_script/blob/main/diarization_script.py
However the issue I found was that there was a lot of overlapping and this was not accurate. ps I don't know any coding and I would appreciate any steer here.
Beta Was this translation helpful? Give feedback.
All reactions