HQ LJSpeech voice. (finished, dl available) #202
Replies: 3 comments 1 reply
-
Thanks for doing this! I'd recommend running the English test phrases through as well (https://github.com/rhasspy/piper/tree/master/etc/test_sentences). The last 4 sentences are pangrams, which attempt to use every letter of the alphabet in a single sentence. They usually give a good hint of how the voice will perform on text outside the dataset. Regarding the stopping criteria, every automated method I've found doesn't work when you have multiple optimizers. Since VITS has both the generator and discriminator loss, I'm not sure if there's a good (automated) way to say a voice is "done". |
Beta Was this translation helpful? Give feedback.
-
Ok, I just passed 1000 epochs, and updated the page with examples of what the voice is producing. Perhaps halfway there? |
Beta Was this translation helpful? Give feedback.
-
Ok, I finished training 2000 epochs, and here is what I ended up with. Ckpt and voice files are dedicated to the public domain. Samples are available here.
|
Beta Was this translation helpful? Give feedback.
-
As I mentioned in another topic, I'm currently training a voice from scratch on the high quality setting using the LJSpeech dataset, which is in the public domain. When I'm done, I'll release the .ckpt and .onnx files into the public domain, too.
I'm not sure how long I'll end up letting it train. For a while yet, at least. The training docks say to watch for when certain graphs in the tensorboard "level off". I don't know exactly what that means, but I'm watching those graphs. I don't have a particularly spectacular GPU, so it'll take a while.
If anybody is interested, I set up a page with some results as of epoch 286 this morning. TTS Voice Training Experiments
Beta Was this translation helpful? Give feedback.
All reactions