Replies: 1 comment 2 replies
-
Hi @MooTheCow,
If they are very small datasets (less than 100 wavs), 2-3 hours could work well. If they are older, I recommend taking advantage of the six free hours that colab offers with GPU, although lately the colab drive has the bug of disconnecting Drive after 4 hours of execution. |
Beta Was this translation helpful? Give feedback.
-
PiperTTS is fantastic. Well done!
I have a few questions about the training colab notebook.
What determines model size?
I am finetuning off of lessacs-high. Does this determine the model size? Then I am selecting Quality for the model as high. If I select something else will this drop the quality to that level? And finally I am exporting the model with the Exporter notebook and I can choose "quality" there. Does that change the model size or is it just used to create the name of the model? Ideally I'd like to train a High model, but then convert that quickly into a Medium model to compare render time and fidelity. I don't really want to have to go through the entire training process again for a Medium if I don't need to. And I'm not sure if I then have to train from a Medium model instead of a High one.
How long to train for?
There is no information given by the notebook other than a log message:
DEBUG:fsspec.local:open file: /content/drive/MyDrive/colab/piper/Test/lightning_logs/version_0/checkpoints/last.ckpt
There is no indication of how far the training has progressed, which makes it hard to judge how long to leave it. I've had success finetuning and training 100 * 20sec wavs for 3.5 hours, but I don't know if I could make the model better with longer, or if that was overkill. Colab starts to get ansty at that time and wants to kick me off. And I've totally failed to get WSL to recognise CUDA so far, so local training is not possible.
If Colab boots the process over taking too long, can I use the last.ckpt (eg with the Extraction notebook) or is that now invalid?
Other information and community?
I came here from CoquiTTS, which has a useful Discord, and is a great model, but in my opinion not as good as Piper. Is there anywhere other than this github (or, worse: comments on YouTube videos) to discuss PiperTTS? Also is there anywhere that people are sharing voice models, other than links to random HuggingFace pages?
Beta Was this translation helpful? Give feedback.
All reactions