KAN-TTS

This is a model built based on modelscope kantts/a finetune version of korean tts

Training Process For Sambert

Use modify_sr to change the sampling rate to 16000hz

Use korean_recognition to auto recognize korean

Use text2phone to change text to required form

cp the raw_metafile into the labeled file

python kantts/preprocess/data_process_korean.py --voice_input_dir data/hanyu2h_label --voice_output_dir data/hanyu2h_labeled --audio_config kantts/configs/audio_config_se_16k.yaml --speaker F7 --se_model speech_personal_sambert-hifigan_nsf_tts_zh-cn_pretrain_16k/basemodel_16k/speaker_embedding/se.model to extract the relevant features (need to edit the data_process_korean.py 113 the raw_metafile location and the audio_processor 717 the raw folder)

edit the am_train file using raw_metafile

CUDA_VISIBLE_DEVICES=1,2 python kantts/bin/train_sambert.py --model_config speech_sambert-hifigan_tts_kyong_Korean_16k/basemodel_16k/sambert/config.yaml --root_dir data/hanyu_label data/tts1_label data/kss_label --stage_dir training_stage/test_male_ptts_sambert_ckpt --resume_path speech_sambert-hifigan_tts_kyong_Korean_16k/basemodel_16k/sambert/ckpt/checkpoint_630000.pth

Training Process For Hifigan

Preprocess korean dataset as shown in training process for sambert, run the following command to train hifigan for your own voice

CUDA_VISIBLE_DEVICES=2 python kantts/bin/train_hifigan.py --model_config speech_sambert-hifigan_tts_kyong_Korean_16k/basemodel_16k/hifigan/config.yaml --root_dir data/korean_labeled --stage_dir training_stage_hifigan --resume_path speech_sambert-hifigan_tts_kyong_Korean_16k/basemodel_16k/hifigan/ckpt/checkpoint_160000.pth

Inference Process

You can download the pretrain korean model from modelscope (https://modelscope.cn/models/speech_tts/speech_sambert-hifigan_tts_kyong_Korean_16k/files)

Then run the following command:

CUDA_VISIBLE_DEVICES=0 python kantts/bin/text_to_wav.py --txt test.txt --output_dir res/test_male_ptts_syn --res_zip speech_sambert-hifigan_tts_kyong_Korean_16k/resource.zip --am_ckpt speech_sambert-hifigan_tts_kyong_Korean_16k/basemodel_16k/sambert/ckpt/checkpoint_630000.pth --voc_ckpt speech_sambert-hifigan_tts_kyong_Korean_16k/basemodel_16k/hifigan/ckpt/checkpoint_160000.pth

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
components		components
kantts		kantts
.gitattributes		.gitattributes
README.md		README.md
api.py		api.py
attentions.py		attentions.py
autolabel.py		autolabel.py
commons.py		commons.py
convert.py		convert.py
disturb.py		disturb.py
environment.yaml		environment.yaml
korean_recognition.py		korean_recognition.py
mel_processing.py		mel_processing.py
models.py		models.py
modules.py		modules.py
requirements.txt		requirements.txt
se_extractor.py		se_extractor.py
setup.py		setup.py
text2phone.py		text2phone.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KAN-TTS

Training Process For Sambert

Training Process For Hifigan

Inference Process

About

Releases

Packages

Languages

MEI-mk11/korean_kantts

Folders and files

Latest commit

History

Repository files navigation

KAN-TTS

Training Process For Sambert

Training Process For Hifigan

Inference Process

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages