Release Version 1.4.0rc1 · mesolitica/malaya-speech

Starting Malaya-Boilerplate 0.0.24, if Tensorflow absent in local, it will be replaced with Mock Tensorflow, https://malaya-speech.readthedocs.io/en/latest/mock-tensorflow.html, we are going to focus on PyTorch onwards.
Added PyTorch RNNT using TorchAudio, https://malaya-speech.readthedocs.io/en/latest/load-stt-transducer-model-pt.html, beat Google ASR on Malaya-Speech Malay test set, FLEURS Malay test set and Singlish test set. Required TorchAudio.
Added PyTorch Multi-language RNNT using TorchAudio, now you can predict multi-language in 1 audio sample, https://malaya-speech.readthedocs.io/en/latest/load-stt-transducer-model-pt-multilanguage.html, beat Google ASR on Malaya-Speech Malay test set, FLEURS Malay test set and Singlish test set. Required TorchAudio.
Added more ASR CTC models, https://malaya-speech.readthedocs.io/en/latest/stt-ctc-huggingface.html
Added Finetuned Whisper models, trained on Malaya-Speech Malay train set and IMDA Singlish train set, https://malaya-speech.readthedocs.io/en/latest/stt-seq2seq-whisper.html
Added HuggingFace ASR Seq2Seq models, https://malaya-speech.readthedocs.io/en/latest/stt-seq2seq-whisper.html
Added Force Alignment using PyTorch RNNT, https://malaya-speech.readthedocs.io/en/latest/force-alignment-transducer-pt.html
Added Force Alignment using HuggingFace ASR Seq2Seq models https://malaya-speech.readthedocs.io/en/latest/force-alignment-seq2seq-huggingface.html
Added orkid, bunga, jebat, tuah, male, female speakers for TTS VITS, https://malaya-speech.readthedocs.io/en/latest/tts-vits.html
Added multispeaker TTS VITS, https://malaya-speech.readthedocs.io/en/latest/tts-vits-multispeaker.html
Added is clean detection, very useful if you want to very clean voice activities, https://malaya-speech.readthedocs.io/en/latest/load-is-clean.html
Added Speaker embedding models from Nemo, without required to install Nemo, https://malaya-speech.readthedocs.io/en/latest/load-speaker-vector-nemo.html, there are the best in term of EER score on VoxCeleb2 test set.
Added interface to combine multiple diarization results become single diarization result, https://malaya-speech.readthedocs.io/en/latest/combine-longer-speaker-diarization.html
Added TorchAudio streaming interface, streaming VAD, https://malaya-speech.readthedocs.io/en/latest/long-audio-vad-torchaudio.html
Added TorchAudio streaming interface, streaming ASR, https://malaya-speech.readthedocs.io/en/latest/long-audio-asr-torchaudio.html
Added Enformer Streaming PyTorch RNNT, https://malaya-speech.readthedocs.io/en/latest/long-audio-asr-torchaudio.html
Added TorchAudio streaming interface, streaming ASR and diarization on Youtube videos, https://malaya-speech.readthedocs.io/en/latest/youtube-asr-diarization-torchaudio.html

To install it,

pip3 install malaya-speech==1.4.0rc1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version 1.4.0rc1