Skip to content

2021-07-21

Compare
Choose a tag to compare
@comodoro comodoro released this 21 Jul 13:24
· 3 commits to master since this release

This is a model based on a smaller alphabet that only contains Czech alphabet letters (as opposed to noise and non-speech sound symbols), see the file alphabet.txt.

Results on some test sets (without a language model):

  • Czech Commonvoice 6.1 test dataset: WER: 0.423823, CER: 0.112101, loss: 15.059019
  • Vystadial 2016 test set: WER: 0.507822, CER: 0.195558, loss: 17.671772
  • Large Corpus of Czech Parliament Plenary Hearings test set: WER: 0.214612, CER: 0.051837, loss: 19.688087