-
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition, Daniel S. Park et al, Google, 2019.04
-
SPLICEOUT: A Simple and Efficient Audio Augmentation Method, Arjit Jain et al, 2021.09
-
Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks, Alex Graves et al, 2006.01
-
SCALA: SUPERVISED CONTRASTIVE LEARNING FOR END-TO-END AUTOMATIC SPEECH RECOGNITION, Li Fu et al, 2021.10
-
Transformers with convolutional context for ASR, Abdelrahman Mohamed et al, Facebook, 2019.04
-
Conformer: Convolution-augmented Transformer for Speech Recognition, Anmol Gulati et al, Google, 2020.05
-
SIMPLIFIED SELF-ATTENTION FOR TRANSFORMER-BASED END-TO-END SPEECH RECOGNITION, Haoneng Luo et al, 2020.05
-
Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers, Takaaki Hori et al, 2021.04
-
A Survey of Transformers, Tianyang Lin et al, 2021.06
-
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition, Zhao You et al, 2022.04
-
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition, Sehoon Kim et al, 2022.06
-
A BETTER AND FASTER END-TO-END MODEL FOR STREAMING ASR, Bo Li et al, 2020.11
-
Bridging the gap between streaming and non-streaming ASR systems by distilling ensembles of CTC and RNN-T models, Thibault Doutre et al, 2021.04
-
Reducing Streaming ASR Model Delay with Self Alignment, Jaeyoung Kim et al, 2021.05
-
Streaming End-to-End ASR based on Blockwise Non-Autoregressive Models, Tianzi Wang et al, 2021.07
-
TRANSFORMER-TRANSDUCER: END-TO-END SPEECH RECOGNITION WITH SELF-ATTENTION, Ching-Feng Yeh et al, 2019.10
-
IMPROVING ACCURACY OF RARE WORDS FOR RNN-TRANSDUCER THROUGH UNIGRAM SHALLOW FUSION, Vijay Ravi et al, 2020.12
-
LESS IS MORE: IMPROVED RNN-T DECODING USING LIMITED LABEL CONTEXT AND PATH MERGING, Rohit Prabhavalkar et al, 2020.12
-
INPUT LENGTH MATTERS: AN EMPIRICAL STUDY OF RNN-T AND MWER TRAINING FOR LONG-FORM TELEPHONY SPEECH RECOGNITION, Zhiyun Lu et al, 2021.10
-
Modular End-to-end Automatic Speech Recognition Framework for Acoustic-to-word Model, Qi Liu et al, 2020.08
-
TRANSFORMER TRANSDUCER: ONE MODEL UNIFYING STREAMING AND NON-STREAMING SPEECH RECOGNITION, Anshuman Tripathi et al, 2020.10
-
UNIVERSAL ASR: UNIFYING STREAMING AND NON-STREAMING ASR USING A SINGLE ENCODER-DECODER MODEL, Zhifu Gao et al, 2020.10
-
CASCADED ENCODERS FOR UNIFYING STREAMING AND NON-STREAMING ASR, Arun Narayanan et al, 2020.10
-
CASCADE RNN-TRANSDUCER: SYLLABLE BASED STREAMING ON-DEVICE MANDARIN SPEECH RECOGNITION WITH A SYLLABLE-TO-CHARACTER CONVERTER, Xiong Wang et al, 2020.11
-
Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition, Binbin Zhang, 2020.12
-
TRANSFORMER BASED DELIBERATION FOR TWO-PASS SPEECH RECOGNITION, Ke Hu et al, 2021.01
-
TSNAT: Two-Step Non-Autoregressvie Transformer Models for Speech Recognition, Zhengkun Tian et al, 2021.04
-
DECOUPLING RECOGNITION AND TRANSCRIPTION IN MANDARIN ASR, Jiahong Yuan et al, 2021.08
-
Have best of both worlds: two-pass hybrid and E2E cascading framework for speech recognition, Guoli Ye et al, 2021.10
-
A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes, Shaojin Ding et al, Google, 2022.04
-
LEARNING A DUAL-MODE SPEECH RECOGNITION MODEL VIA SELF-PRUNING, Chunxi Liu el al, Facebook, 2022.07
-
HYBRID AUTOREGRESSIVE TRANSDUCER (HAT), Ehsan Variani et al, Google, 2020.03
-
INTERNAL LANGUAGE MODEL TRAINING FOR DOMAIN-ADAPTIVE END-TO-END SPEECH RECOGNITION, Zhong Meng et al, Microsoft, 2021.02
-
Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion, Duc Le et al, 2021.04
-
Investigating Methods to Improve Language Model Integration for Attention-based Encoder-Decoder ASR Models, Mohammad Zeineldeen et al, RWTH, 2021.04
-
Tree-constrained Pointer Generator for End-to-end Contextual Speech Recognition, Guangzhi Sun et al, 2021.09
-
INTEGRATING CATEGORICAL FEATURES IN END-TO-END ASR, Rongqing Huang et al, 2021.10
-
CONTEXT-AWARE TRANSFORMER TRANSDUCER FOR SPEECH RECOGNITION, Feng-Ju Chang et al, Amazon, 2021.11
-
CONSISTENT TRAINING AND DECODING FOR END-TO-END SPEECH RECOGNITION USING LATTICE-FREE MMI, Jinchuan Tian et al, 2021.12
-
Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model, Jinchuan Tian et al, 2022.01
-
KNOWLEDGE TRANSFER FROM LARGE-SCALE PRETRAINED LANGUAGE MODELS TO END-TO-END SPEECH RECOGNIZERS, Yotaro Kubo et al, Google, 2022.02
-
Scaling End-to-End Models for Large-Scale Multilingual ASR, Bo Li et al, 2021.04
-
A Study of Multilingual End-to-End Speech Recognition for Kazakh, Russian, and English, Saida Mussakhojayeva et al, 2021.08
-
Multilingual Speech Recognition for Low-Resource Indian Languages using Multi-Task conformer, Krishna D N et al, 2021.09
-
ACCENT-ROBUST AUTOMATIC SPEECH RECOGNITION USING SUPERVISED AND UNSUPERVISED WAV2VEC EMBEDDINGS, Jialu Li et al, Facebook, 2021.10
-
SCALING UP DELIBERATION FOR MULTILINGUAL ASR, Ke Hu et al, Google, 2022.10
-
Robust Speech Recognition via Large-Scale Weak Supervision, Alec Radford et al, OpenAI, 2022.12
-
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages, Yu Zhang et al, Google, 2023.03
-
Acoustic data augmentation for Mandarin-English code-switching speech recognition, Yanhua Long et al, 2019.11
-
Bi-encoder Transformer Network for Mandarin-English Code-switching Speech Recognition using Mixture of Experts, Yizhou Lu et al, 2020.05
-
MANDARIN-ENGLISH CODE-SWITCHING SPEECH RECOGNITION WITH SELF-SUPERVISED SPEECH REPRESENTATION MODELS, Liang-Hsuan Tseng et al, 2021.10
-
Integrating Knowledge in End-to-End Automatic Speech Recognition for Mandarin-English Code-Switching, Chia-Yu Li et al, 2021.12
-
Unsupervised Cross-lingual Representation Learning for Speech Recognition, Alexis Conneau et al, 2020.06
-
Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition, Yu Zhang et al, 2020.10
-
Unsupervised Speech Recognition, Alexei Baevski et al, 2021.05
-
Unsupervised Automatic Speech Recognition : A Review, Hanan Aldarmaki et al, 2021.06
-
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units, Wei-Ning Hsu et al, 2021.06
-
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition, Yu Zhang et al, 2021.09
-
IMPROVING PSEUDO-LABEL TRAINING FOR END-TO-END SPEECH RECOGNITION USING GRADIENT MASK, Shaoshi Ling et al, Bytedance, 2021.10
-
WORD ORDER DOES NOT MATTER FOR SPEECH RECOGNITION, Vineel Pratap et al, 2021.10
-
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing, Sanyuan Chen et al, 2021.10
-
Pseudo-Labeling for Massively Multilingual Speech Recognition, Loren Lugosch et al, Facebook, 2021.11
-
SCALING ASR IMPROVES ZERO AND FEW SHOT LEARNING, Alex Xiao et al, Facebook, 2021.11
-
EFFICIENT ADAPTER TRANSFER OF SELF-SUPERVISED SPEECH MODELS FOR AUTOMATIC SPEECH RECOGNITION, Bethan Thomas et al, 2022.02
-
Towards End-to-end Unsupervised Speech Recognition, Alexander H. Liu et al, Facebook, 2022.04
-
Confidence Estimation for Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks, A. Kastanos et al, 2019.10
-
CONFIDENCE ESTIMATION FOR ATTENTION-BASED SEQUENCE-TO-SEQUENCE MODELS FOR SPEECH RECOGNITION, Qiujia Li et al, Google, 2020.10
-
Residual Energy-Based Models for End-to-End Speech Recognition, Qiujia Li et al, Google, 2021.03
-
Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction, David Qiu et al, Google, 2021.04
-
IMPROVING CONFIDENCE ESTIMATION ON OUT-OF-DOMAIN DATA FOR END-TO-END SPEECH RECOGNITION, Qiujia Li et al, Google, 2021.10
- FastCorrect 2: Fast Error Correction on Multiple Candidates for Automatic Speech Recognition, Yichong Leng et al, 2021.09
-
On the Comparison of Popular End-to-End Models for Large Scale Speech Recognition, Jinyu Li et al, 2020.05
-
Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview, Peter Bell et al, 2020.08
-
Automatic speech recognition: a survey, Mishaim Malik et al, 2020.11
-
A REVIEW OF ON-DEVICE FULLY NEURAL END-TO-END AUTOMATIC SPEECH RECOGNITION ALGORITHMS, Chanwoo Kim et al, 2020.12
-
Thank you for Attention: A survey on Attention-based Artificial Neural Networks for Automatic Speech Recognition, Priyabrata Karmakar et al, 2021.02
-
Accented Speech Recognition: A Survey, Arthur Hinsvark et al, 2021.04
-
The History of Speech Recognition to the Year 2030, Awni Hannun et al, 2021.08
-
Automatic Speech Recognition using limited vocabulary: A survey, JEAN LOUIS K. E. FENDJI et al, 2021.08
-
Recent Advances in End-to-End Automatic Speech Recognition, JINYU LI et al, 2021.11