- Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context, Zihang Dai et al, 1901
- Data Augmentation Approaches in Natural Language Processing: A Survey, Bohan Li et al, 2110
- Must-read papers on prompt-based tuning for pre-trained language models, Ning Ding and Shengding Hu, 2201
- Transformer Quality in Linear Time, Google, Weizhe Hua et al, 2202