Recurrent Neural Network Regularization

TLDR; The authors show that applying dropout to only the non-recurrent connections (between layers of the same timestep) in an LSTM works well, improving the scores on various sequence tasks.

Data Sets and model performance

PTB Language Modeling Perplexity: 78.4
Google Icelandic Speech Dataset WER Accuracy: 70.5
WMT'14 English to French Machine Translation BLEU: 29.03
MS COCO Image Caption Generation BLEU: 24.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rnn-regularization.md

rnn-regularization.md

Recurrent Neural Network Regularization

Data Sets and model performance

Files

rnn-regularization.md

Latest commit

History

rnn-regularization.md

File metadata and controls

Recurrent Neural Network Regularization

Data Sets and model performance