My team placed first place in the movie review prediction (MSE: 2.57611).
- Syllable embedding
- 2-layer bi-directional LSTM
- Variational dropout, Weight dropout
- Self-attention
- Learning rate decay
- output clamping
- dropout: weight, variational
- data flip
- weight moving average
Name | Value |
---|---|
Learning rate | 1e-3 |
number of character | 4500 |
number of embedding | 256 |
number of hidden | 256 |
dropout rate | 0.2 |
gradient clip | True |