You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In practice, this BERT model is bound to fall into local convergence if the learning rate is 0.001;
I think the learning rate should be reduced to 0.0001.
The experimental results show that when the learning rate is 0.0001, after about 100 iterations, the loss value will be reduced to 0.1, while if the learning rate is 0.001, the loss value will almost never be less than 2.0.
In Line 209:
In practice, this BERT model is bound to fall into local convergence if the learning rate is 0.001;
I think the learning rate should be reduced to 0.0001.
The experimental results show that when the learning rate is 0.0001, after about 100 iterations, the loss value will be reduced to 0.1, while if the learning rate is 0.001, the loss value will almost never be less than 2.0.
when lr=0.01
when lr=0.0001
The text was updated successfully, but these errors were encountered: