Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the prediction accuracy after training #9

Open
XinGP opened this issue Apr 19, 2024 · 4 comments
Open

About the prediction accuracy after training #9

XinGP opened this issue Apr 19, 2024 · 4 comments

Comments

@XinGP
Copy link

XinGP commented Apr 19, 2024

27131f5966bf59c70a2673ecb21bf3e
Hello author! I reproduced your AV1 training code, adjusted the epoch to 50, and adjusted the batch_size to 4, with no other adjustments made. What is the reason for the significant difference in the validation results obtained after training compared to your validation model?

@MasterIzumi
Copy link
Collaborator

@XinGP Generally, if you reduce the batch size, you should also reduce the learning rate (ref: https://arxiv.org/pdf/1812.01187.pdf). The relatively large LR will lead to unstable training.

@penglo
Copy link

penglo commented May 24, 2024

27131f5966bf59c70a2673ecb21bf3e Hello author! I reproduced your AV1 training code, adjusted the epoch to 50, and adjusted the batch_size to 4, with no other adjustments made. What is the reason for the significant difference in the validation results obtained after training compared to your validation model?
Hello, I've also encountered similar issues while tuning parameters, and I can't pinpoint the specific learning rate. Could you share what learning rate you ended up using? I'd like to discuss this issue further. My email is [email protected].

@Family-Liao
Copy link

Family-Liao commented Jun 3, 2024

27131f5966bf59c70a2673ecb21bf3e Hello author! I reproduced your AV1 training code, adjusted the epoch to 50, and adjusted the batch_size to 4, with no other adjustments made. What is the reason for the significant difference in the validation results obtained after training compared to your validation model?

Have you found a suitable learning rate for batch_size 4 to close to the result which the author gives? @XinGP

@XinGP
Copy link
Author

XinGP commented Jun 26, 2024

@XinGP一般来说,如果减小batch size,也应该减小学习率(ref: https://arxiv.org/pdf/1812.01187.pdf)。LR过大会导致训练不稳定。

Yes, after reducing all learning rates by half, I can obtain validation set results similar to those in the paper

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants