Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyperparameters for ImageNet training? #120

Open
InugYoon opened this issue Sep 25, 2022 · 4 comments
Open

Hyperparameters for ImageNet training? #120

InugYoon opened this issue Sep 25, 2022 · 4 comments

Comments

@InugYoon
Copy link

Hi, I am trying to reproduce the results.
May I get the hyperparameters for ImageNet experiment?

@HobbitLong
Copy link
Owner

Hi @InugYoon,

I assume you mean the implementation with the momentum encoder trick. If so, I adopt most hyper-parameters from MoCo-v2. IIRC, the differences are that: (1) I used a queue size of 8192; (2) temperature is 0.07; (3) batch size of 1024.

If you want to be faithful to the paper (not using the momentum encoder trick), please consider following the paper.

@InugYoon
Copy link
Author

Hello @HobbitLong, thank you for quick reply.

First of all, I wanted to re-implement the results based on code here, without using Moco-trick, following the paper.
For the CIFAR10/100 dataset, I could follow the script that you uploaded (with detailed hyperparameters including lr_rate, schedule method (milestone vs cosine), decay, etc) and successfully re-implemented the results.

However for the ImageNet, I couldn't find the listed hyperparameters both on github or the paper.
I tried with some hyperparameters, but earned nearly -10% performances.

Now with your kind reply, I noticed that you used the hyperparameters from Moco-v2 IIRC.
Then is there any source to know moco-trick exactly is?
Is there any github or code that I could know about?

About the moco-v2 hyperparameters, did you used from here?
https://github.com/facebookresearch/moco

@kiimmm
Copy link

kiimmm commented Apr 3, 2023

Hello @HobbitLong,

I have incorporated most of the hyper-parameters from MoCo-v2, which align with the ones you previously mentioned. However, there appears to be a gap in accuracy, and I postulate that this may be due to the number of epochs.
Thus, I would like to kindly request your assistance in providing me with the appropriate number of epochs or any other hyper-parameters that may help to improve the accuracy of my model.

Thank you for your time and consideration.

@yaoerqin
Copy link

Hello @kiimmm,

Would you please release your MoCo version code? I'm trying to reimplement it but got stuck in the MoCo code. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants