Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training leverage slowly with very deep polynet #2

Open
clks-wzz opened this issue Mar 2, 2018 · 2 comments
Open

Training leverage slowly with very deep polynet #2

clks-wzz opened this issue Mar 2, 2018 · 2 comments

Comments

@clks-wzz
Copy link

clks-wzz commented Mar 2, 2018

We train very deep poly net with 10 GPUs and 30-batch-size. It leverages very slowly (almost no leverage).
Is there any solution?

@zhangxc11
Copy link
Collaborator

There are several tricks list in paper may help,

  • initialization by insertion: firstly train a relatively shallow model, say normal Inception ResNet, and then use it to initialise very deep poly net. You can find detail from figure 9 of paper
  • choose small residual scale, 0.3 is the default residual scale, but you can use smaller value to get better convergency rate.
  • synchronize batch normalization: if you use 10 GPU and total 30 batch size, which means there is only 3 samples in a GPU, this is too small, you can apply synchronize batch normalization to get larger batch size
  • about stochastic paths: this strategy is used overcome overfitting, in the early stage of training, this is not necessarily.

@clks-wzz
Copy link
Author

clks-wzz commented Mar 2, 2018

Thanks for answering.
I want to train the normal Inception ResNet, and make some changes in polynet.
As I don't have python caffe scripts (python layer) on polynet for simple operation. Can you offer me one?
Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants