Training leverage slowly with very deep polynet #2

clks-wzz · 2018-03-02T03:34:11Z

We train very deep poly net with 10 GPUs and 30-batch-size. It leverages very slowly (almost no leverage).
Is there any solution?

zhangxc11 · 2018-03-02T04:41:10Z

There are several tricks list in paper may help,

initialization by insertion: firstly train a relatively shallow model, say normal Inception ResNet, and then use it to initialise very deep poly net. You can find detail from figure 9 of paper
choose small residual scale, 0.3 is the default residual scale, but you can use smaller value to get better convergency rate.
synchronize batch normalization: if you use 10 GPU and total 30 batch size, which means there is only 3 samples in a GPU, this is too small, you can apply synchronize batch normalization to get larger batch size
about stochastic paths: this strategy is used overcome overfitting, in the early stage of training, this is not necessarily.

clks-wzz · 2018-03-02T08:25:55Z

Thanks for answering.
I want to train the normal Inception ResNet, and make some changes in polynet.
As I don't have python caffe scripts (python layer) on polynet for simple operation. Can you offer me one?
Thanks a lot!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training leverage slowly with very deep polynet #2

Training leverage slowly with very deep polynet #2

clks-wzz commented Mar 2, 2018

zhangxc11 commented Mar 2, 2018

clks-wzz commented Mar 2, 2018 •

edited

Loading

Training leverage slowly with very deep polynet #2

Training leverage slowly with very deep polynet #2

Comments

clks-wzz commented Mar 2, 2018

zhangxc11 commented Mar 2, 2018

clks-wzz commented Mar 2, 2018 • edited Loading

clks-wzz commented Mar 2, 2018 •

edited

Loading