You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
initialization by insertion: firstly train a relatively shallow model, say normal Inception ResNet, and then use it to initialise very deep poly net. You can find detail from figure 9 of paper
choose small residual scale, 0.3 is the default residual scale, but you can use smaller value to get better convergency rate.
synchronize batch normalization: if you use 10 GPU and total 30 batch size, which means there is only 3 samples in a GPU, this is too small, you can apply synchronize batch normalization to get larger batch size
about stochastic paths: this strategy is used overcome overfitting, in the early stage of training, this is not necessarily.
Thanks for answering.
I want to train the normal Inception ResNet, and make some changes in polynet.
As I don't have python caffe scripts (python layer) on polynet for simple operation. Can you offer me one?
Thanks a lot!
We train very deep poly net with 10 GPUs and 30-batch-size. It leverages very slowly (almost no leverage).
Is there any solution?
The text was updated successfully, but these errors were encountered: