multi-GPU #105

wanghechong · 2018-04-20T15:46:05Z

i read you code carefully, but i do not vary understand like 'decoder_clones = clone_many_times(decoder, opt.max_sent_l_targ) ' , why we need to copy opt.max_sent_l_targ times, and we share the parameters and
do not share gradinput and gradoutput and others, what‘ s the logitic relation about them i do not understand clearly, so what kind of things we shoule clone? my model = {encoder（mlp）, decoder} too, if i want to train my model , what i shoule notice? because i increase the compution in my model ,so i have to train it on two gpus to compare with others with the same hyper parameters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi-GPU #105

multi-GPU #105

wanghechong commented Apr 20, 2018

multi-GPU #105

multi-GPU #105

Comments

wanghechong commented Apr 20, 2018