You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since f is the cross product of embedding and context in the case of hierarchical softmax we want it to be as close as possible to the turn (0 or 1) in a Huffman tree we have to take for this previous word (embedding) and current word's node index (context). In this case we just need
g = (vocab[word].code[d] - f)*alpha
Taking into account that vocab[word].code[d] could be 0 or 1 only, the "1 - vocab[word].code[d]" is just the inversion left-to-right-and-back nodes; what's its purpose?
In line 523 of word2vec there is a formula:
g = (1 - vocab[word].code[d] - f) * alpha;
Can you please help me understand its logic?
Since f is the cross product of embedding and context in the case of hierarchical softmax we want it to be as close as possible to the turn (0 or 1) in a Huffman tree we have to take for this previous word (embedding) and current word's node index (context). In this case we just need
g = (vocab[word].code[d] - f)*alpha
Taking into account that vocab[word].code[d] could be 0 or 1 only, the "1 - vocab[word].code[d]" is just the inversion left-to-right-and-back nodes; what's its purpose?
I summed up some details here: https://datascience.stackexchange.com/questions/129865/intuition-behind-g-variable-calculation-in-the-original-word2vec-implementation
The text was updated successfully, but these errors were encountered: