Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gradient calculation formula of word2vec #34

Open
DamirTenishev opened this issue Aug 5, 2024 · 0 comments
Open

Gradient calculation formula of word2vec #34

DamirTenishev opened this issue Aug 5, 2024 · 0 comments

Comments

@DamirTenishev
Copy link

In line 523 of word2vec there is a formula:

g = (1 - vocab[word].code[d] - f) * alpha;

Can you please help me understand its logic?

Since f is the cross product of embedding and context in the case of hierarchical softmax we want it to be as close as possible to the turn (0 or 1) in a Huffman tree we have to take for this previous word (embedding) and current word's node index (context). In this case we just need

g = (vocab[word].code[d] - f)*alpha

Taking into account that vocab[word].code[d] could be 0 or 1 only, the "1 - vocab[word].code[d]" is just the inversion left-to-right-and-back nodes; what's its purpose?

I summed up some details here: https://datascience.stackexchange.com/questions/129865/intuition-behind-g-variable-calculation-in-the-original-word2vec-implementation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant