Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bnb optimizers could use bnb.nn.StableEmbedding instead of torch.nn.Embedding #1769

Closed
mtasic85 opened this issue Oct 3, 2024 · 2 comments · Fixed by #1770
Closed

bnb optimizers could use bnb.nn.StableEmbedding instead of torch.nn.Embedding #1769

mtasic85 opened this issue Oct 3, 2024 · 2 comments · Fixed by #1770
Labels
enhancement New feature or request

Comments

@mtasic85
Copy link
Contributor

mtasic85 commented Oct 3, 2024

According to bnb documentation here:

https://huggingface.co/docs/bitsandbytes/main/optimizers
https://huggingface.co/docs/bitsandbytes/main/explanations/optimizers#stable-embedding-layer

This line could alter between bnb.nn.StableEmbedding and torch.nn.Embedding, or allow it to be configurable in config file:

wte=nn.Embedding(config.padded_vocab_size, config.n_embd),

There are also other places in code where torch.nn.Embedding is used.

@mtasic85 mtasic85 added the question Further information is requested label Oct 3, 2024
@rasbt
Copy link
Collaborator

rasbt commented Oct 3, 2024

Thanks for the note and good point, I didn't know about this.

One challenge I see with configuring it in the config file is that it's used to model creation. But one can later optionally run with --quantize bnb.nf4 or not. So, ideally, that swap should only take place upon calling the inference/training functions and leave the original model as is.

@rasbt rasbt added enhancement New feature or request and removed question Further information is requested labels Oct 3, 2024
@rasbt
Copy link
Collaborator

rasbt commented Oct 4, 2024

Upon reading a bit more, this would only be required for training (due to the optimizer choice). I added it in #1770

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants