-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add presets for Electra and checkpoint conversion script #1384
Conversation
Please format your code with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks overall good! Left a few comments. See Kaggle comment below.
"lowercase": False, | ||
}, | ||
# TODO: Upload weights on GCS. | ||
"weights_url": "https://storage.googleapis.com/pranav-keras/electra-base-generator/model.weights.h5", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually just moved all our weights over to Kaggle. https://github.com/keras-team/keras-nlp/releases/tag/v0.7.0
This will make it easier to upload models long term, but let me get back to you next week on exact steps for upload. If you have a kaggle username, could you reply here with it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kaggle here is my kaggle username
Sorry for the delay I'll make the above following requested changes, also I have left my kaggle username above |
I have made all the changes as suggested |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Just let me know where the final assets to copy over are and I will pull this in.
"path": "electra", | ||
"model_card": "https://github.com/google-research/electra", | ||
}, | ||
"kaggle_handle": "kaggle://pranavprajapati16/electra/keras/electra_base_discriminator_en/1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see anything at https://www.kaggle.com/models/pranavprajapati16/electra.
You should now have the ability to make models public, can you do so? Or is the actual model here? https://www.kaggle.com/models/pranavprajapati16/electra_base_discriminator_en (in which case these links are still wrong).
Let me know where to get the proper assets and I will copy to the Keras org.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry the model was private just made it public. https://www.kaggle.com/models/pranavprajapati16/electra
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Uploading now! I can just patch the new links into this PR and land. I'll ping here if I run into any issues.
Actually does look like there is an error here. It looks like the tokenizer should be configured to lowercase input, but is not. This is leading to some test failures. E.g. Can you take a look and confirm that we should be lowercasing input for all electra presets? If so I can go ahead an make the changes here, there will be some annoying renames to stick to our conversions--we should call the variants |
Also, could you try converting the https://huggingface.co/collections/google/electra-release-64ff6e8b18830fabea30a1ab |
3780fe9
to
2889350
Compare
# Conflicts: # keras_nlp/models/electra/electra_tokenizer.py
regarding the tokenizer I found this config for one of the presets. Also I have uploaded the weights for large models |
Thanks! I'll get large uploaded, and fix up the lowercasing issues. |
OK think this is working! Going to pull this in. @pranavvp16 please let us know if you spot any issues. We should probably test this out with an end to end colab to make sure things are working as intended. |
Not that I changed all the preset names to include "uncased" in keeping with bert conventions. |
…1384) * Added ElectraBackbone * Added backbone tests for ELECTRA * Fix config * Add model import to __init__ * add electra tokenizer * add tests for tokenizer * add __init__ file * add tokenizer and backbone to models __init__ * Fix Failing tokenization test * Add example on usage of the tokenizer with custom vocabulary * Add conversion script to convert weights from checkpoint * Add electra preprocessor * Add presets and tests * Add presets config with model weights * Add checkpoint conversion script * Name conversion for electra models * Update naming conventions according to preset names * Fix failing tokenizer tests * Update checkpoint conversion script according to kaggle * Add validate function * Kaggle preset * update preset link * Add electra presets * Complete run_small_preset test for electra * Add large variations of electra in presets * Fix case issues with electra presets * Fix format --------- Co-authored-by: Matt Watson <[email protected]>
I have uploaded the weights on personal google cloud bucket. The
from_preset
method works properly in my local setup, but it throws some error in google collab notebook.