Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add presets for Electra and checkpoint conversion script #1384

Merged
merged 37 commits into from
Mar 26, 2024

Conversation

pranavvp16
Copy link
Contributor

I have uploaded the weights on personal google cloud bucket. The from_preset method works properly in my local setup, but it throws some error in google collab notebook.

@mattdangerw
Copy link
Member

Please format your code with ./shell/format.sh, also looks like we have a pretty simple merge conflict to resolve.

@mattdangerw mattdangerw self-requested a review January 4, 2024 21:40
Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks overall good! Left a few comments. See Kaggle comment below.

keras_nlp/models/electra/electra_backbone.py Outdated Show resolved Hide resolved
keras_nlp/models/electra/electra_backbone.py Outdated Show resolved Hide resolved
keras_nlp/models/electra/electra_presets.py Outdated Show resolved Hide resolved
keras_nlp/models/electra/electra_backbone.py Outdated Show resolved Hide resolved
"lowercase": False,
},
# TODO: Upload weights on GCS.
"weights_url": "https://storage.googleapis.com/pranav-keras/electra-base-generator/model.weights.h5",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually just moved all our weights over to Kaggle. https://github.com/keras-team/keras-nlp/releases/tag/v0.7.0

This will make it easier to upload models long term, but let me get back to you next week on exact steps for upload. If you have a kaggle username, could you reply here with it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kaggle here is my kaggle username

keras_nlp/models/electra/electra_preprocessor.py Outdated Show resolved Hide resolved
@pranavvp16
Copy link
Contributor Author

Sorry for the delay I'll make the above following requested changes, also I have left my kaggle username above

@pranavvp16
Copy link
Contributor Author

I have made all the changes as suggested

Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Just let me know where the final assets to copy over are and I will pull this in.

"path": "electra",
"model_card": "https://github.com/google-research/electra",
},
"kaggle_handle": "kaggle://pranavprajapati16/electra/keras/electra_base_discriminator_en/1",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything at https://www.kaggle.com/models/pranavprajapati16/electra.

You should now have the ability to make models public, can you do so? Or is the actual model here? https://www.kaggle.com/models/pranavprajapati16/electra_base_discriminator_en (in which case these links are still wrong).

Let me know where to get the proper assets and I will copy to the Keras org.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry the model was private just made it public. https://www.kaggle.com/models/pranavprajapati16/electra

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Uploading now! I can just patch the new links into this PR and land. I'll ping here if I run into any issues.

@mattdangerw mattdangerw added the kokoro:force-run Runs Tests on GPU label Mar 22, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Mar 22, 2024
@mattdangerw mattdangerw added the kokoro:force-run Runs Tests on GPU label Mar 23, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Mar 23, 2024
@mattdangerw
Copy link
Member

Actually does look like there is an error here. It looks like the tokenizer should be configured to lowercase input, but is not. This is leading to some test failures.

E.g. input_data=["The quick brown fox."], -> an UNK token at the beginning of the sequence and failing tests.

Can you take a look and confirm that we should be lowercasing input for all electra presets? If so I can go ahead an make the changes here, there will be some annoying renames to stick to our conversions--we should call the variants uncased_en instead of _en.

@mattdangerw
Copy link
Member

mattdangerw commented Mar 23, 2024

Also, could you try converting the large versions? And adding presets? We should probably include those.

https://huggingface.co/collections/google/electra-release-64ff6e8b18830fabea30a1ab

@pranavvp16 pranavvp16 force-pushed the electra branch 2 times, most recently from 3780fe9 to 2889350 Compare March 23, 2024 21:07
# Conflicts:
#	keras_nlp/models/electra/electra_tokenizer.py
@pranavvp16
Copy link
Contributor Author

regarding the tokenizer I found this config for one of the presets. Also I have uploaded the weights for large models

@mattdangerw
Copy link
Member

Thanks! I'll get large uploaded, and fix up the lowercasing issues.

@mattdangerw mattdangerw added the kokoro:force-run Runs Tests on GPU label Mar 26, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Mar 26, 2024
@mattdangerw mattdangerw added the kokoro:force-run Runs Tests on GPU label Mar 26, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Mar 26, 2024
@mattdangerw
Copy link
Member

OK think this is working! Going to pull this in. @pranavvp16 please let us know if you spot any issues. We should probably test this out with an end to end colab to make sure things are working as intended.

@mattdangerw
Copy link
Member

Not that I changed all the preset names to include "uncased" in keeping with bert conventions. electra_small_generator_uncased_en

@mattdangerw mattdangerw merged commit a6700eb into keras-team:master Mar 26, 2024
8 of 10 checks passed
abuelnasr0 pushed a commit to abuelnasr0/keras-nlp that referenced this pull request Apr 2, 2024
…1384)

* Added ElectraBackbone

* Added backbone tests for ELECTRA

* Fix config

* Add model import to __init__

* add electra tokenizer

* add tests for tokenizer

* add __init__ file

* add tokenizer and backbone to models __init__

* Fix Failing tokenization test

* Add example on usage of the tokenizer with custom vocabulary

* Add conversion script to convert weights from checkpoint

* Add electra preprocessor

* Add presets and tests

* Add presets config with model weights

* Add checkpoint conversion script

* Name conversion for electra models

* Update naming conventions according to preset names

* Fix failing tokenizer tests

* Update checkpoint conversion script according to kaggle

* Add validate function

* Kaggle preset

* update preset link

* Add electra presets

* Complete run_small_preset test for electra

* Add large variations of electra in presets

* Fix case issues with electra presets

* Fix format

---------

Co-authored-by: Matt Watson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants