Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FalconBackbone #1475

Merged
merged 16 commits into from
Mar 1, 2024
Merged

Conversation

SamanehSaadat
Copy link
Member

@SamanehSaadat SamanehSaadat commented Feb 27, 2024

This is part of addressing #1372 to add the Falcon model to KerasNLP. This PR adds the FalconBackbone.

Checkpoint conversion colab


alibi = self._build_alibi_tensor(
self.num_attention_heads, decoder_padding_mask
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattdangerw Right now, I'm calculating alibi in every layer although it doesn't change from layer to layer! It would be more efficient to just calculate it once in the backbone but backbone init doesn't know the shapes yet! If you have any suggestions to be able to calculate the alibi once, let me know.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed this offline. This is fine for now because it doesn't seem a really compute intensive part. Ideally, it would be better to analyze to see what's the impact of this repetition and avoid repetition if it has a significant impact on the performance of the model.

Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Left a few comments.

keras_nlp/models/falcon/falcon_attention.py Outdated Show resolved Hide resolved
keras_nlp/models/falcon/falcon_backbone.py Outdated Show resolved Hide resolved
keras_nlp/models/falcon/falcon_backbone.py Outdated Show resolved Hide resolved
keras_nlp/models/falcon/falcon_backbone.py Outdated Show resolved Hide resolved

class FalconCausalLM(GenerativeTask):
def __init__(self, backbone, preprocessor=None, **kwargs):
inputs = backbone.input
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reference.
Yeah! I was thinking of adding this as a follow up PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the file.

keras_nlp/models/falcon/falcon_transformer_decoder.py Outdated Show resolved Hide resolved
keras.config.disable_traceback_filtering()


def convert_checkpoints(hf_model):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could consider modeling after this #1402 for a more fleshed out conversion script (though the PR i'm linking is missing the numerics validation). Will save a full preset directory and print emojis, both very important.

Also, could wait to shift to something like this after we add the tokenizer. Either way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the info! It would be great if I can add it later.

keras_nlp/models/falcon/falcon_attention.py Outdated Show resolved Hide resolved
keras_nlp/models/falcon/falcon_attention.py Outdated Show resolved Hide resolved
Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! feel free to merge if gpu testing looks good!

keras_nlp/models/falcon/falcon_backbone.py Outdated Show resolved Hide resolved
@SamanehSaadat SamanehSaadat merged commit 87eec69 into keras-team:master Mar 1, 2024
10 checks passed
@SamanehSaadat SamanehSaadat deleted the falcon-backbone branch March 1, 2024 01:18
abuelnasr0 pushed a commit to abuelnasr0/keras-nlp that referenced this pull request Apr 2, 2024
* Add Falcon backbone.

* Add docstring.

* Add dtype.

* Add checkpoint conversion script.

* Fix tests.

* Random fixes.

* Add cache.

* Cast cumsum to int32.

* Make sublayers public.

* Address backbone comments.

* Update attention computation to use einsum.

* Falcon only works with Keras3.

* Fix tests.

* Remove falcon_causal_lm file.

* Remove commented/unused codes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants