Add `FalconBackbone` #1475

SamanehSaadat · 2024-02-27T21:48:53Z

This is part of addressing #1372 to add the Falcon model to KerasNLP. This PR adds the FalconBackbone.

SamanehSaadat · 2024-02-28T20:53:08Z

keras_nlp/models/falcon/falcon_transformer_decoder.py

+
+        alibi = self._build_alibi_tensor(
+            self.num_attention_heads, decoder_padding_mask
+        )


@mattdangerw Right now, I'm calculating alibi in every layer although it doesn't change from layer to layer! It would be more efficient to just calculate it once in the backbone but backbone init doesn't know the shapes yet! If you have any suggestions to be able to calculate the alibi once, let me know.

Discussed this offline. This is fine for now because it doesn't seem a really compute intensive part. Ideally, it would be better to analyze to see what's the impact of this repetition and avoid repetition if it has a significant impact on the performance of the model.

mattdangerw

Looks great! Left a few comments.

keras_nlp/models/falcon/falcon_attention.py

keras_nlp/models/falcon/falcon_backbone.py

mattdangerw · 2024-02-28T22:03:29Z

keras_nlp/models/falcon/falcon_causal_lm.py

+
+class FalconCausalLM(GenerativeTask):
+    def __init__(self, backbone, preprocessor=None, **kwargs):
+        inputs = backbone.input


Use this form https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/gemma/gemma_causal_lm.py#L148-L169

Though could also leave this class as a follow up.

Thanks for the reference.
Yeah! I was thinking of adding this as a follow up PR.

Removed the file.

keras_nlp/models/falcon/falcon_transformer_decoder.py

mattdangerw · 2024-02-28T22:15:42Z

tools/checkpoint_conversion/convert_falcon_checkpoints.py

+keras.config.disable_traceback_filtering()
+
+
+def convert_checkpoints(hf_model):


You could consider modeling after this #1402 for a more fleshed out conversion script (though the PR i'm linking is missing the numerics validation). Will save a full preset directory and print emojis, both very important.

Also, could wait to shift to something like this after we add the tokenizer. Either way.

Thanks for the info! It would be great if I can add it later.

keras_nlp/models/falcon/falcon_attention.py

mattdangerw

lgtm! feel free to merge if gpu testing looks good!

keras_nlp/models/falcon/falcon_backbone.py

* Add Falcon backbone. * Add docstring. * Add dtype. * Add checkpoint conversion script. * Fix tests. * Random fixes. * Add cache. * Cast cumsum to int32. * Make sublayers public. * Address backbone comments. * Update attention computation to use einsum. * Falcon only works with Keras3. * Fix tests. * Remove falcon_causal_lm file. * Remove commented/unused codes.

SamanehSaadat added 8 commits February 27, 2024 21:44

Add Falcon backbone.

32f7e20

Add docstring.

1ee93a3

Add dtype.

051a3f5

Merge branch 'keras-team:master' into falcon-backbone

ab28feb

Add checkpoint conversion script.

ac697a1

Fix tests.

52662f2

Random fixes.

1e2c30b

Add cache.

5927eed

SamanehSaadat requested a review from mattdangerw February 28, 2024 20:52

SamanehSaadat commented Feb 28, 2024

View reviewed changes

Cast cumsum to int32.

8c76d0d

mattdangerw requested changes Feb 28, 2024

View reviewed changes

SamanehSaadat added 6 commits February 29, 2024 02:10

Make sublayers public.

838942d

Address backbone comments.

e250755

Update attention computation to use einsum.

0c94c5c

Falcon only works with Keras3.

6787b20

Fix tests.

24d705e

Remove falcon_causal_lm file.

7355fd3

SamanehSaadat requested a review from mattdangerw March 1, 2024 00:19

mattdangerw approved these changes Mar 1, 2024

View reviewed changes

keras_nlp/models/falcon/falcon_backbone.py Outdated Show resolved Hide resolved

Remove commented/unused codes.

72753d8

SamanehSaadat merged commit 87eec69 into keras-team:master Mar 1, 2024
10 checks passed

SamanehSaadat deleted the falcon-backbone branch March 1, 2024 01:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `FalconBackbone` #1475

Add `FalconBackbone` #1475

SamanehSaadat commented Feb 27, 2024 •

edited

Loading

SamanehSaadat Feb 28, 2024

SamanehSaadat Mar 1, 2024

mattdangerw left a comment

mattdangerw Feb 28, 2024

SamanehSaadat Feb 29, 2024

SamanehSaadat Feb 29, 2024

mattdangerw Feb 28, 2024

SamanehSaadat Feb 29, 2024

mattdangerw left a comment

		keras.config.disable_traceback_filtering()


		def convert_checkpoints(hf_model):

Add FalconBackbone #1475

Add FalconBackbone #1475

Conversation

SamanehSaadat commented Feb 27, 2024 • edited Loading

SamanehSaadat Feb 28, 2024

Choose a reason for hiding this comment

SamanehSaadat Mar 1, 2024

Choose a reason for hiding this comment

mattdangerw left a comment

Choose a reason for hiding this comment

mattdangerw Feb 28, 2024

Choose a reason for hiding this comment

SamanehSaadat Feb 29, 2024

Choose a reason for hiding this comment

SamanehSaadat Feb 29, 2024

Choose a reason for hiding this comment

mattdangerw Feb 28, 2024

Choose a reason for hiding this comment

SamanehSaadat Feb 29, 2024

Choose a reason for hiding this comment

mattdangerw left a comment

Choose a reason for hiding this comment

Add `FalconBackbone` #1475

Add `FalconBackbone` #1475

SamanehSaadat commented Feb 27, 2024 •

edited

Loading