Update gemma_backbone.py for sharding config. #1491

qlzh727 · 2024-03-06T20:37:11Z

This is trying to address the #1464.

The new setting is based on the Gemma training script internally.

Here is some perf benchmark on TPU v3-8:

(Smaller value are better)

===================
base line (current setting):
generate: 1342 ms per 100 token
finetune with lora: 125ms/step

=====================
This PR setting
generate: 1245 ms per 100 token
finetune with lora: 64ms/step

qlzh727 · 2024-03-12T17:41:15Z

PTAL again.

mattdangerw

This LGTM, though we still might want to check with other folks to help decide between the two.

qlzh727 · 2024-03-12T18:14:43Z

Ack, I will leave the PR here and feel free to merge it when ready.

mattdangerw

looks good!

* Update gemma_backbone.py for sharding config. * Update unit test and fix format. * Update sharding spec for gemma based on gemma training.

Update gemma_backbone.py for sharding config.

e95675a

github-actions bot added the Gemma Gemma model specific issues label Mar 6, 2024

mattdangerw self-requested a review March 8, 2024 01:42

Update unit test and fix format.

d11fb86

mattdangerw approved these changes Mar 12, 2024

View reviewed changes

Update sharding spec for gemma based on gemma training.

fcc94d5

qlzh727 requested a review from mattdangerw March 14, 2024 17:22

mattdangerw approved these changes Mar 14, 2024

View reviewed changes

mattdangerw merged commit 4511580 into master Mar 14, 2024
18 of 19 checks passed

qlzh727 mentioned this pull request Mar 15, 2024

Question about Gemma tensor parallel sharding policy #1464

Closed

josharian mentioned this pull request May 3, 2024

GemmaBackbone.get_layout_map broken for gemma_2b_en #1613

Open

mattdangerw deleted the qlzh727-patch-2 branch August 22, 2024 00:15

Provide feedback