[Finetune] enable converting checkpoints without optimizer state generation #424

billishyahao · 2024-07-19T05:55:17Z

Our LLaMa finetune showcase consists of two parts:

Conversion from huggingface checkpoint to megatron deepspeed compatible checkpoint.
Do finetune with supervised dataset.

Latest version of checkpoint conversion generates both model weights and optimizer state and LR scheduler. It occupies huge disk storage.
This patch aims to reduce those states and re-generate them during finetune program starts. Take LLaMa 7B as an example:

# du -sh llama-7b-mega-ds-T2P2.*
13G     llama-7b-mega-ds-T2P2.with-patch
38G     llama-7b-mega-ds-T2P2.without-patch

Use alpaca dataset to run finetune task and see loss has good convergence:

…ration

[Finetune] enable converting checkpoints without optimizer state gene…

88954a6

…ration

billishyahao requested review from tjruwase, awan-10, eltonzheng, duli2012, arashb and GuanhuaWang as code owners July 19, 2024 05:55

delock mentioned this pull request Sep 20, 2024

[TRACKER] Customer support related PR tracker for Intel devices #446

Open

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Finetune] enable converting checkpoints without optimizer state generation #424

[Finetune] enable converting checkpoints without optimizer state generation #424

billishyahao commented Jul 19, 2024

[Finetune] enable converting checkpoints without optimizer state generation #424

Are you sure you want to change the base?

[Finetune] enable converting checkpoints without optimizer state generation #424

Conversation

billishyahao commented Jul 19, 2024