Quark dataset importer for fp8 #96

dan-garvey · 2024-07-09T18:47:13Z

:edited:

Provides a entry point for quark models that are stored as a safetensor file + config.json. With a little effort this could also be adapted to become the equivalent of the hf importer from gguf without the intermediate step.

I removed a lot of the useful debugging tooling I developed for this because it doesn't play nicely with torch.export, but I imagine a lot of it could be re-added if I guarded it in some way, so I'm keeping a branch with those changes for reference.

stellaraccident

Please create an import script specific to llama. Probably in models/llama/tools. Fine to start from here but these will diverge a lot

its ugly

TODO: split qkv prior to irpa

import directly to gguf format

undo some uneccessary changes

and rebase!

sharktank/sharktank/models/llama/tools/import_quark_dataset.py

sharktank/sharktank/layers/linear.py

dan-garvey requested review from harsh-amd and stellaraccident July 9, 2024 18:50

stellaraccident requested changes Jul 9, 2024

View reviewed changes

dan-garvey force-pushed the llama_fp8 branch from e0d100a to 586f3e2 Compare July 10, 2024 23:06

dan-garvey force-pushed the llama_fp8 branch 2 times, most recently from 142aeb7 to 4ed3c9d Compare July 19, 2024 20:59

dan-garvey force-pushed the llama_fp8 branch from da0f0a2 to d2f81d4 Compare August 13, 2024 20:34

dan-garvey and others added 20 commits August 28, 2024 13:44

(WIP) llama fp8 safetensor conversion

6e2e1ec

add in changes for loading model.

b5e0545

its ugly

move file

2967d86

re-add original

409e928

fix importer

70cadbb

TODO: split qkv prior to irpa

dont quantize quantized parameters facedesk

be8352b

datatype

2acb54c

add some fixes to run

92c2f7c

mid-debug

d285bec

more debug

0e82323

update

3c0690d

rebased

82878ec

checkpoint before swapping quant style

2979f7b

holy $*@($@ a working importer

d212566

stable prefill

cb9e642

Rework import quark dataset

d3bc063

import directly to gguf format

some cleanup

0d2d78f

remove some device calls, add some comments

65256e2

undo some uneccessary changes

remove some default values

7433ae9

last pass?

7f3c963

and rebase!

dan-garvey changed the title ~~(WIP) (DNM) llama fp8 safetensor conversion~~ Quark dataset importer for fp8 Aug 28, 2024

dan-garvey force-pushed the llama_fp8 branch from 6b85967 to 7f3c963 Compare August 28, 2024 20:46

add a test for Theta.pop()

b93aa68

dan-garvey marked this pull request as ready for review August 29, 2024 18:21

dan-garvey force-pushed the llama_fp8 branch from 07e6c5e to b93aa68 Compare August 29, 2024 18:23

Merge branch 'main' into llama_fp8

9becccf

dan-garvey requested review from rsuderman, archana-ramalingam and IanNod September 4, 2024 22:40

IanNod requested changes Sep 5, 2024

View reviewed changes

sharktank/sharktank/models/llama/tools/import_quark_dataset.py Outdated Show resolved Hide resolved

sharktank/sharktank/models/llama/tools/import_quark_dataset.py Outdated Show resolved Hide resolved

IanNod reviewed Sep 5, 2024

View reviewed changes

sharktank/sharktank/models/llama/tools/import_quark_dataset.py Show resolved Hide resolved

sharktank/sharktank/layers/linear.py Outdated Show resolved Hide resolved

dan-garvey and others added 5 commits September 5, 2024 14:14

address comments

57368b5

remove print from linear

0fe9da1

Merge branch 'main' into llama_fp8

6cd7880

adds support for fn->fnuz

83d385c

remove spurious clone

f6f6245

dan-garvey requested a review from IanNod September 24, 2024 17:55

IanNod approved these changes Sep 24, 2024

View reviewed changes

Merge branch 'main' into llama_fp8

f1e1bcb

dan-garvey enabled auto-merge (squash) September 24, 2024 20:39

stellaraccident approved these changes Sep 24, 2024

View reviewed changes

dan-garvey merged commit 9f4283d into main Sep 24, 2024
7 checks passed

dan-garvey deleted the llama_fp8 branch September 24, 2024 20:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quark dataset importer for fp8 #96

Quark dataset importer for fp8 #96

dan-garvey commented Jul 9, 2024 •

edited

Loading

stellaraccident left a comment

Quark dataset importer for fp8 #96

Quark dataset importer for fp8 #96

Conversation

dan-garvey commented Jul 9, 2024 • edited Loading

stellaraccident left a comment

Choose a reason for hiding this comment

dan-garvey commented Jul 9, 2024 •

edited

Loading