initial grok #169

dan-garvey · 2024-09-05T23:15:09Z

Initial grok work, also does some refactoring

This reverts commit 7c2e133, reversing changes made to 3f2914a.

archana-ramalingam · 2024-09-25T07:39:05Z

Posted a comment here, doesn't show up on the main page.

dan-garvey · 2024-09-25T15:56:52Z

Posted a comment here, doesn't show up on the main page.

lets chat at meeting

dan-garvey · 2024-09-25T17:10:48Z

sharktank/sharktank/utils/create_cache.py

I did say I didn't mind if we did this, but looking at it now, all the args are just values from the config anyway. I feel like separating them I think just makes our code harder to follow.

Not a big deal either way, but I'd personally prefer we just drop this commit. @KyleHerndon what do you think?

I think this is the right way to set things up. The bad argument is that llama.cpp has a similar setup, and we're referencing them for accuracy baselines. The better argument goes something like:

The KV cache is not ontologically related to the config, it just (currently) exclusively uses parameters from it. In the near future, we will want things like sharding, at which point the KV Cache might need additional args (number of devices, for example, which is not a ModelConfig parameter, but something like an ExecutionConfig parameter).

This change doesn't feel related to grok as much a code refactor, which I think is fine to include in a bigger patch for efficiency but might make things harder to follow when looking at commit history. Otherwise, I find the code just as easy to follow, if not slightly easier because I think it is better organized.

I don't mind either way, but do agree with Kyle that this keeps it better organized. We can consult Rob or Stella to see if there is a better way to do this.

not necessary, if you two agree thats more than enough for me

dan-garvey · 2024-09-25T18:55:18Z

looks good to me, @KyleHerndon @archana-ramalingam any final changes you think are needed?

dan-garvey · 2024-09-25T18:57:47Z

sharktank/sharktank/examples/paged_llm_v1.py

@@ -233,7 +233,6 @@ def main():

    device = torch.device(args.device) if args.device else None
    activation_dtype = getattr(torch, args.activation_dtype)
-    attention_dtype = getattr(torch, args.attention_dtype)


archana-ramalingam

I tested Llama and Grok models again and it's LGTM.

sogartar · 2024-09-26T10:05:54Z

@dan-garvey, the error.
Seems to be related to calling __mul__ on the DefaultPrimitiveTensor.
This fails:

output = weight * to(output, weight.dtype)

but this is fine

output = elementwise(torch.mul, weight, to(output, weight.dtype))

__mul__'s implementation is just

    def __mul__(self, rhs):
        from ..ops import elementwise

        return elementwise(torch.mul, self, rhs)

sogartar · 2024-09-26T11:00:45Z

When calling torch.export.export with strict=False, the error is not present, so it is about the export checks.

sogartar · 2024-09-26T11:28:15Z

Instead of calling the * operator I changed the code to call a custom member function in weight

output = weight.my_mul(to(output, weight.dtype))

class InferenceTensor(ABC):

....

    def my_mul(self, rhs):
        from ..ops import elementwise

        return elementwise(torch.mul, self, rhs)

This is essentially the same code, but PyTorch does something special about binary operators. '+' also suffers from the same problem.

Also when running with env vars

TORCH_LOGS="+dynamo"
TORCHDYNAMO_VERBOSE=1

The tracer reports

V0926 06:19:31.456000 140060607524864 torch/_dynamo/symbolic_convert.py:798] [0/0] [__trace_bytecode] TRACE LOAD_FAST results []
V0926 06:19:31.456000 140060607524864 torch/_dynamo/symbolic_convert.py:798] [0/0] [__trace_bytecode] TRACE LOAD_CONST 0 [TupleVariable()]
V0926 06:19:31.456000 140060607524864 torch/_dynamo/symbolic_convert.py:798] [0/0] [__trace_bytecode] TRACE BINARY_SUBSCR None [TupleVariable(), ConstantVariable()]
V0926 06:19:31.456000 140060607524864 torch/_dynamo/symbolic_convert.py:798] [0/0] [__trace_bytecode] TRACE RETURN_VALUE None [TensorVariable()]
V0926 06:19:31.456000 140060607524864 torch/_dynamo/symbolic_convert.py:2807] [0/0] DONE INLINING <code object __call__ at 0x7f618b39f470, file "/home/bpetkant/ws/sharktank/repo/sharktank/sharktank/ops/_registry.py", line 196>
V0926 06:19:31.456000 140060607524864 torch/_dynamo/symbolic_convert.py:798] [0/0] [__trace_bytecode] TRACE BINARY_MULTIPLY None [UserDefinedObjectVariable(), TensorVariable()]
V0926 06:19:31.456000 140060607524864 torch/_dynamo/symbolic_convert.py:814] [0/0] empty checkpoint
V0926 06:19:31.456000 140060607524864 torch/_dynamo/symbolic_convert.py:2796] [0/0] FAILED INLINING <code object rms_norm_default at 0x7f618b2126b0, file "/home/bpetkant/ws/sharktank/repo/sharktank/sharktank/ops/default_impls.py", line 308>

It is doing something special about BINARY_MULTIPLY.

sogartar · 2024-09-26T14:17:21Z

I opened an issue with PyTorch. I think that may be a bug there.

dan-garvey and others added 6 commits September 5, 2024 16:12

initial grok

ba87a04

use name prefix instead of new dataclass

de9842b

some hacks

b7965b1

more hack

6d3d261

fix moe-ffn

b5f535d

Add in some missing grok specific model structure and constants

4095db0

KyleHerndon force-pushed the grokstar branch from a45d444 to 4095db0 Compare September 9, 2024 16:57

archana-ramalingam and others added 23 commits September 11, 2024 23:30

Add attn_output_norm layer

e71630a

Update MOE block in decode

5772a3d

Some fixes to the grok model

3f2914a

Merge branch 'main' into grokstar

7c2e133

Revert "Merge branch 'main' into grokstar"

e1261f5

This reverts commit 7c2e133, reversing changes made to 3f2914a.

Fix merging main changes

a242bde

Update tensor trace names

bb40d12

Update moe block test

cfa8420

Update paged attention block with grok changes

325696f

Update paged attention block with grok changes

48fce0c

Add use_grok to MOE block

d9e787c

Use use_grok in MOE block

ab084cc

Change MOE activation from silu to gelu for Grok

29e3603

Allow router weight norm for all MOEs

0670e1d

Update llm_configs to support llama and grok architectures

a4be20b

Remove comment

3049f87

Add optional params for Grok

b8240c8

Add all models supported in sharktank

5bf30e0

Make rope_freq_base mandatory param

d970944

small refactor/cleanup

b1fd818

more cleanup

85e2f87

this shouldn't have been unrebased??

7ed9a23

fix use_hf args

3510634

archana-ramalingam and others added 5 commits September 24, 2024 17:37

Add short versions for args

6aeeb4f

Remove use_hf and use_grok options from llama

cac489c

Move create_kv_cache to utils folder

d5c27fe

Fix error

10d6c87

Merge branch 'main' into grokstar

4816c93

dan-garvey and others added 2 commits September 25, 2024 12:02

revert addition of dtype arg

124503f

Merge branch 'main' into grokstar

46c6eb6

dan-garvey commented Sep 25, 2024

View reviewed changes

Remove attention_dtype

f3a8fb1

dan-garvey requested review from KyleHerndon and archana-ramalingam September 25, 2024 18:54

dan-garvey commented Sep 25, 2024

View reviewed changes

KyleHerndon approved these changes Sep 25, 2024

View reviewed changes

Merge branch 'main' into grokstar

dcc1e8f

dan-garvey enabled auto-merge (squash) September 25, 2024 19:51

archana-ramalingam approved these changes Sep 25, 2024

View reviewed changes

dan-garvey added 2 commits September 25, 2024 14:53

fix missing parenth

88e38e2

correctly rebase T_T

430045b

sogartar mentioned this pull request Sep 26, 2024

Add test that exports to MLIR a small sharded Llama model #220

Merged

dan-garvey and others added 2 commits September 26, 2024 12:11

nonstrict

f0a3e31

Merge branch 'main' into grokstar

e5dc9e9

dan-garvey merged commit 9f3f70f into main Sep 26, 2024
7 of 8 checks passed

dan-garvey deleted the grokstar branch September 26, 2024 17:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

initial grok #169

initial grok #169

dan-garvey commented Sep 5, 2024 •

edited

Loading

archana-ramalingam commented Sep 25, 2024

dan-garvey commented Sep 25, 2024

dan-garvey Sep 25, 2024

KyleHerndon Sep 25, 2024

archana-ramalingam Sep 25, 2024

dan-garvey Sep 25, 2024

dan-garvey commented Sep 25, 2024

dan-garvey Sep 25, 2024

archana-ramalingam left a comment

sogartar commented Sep 26, 2024

sogartar commented Sep 26, 2024

sogartar commented Sep 26, 2024 •

edited

Loading

sogartar commented Sep 26, 2024

initial grok #169

initial grok #169

Conversation

dan-garvey commented Sep 5, 2024 • edited Loading

archana-ramalingam commented Sep 25, 2024

dan-garvey commented Sep 25, 2024

dan-garvey Sep 25, 2024

Choose a reason for hiding this comment

KyleHerndon Sep 25, 2024

Choose a reason for hiding this comment

archana-ramalingam Sep 25, 2024

Choose a reason for hiding this comment

dan-garvey Sep 25, 2024

Choose a reason for hiding this comment

dan-garvey commented Sep 25, 2024

dan-garvey Sep 25, 2024

Choose a reason for hiding this comment

archana-ramalingam left a comment

Choose a reason for hiding this comment

sogartar commented Sep 26, 2024

sogartar commented Sep 26, 2024

sogartar commented Sep 26, 2024 • edited Loading

sogartar commented Sep 26, 2024

dan-garvey commented Sep 5, 2024 •

edited

Loading

sogartar commented Sep 26, 2024 •

edited

Loading