Support seqpos slicing #294

callummcdougall · 2024-09-18T08:18:58Z

This allows seqpos slicing during training. Basically we add a seqpos_slice arg to the LanguageModelSAERunnerConfig (in the form of a tuple, which gets converted to a slice via slice(*seqpos_slice) - this is because slice objects aren't serializable when we're saving the config).

Apart from this config, the only other file getting changed is activations_store.py. It now has a seqpos_slice attribute, and it uses this to slice the activations which are fetched from get_activations (and which are used in get_buffer).

Note that the default behaviour is seqpos_slice = (None,), which slices over all sequence positions. Also note that seqpos_slice can be used in conjunction with context_size (i.e. one doesn't make the other redundant).

jbloomAus · 2024-09-20T09:59:07Z

Hey @callummcdougall I've pushed:

tests (in the future, please add tests!)
ensured that this is in the SAE config (Since using an SAE / evaluating an SAE correctly will rely on knowing seqpos slice).
That we can serialize / deserialize without modification.

callummcdougall · 2024-09-20T10:03:35Z

Got it, sorry for causing undue work - yes in the future will make sure to add tests! I wasn't sure about putting it in the sae config cause it's about the SAE's training data (or what inputs make sense for it) but not about e.g. the SAE's actual architecture. I was basing this on the fact that ActivationsStore gets initialized from_config which is a LanguageModelSAERunnerConfig not SAEConfig (although now I'm looking at that page, I see that it can also get initialized from_sae, so I get why this should be added).

jbloomAus · 2024-09-20T10:21:20Z

@callummcdougall I think the idea is that if you couldn't evaluate the SAE without knowing about this property, then it needs to be in the SAE config.

Speaking of which I don't see any changes to the evals.py but presumably we should ensure that evals are only run on seqpos positions? Are you able to do this?

chanind

Code-wise this looks good to me, and looks like a reasonable addition to the library! Will defer to @jbloomAus if this is OK to merge. I guess there's a question fo whether the expectation is that this would require different evals, or if this is something that only effects training.

chanind · 2024-09-22T19:42:10Z

tests/unit/training/test_activations_store.py

+    activations = activation_store.get_activations(batch)
+
+    assert batch.shape == (1, 10)  # Full context size
+    assert activations.shape == (1, 6, 1, cfg.d_in)  # Only 6 positions (2 to 7)


nice! Really great test 🥇

callummcdougall · 2024-09-23T07:44:26Z

Code-wise this looks good to me, and looks like a reasonable addition to the library! Will defer to @jbloomAus if this is OK to merge. I guess there's a question fo whether the expectation is that this would require different evals, or if this is something that only effects training.

Think it does seem valuable to also have the logged metrics during training only apply to the right sequence positions - is that what you meant @jbloomAus , or did you mean evals that are applied in a non-training context? Either way I can likely get to that later this week

callummcdougall and others added 4 commits September 18, 2024 09:13

support seqpos slicing

0a57e47

add basic tests, ensure it's in the SAE config

05543fb

format

c20830a

fix tests

493edf3

fix tests 2

9d21daf

chanind approved these changes Sep 22, 2024

View reviewed changes

callummcdougall marked this pull request as draft October 2, 2024 15:34

decandido mentioned this pull request Oct 3, 2024

Issue 39 support othellogpt in SAELens #317

Merged

8 tasks

callummcdougall marked this pull request as ready for review October 11, 2024 09:33

callummcdougall marked this pull request as draft October 11, 2024 09:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support seqpos slicing #294

Support seqpos slicing #294

callummcdougall commented Sep 18, 2024

jbloomAus commented Sep 20, 2024

callummcdougall commented Sep 20, 2024

jbloomAus commented Sep 20, 2024

chanind left a comment •

edited

Loading

chanind Sep 22, 2024

callummcdougall commented Sep 23, 2024

Support seqpos slicing #294

Are you sure you want to change the base?

Support seqpos slicing #294

Conversation

callummcdougall commented Sep 18, 2024

jbloomAus commented Sep 20, 2024

callummcdougall commented Sep 20, 2024

jbloomAus commented Sep 20, 2024

chanind left a comment • edited Loading

Choose a reason for hiding this comment

chanind Sep 22, 2024

Choose a reason for hiding this comment

callummcdougall commented Sep 23, 2024

chanind left a comment •

edited

Loading