chore: setting up pyright type checking and fixing typing errors #18

chanind · 2024-02-28T23:09:03Z

This PR sets up Pyright for type-checking, and attempts to fix all type errors in the codebase.

Some things to note:

I struggled a lot with the types of the various configs expected throughout the code, I might have made a mistake somewhere. I switched to just typing the configs as Any in later parts of the code.
I was pretty liberal about casting to Any when I didn't know what the correct type of things was
I was pretty liberal about adding assertions to help the type checker understand what's going on
This PR does have some minor changes to the codebase itself where necessary for the type checking to pass. Likely many of these are actual bugs in the code, but it's also possible I've messed something up and potentially broken something.
I left TODO notes where there was some weird typing thing I didn't know how to handle properly.
The geom_median dir is ignored by the type checking, since that's sort of independent of this codebase.

closes #15

chanind · 2024-02-28T23:13:17Z

sae_analysis/visualizer/data_fns.py

@@ -462,11 +465,7 @@ def get_all_html(self, debug: bool = False, split_scripts: bool = False) -> str:
        if debug:
            display(HTML(html_string))

-        if split_scripts:
-            scripts, html_string = extract_and_remove_scripts(html_string)
-            return scripts, html_string


This split_scripts option causes the return type to be a tuple instead of a string, which breaks typing. It looks like this option is never used anyway throughout the code so I just deleted it here. If this is not OK I can try going back and casting everything to str everywhere this function is called, but IMO it's probably cleaner to have functions have only a single return type in general.

chanind · 2024-02-28T23:14:49Z

Looks like tests are failing due to the huggingface hub being down 🤦‍♂️. https://twitter.com/huggingface/status/1762954032312639702

jbloomAus · 2024-02-28T23:42:50Z

Looks like tests are failing due to the huggingface hub being down 🤦‍♂️. https://twitter.com/huggingface/status/1762954032312639702

No worries! Can wait.

jbloomAus · 2024-02-28T23:46:11Z

@chanind Amazing work! I think we're likely to delete a lot of the SAE visualizer stuff (sorry for not telling you earlier) as the new version of SAE vis (https://github.com/callummcdougall/sae_vis) should more/less work with our code. I'll make an issue for making a script to show how to use this with those SAEs.

Also #17 might cause some further type checking issues. Idk if it make more sense to merge this PR or that PR first, do you have thoughts? Many thanks!

chanind · 2024-02-29T00:19:22Z

No worries! Go ahead and merge #17 first and I'll fix up any typing issues in this PR. That's likely easier than you having to figure out the typing stuff just to get the PR merged. It shouldn't be too much to fix up.

jbloomAus · 2024-02-29T21:00:15Z

@chanind other PR has been merged if you want to rebase :)

chanind · 2024-02-29T22:56:19Z

sae_analysis/dashboard_runner.py


+        assert sparse_autoencoder.cfg.d_sae is not None  # keep pyright happy


Is d_sae really optional? It seems like throughout the code it's assumed to be present

chanind · 2024-02-29T22:57:24Z

sae_analysis/dashboard_runner.py

            self.activation_store,
        ) = LMSparseAutoencoderSessionloader.load_session_from_pretrained(self.sae_path)
+        self.sparse_autoencoder = sae_group.autoencoders[0]


I had to pull out just the first autoencoder from the group here, since it looks like this file is expecting a single autoencoder to come out of this function rather than a group. IIRC this file will be removed in the future anyway, so probably fine?

chanind · 2024-02-29T22:58:11Z

scripts/generate_dashboards.py

            self.activation_store,
        ) = LMSparseAutoencoderSessionloader.load_session_from_pretrained(self.sae_path)
+        # TODO: handle multiple autoencoders
+        self.sparse_autoencoder = sae_group.autoencoders[0]


I had to pull out just the first autoencoder from the group here as well. Is this file also going to be removed in the future?

chanind · 2024-02-29T22:59:26Z

👍 should be updated for the new changes.

jbloomAus · 2024-03-01T16:11:59Z

thanks @chanind

chore: setting up pyright type checking and fixing typing errors

chore: setting up pyright type checking and fixing typing errors

351995c

chanind commented Feb 28, 2024

View reviewed changes

Merge branch 'main' into type-checking

57c4582

chanind commented Feb 29, 2024

View reviewed changes

jbloomAus merged commit bd5fc43 into jbloomAus:main Mar 1, 2024
2 checks passed

chanind deleted the type-checking branch March 1, 2024 16:23

tom-pollak pushed a commit to tom-pollak/SAELens that referenced this pull request Oct 22, 2024

Merge pull request jbloomAus#18 from chanind/type-checking

73219a8

chore: setting up pyright type checking and fixing typing errors

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: setting up pyright type checking and fixing typing errors #18

chore: setting up pyright type checking and fixing typing errors #18

chanind commented Feb 28, 2024 •

edited

Loading

chanind Feb 28, 2024

chanind commented Feb 28, 2024

jbloomAus commented Feb 28, 2024

jbloomAus commented Feb 28, 2024

chanind commented Feb 29, 2024

jbloomAus commented Feb 29, 2024

chanind Feb 29, 2024

chanind Feb 29, 2024

chanind Feb 29, 2024

chanind commented Feb 29, 2024

jbloomAus commented Mar 1, 2024


		assert sparse_autoencoder.cfg.d_sae is not None # keep pyright happy

chore: setting up pyright type checking and fixing typing errors #18

chore: setting up pyright type checking and fixing typing errors #18

Conversation

chanind commented Feb 28, 2024 • edited Loading

chanind Feb 28, 2024

Choose a reason for hiding this comment

chanind commented Feb 28, 2024

jbloomAus commented Feb 28, 2024

jbloomAus commented Feb 28, 2024

chanind commented Feb 29, 2024

jbloomAus commented Feb 29, 2024

chanind Feb 29, 2024

Choose a reason for hiding this comment

chanind Feb 29, 2024

Choose a reason for hiding this comment

chanind Feb 29, 2024

Choose a reason for hiding this comment

chanind commented Feb 29, 2024

jbloomAus commented Mar 1, 2024

chanind commented Feb 28, 2024 •

edited

Loading