Make load_state_dict use strict=False #95

neelnanda-io · 2024-04-20T19:59:59Z

Your library is too strict, which creates errors if loading state dicts that don't eg contain a scaling factor. I added a quick fix that won't raise errors when loading a state dict that only contains some parameters. This runs the risk that someone may accidentally load a state dict with too few parameters, which is bad? But seems pretty unlikely to me

codecov · 2024-04-20T20:02:11Z

Codecov Report

Attention: Patch coverage is 25.00000% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 57.62%. Comparing base (6a056b7) to head (c22fbbd).

Files	Patch %	Lines
sae_lens/training/sparse_autoencoder.py	25.00%	2 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #95      +/-   ##
==========================================
- Coverage   57.74%   57.62%   -0.13%     
==========================================
  Files          16       16              
  Lines        1394     1397       +3     
  Branches      227      228       +1     
==========================================
  Hits          805      805              
- Misses        543      545       +2     
- Partials       46       47       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

hijohnnylin · 2024-04-20T20:06:29Z

Perhaps checking that the state dict’s keys is a superset of a set of required keys?

I recently made a very similar change to sae_vis here: https://github.com/callummcdougall/sae_vis/pull/39/files

jbloomAus · 2024-04-21T11:09:41Z

Hmm. So this is resulting from the decoder fine-tuning changes which added a new state_dict parameter to SAEs. This parameter doesn't effect the output if unused.

I think we should:

explicitly check whether the extra weight is the scale parameter.
deliberately allow strict = false if that's the only difference.

…rict Make load_state_dict use strict=False

Make load_state_dict use strict=False

fdf7fe9

jbloom-md added 2 commits April 21, 2024 12:34

fix load pretrained legacy with state dict change

b5e97f8

fix accidental bug

c22fbbd

jbloomAus merged commit 4a9e274 into main Apr 21, 2024
5 of 7 checks passed

jbloomAus deleted the load-state-dict-not-strict branch May 20, 2024 13:36

tom-pollak pushed a commit to tom-pollak/SAELens that referenced this pull request Oct 22, 2024

Merge pull request jbloomAus#95 from jbloomAus/load-state-dict-not-st…

2784e34

…rict Make load_state_dict use strict=False

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make load_state_dict use strict=False #95

Make load_state_dict use strict=False #95

neelnanda-io commented Apr 20, 2024 •

edited

Loading

codecov bot commented Apr 20, 2024 •

edited

Loading

hijohnnylin commented Apr 20, 2024

jbloomAus commented Apr 21, 2024

Make load_state_dict use strict=False #95

Make load_state_dict use strict=False #95

Conversation

neelnanda-io commented Apr 20, 2024 • edited Loading

codecov bot commented Apr 20, 2024 • edited Loading

Codecov Report

hijohnnylin commented Apr 20, 2024

jbloomAus commented Apr 21, 2024

neelnanda-io commented Apr 20, 2024 •

edited

Loading

codecov bot commented Apr 20, 2024 •

edited

Loading