fix: hotfix scale decoder norm is not passed to training sae #377
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
In #365, we fixed a bug where the
scale_sparsity_penalty_by_decoder_norm
was being ignored and the SAE was always scaling by decoder norm regardless. However, this fix revealed a second bug where we're not passing thescale_sparsity_penalty_by_decoder_norm
param through to the training SAE at all. This sort of bug is easy to happen given that we create theTrainingSAEConfig
from the runner config via creating a dictionary without type checking.This PR adds a test just that
scale_sparsity_penalty_by_decoder_norm
is now being passed through correctly to get this fix out asap, but I'll make a follow-up PR with a more robust fix in the form of better tests or type checking or something after this is merged.Type of change
Please delete options that are not relevant.
Checklist:
You have tested formatting, typing and unit tests (acceptance tests not currently in use)
make check-ci
to check format and linting. (you can runmake format
to format code if needed.)