[Proposal] Rename `l1_coefficient` to `sparsity_coefficient` #360

chanind · 2024-11-04T14:53:55Z

Proposal

Now that we support JumpReLU training, the l1_coefficient is confusing since jumprelu use a l0 loss, not l1, for training. We should rename this parameter to sparsity_coefficient since it is a coefficient used to generally promote sparsity. We should also rename l1_warmup_steps to sparsity_warmup_steps.

Motivation

It is confusing to see l1_coefficient used for JumpReLU training which doesn't use L1 loss.

Alternatives

Alternatively, we could add a separate l0_coefficient / l0_warmup_steps which is only used for jumprelu training and error if l1_coefficient is provided. This would also potentially allow training a jumprelu with both L0 and L1 loss if desired.

Checklist

I have checked that there is no similar issue in the repo (required)

The text was updated successfully, but these errors were encountered:

muyo8692 · 2024-11-14T01:50:06Z

Hi @chanind!
I've created PR #376 that implements one of the alternatives you suggested - adding a separate l0_lambda parameter specifically for JumpReLU training.

The PR:

Adds a dedicated l0_lambda parameter (default: 0.0)
Requires explicit l0_lambda specification when using JumpReLU
Maintains clear separation between l1 and l0 regularization terms

Would love to get your thoughts on this implementation!

muyo8692 linked a pull request Nov 14, 2024 that will close this issue

Fix parameter naming: Separate l0_lambda from l1_coefficient #376

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Rename `l1_coefficient` to `sparsity_coefficient` #360

[Proposal] Rename `l1_coefficient` to `sparsity_coefficient` #360

chanind commented Nov 4, 2024

muyo8692 commented Nov 14, 2024

[Proposal] Rename l1_coefficient to sparsity_coefficient #360

[Proposal] Rename l1_coefficient to sparsity_coefficient #360

Comments

chanind commented Nov 4, 2024

Proposal

Motivation

Alternatives

Checklist

muyo8692 commented Nov 14, 2024

[Proposal] Rename `l1_coefficient` to `sparsity_coefficient` #360

[Proposal] Rename `l1_coefficient` to `sparsity_coefficient` #360