-
Notifications
You must be signed in to change notification settings - Fork 424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sample weights #120
Sample weights #120
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, I left a couple of comments. I think once we've settled on a design for the implicit factorization models we can extend to all the other models as well.
Thanks for your help, I really appreciate it!
spotlight/losses.py
Outdated
mask = mask.float() | ||
loss = loss * mask | ||
return loss.sum() / mask.sum() | ||
if sample_weights is not None or mask is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need these checks here: we can always call the base loss function. I think that will make for less code repetition in all the loss functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
totally
spotlight/losses.py
Outdated
@@ -15,7 +15,42 @@ | |||
from spotlight.torch_utils import assert_no_grad | |||
|
|||
|
|||
def pointwise_loss(positive_predictions, negative_predictions, mask=None): | |||
def base_loss(loss, sample_weights=None, mask=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we call this _weighted_loss
? This would reflect the fact that:
- This is an internal function.
- Its essence lies in applying weights.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not opposed to this name, but it'd be a misnomer in case they call us with a mask and no weights. Although, conceptually at least, a mask could be thought of as a weight of zero.
_base_loss
vs
_weighted_loss
vs
_modified_loss
¯_(ツ)_/¯ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If, as you suggest, we subsume masks under weights this will all be fine!
spotlight/factorization/implicit.py
Outdated
raise ValueError('Degenerate epoch loss: {}' | ||
.format(epoch_loss)) | ||
|
||
def fit_weighted(self, interactions, verbose=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should keep a single fit
method. If weights are present, we use them; if not, we don't.
This could probably be accomplished by changing the minibatch
function to just infinitely yield None
for those arguments that are None
rather than tensors. This way, it always yields None
for batch_sample_weights
which ties in nicely with how the loss functions are changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely tried to keep it to a single fit function at first. But I want to avoid consuming ram for an unnecessary tensor of Nones.
Would this lead to a tensor of Nones (not None) being passed to the loss functions, rather than a single None? (unless the tensor is not created by yielding Nones?)
Would also have to augment torch_utils.shuffle and add a few if all(weights==None) checks in (may slow it down a bit).
What if we just checked if weights are specified at the beginning of fit, and if they are just have fit call _fit_weighted? That way we can maintain the api of having the user always call fit, but still keep the code relatively simple.
Sequence weight question for you: I was thinking about the sequence interactions yesterday, and my guess is to make a parallel sequence tensor full of sample weights, of the same dimensions as the sequence tensor of item-ids. Does this sound right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we won't need a tensor of nones. I'll add a comment with a prototype of what I meant.
…ed_loss. rename fit_weighted to _fit_weighted and call it from fit to streamline the user-api. add cscope database files to gitignore.
I notice that the sequence losses already make pretty heavy use of the mask argument via the PADDING_INDEX.
Is it correct for sample weights to override them in the _weighted_loss, or is there something more sophisticated we could be doing? |
This is a good point: we can definitely subsume masks under weights by just setting weights to zero where mask should be false. |
spotlight/factorization/implicit.py
Outdated
raise ValueError('Degenerate epoch loss: {}' | ||
.format(epoch_loss)) | ||
|
||
def _fit_weighted(self, interactions, verbose=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not convinced about having a separate _fit_weighted
function, even if it is internal. I think it introduces too much code duplication that will have to be kept in sync.
What about modifying minibatch
to look roughly like this:
def minibatch(*tensors, **kwargs):
batch_size = kwargs.get('batch_size', 128)
if len(tensors) == 1:
tensor = tensors[0]
for i in range(0, len(tensor), batch_size):
yield tensor[i:i + batch_size]
else:
for i in range(0, len(tensors[0]), batch_size):
yield tuple(x[i:i + batch_size] if x is not None else None for x in tensors)
This way, it emits tensor slices if an argument is a tensor (as before), but also emits None
in the tuple if an argument is None
.
spotlight/factorization/implicit.py
Outdated
raise ValueError('Degenerate epoch loss: {}' | ||
.format(epoch_loss)) | ||
|
||
def fit_weighted(self, interactions, verbose=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we won't need a tensor of nones. I'll add a comment with a prototype of what I meant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of making masks simply zero weights.
But I still think we should try to keep only one fit
method :)
…ights and removing _fit_weighted.
Closing in favour of #122 |
Commit corresponds to discussion at #118
Loss functions and implicit factorization done.
Implicit sequence coming.