Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loss computation: mean and not sum #135

Open
CassNot opened this issue Aug 3, 2023 · 4 comments
Open

loss computation: mean and not sum #135

CassNot opened this issue Aug 3, 2023 · 4 comments

Comments

@CassNot
Copy link

CassNot commented Aug 3, 2023

Dear authors,

Thank you for your code!

We had a question concerning the loss implementation. We saw that for each minibatch, the mean is computed and not the sum as in the paper (https://arxiv.org/pdf/2004.11362.pdf - equation 2):

loss = loss.view(anchor_count, batch_size).mean()

We were wondering if there was a reason for this choice.

Thank you

@HobbitLong
Copy link
Owner

Good catch! I think the eq 2 in the paper has ignored the 1/(2N).

@dave4422
Copy link

Hi,

I've been reviewing the implementation, and I noticed the line loss = loss.view(anchor_count, batch_size).mean(). Given the computations, it seems that the result would be equivalent to simply using loss.mean(). Could you kindly explain the rationale behind the reshaping here?

@dave4422
Copy link

I assume it's just for readability?

@HobbitLong
Copy link
Owner

yeah, it's just helping understand the shape (potentially may help understand what's going on).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants