Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DotLayer reduce over dynamic axis should respect the seq mask #629

Closed
albertz opened this issue Sep 3, 2021 · 3 comments
Closed

DotLayer reduce over dynamic axis should respect the seq mask #629

albertz opened this issue Sep 3, 2021 · 3 comments

Comments

@albertz
Copy link
Member

albertz commented Sep 3, 2021

DotLayer should respect the dyn size (or seq mask) when you reduce dynamic axes, just like ReduceLayer etc would do over dynamic axes. The DotLayer currently completely ignores this. This is fine when the user explicitly performs masking beforehand. E.g. SoftmaxOverSpatialLayer will do the right thing. However, otherwise it will be wrong in general. This is esp also a problem for the concept of dynamic axes with extended dynamic sizes.

Implementing this for DotLayer is not so nice, though, because this additional masking is not needed for the common case where there was a SoftmaxOverSpatialLayer before. The additional masking is not wrong, but it would make it a bit slower. So we definitely want to avoid this. But how can we know when this can be skipped? It could explicitly check whether one of the inputs comes from SoftmaxOverSpatialLayer but this would be very ugly, hacky, and also fail in some cases (e.g. by any automatic internal wrapping layers such as SelectSearchSourcesLayer). But is there a better more generic way? Somehow some way that the input can say "I'm already masked, padded values are 0, no need to mask again". Other layers like ReduceLayer could also use this information. The padded values are also relevant. For DotLayer we want 0. For ReduceLayer with sum or avg we want 0. For ReduceLayer with max or also SoftmaxOverSpatialLayer we want -inf. But then, I'm not sure if we are maybe making it too complicated in end.

Originally posted by @albertz in #391 (comment)

This was referenced Sep 3, 2021
@Zettelkasten

This comment has been minimized.

@albertz

This comment has been minimized.

@Zettelkasten

This comment has been minimized.

albertz added a commit that referenced this issue Sep 7, 2021
albertz added a commit that referenced this issue Sep 7, 2021
@albertz albertz closed this as completed in e0367d2 Sep 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants