Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why use 'torch.nn.Bilinear' in 'self.channelMixMLPs01' ? #9

Open
linhaojia13 opened this issue Aug 18, 2022 · 2 comments
Open

Why use 'torch.nn.Bilinear' in 'self.channelMixMLPs01' ? #9

linhaojia13 opened this issue Aug 18, 2022 · 2 comments

Comments

@linhaojia13
Copy link

What's the purpose of using 'torch.nn.Bilinear'?
The formulation of a bilinear transformation is y= x_1^T A x_2 + b, and the formulation of a linear transformation is y=xA^T+b .
It seems that a bilinear layer just apply a slighter sophisticated linear transformation than the linear layer?

@LifeBeyondExpectations
Copy link
Owner

LifeBeyondExpectations commented Aug 18, 2022

We empirically found that nn.Bilinear(...) results in a good performance.
I remember that the performance gap is quite big.
(ToDo) I will report the ablation study soon. ⚡

Nonetheless, in my personal opinion, this function can be seen as the 'simplest' version of a conditional MLP.

class BilinearFeedForward(nn.Module):
def __init__(self, in_planes1, in_planes2, out_planes):
super().__init__()
self.bilinear = nn.Bilinear(in_planes1, in_planes2, out_planes)
def forward(self, x):
x = x.contiguous()
x = self.bilinear(x, x)
return x

As you can see the code above, the bilinear function takes the same input x (line 37).
Accordingly, we can say that

y = Bilinear(x, x) 
  = (xT W) x
  = W' x 
  = Linear(x | W')

where W' is conditioned to the input x.

In short, I think that not that much 'purpose' exists in this function.
However, nn.Bilinear(...) can be viewed as another type of MLP, as its name implies.

I hope you are satisfied with my understanding.

@linhaojia13
Copy link
Author

Thank you for your detailed explanation. I think it is reasonable that regarding the bilinear layer as a simplest implementation of conditional MLP.
I'm looking forward to seeing the results of ablation study. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants