Question on the parameter count presented in Table 1 #3

Nikolai10 · 2023-04-13T16:16:16Z

thank you very much for providing your interesting work.

Could you please explain in more detail how exactly you calculate the parameter number given in Table 1? According to your paper, your small model should have a parameter count of 44.96M, while I get about 76M when testing your code. I have created a colab to reproduce this result:

https://colab.research.google.com/drive/1KdwoC1i-TYMtc3akyuX83exipynKEE4v?usp=sharing

I have used the default setting with C=128 - probably I am just missing some details here...

I was also a bit surprised by the reported number of model parameters for SwinT-ChARM. According to Zhu et al., they have a total of 32.6M (Table 3), whereas you report 60.55M.

It would be great if you could provide further insights here.

Thanks in advance,
Nikolai

jmliu206 · 2023-04-13T18:24:08Z

I apologize for the misunderstanding. In the code, N refers to 1/2 of C in the paper, as it represents the number of channels used to be input into the Tensformer and CNN networks respectively. Therefore, for the small model, N should be set as 64.

Regarding Swin-Charm, since the open-source code was not available when we finished our work, we reproduced the method based on the paper. The difference between our implementation and the open-source code may be the slice transform. Since the paper didn't say the output channels of the middle convolutional layers, we used the same convolutional layers of our method.

Sorry for any confusion caused. I'll update README to clarify both points.

Nikolai10 · 2023-04-13T23:38:20Z

Thanks a lot for your help. Now I get the following values with the Deepspeed Profiler/ get_model_profile():

N=64    - 441.38 G 215.32 GMACs 45.18 M (flops, macs, params)
N=96    - 865.73 G 425.09 GMACs 59.13 M (flops, macs, params)
N=128   - 1454.5 G 717.08 GMACs 76.57 M (flops, macs, params)

The number of parameters is now quite similar to the reported numbers. The Flops however are approximately twice as high compared to the reported numbers. Do you have any idea why? How are you profiling your model?

I also used a RTX 3090 GPU.

Thanks again!

jmliu206 · 2023-04-14T02:54:34Z

For all methods in table 1, we use flops-counter.pytorch to caculate complexity.
Specifically, the Flops we report here should be MACs.
For most CV tasks, many papers identify FLOPs with MACs. Some versions of packages also mixed up Flops and MACs .

I think the following reference might be helpful:
open-mmlab/mmcv#785 (comment)
sovrasov/flops-counter.pytorch#16 (comment)
https://github.com/sovrasov/flops-counter.pytorch/blob/1ad0ed1999620c0170e5854dde39805d30d9b6aa/sample.py#L36
https://github.com/Lyken17/pytorch-OpCounter/tree/160004dd1535323d71763c93482d2a8f5f260301

Nikolai10 · 2023-04-14T08:31:24Z

Thank you very much :)

Nikolai10 closed this as completed Apr 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question on the parameter count presented in Table 1 #3

Question on the parameter count presented in Table 1 #3

Nikolai10 commented Apr 13, 2023

jmliu206 commented Apr 13, 2023

Nikolai10 commented Apr 13, 2023

jmliu206 commented Apr 14, 2023 •

edited

Loading

Nikolai10 commented Apr 14, 2023

Question on the parameter count presented in Table 1 #3

Question on the parameter count presented in Table 1 #3

Comments

Nikolai10 commented Apr 13, 2023

jmliu206 commented Apr 13, 2023

Nikolai10 commented Apr 13, 2023

jmliu206 commented Apr 14, 2023 • edited Loading

Nikolai10 commented Apr 14, 2023

jmliu206 commented Apr 14, 2023 •

edited

Loading