Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backbone 换swin-L,效果变差 #146

Open
Blissy-32 opened this issue Jun 19, 2024 · 3 comments
Open

backbone 换swin-L,效果变差 #146

Blissy-32 opened this issue Jun 19, 2024 · 3 comments

Comments

@Blissy-32
Copy link

请问 我在某公开数据集上训练 用resnet50 12轮可以0.415 但是换swin-L后,batch-size设为1,多尺度size等不变,最高0.39,还是扩大学习率之后的,用线性计算学习率才0.34 是为什么呢

@TempleX98
Copy link
Collaborator

batch size为1太小了,就算线性scale学习率也会掉点

@Whitefish-by
Copy link

请问 我的显卡是两张L40,训练co_deformable_detr_swin_large_1x_coco.py,设置多少的per_smple_cpu和base_batch_size合适呢?
刚开始训练的时候学习率没改(2e-4),上面的两个参数没改(2,16)。但是,训练完第一个epoch发现评估结果都是0(之前在co_deformable_detr_r50_1x_coco.py上遇到过,改低学习率奏效)。
想请您给个建议,学习率、per_smple_cpu,base_batch_size设置多少比较合适呢?(下面是默认值)
auto_scale_lr = dict(enable=False, base_batch_size=16)optimizer = dict( @type='AdamW', lr=2e-4, weight_decay=1e-4, paramwise_cfg=dict( custom_keys={ 'backbone': dict(lr_mult=0.1), 'sampling_offsets': dict(lr_mult=0.1), 'reference_points': dict(lr_mult=0.1) }))
@TempleX98 感谢您赐教!

@TempleX98
Copy link
Collaborator

不好意思回复得比较晚,可以设置为2张卡,每张卡samples_per_gpu=8,学习率使用2e-4,如果爆显存,samples_per_gpu降到4,学习率变为之前的一半1e-4,以此类推

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants