-
Notifications
You must be signed in to change notification settings - Fork 356
Struggling to Train Downstream Classifier #58
Comments
facing a similar challenge |
Same issue |
I have been able to finetune but i'm experiencing some weird behavior during training so i'm still investigating but the model seems to train despite the variance in performance. Accuracy drops suddenly between epochs 13 and 14. Then starts to recover but finishes with 61% top-1. I'm finetuning the whole model (Imagenet-22k pretrained weights) over the PlantNet300k dataset with 8 gpus, batch size 64 and 4 gradient accumulation iterations. I'm also using the original warmup scheduler with start lr = 1e-4, lr = 7.5e-4 and final lr = 1e-6.
|
Anyone able to find a solution? |
Other than this weird behavior of accuracy dropping out of nowhere (which i suspect that its something related either to the size of the dataset or learning rate value), i didn't had problems with fine-tuning. I have also compared fine-tuning and linear probing over the intel dataset (https://www.kaggle.com/datasets/puneet6060/intel-image-classification) and didn't had any problems with it. |
Hi,
I'm working on training a downstream classification task from the ImageNet-22k checkpoint. When I use a TinyViT checkpoint, average over the first dimension of output and feed that into a linear classification head, the model trains appropriately. However, if I replace TinyViT with the target encoder of I-JEPA, once again averaging over the first dimension of the final layer and feeding into a linear classification head. However, the model fails to train at all in these conditions. Has anyone been able to successfully train on a downstream task?
Thank you!
The text was updated successfully, but these errors were encountered: