Experiments around deleting the last layer of SigLip encoder. #335

Natyren · 2024-11-07T14:27:41Z

LLaVA-NeXT/llava/model/multimodal_encoder/siglip_encoder.py

Line 570 in 79ef45a

del self.vision_tower.vision_model.encoder.layers[-1:]

Hello! I noticed that last layer of SigLip removed. Could you please explain why you decided to do this? It seems a bit unintuitive. Did experiments show that this improves the quality? I can assume it might be due to specific features in the last layer, but I couldn't find any confirmation of this. In your paper, you mentioned comparing features before and after the last layer—could you share the results?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiments around deleting the last layer of SigLip encoder. #335

Experiments around deleting the last layer of SigLip encoder. #335

Natyren commented Nov 7, 2024 •

edited

Loading

Experiments around deleting the last layer of SigLip encoder. #335

Experiments around deleting the last layer of SigLip encoder. #335

Comments

Natyren commented Nov 7, 2024 • edited Loading

Natyren commented Nov 7, 2024 •

edited

Loading