You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! I noticed that last layer of SigLip removed. Could you please explain why you decided to do this? It seems a bit unintuitive. Did experiments show that this improves the quality? I can assume it might be due to specific features in the last layer, but I couldn't find any confirmation of this. In your paper, you mentioned comparing features before and after the last layer—could you share the results?
The text was updated successfully, but these errors were encountered:
LLaVA-NeXT/llava/model/multimodal_encoder/siglip_encoder.py
Line 570 in 79ef45a
Hello! I noticed that last layer of SigLip removed. Could you please explain why you decided to do this? It seems a bit unintuitive. Did experiments show that this improves the quality? I can assume it might be due to specific features in the last layer, but I couldn't find any confirmation of this. In your paper, you mentioned comparing features before and after the last layer—could you share the results?
The text was updated successfully, but these errors were encountered: