CLIPVisionTower unable to obtain model #31

chuznhiwu · 2024-04-02T03:46:04Z

Hello! I use finetune_lora.sh and set as feliows:
--model_name_or_path liuhaotian/llava-v1.5-7b
--vision_tower openai/clip-vit-large-patch14-336 \

and got these:

File "/home/wucz/remote-sensing/GeoChat/geochat/model/multimodal_encoder/clip_encoder.py", line 97, in init
self.clip_interpolate_embeddings(image_size=504, patch_size=14)
File "/home/wucz/remote-sensing/GeoChat/geochat/model/multimodal_encoder/clip_encoder.py", line 34, in clip_interpolate_embeddings
n, seq_length, hidden_dim = pos_embedding.shape
ValueError: not enough values to unpack (expected 3, got 2)

    pos_embedding = state_dict['weight']
    print(pos_embedding.shape)  【torch.Size([0])】
    pos_embedding = pos_embedding.unsqueeze(0)
    print(pos_embedding.shape)  【torch.Size([1, 0])】
    n, seq_length, hidden_dim = pos_embedding.shape

Where did I set the wrong settings that caused me to not read the model？

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLIPVisionTower unable to obtain model #31

CLIPVisionTower unable to obtain model #31

chuznhiwu commented Apr 2, 2024

CLIPVisionTower unable to obtain model #31

CLIPVisionTower unable to obtain model #31

Comments

chuznhiwu commented Apr 2, 2024