Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLIPVisionTower unable to obtain model #31

Open
chuznhiwu opened this issue Apr 2, 2024 · 0 comments
Open

CLIPVisionTower unable to obtain model #31

chuznhiwu opened this issue Apr 2, 2024 · 0 comments

Comments

@chuznhiwu
Copy link

Hello! I use finetune_lora.sh and set as feliows:
--model_name_or_path liuhaotian/llava-v1.5-7b
--vision_tower openai/clip-vit-large-patch14-336 \

and got these:

File "/home/wucz/remote-sensing/GeoChat/geochat/model/multimodal_encoder/clip_encoder.py", line 97, in init
self.clip_interpolate_embeddings(image_size=504, patch_size=14)
File "/home/wucz/remote-sensing/GeoChat/geochat/model/multimodal_encoder/clip_encoder.py", line 34, in clip_interpolate_embeddings
n, seq_length, hidden_dim = pos_embedding.shape
ValueError: not enough values to unpack (expected 3, got 2)

    pos_embedding = state_dict['weight']
    print(pos_embedding.shape)  【torch.Size([0])】
    pos_embedding = pos_embedding.unsqueeze(0)
    print(pos_embedding.shape)  【torch.Size([1, 0])】
    n, seq_length, hidden_dim = pos_embedding.shape

Where did I set the wrong settings that caused me to not read the model?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant