ValueError: Cannot load ./ckpt/img2mvimg because decoder.conv_in.bias expected shape tensor(..., device='meta', size=(64,)), but got torch.Size([512]). #91

sherwoodwoo · 2024-08-20T11:21:28Z

(py311) G:\Unique3D>python app/gradio_local.py --port 7860
Warning! extra parameter in cli is not verified, may cause erros.
The config attributes {'condition_offset': True, 'feature_extractor': ['transformers', 'CLIPImageProcessor'], 'image_encoder': ['transformers', 'CLIPVisionModelWithProjection'], 'requires_safety_checker': True, 'safety_checker': [None, None], 'scheduler': ['diffusers', 'DDIMScheduler'], 'unet': ['diffusers', 'UNet2DConditionModel'], 'vae': ['diffusers', 'AutoencoderKL']} were passed to UnifieldWrappedUNet, but are not expected and will be ignored. Please verify your config.json configuration file.
Some weights of UnifieldWrappedUNet were not initialized from the model checkpoint at ./ckpt/img2mvimg and are newly initialized because the shapes did not match:

conv_in.weight: found shape torch.Size([320, 8, 3, 3]) in the checkpoint and torch.Size([320, 4, 3, 3]) in the model instantiated
down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_k.weight: found shape torch.Size([320, 768]) in the checkpoint and torch.Size([320, 1280]) in the model instantiated
down_blocks.0.attentions.0.transformer_blocks.0.attn2.to_v.weight: found shape torch.Size([320, 768]) in the checkpoint and torch.Size([320, 1280]) in the model instantiated
down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_k.weight: found shape torch.Size([320, 768]) in the checkpoint and torch.Size([320, 1280]) in the model instantiated
down_blocks.0.attentions.1.transformer_blocks.0.attn2.to_v.weight: found shape torch.Size([320, 768]) in the checkpoint and torch.Size([320, 1280]) in the model instantiated
down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.weight: found shape torch.Size([640, 768]) in the checkpoint and torch.Size([640, 1280]) in the model instantiated
down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.weight: found shape torch.Size([640, 768]) in the checkpoint and torch.Size([640, 1280]) in the model instantiated
down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.weight: found shape torch.Size([640, 768]) in the checkpoint and torch.Size([640, 1280]) in the model instantiated
down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.weight: found shape torch.Size([640, 768]) in the checkpoint and torch.Size([640, 1280]) in the model instantiated
down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.weight: found shape torch.Size([1280, 768]) in the checkpoint and torch.Size([1280, 1280]) in the model instantiated
down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.weight: found shape torch.Size([1280, 768]) in the checkpoint and torch.Size([1280, 1280]) in the model instantiated
down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_k.weight: found shape torch.Size([1280, 768]) in the checkpoint and torch.Size([1280, 1280]) in the model instantiated
down_blocks.2.attentions.1.transformer_blocks.0.attn2.to_v.weight: found shape torch.Size([1280, 768]) in the checkpoint and torch.Size([1280, 1280]) in the model instantiated
mid_block.attentions.0.transformer_blocks.0.attn2.to_k.weight: found shape torch.Size([1280, 768]) in the checkpoint and torch.Size([1280, 1280]) in the model instantiated
mid_block.attentions.0.transformer_blocks.0.attn2.to_v.weight: found shape torch.Size([1280, 768]) in the checkpoint and torch.Size([1280, 1280]) in the model instantiated
up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.weight: found shape torch.Size([1280, 768]) in the checkpoint and torch.Size([1280, 1280]) in the model instantiated
up_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.weight: found shape torch.Size([1280, 768]) in the checkpoint and torch.Size([1280, 1280]) in the model instantiated
up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.weight: found shape torch.Size([1280, 768]) in the checkpoint and torch.Size([1280, 1280]) in the model instantiated
up_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.weight: found shape torch.Size([1280, 768]) in the checkpoint and torch.Size([1280, 1280]) in the model instantiated
up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_k.weight: found shape torch.Size([1280, 768]) in the checkpoint and torch.Size([1280, 1280]) in the model instantiated
up_blocks.1.attentions.2.transformer_blocks.0.attn2.to_v.weight: found shape torch.Size([1280, 768]) in the checkpoint and torch.Size([1280, 1280]) in the model instantiated
up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.weight: found shape torch.Size([640, 768]) in the checkpoint and torch.Size([640, 1280]) in the model instantiated
up_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.weight: found shape torch.Size([640, 768]) in the checkpoint and torch.Size([640, 1280]) in the model instantiated
up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_k.weight: found shape torch.Size([640, 768]) in the checkpoint and torch.Size([640, 1280]) in the model instantiated
up_blocks.2.attentions.1.transformer_blocks.0.attn2.to_v.weight: found shape torch.Size([640, 768]) in the checkpoint and torch.Size([640, 1280]) in the model instantiated
up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_k.weight: found shape torch.Size([640, 768]) in the checkpoint and torch.Size([640, 1280]) in the model instantiated
up_blocks.2.attentions.2.transformer_blocks.0.attn2.to_v.weight: found shape torch.Size([640, 768]) in the checkpoint and torch.Size([640, 1280]) in the model instantiated
up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_k.weight: found shape torch.Size([320, 768]) in the checkpoint and torch.Size([320, 1280]) in the model instantiated
up_blocks.3.attentions.0.transformer_blocks.0.attn2.to_v.weight: found shape torch.Size([320, 768]) in the checkpoint and torch.Size([320, 1280]) in the model instantiated
up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_k.weight: found shape torch.Size([320, 768]) in the checkpoint and torch.Size([320, 1280]) in the model instantiated
up_blocks.3.attentions.1.transformer_blocks.0.attn2.to_v.weight: found shape torch.Size([320, 768]) in the checkpoint and torch.Size([320, 1280]) in the model instantiated
up_blocks.3.attentions.2.transformer_blocks.0.attn2.to_k.weight: found shape torch.Size([320, 768]) in the checkpoint and torch.Size([320, 1280]) in the model instantiated
up_blocks.3.attentions.2.transformer_blocks.0.attn2.to_v.weight: found shape torch.Size([320, 768]) in the checkpoint and torch.Size([320, 1280]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
The config attributes {'condition_offset': True, 'feature_extractor': ['transformers', 'CLIPImageProcessor'], 'image_encoder': ['transformers', 'CLIPVisionModelWithProjection'], 'requires_safety_checker': True, 'safety_checker': [None, None], 'scheduler': ['diffusers', 'DDIMScheduler'], 'unet': ['diffusers', 'UNet2DConditionModel'], 'vae': ['diffusers', 'AutoencoderKL']} were passed to AutoencoderKL, but are not expected and will be ignored. Please verify your config.json configuration file.
Traceback (most recent call last):
File "G:\Unique3D\app\gradio_local.py", line 20, in
from app.gradio_3dgen import create_ui as create_3d_ui
File "G:\Unique3D\app\gradio_3dgen.py", line 6, in
from app.custom_models.mvimg_prediction import run_mvprediction
File "G:\Unique3D\app\custom_models\mvimg_prediction.py", line 14, in
trainer, pipeline = load_pipeline(training_config, checkpoint_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\Unique3D\app\custom_models\utils.py", line 59, in load_pipeline
shared_modules = trainer.init_shared_modules(shared_modules)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\Unique3D\custum_3d_diffusion\trainings\image2mvimage_trainer.py", line 55, in init_shared_modules
vae = AutoencoderKL.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\anaconda3\envs\py311\Lib\site-packages\huggingface_hub\utils_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "E:\anaconda3\envs\py311\Lib\site-packages\diffusers\models\modeling_utils.py", line 667, in from_pretrained
unexpected_keys = load_model_dict_into_meta(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\anaconda3\envs\py311\Lib\site-packages\diffusers\models\modeling_utils.py", line 152, in load_model_dict_into_meta
raise ValueError(
ValueError: Cannot load ./ckpt/img2mvimg because decoder.conv_in.bias expected shape tensor(..., device='meta', size=(64,)), but got torch.Size([512]). If you want to instead overwrite randomly initialized weights, please make sure to pass both low_cpu_mem_usage=False and ignore_mismatched_sizes=True. For more information, see also: Adding additional input channels to model after intialization / Converting text2img to mask inpainting to resume training huggingface/diffusers#1619 (comment) as an example.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: Cannot load ./ckpt/img2mvimg because decoder.conv_in.bias expected shape tensor(..., device='meta', size=(64,)), but got torch.Size([512]). #91

ValueError: Cannot load ./ckpt/img2mvimg because decoder.conv_in.bias expected shape tensor(..., device='meta', size=(64,)), but got torch.Size([512]). #91

sherwoodwoo commented Aug 20, 2024

ValueError: Cannot load ./ckpt/img2mvimg because decoder.conv_in.bias expected shape tensor(..., device='meta', size=(64,)), but got torch.Size([512]). #91

ValueError: Cannot load ./ckpt/img2mvimg because decoder.conv_in.bias expected shape tensor(..., device='meta', size=(64,)), but got torch.Size([512]). #91

Comments

sherwoodwoo commented Aug 20, 2024