Input type and weight type error in scene graph code #16

Yassin-fan · 2023-05-04T12:02:00Z

Hi, I have installed the code in python3.8, pytorch 1.8.0 and cuda11. And the debug_relationformer.ipynb runs well about Debug Dataloader and Debug Model part.
However, when I run the train.py using "nohup python3 train.py --config configs/scene_2d.yaml --cuda_visible_device 0 1 2 --exp_name VGtest1 --nproc_per_node 3 --b 16 &> log/Muti.out& ", there is an error:

*** Config file
configs/scene_2d.yaml
Experiment Name : VGtest1
Batch size : 16
Running Distributed: True ; GPU: 0 ; RANK: 0
Number of parameters : 92944451
ERROR:ignite.engine.engine.RelationformerTrainer:Current run is terminating due to exception: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
ERROR:ignite.engine.engine.RelationformerTrainer:Current run is terminating due to exception: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
ERROR:ignite.engine.engine.RelationformerTrainer:Current run is terminating due to exception: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
ERROR:ignite.engine.engine.RelationformerTrainer:Engine run is terminating due to exception: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
ERROR:ignite.engine.engine.RelationformerTrainer:Engine run is terminating due to exception: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
ERROR:ignite.engine.engine.RelationformerTrainer:Engine run is terminating due to exception: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
Traceback (most recent call last):
File "train.py", line 292, in
parallel.run(main, args)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/distributed/launcher.py", line 275, in run
idist.spawn(self.backend, func, args=args, kwargs_dict=kwargs, **self._spawn_params)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/distributed/utils.py", line 323, in spawn
comp_model_cls.spawn(
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/distributed/comp_models/native.py", line 304, in spawn
start_processes(
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
while not context.join():
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
fn(i, *args)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/distributed/comp_models/native.py", line 272, in _dist_worker_task_fn
fn(local_rank, *args, **kw_dict)
File "/home/ymf/dockerFile/relationformer/train.py", line 282, in main
trainer.run()
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/monai/engines/trainer.py", line 56, in run
super().run()
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/monai/engines/workflow.py", line 250, in run
super().run(data=self.data_loader, max_epochs=self.state.max_epochs)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 702, in run
return self._internal_run()
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 775, in _internal_run
self._handle_exception(e)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 469, in _handle_exception
raise e
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 745, in _internal_run
time_taken = self._run_once_on_dataset()
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 850, in _run_once_on_dataset
self._handle_exception(e)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 469, in _handle_exception
raise e
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 833, in _run_once_on_dataset
self.state.output = self._process_function(self, self.state.batch)
File "/home/ymf/dockerFile/relationformer/trainer.py", line 40, in _iteration
h, out = self.network(images)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 711, in forward
output = self.module(*inputs, **kwargs)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/ymf/dockerFile/relationformer/models/relationformer_2D.py", line 108, in forward
features, pos = self.backbone(samples)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/ymf/dockerFile/relationformer/models/deformable_detr_backbone.py", line 117, in forward
xs = self0
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/ymf/dockerFile/relationformer/models/deformable_detr_backbone.py", line 84, in forward
xs = self.body(tensor_list.tensors)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torchvision/models/_utils.py", line 63, in forward
x = module(x)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 399, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 395, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

Do you have any clue about this error and how to fix it? Thanks!

Bartopt · 2023-08-07T09:23:45Z

Hi, I have installed the code in python3.8, pytorch 1.8.0 and cuda11. And the debug_relationformer.ipynb runs well about Debug Dataloader and Debug Model part. However, when I run the train.py using "nohup python3 train.py --config configs/scene_2d.yaml --cuda_visible_device 0 1 2 --exp_name VGtest1 --nproc_per_node 3 --b 16 &> log/Muti.out& ", there is an error:

*** Config file configs/scene_2d.yaml Experiment Name : VGtest1 Batch size : 16 Running Distributed: True ; GPU: 0 ; RANK: 0 Number of parameters : 92944451 ERROR:ignite.engine.engine.RelationformerTrainer:Current run is terminating due to exception: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same ERROR:ignite.engine.engine.RelationformerTrainer:Current run is terminating due to exception: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same ERROR:ignite.engine.engine.RelationformerTrainer:Current run is terminating due to exception: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same ERROR:ignite.engine.engine.RelationformerTrainer:Engine run is terminating due to exception: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same ERROR:ignite.engine.engine.RelationformerTrainer:Engine run is terminating due to exception: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same ERROR:ignite.engine.engine.RelationformerTrainer:Engine run is terminating due to exception: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same Traceback (most recent call last): File "train.py", line 292, in parallel.run(main, args) File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/distributed/launcher.py", line 275, in run idist.spawn(self.backend, func, args=args, kwargs_dict=kwargs, **self._spawn_params) File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/distributed/utils.py", line 323, in spawn comp_model_cls.spawn( File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/distributed/comp_models/native.py", line 304, in spawn start_processes( File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes while not context.join(): File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join raise ProcessRaisedException(msg, error_index, failed_process.pid) torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error: Traceback (most recent call last): File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap fn(i, *args) File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/distributed/comp_models/native.py", line 272, in _dist_worker_task_fn fn(local_rank, *args, **kw_dict) File "/home/ymf/dockerFile/relationformer/train.py", line 282, in main trainer.run() File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/monai/engines/trainer.py", line 56, in run super().run() File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/monai/engines/workflow.py", line 250, in run super().run(data=self.data_loader, max_epochs=self.state.max_epochs) File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 702, in run return self._internal_run() File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 775, in _internal_run self._handle_exception(e) File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 469, in _handle_exception raise e File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 745, in _internal_run time_taken = self._run_once_on_dataset() File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 850, in _run_once_on_dataset self._handle_exception(e) File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 469, in _handle_exception raise e File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 833, in _run_once_on_dataset self.state.output = self._process_function(self, self.state.batch) File "/home/ymf/dockerFile/relationformer/trainer.py", line 40, in _iteration h, out = self.network(images) File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 711, in forward output = self.module(*inputs, **kwargs) File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/ymf/dockerFile/relationformer/models/relationformer_2D.py", line 108, in forward features, pos = self.backbone(samples) File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/ymf/dockerFile/relationformer/models/deformable_detr_backbone.py", line 117, in forward xs = self0 File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/ymf/dockerFile/relationformer/models/deformable_detr_backbone.py", line 84, in forward xs = self.body(tensor_list.tensors) File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torchvision/models/_utils.py", line 63, in forward x = module(x) File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 399, in forward return self._conv_forward(input, self.weight, self.bias) File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 395, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

Do you have any clue about this error and how to fix it? Thanks!

I have the same problem. How do you fix it?

MeklitMa · 2023-10-13T04:03:38Z

@Bartopt @Yassin-fan Hi did any one fix this problem? please if you did, how did you fix it thank you.

tyxtyxtyxtyx · 2023-11-21T07:23:59Z

@Bartopt @Yassin-fan @MeklitMa Hi, guys, please tell me if you fix it, thanks very much!

incredibledays · 2024-03-25T03:54:38Z

@Bartopt @tyxtyxtyxtyx @Yassin-fan @MeklitMa @suprosanna Hi, can anyone give me the source code? Thank you very much? My email is [email protected]

JoseLuisNeves · 2024-06-04T14:47:25Z

In trainer.py, I just sent the self.network to cuda device before line 41 (h, out = self.network(images)), from which the error was coming @Bartopt @tyxtyxtyxtyx @Yassin-fan @incredibledays @MeklitMa.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Input type and weight type error in scene graph code #16

Input type and weight type error in scene graph code #16

Yassin-fan commented May 4, 2023

Bartopt commented Aug 7, 2023

MeklitMa commented Oct 13, 2023 •

edited

Loading

tyxtyxtyxtyx commented Nov 21, 2023

incredibledays commented Mar 25, 2024

JoseLuisNeves commented Jun 4, 2024 •

edited

Loading

Input type and weight type error in scene graph code #16

Input type and weight type error in scene graph code #16

Comments

Yassin-fan commented May 4, 2023

Bartopt commented Aug 7, 2023

MeklitMa commented Oct 13, 2023 • edited Loading

tyxtyxtyxtyx commented Nov 21, 2023

incredibledays commented Mar 25, 2024

JoseLuisNeves commented Jun 4, 2024 • edited Loading

MeklitMa commented Oct 13, 2023 •

edited

Loading

JoseLuisNeves commented Jun 4, 2024 •

edited

Loading