RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #2 'other' #62

sunshine-zkf · 2019-06-16T13:13:44Z

when i run the train.py, there is a problem as fellow:

/home/sunshine_zkf/RetinaNet/pytorch-retinanet-master/loss.py:95: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
print('loc_loss: %.3f | cls_loss: %.3f' % (loc_loss.data[0]/num_pos, cls_loss.data[0]/num_peg), end=' | ')
Traceback (most recent call last):
File "/home/sunshine_zkf/RetinaNet/pytorch-retinanet-master/train.py", line 116, in
train(epoch)
File "/home/sunshine_zkf/RetinaNet/pytorch-retinanet-master/train.py", line 77, in train
loss = criterion(loc_preds, loc_targets, cls_preds, cls_targets)
File "/home/sunshine_zkf/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/sunshine_zkf/RetinaNet/pytorch-retinanet-master/loss.py", line 95, in forward
print('loc_loss: %.3f | cls_loss: %.3f' % (loc_loss.data[0]/num_pos, cls_loss.data[0]/num_peg), end=' | ')
RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #2 'other'

why? Can you help me? Thank you very much!

sunshine-zkf · 2019-06-16T13:14:00Z

@kuangliu

wvalcke · 2019-06-16T13:50:27Z

In utils.py you need to change the following
a = torch.arange(0,x)
b = torch.arange(0,y)

by

a = torch.arange(0,x,dtype=torch.float)
b = torch.arange(0,y,dtype=torch.float)

Also you probably need to change every call like .data[0] by .item()

sunshine-zkf · 2019-06-17T03:38:27Z

In utils.py you need to change the following
a = torch.arange(0,x)
b = torch.arange(0,y)

by
a = torch.arange(0,x,dtype=torch.float)
b = torch.arange(0,y,dtype=torch.float)
Also you probably need to change every call like .data[0] by .item()

I modify it as you suggest, but the following errors have occurred：
I tried to modify the loc_loss.data[0].item() and the following , the same errors as following.
Traceback (most recent call last):
File "/home/sunshine_zkf/RetinaNet/pytorch-retinanet-master/train.py", line 116, in
train(epoch)
File "/home/sunshine_zkf/RetinaNet/pytorch-retinanet-master/train.py", line 77, in train
loss = criterion(loc_preds, loc_targets, cls_preds, cls_targets)
File "/home/sunshine_zkf/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/sunshine_zkf/RetinaNet/pytorch-retinanet-master/loss.py", line 95, in forward
print('loc_loss: %.3f | cls_loss: %.3f' % (loc_loss.item()/num_pos, cls_loss.item()/num_peg), end=' | ')
File "/home/sunshine_zkf/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 320, in rdiv
return self.reciprocal() * other
RuntimeError: reciprocal is not implemented for type torch.cuda.LongTensor

what's problem? Can you help me,Thank you very much!

wvalcke · 2019-06-17T17:21:08Z

You did
loc_loss.data[0].item()

But it should be
loc_loss.item()

Check other references like this and change them all

sunshine-zkf · 2019-06-18T03:48:02Z

You did
loc_loss.data[0].item()

But it should be
loc_loss.item()

Check other references like this and change them all

Thank you very much! I run the train.py successfully !
The main reason seems to be the problem of pytorch's version.
Except for modifying loc_loss.item(), it's necessory to modify the following:
num_pos = pos.data.long().sum().item()

sunshine-zkf · 2019-06-18T11:18:08Z

You did
loc_loss.data[0].item()

But it should be
loc_loss.item()

Check other references like this and change them all

Sorry, disturb you. I was wondering if loss is this kind of situation is correct, when it is starting training！

wvalcke · 2019-06-18T12:13:39Z

Difficult to say without knowing what you want to train.
If possible sent me your train/test index files.
What are your training images ?
Are you training on your own set, or an existing one ?
If you are starting from an already trained model, it can be normal that the loss is very low at the beginning.

sunshine-zkf · 2019-06-18T13:41:09Z

Difficult to say without knowing what you want to train.
If possible sent me your train/test index files.
What are your training images ?
Are you training on your own set, or an existing one ?
If you are starting from an already trained model, it can be normal that the loss is very low at the beginning.

I am training on VOC2012 dataset that match the file ./data/voc12_train.txt and voc12_val.txt in this repo.
I use the net.pth downloaded the onlion. So, am i staring from an already trained model ?

Then i modify the loss.py follow you #56 , the problem got a little better, but it didn't make much difference

wvalcke · 2019-06-18T14:10:10Z

Have you used the script get_state_dict.py ?
This initializes the net.pth with resnet50 pretrained weights (i guess from Imagenet) and the retinanet specific layers are initialised with gaussian distribution.
This net.pth that is created is not trained at all on any model.
That is what i did, and training (for a specific set i trained on) starts with a loss at 2.1, then degraded while training.

sunshine-zkf · 2019-06-18T14:45:42Z

Have you used the script get_state_dict.py ?
This initializes the net.pth with resnet50 pretrained weights (i guess from Imagenet) and the retinanet specific layers are initialised with gaussian distribution.
This net.pth that is created is not trained at all on any model.
That is what i did, and training (for a specific set i trained on) starts with a loss at 2.1, then degraded while training.

yes, i used the script get_state_dict.py and generated the net.pth.Do you train on the voc ? How do i know that the train is right.

wvalcke · 2019-06-19T08:15:27Z

I started training on Pascal VOC set, loss starts at 1.4
But during the first test evaluation it fails to load the test images, i cant' find them, from where have you downloaded those ?

wvalcke · 2019-06-19T09:03:53Z

I took the loss implementation from Issue #52 and started training on VOC
The loss started with the value 0.7, training seems to be more stable than with the original code, as sometimes it went to 'nan'.

sunshine-zkf · 2019-06-19T11:41:29Z

I downloaded the images from VOC2007test, but I runed the test.py , there are many boxes on the detected image, I think there's a problem with that code. And you?
I use the loss from issue #52 ,the loss is very low ,but is stable.
Can I add you Wechat?

wvalcke · 2019-06-26T17:56:46Z

I trained on the VOC dataset and saw that with the loss of #52 it trained, but the results were NOK. (hundreds of boxes detected)
I changed the loss function to the definition below, i retrained from scratch and after training i tested on one of the images. Now the objects were correctly detected.

 def focal_loss_alt(self, x, y):
    '''Focal loss alternative.

    Args:
      x: (tensor) sized [N,D].
      y: (tensor) sized [N,].

    Return:
      (tensor) focal loss.
    '''
    alpha = 0.25

    t = one_hot_embedding(y.data.cpu(), 1+self.num_classes)
    t = t[:,1:]
    t = Variable(t).cuda()

    xt = x*(2*t-1)  # xt = x if t > 0 else -x
    pt = (2*xt+1).sigmoid()
    pt = pt.clamp(1e-7, 1.0)

    w = alpha*t + (1-alpha)*(1-t)
    loss = -w*pt.log() / 2
    return loss.sum()

sunshine-zkf · 2019-06-29T15:03:14Z

I trained on the VOC dataset and saw that with the loss of #52 it trained, but the results were NOK. (hundreds of boxes detected)
I changed the loss function to the definition below, i retrained from scratch and after training i tested on one of the images. Now the objects were correctly detected.
 def focal_loss_alt(self, x, y):
    '''Focal loss alternative.

    Args:
      x: (tensor) sized [N,D].
      y: (tensor) sized [N,].

    Return:
      (tensor) focal loss.
    '''
    alpha = 0.25

    t = one_hot_embedding(y.data.cpu(), 1+self.num_classes)
    t = t[:,1:]
    t = Variable(t).cuda()

    xt = x*(2*t-1)  # xt = x if t > 0 else -x
    pt = (2*xt+1).sigmoid()
    pt = pt.clamp(1e-7, 1.0)

    w = alpha*t + (1-alpha)*(1-t)
    loss = -w*pt.log() / 2
    return loss.sum()

sunshine-zkf · 2019-06-29T15:07:28Z

Why is it modified like this? I don't quite understand xt. I used the author another repo that is torchcv.but I get 20.3map in 2007testvoc. can i see you code modified?

wvalcke mentioned this issue Jun 16, 2019

Expected object of type torch.LongTensor but found type torch.FloatTensor #40

Open

sunshine-zkf closed this as completed Jun 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #2 'other' #62

RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #2 'other' #62

sunshine-zkf commented Jun 16, 2019

sunshine-zkf commented Jun 16, 2019

wvalcke commented Jun 16, 2019

sunshine-zkf commented Jun 17, 2019

wvalcke commented Jun 17, 2019

sunshine-zkf commented Jun 18, 2019

sunshine-zkf commented Jun 18, 2019

wvalcke commented Jun 18, 2019 •

edited

Loading

sunshine-zkf commented Jun 18, 2019

wvalcke commented Jun 18, 2019

sunshine-zkf commented Jun 18, 2019

wvalcke commented Jun 19, 2019

wvalcke commented Jun 19, 2019

sunshine-zkf commented Jun 19, 2019

wvalcke commented Jun 26, 2019

sunshine-zkf commented Jun 29, 2019

sunshine-zkf commented Jun 29, 2019

RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #2 'other' #62

RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #2 'other' #62

Comments

sunshine-zkf commented Jun 16, 2019

sunshine-zkf commented Jun 16, 2019

wvalcke commented Jun 16, 2019

sunshine-zkf commented Jun 17, 2019

wvalcke commented Jun 17, 2019

sunshine-zkf commented Jun 18, 2019

sunshine-zkf commented Jun 18, 2019

wvalcke commented Jun 18, 2019 • edited Loading

sunshine-zkf commented Jun 18, 2019

wvalcke commented Jun 18, 2019

sunshine-zkf commented Jun 18, 2019

wvalcke commented Jun 19, 2019

wvalcke commented Jun 19, 2019

sunshine-zkf commented Jun 19, 2019

wvalcke commented Jun 26, 2019

sunshine-zkf commented Jun 29, 2019

sunshine-zkf commented Jun 29, 2019

wvalcke commented Jun 18, 2019 •

edited

Loading