Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

training_fail_/seq2seq.lua:62: attempt to call field 'recursiveCopy' #80

Closed
ArashHosseini opened this issue Oct 15, 2017 · 10 comments
Closed

Comments

@ArashHosseini
Copy link

ArashHosseini commented Oct 15, 2017

@neuralconvo is running on two other machines with cuda and opencl without any probelm.....NOW i setup a fresh os with 14.04, cuda8 and changed cudnn5 to 6 because of tf1.3....torch install done...

trace:

`$ th train.lua --cuda --dataset 5000 --hiddenSize 100-- Loading dataset
data/vocab.t7 not found
-- Parsing Cornell movie dialogs data set ...
[==================== 387810/387810 ==========>] Tot: 1s238ms | Step: 0ms
-- Pre-processing data
[==================== 5000/5000 ==============>] Tot: 771ms | Step: 0ms
-- Shuffling
Writing data/examples.t7 ...
[==================== 8151/8151 ==============>] Tot: 703ms | Step: 0ms
Writing data/vocab.t7 ...

Dataset stats:
Vocabulary size: 7061
Examples: 8151

-- Epoch 1 / 50 (LR= 0.001)

~/torch/install/bin/luajit: ./seq2seq.lua:62: attempt to call field 'recursiveCopy' (a nil value)
stack traceback:
./seq2seq.lua:62: in function 'forwardConnect'
train.lua:97: in function 'opfunc'
/home/flyn/torch/install/share/lua/5.1/optim/adam.lua:37: in function 'adam'
train.lua:131: in main chunk
[C]: in function 'dofile'
...flyn/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406670
`
never hit this issue before and cant find related details on web, thanks for help

@ArashHosseini ArashHosseini changed the title training_fail_/seq2seq.lua:62: attempt to call field 'recursiveCopy' training_fail_/seq2seq.lua:62: attempt to call field 'recursiveCopy'........cudnn6 support? Oct 18, 2017
@ArashHosseini ArashHosseini changed the title training_fail_/seq2seq.lua:62: attempt to call field 'recursiveCopy'........cudnn6 support? training_fail_/seq2seq.lua:62: attempt to call field 'recursiveCopy' Oct 18, 2017
@ArashHosseini
Copy link
Author

ArashHosseini commented Oct 27, 2017

i talked with marc about that point, he will look into soonish........can you post your env settings, cuda and cudnn version please

@ArashHosseini
Copy link
Author

@hit-lacus there is no other participant yet. yeah please

@shrutiphadke
Copy link

I have the same problem. Exact same error. I am using Torch without Cuda/OpenCL on Ubuntu 16.04.

@baaleze
Copy link

baaleze commented Jan 23, 2018

It looks like the way to call recursiveCopy has changed.
In the file seq2seq.lua line 62 & 64 if I replace nn.rnn.recursiveCopy by nn.utils.recursiveCopy it works for me.
Hope that can help.

@ArashHosseini
Copy link
Author

perfectly, can also confirm

@ArashHosseini
Copy link
Author

ArashHosseini commented Jan 24, 2018

@baaleze, thx again, did you also got on eval after training?

Loading vocabulary from data/vocab.t7 ...	
-- Loading model	

Type a sentence and hit enter to submit.	
CTRL+C then enter to quit.
	
you> hello
/home/flyn/torch/install/bin/luajit: /home/flyn/torch/install/share/lua/5.1/nn/Container.lua:67: 
In 3 module of nn.Sequential:
/home/flyn/torch/install/share/lua/5.1/torch/Tensor.lua:466: Wrong size for view. Input size: 1000. Output size: 25931
stack traceback:
	[C]: in function 'error'
	/home/flyn/torch/install/share/lua/5.1/torch/Tensor.lua:466: in function 'view'
	/home/flyn/torch/install/share/lua/5.1/rnn/utils.lua:191: in function 'recursiveZeroMask'
	/home/flyn/torch/install/share/lua/5.1/rnn/MaskZero.lua:37: in function 'updateOutput'
	/home/flyn/torch/install/share/lua/5.1/rnn/Recursor.lua:13: in function '_updateOutput'
	...yn/torch/install/share/lua/5.1/rnn/AbstractRecurrent.lua:50: in function 'updateOutput'
	/home/flyn/torch/install/share/lua/5.1/rnn/Sequencer.lua:53: in function </home/flyn/torch/install/share/lua/5.1/rnn/Sequencer.lua:34>
	[C]: in function 'xpcall'
	/home/flyn/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	/home/flyn/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	./seq2seq.lua:87: in function 'eval'
	eval.lua:55: in function 'say'
	eval.lua:69: in main chunk
	[C]: in function 'dofile'
	...flyn/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00406670

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
	[C]: in function 'error'
	/home/flyn/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
	/home/flyn/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	./seq2seq.lua:87: in function 'eval'
	eval.lua:55: in function 'say'
	eval.lua:69: in main chunk
	[C]: in function 'dofile'
	...flyn/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00406670

@fengmao31
Copy link

i have the bug too

@Jeavy
Copy link

Jeavy commented May 18, 2018

Hi, I've trained a model with "th train.lua --cuda --dataset 50000 --hiddenSize 1000" and after that, I got same error as @ArashHosseini (Wrong size for view) when tried to chat.
Does anyone know how to fix it?
Thanks for help.

@ghost
Copy link

ghost commented Jul 3, 2018

I found a simple solution. Try this.
In the file seq2seq.lua line 87, change
local prediction = self.decoder:forward(torch.Tensor(output))[#output]
to
local prediction = self.decoder:forward(torch.Tensor({output}):t())[#output][1]

@ArashHosseini
Copy link
Author

@Tak-o-m great report, can also agree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants