[ADD] PyTorch: Tolstoi Char RNN #40

schaefertim · 2021-10-07T09:45:26Z

Introduces a PyTorch version of the Tolstoi Char RNN (previously only in Tensorflow).

…=False

deepobs/pytorch/datasets/tolstoi.py

schaefertim · 2021-10-07T09:52:44Z

deepobs/pytorch/testproblems/tolstoi_char_rnn.py

+    Working training parameters are:
+
+    - batch size ``50``
+    - ``200`` epochs
+    - SGD with a learning rate of :math:`\\approx 0.1` works


This is copied from tensorflow and not verified.
In Tensorflow, the CrossEntropyLoss takes mean across time axis and sum across batch axis.
Such an option does not exist in PyTorch. The only options are "sum" or "mean" for both axes. Currently "mean" is chosen.
In this case, the learning rate should be a factor batch_size bigger, because gradients are a factor batch_size smaller.

Could you instead use "sum" and divide by seq_length (or whatever the variable name for the width of the time axis is)?
It would be great if running, e.g. SGD with lr=0.1 produced similar results in PyTorch and TensorFlow.

This is exactly the idea we discussed in person. However, it turns out that this didn't work.
The division by seq_length must happen only after the CrossEntropyLoss. Therefore, it cannot be part of the model.
I see two possibilities:

introduce a custom CrossEntropyLoss, that divides after applying CrossEntropyLoss.

leave it as it is

In my opinion, both options are quite bad.

Easiest would be to change the definition of the Loss in the TensorFlow version to something compatible with PyTorch...
Let me think about this and I will address and merge it once I find time for DeepOBS again.

schaefertim and others added 16 commits August 9, 2021 10:37

[GIT] add PyCharm files to gitignore

588c323

[ADD] Tolstoi Char RNN testproblem

c084691

[FIX] Tolstoi dataset

e5db1af

[ADD] net_char_rnn: debug with print

6701e1a

[ADD] add TODO, fix parameters

a7e4071

[ADD] fix network, remove print

8970099

[ADD] LSTM PyTorch: different parameter count

cfa7f8f

[ADD] SGD Runner

a899ed2

[REF] adjust NR_PT_TESTPROBLEMS to 21

b46ccf3

[ADD] Tolstoi: PyTorch: redundant bias: set to zero and requires_grad…

6ba92d0

…=False

[REF] Tolstoi, PyTorch: adjust dropout probability to tensorflow

a46b8ef

[REF] adjust TODO

ba2f002

[REF] cleanup

523e4a8

[FIX] denominator for 2d labels (like in Tolstoi)

b71d673

[REF] cleanup

3edd505

[DEL] remove default sgd

7be0020

schaefertim commented Oct 7, 2021

View reviewed changes

deepobs/pytorch/datasets/tolstoi.py Outdated Show resolved Hide resolved

schaefertim commented Oct 7, 2021

View reviewed changes

[REF] separate training and validation data

883202f

fsschneider mentioned this pull request Oct 27, 2021

Feature Request: Add support for pytorch #5

Open

38 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ADD] PyTorch: Tolstoi Char RNN #40

[ADD] PyTorch: Tolstoi Char RNN #40

schaefertim commented Oct 7, 2021

schaefertim Oct 7, 2021

fsschneider Oct 7, 2021

schaefertim Oct 7, 2021

fsschneider Oct 7, 2021

[ADD] PyTorch: Tolstoi Char RNN #40

Are you sure you want to change the base?

[ADD] PyTorch: Tolstoi Char RNN #40

Conversation

schaefertim commented Oct 7, 2021

schaefertim Oct 7, 2021

Choose a reason for hiding this comment

fsschneider Oct 7, 2021

Choose a reason for hiding this comment

schaefertim Oct 7, 2021

Choose a reason for hiding this comment

fsschneider Oct 7, 2021

Choose a reason for hiding this comment