CumConcatLayer #589

albertz · 2021-08-26T08:59:04Z

This is for #391.

albertz · 2021-08-26T22:21:13Z

Implementing a test case is a bit tricky here.

I cannot use DotLayer at the moment to reduce the new dim tag (see this comment).

I also cannot really use any other layer to reduce the new dim tag, as they all would not correctly handle the extended dynamic size. Also, operations using the extended dynamic size would usually also expect all axes from the ext dyn size to be present (rec_time_dim), which is not the case inside the loop. Only DotLayer would introduce this as outlined in #391.

I also currently can not return the CumConcatLayer directly as output inside the rec layer because the rec layer logic expects that the output template inside the rec layer will be the final output with additional rec_time_dim. But this is not true for the CumConcatLayer where rec_time_dim is not added (except in the ext dyn size).

Zettelkasten

Looks good! I only had minor comments.
Like you suggested, I think we can merge this then and go from there. EDIT: After we fix the other issues..

returnn/tf/util/data.py

returnn/tf/layers/rec.py

albertz · 2021-08-27T09:16:36Z

Maybe I will directly implement all needed things in this PR here (as minimal as possible) such that the test can run. I.e. also:

SoftmaxOverSpatialLayer with ext dyn size support (via Data get_sequence_mask_broadcast fix for uncommon cases #646) (Edit Done)
DotLayer should check for dyn sizes on reduced axes and apply masking (and error if axes are missing) (and skip masking via some automatic logic when possible) (DotLayer reduce over dynamic axis should respect the seq mask #629) (Edit Done)
DotLayer logic for adapting axes when optimized out (DotLayer behavior unintuitive with rec automatic optimization #569) (Edit Done)

albertz · 2021-09-01T13:22:21Z

I wonder a bit how to best handle such huge PRs (e.g. how to effectively do the cleanup).
I wrote that down here on Reddit as a question but not sure if that is the best place to ask such things. But I also don't really know any good place where I could ask such question. Any recommendation where to ask such things?

albertz · 2021-09-01T13:25:14Z

I will probably now first move everything out into a new branch, except the two new test cases and also the CumConcatLayer (which is what this PR here is about + further things we need for the test cases to pass). Because all the remaining things should always pass all the tests.

albertz · 2021-09-02T21:58:43Z

Wow, this was quite some effort, to get the basics ready, to continue working on this itself.

Basically:
#592 #599 #600 #601 #602 #603 #604 #605 #606 #607 #608 #609 #610 #611 #612 #613 #614 #615 #616 #617 #618 #619 #620 #621 #622 #623 #624

Needed for #589

This is for generalized self attention (#391). Fixes #391. Co-authored-by: Frithjof <[email protected]>

#589, #391

albertz · 2021-09-12T00:53:03Z

The test case is a full simple test case for auto-regressive self attention, so it covers #391. CumConcatLayer is the final missing piece now, so this finally fixes #391 (at least the basic support).

This is for generalized self attention (#391). Fixes #391. Co-authored-by: Frithjof <[email protected]>

albertz mentioned this pull request Aug 26, 2021

Extending Self Attention #391

Closed

Zettelkasten mentioned this pull request Aug 26, 2021

Tests for self attention using CumConcatLayer #590

Merged

albertz force-pushed the albert-generalized-self-att branch from 265929c to 38ef10b Compare August 26, 2021 16:14

albertz marked this pull request as ready for review August 26, 2021 17:39

albertz requested a review from Zettelkasten August 26, 2021 17:39

This comment has been minimized.

Sign in to view

albertz force-pushed the albert-generalized-self-att branch 2 times, most recently from 27504f3 to 580fcee Compare August 26, 2021 21:23

albertz requested a review from a team as a code owner August 26, 2021 21:23

This comment has been minimized.

Sign in to view

Zettelkasten reviewed Aug 27, 2021

View reviewed changes

returnn/tf/util/data.py Outdated Show resolved Hide resolved

returnn/tf/util/data.py Outdated Show resolved Hide resolved

returnn/tf/layers/rec.py Outdated Show resolved Hide resolved

returnn/tf/layers/rec.py Outdated Show resolved Hide resolved

albertz force-pushed the albert-generalized-self-att branch from f53d68e to 4791acd Compare August 27, 2021 09:07

albertz force-pushed the albert-generalized-self-att branch 3 times, most recently from c9d9d8b to e99be05 Compare August 27, 2021 11:46

Zettelkasten mentioned this pull request Aug 31, 2021

Specify dim tags for layers that create new axes #597

Closed

This comment has been minimized.

Sign in to view

albertz force-pushed the albert-generalized-self-att branch 2 times, most recently from 2ed67bb to 2384c1b Compare September 1, 2021 13:09

albertz force-pushed the albert-generalized-self-att branch from 2384c1b to cf604e1 Compare September 1, 2021 13:30

This comment has been minimized.

Sign in to view

This was referenced Sep 1, 2021

(Random stuff) #599

Merged

(Extend some tests) (Earlier this was a huge PR) #605

Merged

albertz force-pushed the albert-generalized-self-att branch from cf604e1 to b56f9e6 Compare September 2, 2021 22:09

albertz mentioned this pull request Sep 3, 2021

DotLayer transform maybe var1/var2 when optimized out of loop #628

Merged

albertz force-pushed the albert-generalized-self-att branch 2 times, most recently from ed793ab to 626c35b Compare September 3, 2021 23:07

albertz mentioned this pull request Sep 5, 2021

Enforce dim tags to be unique in tensor #632

Open

albertz force-pushed the albert-generalized-self-att branch 6 times, most recently from c0c1205 to 415ba6b Compare September 11, 2021 21:58

albertz mentioned this pull request Sep 11, 2021

Data description/repr shorter #644

Merged

albertz force-pushed the albert-generalized-self-att branch from 415ba6b to 088d9c7 Compare September 11, 2021 23:56

albertz mentioned this pull request Sep 12, 2021

ControlFlowContext in dim tag, network, Data #647

Merged

albertz force-pushed the albert-generalized-self-att branch from 088d9c7 to 6deb705 Compare September 12, 2021 00:07

albertz added a commit that referenced this pull request Sep 12, 2021

ControlFlowContext in dim tag, network, Data (#647)

9baf23d

Needed for #589

albertz and others added 2 commits September 12, 2021 02:40

CumConcatLayer (#589)

4661bbe

This is for generalized self attention (#391). Fixes #391. Co-authored-by: Frithjof <[email protected]>

test_reclayer_optimize_out_cum_concat_gen_self_att

a406d3f

#589, #391

albertz force-pushed the albert-generalized-self-att branch from 6deb705 to a406d3f Compare September 12, 2021 00:40

albertz merged commit 3ab8667 into master Sep 12, 2021

albertz added a commit that referenced this pull request Sep 12, 2021

CumConcatLayer (#589)

f35f1cd

This is for generalized self attention (#391). Fixes #391. Co-authored-by: Frithjof <[email protected]>

albertz deleted the albert-generalized-self-att branch September 12, 2021 00:59

This was referenced Sep 16, 2021

SliceNdLayer now uses GatherLayer to get the slices. #635

Merged

Accumulate loop dyn size #657

Merged

albertz mentioned this pull request Nov 3, 2021

Implement standard attention and self-attention module rwth-i6/returnn_common#52

Closed

This was referenced Oct 19, 2022

Data verify_out_shape not intuitive when there are implicit dims #1153

Open

Dim internals and API should be refactored #975

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CumConcatLayer #589

CumConcatLayer #589

albertz commented Aug 26, 2021

This comment has been minimized.

This comment has been minimized.

albertz commented Aug 26, 2021 •

edited

Loading

This comment has been minimized.

Zettelkasten left a comment •

edited

Loading

albertz commented Aug 27, 2021 •

edited

Loading

This comment has been minimized.

This comment has been minimized.

albertz commented Sep 1, 2021

albertz commented Sep 1, 2021

This comment has been minimized.

albertz commented Sep 2, 2021

albertz commented Sep 12, 2021

CumConcatLayer #589

CumConcatLayer #589

Conversation

albertz commented Aug 26, 2021

This comment has been minimized.

This comment has been minimized.

albertz commented Aug 26, 2021 • edited Loading

This comment has been minimized.

Zettelkasten left a comment • edited Loading

Choose a reason for hiding this comment

albertz commented Aug 27, 2021 • edited Loading

This comment has been minimized.

This comment has been minimized.

albertz commented Sep 1, 2021

albertz commented Sep 1, 2021

This comment has been minimized.

albertz commented Sep 2, 2021

albertz commented Sep 12, 2021

albertz commented Aug 26, 2021 •

edited

Loading

Zettelkasten left a comment •

edited

Loading

albertz commented Aug 27, 2021 •

edited

Loading