-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transformer model based on Tensorlayer #1027
base: master
Are you sure you want to change the base?
Conversation
Documentations haven't been done |
import tensorlayer as tl | ||
|
||
|
||
class MultiHeadAttentionLayer(tl.layers.Layer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MultiHeadAttention is better than MultiHeadAttentionLayer?
K = tf.keras.backend | ||
|
||
|
||
class LazyAdam(tf.keras.optimizers.Adam): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this part to tl.optimizer
??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OKAY.
Also, should I modify the "copyright" in the top? I reference most of the codes from TensorFlow officials.
The dependency on Keras should be removed gradually. tf.keras.optimizers -> tf.optimiziers |
ready to merge? |
Add attention visualisation util |
not yet |
Add documentation |
Hi, could you provide an example code in the examples folder? and update changelog.md ? thanks |
"""The :class:`MultiHeadAttentionLayer` layer is for multi-head attention computation. | ||
The weight computation is between "key" and "query", which will then matmul with "value" to generate information | ||
that selectively focuses on the "query" messages. | ||
Parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing space
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
): | ||
"""Search for sequence of subtoken ids with the largest probability. | ||
|
||
Args: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incorrect RST format, should be Parameters
with underline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There are some naming issues with this PR. Please don't merge it for now. |
tensorlayer/optimizers/lazyAdam.py
Outdated
K = tf.keras.backend | ||
|
||
|
||
class LazyAdam(tf.optimizers.Adam): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LazyAdamOptimizer
tensorlayer/optimizers/lazyAdam.py
Outdated
@@ -0,0 +1,147 @@ | |||
# Copyright 2019 The TensorFlow Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename the file to lazy_adam.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
tensorlayer/optimizers/lazyAdam.py
Outdated
self.verbose = verbose | ||
if init_steps is None: | ||
init_steps = 0.0 | ||
self.steps = float(init_steps) # Total steps during training. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why steps is a float not an int?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will get rid of this part. It has been updated in tf.keras library. thanks
import tensorlayer as tl | ||
|
||
|
||
class FeedForwardLayer(tl.layers.Layer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this FeedForwardLayer different from conventioinal feedforward layer? If this is a special implementation, can it have a more specific name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FeedforwardLayer here is specifically designed in Transformer model, which includes two transformations.
The conventional feedforward layer only contains one.
Any suggestion on how the naming?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TransformerFeedFordwardLayer?
tensorlayer/optimizers/lazyAdam.py
Outdated
|
||
import numpy as np | ||
import tensorflow as tf | ||
K = tf.keras.backend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this has to be a global variable of tensorlayer?
import tensorflow as tf | ||
import tensorlayer.models.transformer.beamsearchHelper.beam_search_v1 as v1 | ||
|
||
_StateKeys = v1._StateKeys # pylint: disable=protected-access |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid global variables in the library as much as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try to optimise this one.
@ArnoldLIULJ any update? |
was on vocation and would be working on a simplified tutorial today |
Add examples in example/translation_task/tutorial_transformer |
done |
Hi the RST format is not correct in many function, please check~ |
7b3be01
to
2f316b0
Compare
please check |
I think this one can be merged after the travis pass~. |
Checklist
Motivation and Context
Description