You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Aug 18, 2021. It is now read-only.
Thanks for the wonderful explanation. I am using this code as guideline to build a speech recognition network. I am giving speech frames (sequence of 40-dim feature vectors) as input to the encoder and trying to predict characters as output of the decoder. Speech frames can be very large in number (>1000) as compared to output length (<100).
So, I have set MAX_LENGTH of attention to be 5000. Unfortunately, it never predicts <eos> token and keeps predicting till 5000 characters.
I am using bidirectional LSTM as encoder. I concatenate the output and hidden from encoder to feed in the decoder, which is just LSTM.
I would highly appreciate any directions.
Thanks
Brij
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hi,
Thanks for the wonderful explanation. I am using this code as guideline to build a speech recognition network. I am giving speech frames (sequence of 40-dim feature vectors) as input to the encoder and trying to predict characters as output of the decoder. Speech frames can be very large in number (>1000) as compared to output length (<100).
So, I have set MAX_LENGTH of attention to be 5000. Unfortunately, it never predicts
<eos>
token and keeps predicting till 5000 characters.I am using bidirectional LSTM as encoder. I concatenate the output and hidden from encoder to feed in the decoder, which is just LSTM.
I would highly appreciate any directions.
Thanks
Brij
The text was updated successfully, but these errors were encountered: