Attention-over-Attention Neural Networks for Reading Comprehension

TLDR; The authors present a novel Attention-over-Attention (AoA) model for Machine Comprehension. Given a document and cloze-style question, the model predicts a single-word answer. The model,

Embeds both context and query using a bidirectional GRU
Computes a pairwise matching matrix between document and query words
Computes query-to-document attention values
Computes document-to-que attention averages for each query word
Multiplies the two attention vectors to get final attention scores for words in the document
Maps attention results back into the vocabulary space

The authors evaluate the model on the CNN News and CBTest Question Answering datasets, obtaining state-of-the-art results and beating other models including EpiReader, ASReader, etc.

Notes:

Very good model visualization in the paper
I like that this model is much simpler than EpiReader while also performing better

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

att-over-att.md

att-over-att.md

Attention-over-Attention Neural Networks for Reading Comprehension

Notes:

Files

att-over-att.md

Latest commit

History

att-over-att.md

File metadata and controls

Attention-over-Attention Neural Networks for Reading Comprehension

Notes: