The following is an inventory of data sets around the Natural Language Processing (NLP) domains of Question Generation (QG) and Question Answering (QA). The motivation to include QA into this repository is simply that often the two occur together.
Type | Name | Link |
---|---|---|
QA | SQuAD2.0 - The Stanford Question Answering Dataset | https://rajpurkar.github.io/SQuAD-explorer/ |
QA | Question-Answer Dataset | http://www.cs.cmu.edu/~ark/QA-data/ |
QA | A Corpus for Complex Question Answering over Knowledge Graphs | http://sda.cs.uni-bonn.de/projects/qa-dataset/ |
QA | WebQuestions | https://nlp.stanford.edu/software/sempre/ |
QG | Question Generation Shared Task & Evaluation Challenge (QGSTEC) 2010 - Generating Questions from Sentences | https://github.com/bjwyse/QGSTEC2010 |