Code for paper Answering Ambiguous Questions via Iterative Prompting.
Download the Wikipedia text splitted by 100 words from DPR, put it under data/wikipedia/psgs_w100.tsv
, and run the following command to build Wikipedia's redis cache.
python dataset.py
Download NQ and AmbigNQ data from shmsw25/AmbigQA and put it under data/nq
and data/ambig
Training a dense passage retrieval model using luyug/dense
python train_dense.py
Encode the passages and perform passage retrieval using Faiss.
python inference_dense.py
This step obtains QA data that includes the 100 retrieved passages, like data/ambig/dev.json
.
Download the pre-trained checkpoint from facebookresearch/FiD.
Train the prompting model and QA model on multi-answer QA data:
accelerate launch train.py --data_path data/ambig/train.json --save_path out/ambig/model --do_train true --do_eval false
Evaluate the model:
accelerate launch train.py --data_path data/ambig/dev.json --checkpoint out/ambig/model/9.pt --do_train false --do_eval true
Train a span selection baseline using script in shmsw25/AmbigQA. Predict answers on each of the 100 retrieved passages. Detailed scripts and produced datasets coming soon.
@inproceedings{Sun2023IsCG,
title={Answering Ambiguous Questions via Iterative Prompting},
author={Weiwei Sun and Hengyi Cai and Hongshen Chen and Pengjie Ren and Zhumin Chen and Maarten de Rijke and Zhaochun Ren},
booktitle={ACL},
year={2023},
}