Neural image captioning implementation with Keras based on Show and Tell.
To train from zero using the iapr2012 dataset:
- Download IAPR2012 dataset from here
- Move the downloaded file to the datasets/IAPR_2012/ directory
- Untar the file:
tar xvf iaprtc12.tgz
- Edit the file train.py by changing the flag extract_image_features to True.
-
Download the image features:
-
Download the extracted image features from here
-
Move them do datasets/IAPR_2012/preprocessed_data/ directory
-
Start training by running the script
python3 train.py
- Extracting the image features might take 1-2 hours in a GTX860M.
- Training 50 epochs should give you reasonable results.
- I will provide pre-trained models in COCO soon (hopefully)