The InPhO Corpus Builder matches a plaintext bibliography to volumes in the HathiTrust.
-
Install anystyle.io and Ruby dependencies by following the directions:
sudo apt-get install ruby-dev gem install --user-install anystyle
Note: you might need to add the
~/.gem/ruby/2.3.0/bin
dir to yourPATH
. Thegem install
will tell you. -
Install
rython
using distribute:pip install rython
-
Use
parse.py
to parse a file to the JSON format for the browser:python parse.py FILENAME
-
Launch the Corpus Builder:
python server.py -p 9024
-
Open the Corpus Builder in a browser: http://localhost:9024/
-
When finished, use
extractids.py
to create a file with 1 HathiTrust ID per line for use with corpus download tools.python extractids.py www/out.json