-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review Rob's documentation #80
Comments
Some suggestions for the homepage README: This software library automatically discovers papers and datasets published by HHMI Janelia scientists and stores them in a MongoDB database. The automated scripts, which are run on a weekly basis, also make "educated guesses" about metadata that are of strategic interest to Janelia, such as labs, teams, and employees who contributed to the work. Utility scripts allow the librarian or database administrator to curate these metadata in a semi-automated fashion. A Flask-based application provides a user interface, visualizations, and a REST API. The two most important collections in the DIS database are the utility: utility programs meant to be run interactively on the command line, for querying and manipulating database records Common command line parameters |
Comments on documentation in sync/:
|
For documentation in utility/:
weekly_pubs.pyA wrapper script for the whole weekly curation pipeline. For a DOI or batch of DOIs, it runs update_dois.py, name_match.py, update_tags.py, and get_citation.py, in that order. It will not add DOIs to the database that are already in the database, but it will run the rest of the scripts on those DOIs. Example usage: If you want to simply add a DOI to the database without running the rest of the pipeline, run with the --sync_only flag. As always, you must add the --write flag for the change to persist in the database. It is better to add DOIs to the database this way, rather than running sync/bin/update_dois.py directly, because this script performs a couple of addition quality checks on the DOIs. add_newsletter.pyAdd, change, or remove a date in a DOI's jrc_newsletter field. Importantly, only papers that have a jrc_newsletter field will go on janelia.org. Also, these papers' jrc_author field won't be automatically updated (though it can still be manually updated). Usually, you can add a newsletter date during the weekly curation process, when update_tags.py prompts you to set the newsletter date to today. Sometimes, though, you'll need to set a newsletter date to a date that's not today. The actual date itself isn't used in our automated systems, so it won't be catastrophic if you set it to a silly date. However, it's nice to set jrc_newsletter to the same date for all the papers that went into a particular newsletter issue. add_preprint.pyAdd a preprint relationship for a particular preprint-article pair. It's not unusual that a preprint relationship is missed both by Crossref and by the DIS system's "educated guessing". I always check Google for preprints before putting a journal article into the newsletter. If you discover a preprint or preprint relation that's not in our system, use this script to add the relationship and/or the preprint DOI. This is stored in the jrc_preprint field, which is simply an array of DOIs. For a journal article, jrc_preprint will contain the preprint DOI(s), and for a preprint, it will contain the journal article DOI(s). If the preprint DOI is not in the database, the script will prompt you to add it. This script cannot be used to remove preprint relationships. set_alumni.pyAdd an alumni tag to, or remove an alumni tag from, a Janelian's record in the orcid collection. Alumni are not included in jrc_author, therefore if they have a profile on janelia.org, this paper won't be added to their profile. The alumni field is automatically created and set to true when an employeeId that we have in the orcid collection is no longer in the People system. name_match.pyInteractively curate the list of Janelia authors for one or more DOIs. This list is stored in the DOI metadata under the jrc_author field. Because of the way the database is set up, the list of Janelia authors on the browser interface will not reflect your changes to jrc_author. Rest assured, though, your changes will be stored in the database, as long as you use the --write flag. The new Janelia.org uses jrc_author to determine Janelia authors for a paper. update_tags.pyModify tags (and optionally add newletter date) to one or more DOIs. 'Tags' is our jargon for labels representing labs, project teams, or support teams. Tags are derived from the Janelia authors' HHMI People profiles. They include HHMI supervisory organization codes ('supOrg codes'), as well as supOrg names and cost center descriptions. It is EXTREMELY IMPORTANT that you tag each DOI with ALL applicable tags. So, for example, if you encounter a DOI with possible tags "Srinivas Turaga Lab" and "Srini Turaga Lab", select both. If a postdoc or research assistant is an author but their group leader is not, do not tag it with that lab's tag(s). Example usage: get_citation.pyPrint one or more article citations in the Janelia newsletter format. The DOIs must be in the database already. Typical usage:
Sometimes, you'll want a citation for a DOI that can't be added to the database because it's not in Crossref. (This can happen with bioRxiv.) In these cases, you can add the DOI to EndNote, export the citation in the "Janelia Science News" format, and feed the resulting text file to this script to to produce a useable citation. (EndNote won't let you export citations without journal names.)
Run the script like so:
|
In particular, make sure the section on data sources is comprehensive. Also get some installation instructions up on GItHub.
The text was updated successfully, but these errors were encountered: