You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The pipeline has several external data dependencies (databases for kraken2, mash sketches, etc). There should be a way to verify if those databases are in an expected state, or if there have been changes to them. For example, two 'standard' kraken2 databases that are built on different dates may have different contents due to the ever-changing contents of RefSeq.
We may be able to use some sort of pre-computed checksum to verify the database integrity. May not want to verify on every pipeline run because calculating the hashes can be slow. Maybe provide a separate 'database verification' script That could be run periodically or run once before a set of pipeline runs are submitted.
The text was updated successfully, but these errors were encountered:
Take a peek at Kive http://cfe-lab.github.io/Kive/ - and ask Don Kirkby about what they did in the hashing department. Kive had great foresight in hashing all inputs and using that to be able to stop/continue jobs and know which parts had to be rerun. Not sure if that included reference databases but I wouldn't be surprised if so. They may have some quick hashing tips.
The pipeline has several external data dependencies (databases for kraken2, mash sketches, etc). There should be a way to verify if those databases are in an expected state, or if there have been changes to them. For example, two 'standard' kraken2 databases that are built on different dates may have different contents due to the ever-changing contents of RefSeq.
We may be able to use some sort of pre-computed checksum to verify the database integrity. May not want to verify on every pipeline run because calculating the hashes can be slow. Maybe provide a separate 'database verification' script That could be run periodically or run once before a set of pipeline runs are submitted.
The text was updated successfully, but these errors were encountered: