Disabled distance calculation, improved error handling & integration tests
Pre-releaseAdjusted behaviour for upstream/downstream gene variants
When a distance to the gene is reported for upstream/downstream gene variants, it can lead to multiple output records for a single variant, when a gene which it affects has multiple transcription start or end sites. Because this functionality is currently not intended to be used, it has been disabled by default and can be enabled via --report-distance
flag.
Improved error handling
- The
@retry
decorator which handles VEP errors will now catch all possible exception types, not justrequests.HTTPError
, because more things can go wrong (e. g. protocol errors, server responds OK but the data is malformed). - The wrapper script now includes multiple options (
set -euxo pipefail
) which will lead to the script's failure in case any of its components fails. - In case at least one VEP worker fails (having exhausted the retry attempts), the entire workflow will be stopped immediately, and failure will be raised.
Integration tests
The simple integration test includes running a toy dataset of 2,000 variants (originally randomly sampled from ClinVar) through VEP and comparing the annotation result with the expected one. This means that when VEP updates, the test will break; however, this is exactly the intention, as in this case we will be able to compare the results, see if the changes make sense, and update the test data.