Skip to content

Commit

Permalink
Update VEP to release 101 (August 2020). (#11)
Browse files Browse the repository at this point in the history
* Update VEP to release 101 (August 2020).

* Small updates from 91->101; code comment path change.
  • Loading branch information
mbookman authored Feb 19, 2021
1 parent 7167931 commit 935a7d3
Show file tree
Hide file tree
Showing 3 changed files with 17 additions and 10 deletions.
6 changes: 4 additions & 2 deletions batch/vep/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
#
# Example:
#
# docker build . --build-arg ENSEMBL_RELEASE=91 --tag vep_91
# docker build . --build-arg ENSEMBL_RELEASE=101 --tag vep_101
#
# To run vep through containers created by this file, the VEP cache has to be
# downloaded separately and made available through command line arguments.
Expand All @@ -27,13 +27,15 @@
# retry logic.
FROM gcr.io/cloud-genomics-pipelines/io

ARG ENSEMBL_RELEASE=91
ARG ENSEMBL_RELEASE=101
ARG VEP_BASE=/opt/variant_effect_predictor

RUN apt-get -y update && apt-get install -y procps\
build-essential \
git \
libarchive-zip-perl \
libbz2-dev \
liblzma-dev \
libdbd-mysql-perl \
libdbi-perl \
libfile-copy-recursive-perl \
Expand Down
10 changes: 5 additions & 5 deletions batch/vep/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,17 +27,17 @@ Inside this directory, run:

This will download the source from
[VEP GitHub repo](https://github.com/Ensembl/ensembl-vep) and build VEP from
that source. By default, it uses version 91 of VEP. This can be changed by
that source. By default, it uses version 101 of VEP. This can be changed by
`ENSEMBL_RELEASE` build argument, e.g.,

`docker build . -t [IMAGE_TAG] --build-arg ENSEMBL_RELEASE=90`

Let's say we want to push this image to the
[Container Registry](https://cloud.google.com/container-registry/) of
`my-project` on Google Cloud, so we can pick `[IMAGE_TAG]` as
`gcr.io/my-project/vep:91`. Then push this image by:
`gcr.io/my-project/vep:101`. Then push this image by:

`gcloud docker -- push gcr.io/my-project/vep:91`
`gcloud docker -- push gcr.io/my-project/vep:101`

**TODO**: Add `cloudbuild.yaml` files for both easy push and integration test.

Expand All @@ -48,7 +48,7 @@ download and integrate different pieces of the VEP database or cache files.
Then from within that directory run the
[`build_vep_cache.sh`](build_vep_cache.sh) script. By default this script
creates the database for human (homo_sapiens), referenec sequence `GRCh38`,
and release 91 of VEP. These values can be overwritten by the following
and release 101 of VEP. These values can be overwritten by the following
environment variables (note you should use the same VEP release
that you used for creating VEP docker image above):

Expand All @@ -74,7 +74,7 @@ gcloud alpha genomics pipelines run \
--inputs VCF_INFO_FILED=CSQ_RERUN
```

Note the `vep_cache_homo_sapiens_GRCh38_91.tar.gz` file that is referenced in
Note the `vep_cache_homo_sapiens_GRCh38_101.tar.gz` file that is referenced in
the sample `yaml` file, is the output file that you get from the above database
creation step.

Expand Down
11 changes: 8 additions & 3 deletions batch/vep/build_vep_cache.sh
Original file line number Diff line number Diff line change
Expand Up @@ -26,14 +26,14 @@
# Capital letter variables refer to environment variables that can be set from
# outside. Internal variables have small letters. All environment variables
# have a default value as well to set up cache for homo_sapiens with reference
# GRCh38 and release 91 of VEP.
# GRCh38 and release 101 of VEP.
#
# More details on cache files can be found here:
# https://ensembl.org/info/docs/tools/vep/script/vep_cache.html

set -euo pipefail

readonly release="${ENSEMBL_RELEASE:-91}"
readonly release="${ENSEMBL_RELEASE:-101}"
readonly species="${VEP_SPECIES:-homo_sapiens}"
readonly assembly="${GENOME_ASSEMBLY:-GRCh38}"
readonly work_dir="vep_cache"
Expand Down Expand Up @@ -81,7 +81,12 @@ else
curl -O "${remote_fasta}.gzi"
fi

readonly remote_cache="${ftp_base}/variation/VEP/${cache_file}"
# The path naming convention changed from "VEP" to "vep" after build 95.
if (( release <= 95 )); then
readonly remote_cache="${ftp_base}/variation/VEP/${cache_file}"
else
readonly remote_cache="${ftp_base}/variation/vep/${cache_file}"
fi
echo "Downloading ${remote_cache} ..."
curl -O "${remote_cache}"
echo "Decompressing cache files ..."
Expand Down

0 comments on commit 935a7d3

Please sign in to comment.