Skip to content

SRA tools docker

skripche edited this page Dec 15, 2020 · 6 revisions

Documentation and Help for the NCBI SRA Toolkit Docker image

The NCBI SRA Toolkit is now maintaining a Docker image ncbi/sra-tools

Example usage:

Simple fasterq-dump

% docker run -t --rm -v $PWD:/output:rw -w /output ncbi/sra-tools fasterq-dump -e 2 -p SRR10985476
Unable to find image 'ncbi/sra-tools:latest' locally
latest: Pulling from ncbi/sra-tools
c9b1b535fdd9: Already exists 
0a6856f8fd06: Pull complete 
2d9bc7db21a2: Pull complete 
3de524257044: Pull complete 
Digest: sha256:631578b15625cc5390928772f1bf945847ce2981a81a95042729a47579396099
Status: Downloaded newer image for ncbi/sra-tools:latest
lookup :|-------------------------------------------------- 100%   
merge  : 16319508
join   :|-------------------------------------------------- 100%   
concat :|-------------------------------------------------- 100%   
spots read      : 14,965,183
reads read      : 14,965,183
reads written   : 14,965,183

Please note these suggested options included in the examples:

  • creating a host volume to write to: -v $PWD:/output:rw
  • setting the container working directory to the host volume: -w /output

Most tools write to the current working directory unless told otherwise, and you probably do not want the tools to write into the container's file system. So, please set the working directory to a host volume.

prefetch + fasterq-dump

% docker run -t --rm -v $PWD:/output:rw -w /output ncbi/sra-tools prefetch SRR10985476

2020-06-23T18:07:35 prefetch.2.10.8: 1) Downloading 'SRR10985476'...
2020-06-23T18:07:35 prefetch.2.10.8:  Downloading via HTTPS...
2020-06-23T18:07:45 prefetch.2.10.8:  HTTPS download succeed
2020-06-23T18:07:45 prefetch.2.10.8:  'SRR10985476' is valid
2020-06-23T18:07:45 prefetch.2.10.8: 1) 'SRR10985476' was downloaded successfully
2020-06-23T18:08:27 prefetch.2.10.8: 'SRR10985476' has 454 unresolved dependencies
2020-06-23T18:08:27 prefetch.2.10.8: 2) Downloading 'ncbi-acc:NC_000001.11?vdb-ctx=refseq'...
2020-06-23T18:08:27 prefetch.2.10.8:  Downloading via HTTPS...
2020-06-23T18:08:33 prefetch.2.10.8:  HTTPS download succeed
2020-06-23T18:08:33 prefetch.2.10.8: 2) 'ncbi-acc:NC_000001.11?vdb-ctx=refseq' was downloaded successfully

...

2020-06-23T18:10:25 prefetch.2.10.8: 455) Downloading 'ncbi-acc:NW_004504305.1?vdb-ctx=refseq'...
2020-06-23T18:10:25 prefetch.2.10.8:  Downloading via HTTPS...
2020-06-23T18:10:25 prefetch.2.10.8:  HTTPS download succeed
2020-06-23T18:10:25 prefetch.2.10.8: 455) 'ncbi-acc:NW_004504305.1?vdb-ctx=refseq' was downloaded successfully
% docker run -t --rm -v $PWD:/output:rw -w /output ncbi/sra-tools fasterq-dump -p SRR10985476     
lookup :|-------------------------------------------------- 100%   
merge  : 17976103
join   :|-------------------------------------------------- 100%   
concat :|-------------------------------------------------- 100%   
spots read      : 14,965,183
reads read      : 14,965,183
reads written   : 14,965,183

Please note that both commands are using the same host volume for the working directory. This allows the files that prefetch retrieved to be found by fasterq-dump.

Known issues and work-arounds:

TLS failures

We have seen TLS errors when running on AWS, like these:

2020-06-19T15:50:53 prefetch.2.10.7:  Downloading via HTTPS...
2020-06-19T15:50:53 prefetch.2.10.7 sys: mbedtls_ssl_get_verify_result returned 0x8 (  !! The certificate is not correctly signed by the trusted CA  )
2020-06-19T15:50:53 prefetch.2.10.7 int: connection failed while opening file within cryptographic module - Cannot KClientHttpRequestGET: /scratch/SRR5709848/SRR5709848.sra
2020-06-19T15:50:53 prefetch.2.10.7:  HTTPS download failed

The solution is to make the host's certificates visible inside the container:

docker run -v /etc/pki:/etc/pki:ro -v /etc/ssl:/etc/ssl:ro ...