Tutorial for general UNIX computers without docker

Download cromwell.

$ cd
$ wget https://github.com/broadinstitute/cromwell/releases/download/34/cromwell-34.jar
$ chmod +rx cromwell-34.jar

Git clone this pipeline and move into it.

$ cd
$ git clone https://github.com/ENCODE-DCC/chip-seq-pipeline2
$ cd chip-seq-pipeline2

Download a SUBSAMPLED paired-end sample of ENCSR936XTK.

$ wget https://storage.googleapis.com/encode-pipeline-test-samples/encode-chip-seq-pipeline/ENCSR936XTK/ENCSR936XTK_fastq_subsampled.tar
$ tar xvf ENCSR936XTK_fastq_subsampled.tar

Download pre-built chr19/chrM-only genome database for hg38.

$ wget https://storage.googleapis.com/encode-pipeline-genome-data/test_genome_database_hg38_chr19_chrM_chip.tar
$ tar xvf test_genome_database_hg38_chr19_chrM_chip.tar

Install Conda. Skip this if you already have equivalent Conda alternatives (Anaconda Python). Download and run the installer. Agree to the license term by typing yes. It will ask you about the installation location. On Stanford clusters (Sherlock and SCG4), we recommend to install it outside of your $HOME directory since its filesystem is slow and has very limited space. At the end of the installation, choose yes to add Miniconda's binary to $PATH in your BASH startup script.
```
$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
$ bash Miniconda3-latest-Linux-x86_64.sh
```

Install Conda dependencies.

$ bash conda/uninstall_dependencies.sh  # to remove any existing pipeline env
$ bash conda/install_dependencies.sh

Run a pipeline for the test sample.

$ source activate encode-chip-seq-pipeline # IMPORTANT!
$ INPUT=examples/local/ENCSR936XTK_subsampled_chr19_only.json
$ PIPELINE_METADATA=metadata.json
$ java -jar -Dconfig.file=backends/backend.conf cromwell-34.jar run chip.wdl -i ${INPUT} -m ${PIPELINE_METADATA}

It will take about 6 hours. You will be able to find all outputs on cromwell-executions/chip/[RANDOM_HASH_STRING]/. See output directory structure for details.
See full specification for input JSON file.
You can resume a failed pipeline from where it left off by using PIPELINE_METADATA(metadata.json) file. This file is created for each pipeline run. See here for details. Once you get a new input JSON file from the resumer, use it INPUT=resume.[FAILED_WORKFLOW_ID].json instead of INPUT=examples/local/ENCSR936XTK_subsampled_chr19_only.json.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tutorial_local_conda.md

tutorial_local_conda.md

Tutorial for general UNIX computers without docker

Files

tutorial_local_conda.md

Latest commit

History

tutorial_local_conda.md

File metadata and controls

Tutorial for general UNIX computers without docker