Skip to content

Latest commit

 

History

History
62 lines (35 loc) · 3.75 KB

change-existing-WDL-for-Azure.md

File metadata and controls

62 lines (35 loc) · 3.75 KB

How to modify an existing WDL file to run on Cromwell on Azure

For any pipeline, you can create a WDL file that calls your tools in Docker containers. Please note that Cromwell on Azure only supports tasks with Docker containers defined for security reasons.

In order to run a WDL file, you must modify/create a workflow with the following runtime attributes for the tasks that are compliant with the TES or Task Execution Schemas:

runtime {
    cpu: 1
    memory: 2 GB
    disk: 10 GB
    docker:
    maxRetries: 0
}

Ensure that the attributes memory and disk (note: use the singular form for disk NOT disks) have units. Supported units from Cromwell:

KB - "KB", "K", "KiB", "Ki"
MB - "MB", "M", "MiB", "Mi"
GB - "GB", "G", "GiB", "Gi"
TB - "TB", "T", "TiB", "Ti"

The preemptible attribute is a boolean (not an integer). You can specify preemptible as true or false for each task. When set to true Cromwell on Azure will use a low-priority batch VM to run the task.

bootDiskSizeGb and zones attributes are not supported by the TES backend.
Each of these runtime attributes are specific to your workflow and tasks within those workflows. The default values for resource requirements are as set above.
Learn more about Cromwell's runtime attributes here.

Runtime attributes comparison with a GCP WDL file

Left panel shows a WDL file created for GCP whereas the right panel is the modified WDL that runs on Azure.

Runtime Attributes

Using maxRetries to replace the preemptible attribute

For a GCP WDL, preemptible is an integer - specifying the number of retries when using the flag. For Cromwell on Azure, if you want to use the preemptible attribute but don’t use maxRetries for a task, consider also adding maxRetries to keep the retry functionality. Remember that for each task in a workflow, you can either use a low-priority VM in batch (default configuration) or use a dedicated VM by setting preemptible to either true or false respectively.

Preemptible Attribute

You can choose to ALWAYS run dedicated VMs for every task, by modifying the docker-compose.yml setting UsePreemptibleVmsOnly as described in this section. The preemptible runtime attribute will overwrite the environment variable setting.

Accompanying index files for BAM or VCF files

If a tool you are using within a task assumes that an index file for your data (BAM or VCF file) is located in the same folder, add an index file to the list of parameters when defining and calling the task to ensure the accompanying index file is copied to the correct location for access:

Index file parameter

Index file called in task

Calculating disk_size when scattering

In the current implementation, the entire input file is passed to each task created by the WDL scatter operation. If you calculate disk_size runtime attribute dynamically within the task, use the full size of input file instead of dividing by the number of shards to allow for enough disk space to perform the task. Do not forget to add size of index files if you added them as parameters:

Disk size scatter