-
Notifications
You must be signed in to change notification settings - Fork 55
Modifying WDLs for Azure
For any pipeline, you can create a WDL file that calls your tools in Docker containers. Please note that Cromwell on Azure only supports tasks with Docker containers defined for security reasons.
In order to run a WDL file, you must modify/create a workflow with the following runtime attributes for the tasks that are compliant with the TES or Task Execution Schemas:
runtime {
cpu: 1
memory: 2 GB
disk: 10 GB
docker:
maxRetries: 0
}
Ensure that the attributes memory
and disk
(note: use the singular form for disk
NOT disks
) have units. Supported units from Cromwell:
KB - "KB", "K", "KiB", "Ki"
MB - "MB", "M", "MiB", "Mi"
GB - "GB", "G", "GiB", "Gi"
TB - "TB", "T", "TiB", "Ti"
The preemptible
attribute is a boolean. You can specify preemptible
as true
or false
for each task. When set to true
Cromwell on Azure will use a low-priority batch VM to run the task.
Starting with Cromwell on Azure version 3.2 integer values for preemptible
are accepted and will be converted to boolean: true
for positive values, false
otherwise.
bootDiskSizeGb
and zones
attributes are not supported by the TES backend.
Each of these runtime attributes are specific to your workflow and tasks within those workflows. The default values for resource requirements are as set above.
Learn more about Cromwell's runtime attributes here.
Left panel shows a WDL file created for GCP whereas the right panel is the modified WDL that runs on Azure.
For a GCP WDL, preemptible
is an integer - specifying the number of retries when using the flag. Starting with Cromwell on Azure version 3.2 integer values for preemptible
are accepted and will be converted to boolean (true
for positive values, false
otherwise), but the retry functionality is not provided. Consider adding maxRetries
to keep the retry functionality. Remember that for each task in a workflow, you can either use a low-priority VM in batch (default configuration) or use a dedicated VM by setting preemptible
to either true
or false
respectively.
You can choose to ALWAYS run dedicated VMs for every task, by modifying the docker-compose.yml
setting UsePreemptibleVmsOnly
as described in this section. The preemptible
runtime attribute will overwrite the environment variable setting.
If a tool you are using within a task assumes that an index file for your data (BAM or VCF file) is located in the same folder, add an index file to the list of parameters when defining and calling the task to ensure the accompanying index file is copied to the correct location for access:
In the current implementation, the entire input file is passed to each task created by the WDL scatter
operation. If you calculate disk_size
runtime attribute dynamically within the task, use the full size of input file instead of dividing by the number of shards to allow for enough disk space to perform the task. Do not forget to add size of index files if you added them as parameters:
To search, expand the Pages section above.