diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md index 7d9aa01d08..2a80c57fff 100644 --- a/.github/CONTRIBUTING.md +++ b/.github/CONTRIBUTING.md @@ -19,7 +19,7 @@ If you'd like to write some code for nf-core/sarek, the standard workflow is as 1. Check that there isn't already an issue about your idea in the [nf-core/sarek issues](https://github.com/nf-core/sarek/issues) to avoid duplicating work. If there isn't one already, please create one so that others know you're working on this 2. [Fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) the [nf-core/sarek repository](https://github.com/nf-core/sarek) to your GitHub account 3. Make the necessary changes / additions within your forked repository following [Pipeline conventions](#pipeline-contribution-conventions) -4. Use `nf-core schema build` and add any new parameters to the pipeline JSON schema (requires [nf-core tools](https://github.com/nf-core/tools) >= 1.10). +4. Use `nf-core pipelines schema build` and add any new parameters to the pipeline JSON schema (requires [nf-core tools](https://github.com/nf-core/tools) >= 1.10). 5. Submit a Pull Request against the `dev` branch and wait for the code to be reviewed and merged If you're not used to this workflow with git, you can start with some [docs from GitHub](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests) or even their [excellent `git` resources](https://try.github.io/). @@ -40,7 +40,7 @@ There are typically two types of tests that run: ### Lint tests `nf-core` has a [set of guidelines](https://nf-co.re/developers/guidelines) which all pipelines must adhere to. -To enforce these and ensure that all pipelines stay in sync, we have developed a helper tool which runs checks on the pipeline code. This is in the [nf-core/tools repository](https://github.com/nf-core/tools) and once installed can be run locally with the `nf-core lint ` command. +To enforce these and ensure that all pipelines stay in sync, we have developed a helper tool which runs checks on the pipeline code. This is in the [nf-core/tools repository](https://github.com/nf-core/tools) and once installed can be run locally with the `nf-core pipelines lint ` command. If any failures or warnings are encountered, please follow the listed URL for more documentation. @@ -75,7 +75,7 @@ If you wish to contribute a new step, please use the following coding standards: 2. Write the process block (see below). 3. Define the output channel if needed (see below). 4. Add any new parameters to `nextflow.config` with a default (see below). -5. Add any new parameters to `nextflow_schema.json` with help text (via the `nf-core schema build` tool). +5. Add any new parameters to `nextflow_schema.json` with help text (via the `nf-core pipelines schema build` tool). 6. Add sanity checks and validation for all relevant parameters. 7. Perform local tests to validate that the new code works as expected. 8. If applicable, add a new test command in `.github/workflow/ci.yml`. @@ -86,11 +86,11 @@ If you wish to contribute a new step, please use the following coding standards: Parameters should be initialised / defined with default values in `nextflow.config` under the `params` scope. -Once there, use `nf-core schema build` to add to `nextflow_schema.json`. +Once there, use `nf-core pipelines schema build` to add to `nextflow_schema.json`. ### Default processes resource requirements -Sensible defaults for process resource requirements (CPUs / memory / time) for a process should be defined in `conf/base.config`. These should generally be specified generic with `withLabel:` selectors so they can be shared across multiple processes/steps of the pipeline. A nf-core standard set of labels that should be followed where possible can be seen in the [nf-core pipeline template](https://github.com/nf-core/tools/blob/master/nf_core/pipeline-template/conf/base.config), which has the default process as a single core-process, and then different levels of multi-core configurations for increasingly large memory requirements defined with standardised labels. +Sensible defaults for process resource requirements (CPUs / memory / time) for a process should be defined in `conf/base.config`. These should generally be specified generic with `withLabel:` selectors so they can be shared across multiple processes/steps of the pipeline. A nf-core standard set of labels that should be followed where possible can be seen in the [nf-core pipeline template](https://github.com/nf-core/tools/blob/main/nf_core/pipeline-template/conf/base.config), which has the default process as a single core-process, and then different levels of multi-core configurations for increasingly large memory requirements defined with standardised labels. The process resources can be passed on to the tool dynamically within the process with the `${task.cpus}` and `${task.memory}` variables in the `script:` block. @@ -103,7 +103,7 @@ Please use the following naming schemes, to make it easy to understand what is g ### Nextflow version bumping -If you are using a new feature from core Nextflow, you may bump the minimum required version of nextflow in the pipeline with: `nf-core bump-version --nextflow . [min-nf-version]` +If you are using a new feature from core Nextflow, you may bump the minimum required version of nextflow in the pipeline with: `nf-core pipelines bump-version --nextflow . [min-nf-version]` ### Images and figures diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index 934d8f2600..a52619f09f 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -17,7 +17,7 @@ Learn more about contributing: [CONTRIBUTING.md](https://github.com/nf-core/sare - [ ] If you've fixed a bug or added code that should be tested, add tests! - [ ] If you've added a new tool - have you followed the pipeline conventions in the [contribution docs](https://github.com/nf-core/sarek/tree/master/.github/CONTRIBUTING.md) - [ ] If necessary, also make a PR on the nf-core/sarek _branch_ on the [nf-core/test-datasets](https://github.com/nf-core/test-datasets) repository. -- [ ] Make sure your code lints (`nf-core lint`). +- [ ] Make sure your code lints (`nf-core pipelines lint`). - [ ] Ensure the test suite passes (`nextflow run . -profile test,docker --outdir `). - [ ] Check for unexpected warnings in debug mode (`nextflow run . -profile debug,test,docker --outdir `). - [ ] Usage Documentation in `docs/usage.md` is updated. diff --git a/.github/nf-test-tags.yml b/.github/nf-test-tags.yml new file mode 100644 index 0000000000..1c4f61b4d4 --- /dev/null +++ b/.github/nf-test-tags.yml @@ -0,0 +1,50 @@ +exclude: + - tags: "bcftools/annotate" + - tags: "bcftools/concat" + - tags: "bcftools/mpileup" + - tags: "bcftools/sort" + - tags: "bwa/index" + - tags: "bwa/mem" + - tags: "bwamem2/index" + - tags: "bwamem2/mem" + - tags: "cat/cat" + - tags: "cat/fastq" + - tags: "cnvkit/antitarget" + - tags: "cnvkit/batch" + - tags: "cnvkit/reference" + - tags: "deepvariant" + - tags: "dragmap/align" + - tags: "dragmap/hashtable" + - tags: "ensemblvep/download" + - tags: "ensemblvep/vep" + - tags: "fastp" + - tags: "fastqc" + - tags: "fgbio/fastqtobam" + - tags: "freebayes" + - tags: "gatk4/applybqsr" + - tags: "gatk4/baserecalibrator" + - tags: "gatk4/estimatelibrarycomplexity" + - tags: "gatk4/genomicsdbimport" + - tags: "gatk4/haplotypecaller" + - tags: "gatk4/markduplicates" + - tags: "gatk4/mergevcfs" + - tags: "gatk4/mutect2" + - tags: "gatk4spark/applybqsr" + - tags: "gatk4spark/markduplicates" + - tags: "gawk" + - tags: "lofreq/callparallel" + - tags: "mosdepth" + - tags: "multiqc" + - tags: "ngscheckmate/ncm" + - tags: "samblaster" + - tags: "samtools/convert" + - tags: "samtools/mpileup" + - tags: "samtools/stats" + - tags: "snpeff/snpeff" + - tags: "strelka/germline" + - tags: "strelka/somatic" + - tags: "subworkflows/utils_nfvalidation_plugin" + - tags: "tabix/bgziptabix" + - tags: "tabix/tabix" + - tags: "tiddit/sv" + - tags: "untar" diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index c48d11e7c4..c66c4c23c8 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -1,305 +1,91 @@ name: nf-core CI # This workflow runs the pipeline with the minimal test dataset to check that it completes without any syntax errors on: + push: + branches: + - dev pull_request: + paths-ignore: + - "docs/**" release: types: [published] - merge_group: - types: - - checks_requested - branches: - - master - - dev + workflow_dispatch: env: + NFT_DIFF: "pdiff" + NFT_DIFF_ARGS: "--line-numbers --width 120 --expand-tabs=2" + NFT_VER: "0.9.2" + NFT_WORKDIR: "~" NXF_ANSI_LOG: false - NFTEST_VER: "0.8.1" + NXF_SINGULARITY_CACHEDIR: ${{ github.workspace }}/.singularity + NXF_SINGULARITY_LIBRARYDIR: ${{ github.workspace }}/.singularity + SENTIEON_LICENSE_BASE64: ${{ secrets.SENTIEON_LICENSE_BASE64 }} + TEST_DATA_BASE: "${{ github.workspace }}/test-datasets" -# Cancel if a newer run is started concurrency: - group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }} + group: "${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}" cancel-in-progress: true jobs: - pytest-changes: - name: Check for changes (pytest) - runs-on: ubuntu-latest - outputs: - # Expose matched filters as job 'tags' output variable - tags: ${{ steps.filter.outputs.changes }} - steps: - - uses: actions/checkout@v4 - - - uses: frouioui/paths-filter@main - id: filter - with: - filters: "tests/config/pytesttags.yml" - token: "" - - pytest: - name: ${{ matrix.tags }} ${{ matrix.profile }} NF ${{ matrix.NXF_VER }} + test: runs-on: ubuntu-latest - needs: pytest-changes - if: needs.pytest-changes.outputs.tags != '[]' + name: "Test ${{ matrix.filter }} | ${{ matrix.profile }} | ${{ matrix.NXF_VER }} | ${{ matrix.shard }}/5" strategy: fail-fast: false matrix: - tags: ["${{ fromJson(needs.pytest-changes.outputs.tags) }}"] - profile: ["docker"] - # profile: ["docker", "singularity", "conda"] - TEST_DATA_BASE: - - "test-datasets/data" NXF_VER: - - "23.04.0" + - "24.04.2" - "latest-everything" + filter: ["workflow", "function", "pipeline"] + # filter: ["process", "workflow", "function", "pipeline"] + profile: ["conda", "docker", "singularity"] + shard: [1, 2, 3, 4, 5] + isMaster: + - ${{ github.base_ref == 'master' }} exclude: - - tags: "sentieon/bwamem" - - tags: "sentieon/dedup" - - tags: "sentieon/dnascope" - - tags: "sentieon/dnascope_joint_germline" - - tags: "sentieon/dnascope_skip_filter" - - tags: "sentieon/haplotyper" - - tags: "sentieon/haplotyper_joint_germline" - - tags: "sentieon/haplotyper_skip_filter" - - NXF_VER: "latest-everything" - tags: "joint_germline" - env: - NXF_ANSI_LOG: false - TEST_DATA_BASE: "${{ github.workspace }}/test-datasets" - SENTIEON_LICENSE_BASE64: ${{ secrets.SENTIEON_LICENSE_BASE64 }} + - isMaster: false + profile: "conda" + - isMaster: false + profile: "singularity" steps: - name: Check out pipeline code - uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b - - - name: Hash Github Workspace - id: hash_workspace - run: | - echo "digest=$(echo sarek3_${{ github.workspace }} | md5sum | cut -c 1-25)" >> $GITHUB_OUTPUT - - - name: Set up Python - uses: actions/setup-python@v4 + uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # v4 with: - python-version: "3.x" - cache: "pip" - cache-dependency-path: | - **/requirements.txt + fetch-depth: 0 - - name: Install Python dependencies - run: pip install --upgrade -r tests/requirements.txt - - - name: Install Nextflow ${{ matrix.NXF_VER }} - uses: nf-core/setup-nextflow@v2 - with: - version: "${{ matrix.NXF_VER }}" - - - name: Set up Singularity - if: matrix.profile == 'singularity' - uses: eWaterCycle/setup-singularity@v5 - with: - singularity-version: 3.7.1 - - - name: Set up miniconda - if: matrix.profile == 'conda' - uses: conda-incubator/setup-miniconda@v2 + - uses: actions/setup-python@f677139bbe7f9c59b41e40162b753c062f5d49a3 # v5 with: - auto-update-conda: true - channels: conda-forge,bioconda,defaults - python-version: ${{ matrix.python-version }} + python-version: "3.11" + architecture: "x64" - - name: Cache test data - id: cache-testdata - uses: actions/cache@v3 + - uses: actions/setup-java@8df1039502a15bceb9433410b1a100fbe190c53b # v4 with: - path: test-datasets/ - key: ${{ steps.hash_workspace.outputs.digest }} + distribution: "temurin" + java-version: "17" - - name: Check out test data - if: steps.cache-testdata.outputs.cache-hit != 'true' - uses: actions/checkout@v4 + - name: Set up Nextflow + uses: nf-core/setup-nextflow@v2 with: - repository: nf-core/test-datasets - ref: sarek3 - path: test-datasets/ - - - name: Replace remote paths in samplesheets - run: | - for f in tests/csv/3.0/*csv; do - sed -i "s=https://raw.githubusercontent.com/nf-core/test-datasets/modules/=${{ github.workspace }}/test-datasets/=g" $f - echo "========== $f ============" - cat $f - echo "========================================" - done; + version: "${{ matrix.NXF_VER }}" - # Set up secrets - - name: Set up nextflow secrets + - name: Set up Nextflow secrets if: env.SENTIEON_LICENSE_BASE64 != null run: | nextflow secrets set SENTIEON_LICENSE_BASE64 ${{ secrets.SENTIEON_LICENSE_BASE64 }} nextflow secrets set SENTIEON_AUTH_MECH_BASE64 ${{ secrets.SENTIEON_AUTH_MECH_BASE64 }} SENTIEON_ENCRYPTION_KEY=$(echo -n "${{ secrets.ENCRYPTION_KEY_BASE64 }}" | base64 -d) SENTIEON_LICENSE_MESSAGE=$(echo -n "${{ secrets.LICENSE_MESSAGE_BASE64 }}" | base64 -d) - SENTIEON_AUTH_DATA=$(python bin/license_message.py encrypt --key "$SENTIEON_ENCRYPTION_KEY" --message "$SENTIEON_LICENSE_MESSAGE") + SENTIEON_AUTH_DATA=$(python3 bin/license_message.py encrypt --key "$SENTIEON_ENCRYPTION_KEY" --message "$SENTIEON_LICENSE_MESSAGE") SENTIEON_AUTH_DATA_BASE64=$(echo -n "$SENTIEON_AUTH_DATA" | base64 -w 0) nextflow secrets set SENTIEON_AUTH_DATA_BASE64 $SENTIEON_AUTH_DATA_BASE64 - - name: Conda clean - if: matrix.profile == 'conda' - run: conda clean -a - - - name: Disk space cleanup - uses: jlumbroso/free-disk-space@v1.3.1 - - - name: Run pytest-workflow - uses: Wandalen/wretry.action@v1 - with: - command: TMPDIR=~ PROFILE=${{ matrix.profile }} pytest --tag ${{ matrix.tags }} --symlink --kwdof --git-aware --color=yes - attempt_limit: 3 - - - name: Output log on failure - if: failure() - run: | - sudo apt install bat > /dev/null - batcat --decorations=always --color=always /home/runner/pytest_workflow_*/*/log.{out,err} - - - name: Upload logs on failure - if: failure() - uses: actions/upload-artifact@v2 - with: - name: logs-${{ matrix.profile }} - path: | - /home/runner/pytest_workflow_*/*/.nextflow.log - /home/runner/pytest_workflow_*/*/log.out - /home/runner/pytest_workflow_*/*/log.err - /home/runner/pytest_workflow_*/*/work - !/home/runner/pytest_workflow_*/*/work/conda - !/home/runner/pytest_workflow_*/*/work/singularity - - nftest-changes: - name: Check for changes (nf-test) - runs-on: ubuntu-latest - outputs: - tags: ${{ steps.filter.outputs.changes }} - - steps: - - uses: actions/checkout@v4 - - - name: Combine all tags.yml files - id: get_tags - run: find . -name "tags.yml" -not -path "./.github/*" -exec cat {} + > .github/tags.yml - - - name: debug - run: cat .github/tags.yml - - - uses: frouioui/paths-filter@main - id: filter - with: - filters: ".github/tags.yml" - token: "" - - nftest: - name: ${{ matrix.tags }} ${{ matrix.profile }} NF ${{ matrix.NXF_VER }} - runs-on: ubuntu-latest - needs: nftest-changes - if: needs.nftest-changes.outputs.tags != '[]' - strategy: - fail-fast: false - matrix: - tags: ["${{ fromJson(needs.nftest-changes.outputs.tags) }}"] - profile: ["docker"] - # profile: ["docker", "singularity", "conda"] - TEST_DATA_BASE: - - "test-datasets/data" - NXF_VER: - - "23.04.0" - - "latest-everything" - exclude: - - tags: "bcftools/annotate" - - tags: "bcftools/concat" - - tags: "bcftools/mpileup" - - tags: "bcftools/sort" - - tags: "bwa/index" - - tags: "bwa/mem" - - tags: "bwamem2/index" - - tags: "bwamem2/mem" - - tags: "cat/cat" - - tags: "cat/fastq" - - tags: "cnvkit/antitarget" - - tags: "cnvkit/batch" - - tags: "cnvkit/reference" - - tags: "deepvariant" - - tags: "dragmap/align" - - tags: "dragmap/hashtable" - - tags: "ensemblvep/download" - - tags: "ensemblvep/vep" - - tags: "fastp" - - tags: "fastqc" - - tags: "fgbio/fastqtobam" - - tags: "freebayes" - - tags: "gatk4/applybqsr" - - tags: "gatk4/baserecalibrator" - - tags: "gatk4/estimatelibrarycomplexity" - - tags: "gatk4/genomicsdbimport" - - tags: "gatk4/haplotypecaller" - - tags: "gatk4/markduplicates" - - tags: "gatk4/mergevcfs" - - tags: "gatk4/mutect2" - - tags: "gatk4spark/applybqsr" - - tags: "gatk4spark/markduplicates" - - tags: "gawk" - - tags: "mosdepth" - - tags: "multiqc" - - tags: "ngscheckmate/ncm" - - tags: "samblaster" - - tags: "samtools/convert" - - tags: "samtools/mpileup" - - tags: "samtools/stats" - - tags: "snpeff/snpeff" - - tags: "strelka/germline" - - tags: "strelka/somatic" - - tags: "subworkflows/utils_nfvalidation_plugin" - - tags: "tabix/bgziptabix" - - tags: "tabix/tabix" - - tags: "tiddit/sv" - - tags: "untar" - - tags: "pipeline_sarek" - include: - - tags: "pipeline_sarek" - profile: "test,docker" - env: - NXF_ANSI_LOG: false - TEST_DATA_BASE: "${{ github.workspace }}/test-datasets" - SENTIEON_LICENSE_BASE64: ${{ secrets.SENTIEON_LICENSE_BASE64 }} - - steps: - - uses: actions/checkout@v4 - - - uses: actions/setup-java@v3 - with: - distribution: "temurin" - java-version: "17" - - - name: Install Nextflow ${{ matrix.NXF_VER }} - uses: nf-core/setup-nextflow@v2 - with: - version: "${{ matrix.NXF_VER }}" - - - name: Cache nf-test installation - id: cache-software - uses: actions/cache@v3 + - name: Set up nf-test + uses: nf-core/setup-nf-test@v1 with: - path: | - /usr/local/bin/nf-test - /home/runner/.nf-test/nf-test.jar - key: ${{ runner.os }}-${{ env.NFTEST_VER }}-nftest - - - name: Install nf-test - if: steps.cache-software.outputs.cache-hit != 'true' - run: | - wget -qO- https://code.askimed.com/install/nf-test | bash - sudo mv nf-test /usr/local/bin/ + version: ${{ env.NFT_VER }} - - name: Setup apptainer + - name: Set up apptainer if: matrix.profile == 'singularity' uses: eWaterCycle/setup-apptainer@main @@ -309,62 +95,84 @@ jobs: mkdir -p $NXF_SINGULARITY_CACHEDIR mkdir -p $NXF_SINGULARITY_LIBRARYDIR + - name: Cache pdiff + uses: actions/cache@0c45773b623bea8c8e75f6c82b208c3cf94ea4f9 # v4 + id: cache-pip-pdiff + with: + path: ~/.cache/pip + key: ${{ runner.os }}-pip-pdiff + + - name: Install pdiff + run: python -m pip install --upgrade pip pdiff cryptography + - name: Set up miniconda - uses: conda-incubator/setup-miniconda@v2 + if: matrix.profile == 'conda' + uses: conda-incubator/setup-miniconda@a4260408e20b96e80095f42ff7f1a15b27dd94ca # v3 with: miniconda-version: "latest" auto-update-conda: true - channels: conda-forge,bioconda,defaults - python-version: ${{ matrix.python-version }} + conda-solver: libmamba + channels: conda-forge,bioconda - - name: Conda setup + - name: Set up Conda + if: matrix.profile == 'conda' run: | - conda clean -a - conda install -n base conda-libmamba-solver - conda config --set solver libmamba echo $(realpath $CONDA)/condabin >> $GITHUB_PATH echo $(realpath python) >> $GITHUB_PATH - # Set up secrets - - name: Set up nextflow secrets - if: env.SENTIEON_LICENSE_BASE64 != null - run: | - nextflow secrets set SENTIEON_LICENSE_BASE64 ${{ secrets.SENTIEON_LICENSE_BASE64 }} - nextflow secrets set SENTIEON_AUTH_MECH_BASE64 ${{ secrets.SENTIEON_AUTH_MECH_BASE64 }} - SENTIEON_ENCRYPTION_KEY=$(echo -n "${{ secrets.ENCRYPTION_KEY_BASE64 }}" | base64 -d) - SENTIEON_LICENSE_MESSAGE=$(echo -n "${{ secrets.LICENSE_MESSAGE_BASE64 }}" | base64 -d) - SENTIEON_AUTH_DATA=$(python3 bin/license_message.py encrypt --key "$SENTIEON_ENCRYPTION_KEY" --message "$SENTIEON_LICENSE_MESSAGE") - SENTIEON_AUTH_DATA_BASE64=$(echo -n "$SENTIEON_AUTH_DATA" | base64 -w 0) - nextflow secrets set SENTIEON_AUTH_DATA_BASE64 $SENTIEON_AUTH_DATA_BASE64 - - name: Disk space cleanup uses: jlumbroso/free-disk-space@v1.3.1 - # Test the module - - name: Run nf-test + - name: Start summary + id: print-test run: | + echo "## nf-test tests summary :rocket:" >> $GITHUB_STEP_SUMMARY + echo "" >> $GITHUB_STEP_SUMMARY + echo "This \`${{ matrix.filter }}\` ${{ matrix.shard }}/5 shard was run on \`${{ matrix.profile }}\` | \`NXF_VER=${{ matrix.NXF_VER }}\`, and contains the following test(s):" >> $GITHUB_STEP_SUMMARY + echo "" >> $GITHUB_STEP_SUMMARY nf-test test \ - --profile=${{ matrix.profile }} \ - --tag ${{ matrix.tags }} \ - --tap=test.tap \ - --verbose + --ci \ + --dryRun \ + --junitxml="TEST-${{ matrix.filter }}_${{ matrix.profile }}_${{ matrix.shard }}.xml" \ + --shard ${{ matrix.shard }}/5 \ + --changed-since HEAD^ \ + --follow-dependencies \ + --profile "+${{ matrix.profile }}" \ + --filter ${{ matrix.filter }} \ + | grep PASSED | cut -d "'" -f 2 | sed 's/^/- /' | sort -u >> $GITHUB_STEP_SUMMARY + + - name: "Run tests | ${{ matrix.filter }}_${{ matrix.profile }} | ${{ matrix.shard }}/5" + run: | + nf-test test \ + --ci \ + --debug \ + --verbose \ + --junitxml="TEST-${{ matrix.filter }}_${{ matrix.profile }}_${{ matrix.shard }}.xml" \ + --shard ${{ matrix.shard }}/5 \ + --changed-since HEAD^ \ + --follow-dependencies \ + --profile "+${{ matrix.profile }}" \ + --filter ${{ matrix.filter }} + + - name: Print success in summary + if: success() + run: | + echo "" >> $GITHUB_STEP_SUMMARY + echo "All test(s) successfull :tada:" >> $GITHUB_STEP_SUMMARY - confirm-pass: - runs-on: ubuntu-latest - needs: - - pytest - - nftest - if: always() - steps: - - name: All tests ok - if: ${{ success() || !contains(needs.*.result, 'failure') }} - run: exit 0 - - name: One or more tests failed - if: ${{ contains(needs.*.result, 'failure') }} - run: exit 1 + - name: Print failure in summary + if: failure() + run: | + echo "" >> $GITHUB_STEP_SUMMARY + echo "Some test(s) failed :cold_sweat:" >> $GITHUB_STEP_SUMMARY + + - name: Publish Test Report + uses: mikepenz/action-junit-report@v4 + if: success() || failure() # always run even if the previous step fails + with: + report_paths: "TEST-*.xml" - - name: debug-print - if: always() + - name: Clean up + if: success() || failure() run: | - echo "toJSON(needs) = ${{ toJSON(needs) }}" - echo "toJSON(needs.*.result) = ${{ toJSON(needs.*.result) }}" + sudo rm -rf /home/ubuntu/tests/ diff --git a/.github/workflows/cloudtest.yml b/.github/workflows/cloudtest.yml index 546634c42e..932951b07c 100644 --- a/.github/workflows/cloudtest.yml +++ b/.github/workflows/cloudtest.yml @@ -1,8 +1,13 @@ -name: nf-core cloud test +name: nf-core cloud full size tests +# This workflow is triggered on PRs opened against the master branch. +# It can be additionally triggered manually with GitHub actions workflow dispatch button. +# It runs the -profile 'test_full' on cloud on: release: types: [created] + pull_request_review: + types: [submitted] workflow_dispatch: inputs: test: @@ -31,10 +36,7 @@ on: default: true jobs: - trigger-test: - name: launch - runs-on: ubuntu-latest - if: ${{ github.repository == 'nf-core/sarek' }} + run-platform: strategy: fail-fast: false matrix: @@ -74,10 +76,26 @@ jobs: cloud: aws compute_env: TOWER_COMPUTE_ENV workdir: TOWER_BUCKET_AWS - + name: Run AWS full tests + # run only if the PR is approved by at least 2 reviewers and against the master branch or manually triggered + if: github.repository == 'nf-core/sarek' && github.event.review.state == 'approved' && github.event.pull_request.base.ref == 'master' || github.event_name == 'workflow_dispatch' + runs-on: ubuntu-latest steps: + - uses: octokit/request-action@v2.x + id: check_approvals + with: + route: GET /repos/${{ github.repository }}/pulls/${{ github.event.pull_request.number }}/reviews + env: + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} + - id: test_variables + if: github.event_name != 'workflow_dispatch' + run: | + JSON_RESPONSE='${{ steps.check_approvals.outputs.data }}' + CURRENT_APPROVALS_COUNT=$(echo $JSON_RESPONSE | jq -c '[.[] | select(.state | contains("APPROVED")) ] | length') + test $CURRENT_APPROVALS_COUNT -ge 2 || exit 1 # At least 2 approvals are required + # Launch workflow on AWS Batch - - name: Launch + - name: Launch workflow via Seqera Platform uses: seqeralabs/action-tower-launch@v2 # If inputs item exists (i.e. workflow_dispatch), then we find matrix.test and check it is false # If is false, we negate and run the job diff --git a/.github/workflows/download_pipeline.yml b/.github/workflows/download_pipeline.yml index 2d20d64422..713dc3e739 100644 --- a/.github/workflows/download_pipeline.yml +++ b/.github/workflows/download_pipeline.yml @@ -1,4 +1,4 @@ -name: Test successful pipeline download with 'nf-core download' +name: Test successful pipeline download with 'nf-core pipelines download' # Run the workflow when: # - dispatched manually @@ -8,7 +8,7 @@ on: workflow_dispatch: inputs: testbranch: - description: "The specific branch you wish to utilize for the test execution of nf-core download." + description: "The specific branch you wish to utilize for the test execution of nf-core pipelines download." required: true default: "dev" pull_request: @@ -39,9 +39,11 @@ jobs: with: python-version: "3.12" architecture: "x64" - - uses: eWaterCycle/setup-singularity@931d4e31109e875b13309ae1d07c70ca8fbc8537 # v7 + + - name: Setup Apptainer + uses: eWaterCycle/setup-apptainer@4bb22c52d4f63406c49e94c804632975787312b3 # v2.0.0 with: - singularity-version: 3.8.3 + apptainer-version: 1.3.4 - name: Install dependencies run: | @@ -54,33 +56,64 @@ jobs: echo "REPOTITLE_LOWERCASE=$(basename ${GITHUB_REPOSITORY,,})" >> ${GITHUB_ENV} echo "REPO_BRANCH=${{ github.event.inputs.testbranch || 'dev' }}" >> ${GITHUB_ENV} + - name: Make a cache directory for the container images + run: | + mkdir -p ./singularity_container_images + - name: Download the pipeline env: - NXF_SINGULARITY_CACHEDIR: ./ + NXF_SINGULARITY_CACHEDIR: ./singularity_container_images run: | - nf-core download ${{ env.REPO_LOWERCASE }} \ + nf-core pipelines download ${{ env.REPO_LOWERCASE }} \ --revision ${{ env.REPO_BRANCH }} \ --outdir ./${{ env.REPOTITLE_LOWERCASE }} \ --compress "none" \ --container-system 'singularity' \ - --container-library "quay.io" -l "docker.io" -l "ghcr.io" \ + --container-library "quay.io" -l "docker.io" -l "community.wave.seqera.io" \ --container-cache-utilisation 'amend' \ - --download-configuration + --download-configuration 'yes' - name: Inspect download run: tree ./${{ env.REPOTITLE_LOWERCASE }} + - name: Count the downloaded number of container images + id: count_initial + run: | + image_count=$(ls -1 ./singularity_container_images | wc -l | xargs) + echo "Initial container image count: $image_count" + echo "IMAGE_COUNT_INITIAL=$image_count" >> ${GITHUB_ENV} + - name: Run the downloaded pipeline (stub) id: stub_run_pipeline continue-on-error: true env: - NXF_SINGULARITY_CACHEDIR: ./ + NXF_SINGULARITY_CACHEDIR: ./singularity_container_images NXF_SINGULARITY_HOME_MOUNT: true run: nextflow run ./${{ env.REPOTITLE_LOWERCASE }}/$( sed 's/\W/_/g' <<< ${{ env.REPO_BRANCH }}) -stub -profile test,singularity --outdir ./results - name: Run the downloaded pipeline (stub run not supported) id: run_pipeline if: ${{ job.steps.stub_run_pipeline.status == failure() }} env: - NXF_SINGULARITY_CACHEDIR: ./ + NXF_SINGULARITY_CACHEDIR: ./singularity_container_images NXF_SINGULARITY_HOME_MOUNT: true run: nextflow run ./${{ env.REPOTITLE_LOWERCASE }}/$( sed 's/\W/_/g' <<< ${{ env.REPO_BRANCH }}) -profile test,singularity --outdir ./results + + - name: Count the downloaded number of container images + id: count_afterwards + run: | + image_count=$(ls -1 ./singularity_container_images | wc -l | xargs) + echo "Post-pipeline run container image count: $image_count" + echo "IMAGE_COUNT_AFTER=$image_count" >> ${GITHUB_ENV} + + - name: Compare container image counts + run: | + if [ "${{ env.IMAGE_COUNT_INITIAL }}" -ne "${{ env.IMAGE_COUNT_AFTER }}" ]; then + initial_count=${{ env.IMAGE_COUNT_INITIAL }} + final_count=${{ env.IMAGE_COUNT_AFTER }} + difference=$((final_count - initial_count)) + echo "$difference additional container images were \n downloaded at runtime . The pipeline has no support for offline runs!" + tree ./singularity_container_images + exit 1 + else + echo "The pipeline can be downloaded successfully!" + fi diff --git a/.github/workflows/linting.yml b/.github/workflows/linting.yml index 1fcafe8805..a502573c5a 100644 --- a/.github/workflows/linting.yml +++ b/.github/workflows/linting.yml @@ -1,6 +1,6 @@ name: nf-core linting # This workflow is triggered on pushes and PRs to the repository. -# It runs the `nf-core lint` and markdown lint tests to ensure +# It runs the `nf-core pipelines lint` and markdown lint tests to ensure # that the code meets the nf-core guidelines. on: push: @@ -41,17 +41,32 @@ jobs: python-version: "3.12" architecture: "x64" + - name: read .nf-core.yml + uses: pietrobolcato/action-read-yaml@1.1.0 + id: read_yml + with: + config: ${{ github.workspace }}/.nf-core.yml + - name: Install dependencies run: | python -m pip install --upgrade pip - pip install nf-core + pip install nf-core==${{ steps.read_yml.outputs['nf_core_version'] }} + + - name: Run nf-core pipelines lint + if: ${{ github.base_ref != 'master' }} + env: + GITHUB_COMMENTS_URL: ${{ github.event.pull_request.comments_url }} + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} + GITHUB_PR_COMMIT: ${{ github.event.pull_request.head.sha }} + run: nf-core -l lint_log.txt pipelines lint --dir ${GITHUB_WORKSPACE} --markdown lint_results.md - - name: Run nf-core lint + - name: Run nf-core pipelines lint --release + if: ${{ github.base_ref == 'master' }} env: GITHUB_COMMENTS_URL: ${{ github.event.pull_request.comments_url }} GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} GITHUB_PR_COMMIT: ${{ github.event.pull_request.head.sha }} - run: nf-core -l lint_log.txt lint --dir ${GITHUB_WORKSPACE} --markdown lint_results.md + run: nf-core -l lint_log.txt pipelines lint --release --dir ${GITHUB_WORKSPACE} --markdown lint_results.md - name: Save PR number if: ${{ always() }} diff --git a/.github/workflows/linting_comment.yml b/.github/workflows/linting_comment.yml index 40acc23f5b..42e519bfac 100644 --- a/.github/workflows/linting_comment.yml +++ b/.github/workflows/linting_comment.yml @@ -11,7 +11,7 @@ jobs: runs-on: ubuntu-latest steps: - name: Download lint results - uses: dawidd6/action-download-artifact@09f2f74827fd3a8607589e5ad7f9398816f540fe # v3 + uses: dawidd6/action-download-artifact@bf251b5aa9c2f7eeb574a96ee720e24f801b7c11 # v6 with: workflow: linting.yml workflow_conclusion: completed diff --git a/.github/workflows/pytest.yml b/.github/workflows/pytest.yml new file mode 100644 index 0000000000..bd1dcae14c --- /dev/null +++ b/.github/workflows/pytest.yml @@ -0,0 +1,213 @@ +name: pytest-workflow +# This workflow runs the pipeline with the minimal test dataset to check that it completes without any syntax errors +on: + pull_request: + release: + types: [published] + merge_group: + types: + - checks_requested + branches: + - master + - dev + +env: + NXF_ANSI_LOG: false + TEST_DATA_BASE: "${{ github.workspace }}/test-datasets" + SENTIEON_LICENSE_BASE64: ${{ secrets.SENTIEON_LICENSE_BASE64 }} + NXF_SINGULARITY_CACHEDIR: ${{ github.workspace }}/.singularity + NXF_SINGULARITY_LIBRARYDIR: ${{ github.workspace }}/.singularity + +# Cancel if a newer run is started +concurrency: + group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }} + cancel-in-progress: true + +jobs: + pytest-changes: + name: Check for changes (pytest) + runs-on: ubuntu-latest + outputs: + # Expose matched filters as job 'tags' output variable + tags: ${{ steps.filter.outputs.changes }} + steps: + - uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4 + + - uses: frouioui/paths-filter@main + id: filter + with: + filters: "tests/config/pytesttags.yml" + token: "" + + pytest: + name: ${{ matrix.tags }} ${{ matrix.profile }} NF ${{ matrix.NXF_VER }} + runs-on: ubuntu-latest + needs: pytest-changes + if: needs.pytest-changes.outputs.tags != '[]' + strategy: + fail-fast: false + matrix: + tags: ["${{ fromJson(needs.pytest-changes.outputs.tags) }}"] + profile: ["docker"] + # profile: ["docker", "singularity", "conda"] + TEST_DATA_BASE: + - "test-datasets/data" + NXF_VER: + - "24.04.2" + - "latest-everything" + exclude: + - tags: "sentieon/bwamem" + - tags: "sentieon/dedup" + - tags: "sentieon/dnascope" + - tags: "sentieon/dnascope_joint_germline" + - tags: "sentieon/dnascope_skip_filter" + - tags: "sentieon/haplotyper" + - tags: "sentieon/haplotyper_joint_germline" + - tags: "sentieon/haplotyper_skip_filter" + - NXF_VER: "latest-everything" + tags: "joint_germline" + + steps: + - name: Check out pipeline code + uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4 + + - name: Hash Github Workspace + id: hash_workspace + run: | + echo "digest=$(echo sarek3_${{ github.workspace }} | md5sum | cut -c 1-25)" >> $GITHUB_OUTPUT + + - name: Set up Python + uses: actions/setup-python@f677139bbe7f9c59b41e40162b753c062f5d49a3 # v5 + with: + python-version: "3.11" + cache: "pip" + cache-dependency-path: | + **/requirements.txt + + - name: Install Python dependencies + run: pip install --upgrade -r tests/requirements.txt + + - uses: actions/setup-java@8df1039502a15bceb9433410b1a100fbe190c53b # v4 + with: + distribution: "temurin" + java-version: "17" + + - name: Install Nextflow ${{ matrix.NXF_VER }} + uses: nf-core/setup-nextflow@v2 + with: + version: "${{ matrix.NXF_VER }}" + + - name: Setup apptainer + if: matrix.profile == 'singularity' + uses: eWaterCycle/setup-apptainer@main + + - name: Set up Singularity + if: matrix.profile == 'singularity' + run: | + mkdir -p $NXF_SINGULARITY_CACHEDIR + mkdir -p $NXF_SINGULARITY_LIBRARYDIR + + - name: Set up miniconda + if: matrix.profile == 'conda' + uses: conda-incubator/setup-miniconda@a4260408e20b96e80095f42ff7f1a15b27dd94ca # v3 + with: + miniconda-version: "latest" + auto-update-conda: true + channels: conda-forge,bioconda + + - name: Conda setup + if: matrix.profile == 'conda' + run: | + conda clean -a + conda install -n base conda-libmamba-solver + conda config --set solver libmamba + echo $(realpath $CONDA)/condabin >> $GITHUB_PATH + echo $(realpath python) >> $GITHUB_PATH + + - name: Cache test data + id: cache-testdata + uses: actions/cache@0c45773b623bea8c8e75f6c82b208c3cf94ea4f9 # v4 + with: + path: test-datasets/ + key: ${{ steps.hash_workspace.outputs.digest }} + + - name: Check out test data + if: steps.cache-testdata.outputs.cache-hit != 'true' + uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4 + with: + repository: nf-core/test-datasets + ref: sarek3 + path: test-datasets/ + + - name: Replace remote paths in samplesheets + run: | + for f in tests/csv/3.0/*csv; do + sed -i "s=https://raw.githubusercontent.com/nf-core/test-datasets/modules/=${{ github.workspace }}/test-datasets/=g" $f + echo "========== $f ============" + cat $f + echo "========================================" + done; + + # Set up secrets + - name: Set up nextflow secrets + if: env.SENTIEON_LICENSE_BASE64 != null + run: | + nextflow secrets set SENTIEON_LICENSE_BASE64 ${{ secrets.SENTIEON_LICENSE_BASE64 }} + nextflow secrets set SENTIEON_AUTH_MECH_BASE64 ${{ secrets.SENTIEON_AUTH_MECH_BASE64 }} + SENTIEON_ENCRYPTION_KEY=$(echo -n "${{ secrets.ENCRYPTION_KEY_BASE64 }}" | base64 -d) + SENTIEON_LICENSE_MESSAGE=$(echo -n "${{ secrets.LICENSE_MESSAGE_BASE64 }}" | base64 -d) + SENTIEON_AUTH_DATA=$(python bin/license_message.py encrypt --key "$SENTIEON_ENCRYPTION_KEY" --message "$SENTIEON_LICENSE_MESSAGE") + SENTIEON_AUTH_DATA_BASE64=$(echo -n "$SENTIEON_AUTH_DATA" | base64 -w 0) + nextflow secrets set SENTIEON_AUTH_DATA_BASE64 $SENTIEON_AUTH_DATA_BASE64 + + - name: Conda clean + if: matrix.profile == 'conda' + run: conda clean -a + + - name: Disk space cleanup + uses: jlumbroso/free-disk-space@v1.3.1 + + - name: Run pytest-workflow + uses: Wandalen/wretry.action@v1 + with: + command: TMPDIR=~ PROFILE=${{ matrix.profile }} pytest --tag ${{ matrix.tags }} --symlink --kwdof --git-aware --color=yes + attempt_limit: 3 + + - name: Output log on failure + if: failure() + run: | + sudo apt install bat > /dev/null + batcat --decorations=always --color=always /home/runner/pytest_workflow_*/*/log.{out,err} + + - name: Upload logs on failure + if: failure() + uses: actions/upload-artifact@50769540e7f4bd5e21e526ee35c689e35e0d6874 # v4 + with: + name: logs-${{ matrix.profile }} + path: | + /home/ubuntu/pytest_workflow_*/*/.nextflow.log + /home/ubuntu/pytest_workflow_*/*/log.out + /home/ubuntu/pytest_workflow_*/*/log.err + /home/ubuntu/pytest_workflow_*/*/work + !/home/ubuntu/pytest_workflow_*/*/work/conda + !/home/ubuntu/pytest_workflow_*/*/work/singularity + !${{ github.workspace }}/.singularity + + confirm-pass: + runs-on: ubuntu-latest + needs: + - pytest + if: always() + steps: + - name: All tests ok + if: ${{ success() || !contains(needs.*.result, 'failure') }} + run: exit 0 + - name: One or more tests failed + if: ${{ contains(needs.*.result, 'failure') }} + run: exit 1 + + - name: debug-print + if: always() + run: | + echo "toJSON(needs) = ${{ toJSON(needs) }}" + echo "toJSON(needs.*.result) = ${{ toJSON(needs.*.result) }}" diff --git a/.github/workflows/release-announcements.yml b/.github/workflows/release-announcements.yml index 03ecfcf720..c6ba35df48 100644 --- a/.github/workflows/release-announcements.yml +++ b/.github/workflows/release-announcements.yml @@ -12,7 +12,7 @@ jobs: - name: get topics and convert to hashtags id: get_topics run: | - echo "topics=$(curl -s https://nf-co.re/pipelines.json | jq -r '.remote_workflows[] | select(.full_name == "${{ github.repository }}") | .topics[]' | awk '{print "#"$0}' | tr '\n' ' ')" >> $GITHUB_OUTPUT + echo "topics=$(curl -s https://nf-co.re/pipelines.json | jq -r '.remote_workflows[] | select(.full_name == "${{ github.repository }}") | .topics[]' | awk '{print "#"$0}' | tr '\n' ' ')" | sed 's/-//g' >> $GITHUB_OUTPUT - uses: rzr/fediverse-action@master with: diff --git a/.github/workflows/template_version_comment.yml b/.github/workflows/template_version_comment.yml new file mode 100644 index 0000000000..e8aafe44d6 --- /dev/null +++ b/.github/workflows/template_version_comment.yml @@ -0,0 +1,46 @@ +name: nf-core template version comment +# This workflow is triggered on PRs to check if the pipeline template version matches the latest nf-core version. +# It posts a comment to the PR, even if it comes from a fork. + +on: pull_request_target + +jobs: + template_version: + runs-on: ubuntu-latest + steps: + - name: Check out pipeline code + uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # v4 + with: + ref: ${{ github.event.pull_request.head.sha }} + + - name: Read template version from .nf-core.yml + uses: nichmor/minimal-read-yaml@v0.0.2 + id: read_yml + with: + config: ${{ github.workspace }}/.nf-core.yml + + - name: Install nf-core + run: | + python -m pip install --upgrade pip + pip install nf-core==${{ steps.read_yml.outputs['nf_core_version'] }} + + - name: Check nf-core outdated + id: nf_core_outdated + run: echo "OUTPUT=$(pip list --outdated | grep nf-core)" >> ${GITHUB_ENV} + + - name: Post nf-core template version comment + uses: mshick/add-pr-comment@b8f338c590a895d50bcbfa6c5859251edc8952fc # v2 + if: | + contains(env.OUTPUT, 'nf-core') + with: + repo-token: ${{ secrets.NF_CORE_BOT_AUTH_TOKEN }} + allow-repeats: false + message: | + > [!WARNING] + > Newer version of the nf-core template is available. + > + > Your pipeline is using an old version of the nf-core template: ${{ steps.read_yml.outputs['nf_core_version'] }}. + > Please update your pipeline to the latest version. + > + > For more documentation on how to update your pipeline, please see the [nf-core documentation](https://github.com/nf-core/tools?tab=readme-ov-file#sync-a-pipeline-with-the-template) and [Synchronisation documentation](https://nf-co.re/docs/contributing/sync). + # diff --git a/.gitignore b/.gitignore index c807bd5d3f..9cc2a80834 100644 --- a/.gitignore +++ b/.gitignore @@ -6,6 +6,7 @@ results/ testing/ testing* *.pyc +null/ *.code-workspace .nf-test* .nf-test/ diff --git a/.gitpod.yml b/.gitpod.yml index 105a1821a1..4611863760 100644 --- a/.gitpod.yml +++ b/.gitpod.yml @@ -4,17 +4,14 @@ tasks: command: | pre-commit install --install-hooks nextflow self-update - - name: unset JAVA_TOOL_OPTIONS - command: | - unset JAVA_TOOL_OPTIONS vscode: extensions: # based on nf-core.nf-core-extensionpack - - esbenp.prettier-vscode # Markdown/CommonMark linting and style checking for Visual Studio Code + #- esbenp.prettier-vscode # Markdown/CommonMark linting and style checking for Visual Studio Code - EditorConfig.EditorConfig # override user/workspace settings with settings found in .editorconfig files - Gruntfuggly.todo-tree # Display TODO and FIXME in a tree view in the activity bar - mechatroner.rainbow-csv # Highlight columns in csv files in different colors - # - nextflow.nextflow # Nextflow syntax highlighting + - nextflow.nextflow # Nextflow syntax highlighting - oderwat.indent-rainbow # Highlight indentation level - streetsidesoftware.code-spell-checker # Spelling checker for source code - charliermarsh.ruff # Code linter Ruff diff --git a/.nf-core.yml b/.nf-core.yml index e0b3aa1f76..7beff01213 100644 --- a/.nf-core.yml +++ b/.nf-core.yml @@ -1,16 +1,28 @@ -repository_type: pipeline -nf_core_version: "2.14.1" +bump_version: null lint: - actions_ci: False + actions_ci: false files_exist: - .github/workflows/awsfulltest.yml - .github/workflows/awstest.yml - conf/modules.config files_unchanged: - .gitignore - - .github/PULL_REQUEST_TEMPLATE.md - assets/nf-core-sarek_logo_light.png - docs/images/nf-core-sarek_logo_dark.png - docs/images/nf-core-sarek_logo_light.png - modules_config: False - template_strings: False + modules_config: false + template_strings: false +nf_core_version: 3.0.2 +org_path: null +repository_type: pipeline +template: + author: Maxime Garcia, Szilveszter Juhos, Friederike Hanssen + description: An open-source analysis pipeline to detect germline or somatic variants from whole genome or targeted sequencing + force: false + is_nfcore: true + name: sarek + org: nf-core + outdir: . + skip_features: null + version: 3.5.0dev +update: null diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 4dc0f1dcd7..9e9f0e1c4e 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -7,7 +7,7 @@ repos: - prettier@3.2.5 - repo: https://github.com/editorconfig-checker/editorconfig-checker.python - rev: "2.7.3" + rev: "3.0.3" hooks: - id: editorconfig-checker alias: ec diff --git a/.vscode/settings.json b/.vscode/settings.json new file mode 100644 index 0000000000..e810756abd --- /dev/null +++ b/.vscode/settings.json @@ -0,0 +1,3 @@ +{ + "nextflow.formatting.harshilAlignment": true +} diff --git a/CHANGELOG.md b/CHANGELOG.md index c80878bad6..abdf922c27 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,92 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [3.5.0](https://github.com/nf-core/sarek/releases/tag/3.5.0) - Áhkájiegna + +A set of connecting glaciers. + +### Added + +- [1613](https://github.com/nf-core/sarek/pull/1613) - add indexcov +- [1638](https://github.com/nf-core/sarek/pull/1638) - Added additional documentation detailing ASCAT WES usage. +- [1640](https://github.com/nf-core/sarek/pull/1620) - Add `lofreq` as a tumor-only variant caller +- [1642](https://github.com/nf-core/sarek/pull/1642) - Back to dev +- [1653](https://github.com/nf-core/sarek/pull/1653) - Updates `sarek_subway` files with `lofreq` +- [1660](https://github.com/nf-core/sarek/pull/1642) - Add `--length_required` for minimal reads length with `FASTP` +- [1663](https://github.com/nf-core/sarek/pull/1663) - Massive conda modules update +- [1664](https://github.com/nf-core/sarek/pull/1664) - Check if flowcell ID matches for read pair +- [1730](https://github.com/nf-core/sarek/pull/1730) - Enable Harshil Alignment™️ in VS Code workspace settings + +### Changed + +- [1579](https://github.com/nf-core/sarek/pull/1579) - Update Sentieon usage docs +- [1635](https://github.com/nf-core/sarek/pull/1635) - Fix docs to reflect variant calling tool - data type correctly +- [1668](https://github.com/nf-core/sarek/pull/1668) - Add nf-test sharding CI +- [1669](https://github.com/nf-core/sarek/pull/1669) - Better nf-test pipeline level tests +- [1677](https://github.com/nf-core/sarek/pull/1677) - Migrate pytest aligner and pipeline default tests to nf-test +- [1680](https://github.com/nf-core/sarek/pull/1680) - Template update for nf-core/tools v3.0.0 +- [1681](https://github.com/nf-core/sarek/pull/1681) - Template update for nf-core/tools v3.0.1 +- [1686](https://github.com/nf-core/sarek/pull/1686) - Template update for nf-core/tools v3.0.2 +- [1692](https://github.com/nf-core/sarek/pull/1692) - Update ensemblvep +- [1695](https://github.com/nf-core/sarek/pull/1695) - Update all modules +- [1707](https://github.com/nf-core/sarek/pull/1707) - Un-hide parameters and clean up Json schema +- [1708](https://github.com/nf-core/sarek/pull/1708) - Migrate pipeline pytest alignment and annotation tests to nf-test +- [1711](https://github.com/nf-core/sarek/pull/1711) - Migrate pipeline pytest strelka tests to nf-test +- [1731](https://github.com/nf-core/sarek/pull/1731) - Migrate pipeline pytest controlfreec tests to nf-test + +### Fixed + +- [1624](https://github.com/nf-core/sarek/pull/1624) - Fix channel stalling for bcftools index +- [1657](https://github.com/nf-core/sarek/pull/1657) - Update all actions used in the GHA CI +- [1661](https://github.com/nf-core/sarek/pull/1661) - nf-test pipeline level tests +- [1673](https://github.com/nf-core/sarek/pull/1673) - Print warning message instead of silent error with Nextflow versions prior to 24.08.0edge +- [1693](https://github.com/nf-core/sarek/pull/1693) - Fixes flowcell retrieval during samplesheet parsing +- [1694](https://github.com/nf-core/sarek/pull/1694) - Fix manifest DOI display on CLI +- [1695](https://github.com/nf-core/sarek/pull/1695) - Fix and update input_schema.json +- [1702](https://github.com/nf-core/sarek/pull/1702) - Update nf-schema tests that were not failing on lenient mode +- [1712](https://github.com/nf-core/sarek/pull/1712) - Fix missing import statements on error messages when starting without samplesheet +- [1743](https://github.com/nf-core/sarek/pull/1743) - Add setup java 17 in GHA for latest Nextflow version +- [1745](https://github.com/nf-core/sarek/pull/1745) - Fix bug where workflow can hang if the email parameter is set +- [1746](https://github.com/nf-core/sarek/pull/1746) - Fix Sentieon module inputs +- [1752](https://github.com/nf-core/sarek/pull/1752) - Add `indexcov` and `lofreq` to full size tests. Amend overview figures. +- [1754](https://github.com/nf-core/sarek/pull/1754) - Fix test string +- [1755](https://github.com/nf-core/sarek/pull/1755) - Remove `default` channel and name from local modules + +### Removed + +- [1656](https://github.com/nf-core/sarek/pull/1656) - Retiring parameter `snpeff_genome` +- [1709](https://github.com/nf-core/sarek/pull/1709) - Remove `Strelka` tumor-only somatic variant calling +- [1728](https://github.com/nf-core/sarek/pull/1728) - Remove BAM to CRAM conversion of input files for post-alignment entry points + +### Dependencies + +| Dependency | Old version | New version | +| ------------- | ----------- | ----------- | +| `coreutils` | 8.30 | 9.5 | +| `deepvariant` | 1.5.0 | 1.6.1 | +| `ensemblvep` | 111.0 | 113.0 | +| `fgbio` | 2.0.2 | 2.1.2 | +| `gawk` | 5.1.0 | 5.3.0 | +| `htslib` | 1.20 | 1.21 | +| `lofreq` | | 2.1.5 | +| `multiqc` | 1.21 | 1.25.1 | +| `samtools` | 1.20 | 1.21 | +| `sentieon` | 202308.02 | 202308.03 | +| `svdb` | 2.8.1 | 2.8.2 | + +### Parameters + +| Params | Status | +| ------------------------------------ | ------- | +| `--help_full` | New | +| `--show_hidden` | New | +| `--snpeff_db` | Updated | +| `--snpeff_genome` | Removed | +| `--validationFailUnrecognisedParams` | Removed | +| `--validationLenientMode` | Removed | +| `--validationSchemaIgnoreParams` | Removed | +| `--validationShowHiddenParams` | Removed | + ## [3.4.4](https://github.com/nf-core/sarek/releases/tag/3.4.4) - Ruopsokjåkhå Ruopsokjåkhå is another peak of the Pårte massif. @@ -39,7 +125,7 @@ Loametjåhkkå is another one of the main peaks of the Pårte massif. ### Added - [#1502](https://github.com/nf-core/sarek/pull/1502) - export CNVs into VCF format in `bam_variant_calling_cnvkit` -- [#1534](https://github.com/nf-core/sarek/pull/1534), [#1573](https://github.com/nf-core/sarek/pull/1573) - Handling `.fastq.gz.spring` files as input +- [#1534](https://github.com/nf-core/sarek/pull/1534), [#1573](https://github.com/nf-core/sarek/pull/1573), [#1734](https://github.com/nf-core/sarek/pull/1534) - Handling `.fastq.gz.spring` files as input - [#1593](https://github.com/nf-core/sarek/pull/1593) - Prepare release `3.4.2` ### Changed diff --git a/CITATIONS.md b/CITATIONS.md index f72966d98c..1c4a22cade 100644 --- a/CITATIONS.md +++ b/CITATIONS.md @@ -146,6 +146,10 @@ > Danecek P, Auton A, Abecasis G, et al.: The variant call format and VCFtools. Bioinformatics. 2011 Aug 1;27(15):2156-8. doi: 10.1093/bioinformatics/btr330. Epub 2011 Jun 7. PubMed PMID: 21653522; PubMed Central PMCID: PMC3137218. +- [Lofreq](https://pubmed.ncbi.nlm.nih.gov/23066108/) + + > Wilm et al. LoFreq: A sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets. Nucleic Acids Res. 2012; 40(22):11189-201. + ## R packages - [R](https://www.R-project.org/) diff --git a/README.md b/README.md index 4afd61518c..c4f7b5443b 100644 --- a/README.md +++ b/README.md @@ -12,7 +12,7 @@ [![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.3476425-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.3476425) [![nf-test](https://img.shields.io/badge/unit_tests-nf--test-337ab7.svg)](https://www.nf-test.com) -[![Nextflow](https://img.shields.io/badge/nextflow%20DSL2-%E2%89%A523.04.0-23aa62.svg)](https://www.nextflow.io/) +[![Nextflow](https://img.shields.io/badge/nextflow%20DSL2-%E2%89%A524.04.2-23aa62.svg)](https://www.nextflow.io/) [![run with conda](http://img.shields.io/badge/run%20with-conda-3EB049?labelColor=000000&logo=anaconda)](https://docs.conda.io/en/latest/) [![run with docker](https://img.shields.io/badge/run%20with-docker-0db7ed?labelColor=000000&logo=docker)](https://www.docker.com/) [![run with singularity](https://img.shields.io/badge/run%20with-singularity-1d355c.svg?labelColor=000000)](https://sylabs.io/docs/) @@ -54,12 +54,14 @@ Depending on the options and samples provided, the pipeline can currently perfor - `freebayes` - `GATK HaplotypeCaller` - `Manta` + - `indexcov` - `mpileup` - `MSIsensor-pro` - `Mutect2` - `Sentieon Haplotyper` - `Strelka2` - `TIDDIT` + - `Lofreq` - Variant filtering and annotation (`SnpEff`, `Ensembl VEP`, `BCFtools annotate`) - Summarise and represent QC (`MultiQC`) @@ -93,8 +95,7 @@ nextflow run nf-core/sarek \ ``` > [!WARNING] -> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; -> see [docs](https://nf-co.re/usage/configuration#custom-configuration-files). +> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; see [docs](https://nf-co.re/docs/usage/getting_started/configuration#custom-configuration-files). For more details and further functionality, please refer to the [usage documentation](https://nf-co.re/sarek/usage) and the [parameter documentation](https://nf-co.re/sarek/parameters). @@ -131,6 +132,7 @@ We thank the following people for their extensive assistance in the development - [Abhinav Sharma](https://github.com/abhi18av) - [Adam Talbot](https://github.com/adamrtalbot) - [Adrian Lärkeryd](https://github.com/adrlar) +- [Àitor Olivares](https://github.com/AitorPeseta) - [Alexander Peltzer](https://github.com/apeltzer) - [Alison Meynert](https://github.com/ameynert) - [Anders Sune Pedersen](https://github.com/asp8200) @@ -144,6 +146,7 @@ We thank the following people for their extensive assistance in the development - [Edmund Miller](https://github.com/edmundmiller) - [Famke Bäuerle](https://github.com/famosab) - [Francesco Lescai](https://github.com/lescai) +- [Francisco Martínez](https://github.com/nevinwu) - [Gavin Mackenzie](https://github.com/GCJMackenzie) - [Gisela Gabernet](https://github.com/ggabernet) - [Grant Neilson](https://github.com/grantn5) @@ -169,6 +172,7 @@ We thank the following people for their extensive assistance in the development - [pallolason](https://github.com/pallolason) - [Paul Cantalupo](https://github.com/pcantalupo) - [Phil Ewels](https://github.com/ewels) +- [Pierre Lindenbaum](https://github.com/lindenb) - [Sabrina Krakau](https://github.com/skrakau) - [Sam Minot](https://github.com/sminot) - [Sebastian-D](https://github.com/Sebastian-D) diff --git a/assets/multiqc_config.yml b/assets/multiqc_config.yml index e4512a603e..4165fec686 100644 --- a/assets/multiqc_config.yml +++ b/assets/multiqc_config.yml @@ -3,9 +3,9 @@ custom_logo_url: https://github.com/nf-core/sarek/ custom_logo_title: "nf-core/sarek" report_comment: > - This report has been generated by the nf-core/sarek + This report has been generated by the nf-core/sarek analysis pipeline. For information about how to interpret these results, please see the - documentation. + documentation. report_section_order: "nf-core-sarek-methods-description": order: -1000 diff --git a/assets/schema_input.json b/assets/schema_input.json index ce010b51dd..4284b97430 100644 --- a/assets/schema_input.json +++ b/assets/schema_input.json @@ -1,5 +1,5 @@ { - "$schema": "http://json-schema.org/draft-07/schema", + "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "https://raw.githubusercontent.com/nf-core/sarek/master/assets/schema_input.json", "title": "nf-core/sarek pipeline - params.input schema", "description": "Schema for the file provided with params.input", @@ -23,191 +23,89 @@ "errorMessage": "Sex cannot contain spaces", "meta": ["sex"], "default": "NA", - "anyOf": [ - { - "type": "string", - "pattern": "^\\S+$" - }, - { - "type": "string", - "maxLength": 0 - } - ] + "type": "string", + "pattern": "^\\S+$" }, "status": { "type": "integer", "errorMessage": "Status can only be 0 (normal) or 1 (tumor). Defaults to 0, if none is supplied.", "meta": ["status"], - "default": "0", + "default": 0, "minimum": 0, "maximum": 1 }, "lane": { "type": "string", "pattern": "^\\S+$", - "unique": ["patient", "sample"], - "anyOf": [ - { - "dependentRequired": ["bam"] - }, - { - "dependentRequired": ["fastq_1"] - }, - { - "dependentRequired": ["spring_1"] - } - ], "meta": ["lane"] }, "fastq_1": { - "errorMessage": "Gzipped FastQ file for reads 1 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'", - "anyOf": [ - { - "type": "string", - "pattern": "^\\S+\\.f(ast)?q\\.gz$" - }, - { - "type": "string", - "maxLength": 0 - } - ], + "errorMessage": "FastQ file for reads 1 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'", + "type": "string", + "pattern": "^\\S+\\.f(ast)?q\\.gz$", "format": "file-path", "exists": true }, "fastq_2": { - "errorMessage": "Gzipped FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'", - "dependentRequired": ["fastq_1"], - "anyOf": [ - { - "type": "string", - "pattern": "^\\S+\\.f(ast)?q\\.gz$" - }, - { - "type": "string", - "maxLength": 0 - } - ], + "errorMessage": "FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'", + "type": "string", + "pattern": "^\\S+\\.f(ast)?q\\.gz$", "format": "file-path", "exists": true }, "spring_1": { "errorMessage": "Gzipped and spring-compressed FastQ file for reads 1 cannot contain spaces and must have extension '.fq.gz.spring' or '.fastq.gz.spring'", - "anyOf": [ - { - "type": "string", - "pattern": "^\\S+\\.f(ast)?q\\.gz.spring$" - }, - { - "type": "string", - "maxLength": 0 - } - ], + "type": "string", + "pattern": "^\\S+\\.f(ast)?q\\.gz.spring$", "format": "file-path", "exists": true }, "spring_2": { "errorMessage": "Gzipped and spring-compressed FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz.spring' or '.fastq.gz.spring'", - "dependentRequired": ["spring_1"], - "anyOf": [ - { - "type": "string", - "pattern": "^\\S+\\.f(ast)?q\\.gz.spring$" - }, - { - "type": "string", - "maxLength": 0 - } - ], + "type": "string", + "pattern": "^\\S+\\.f(ast)?q\\.gz.spring$", "format": "file-path", "exists": true }, "table": { "errorMessage": "Recalibration table cannot contain spaces and must have extension '.table'", - "anyOf": [ - { - "type": "string", - "pattern": "^\\S+\\.table$" - }, - { - "type": "string", - "maxLength": 0 - } - ], + "type": "string", + "pattern": "^\\S+\\.table$", "format": "file-path", "exists": true }, "cram": { "errorMessage": "CRAM file cannot contain spaces and must have extension '.cram'", - "anyOf": [ - { - "type": "string", - "pattern": "^\\S+\\.cram$" - }, - { - "type": "string", - "maxLength": 0 - } - ], + "type": "string", + "pattern": "^\\S+\\.cram$", "format": "file-path", "exists": true }, "crai": { "errorMessage": "CRAM index file cannot contain spaces and must have extension '.crai'", - "anyOf": [ - { - "type": "string", - "pattern": "^\\S+\\.crai$" - }, - { - "type": "string", - "maxLength": 0 - } - ], + "type": "string", + "pattern": "^\\S+\\.crai$", "format": "file-path", "exists": true }, "bam": { "errorMessage": "BAM file cannot contain spaces and must have extension '.bam'", - "anyOf": [ - { - "type": "string", - "pattern": "^\\S+\\.bam$" - }, - { - "type": "string", - "maxLength": 0 - } - ], + "type": "string", + "pattern": "^\\S+\\.bam$", "format": "file-path", "exists": true }, "bai": { "errorMessage": "BAM index file cannot contain spaces and must have extension '.bai'", - "anyOf": [ - { - "type": "string", - "pattern": "^\\S+\\.bai$" - }, - { - "type": "string", - "maxLength": 0 - } - ], + "type": "string", + "pattern": "^\\S+\\.bai$", "format": "file-path", "exists": true }, "vcf": { "errorMessage": "VCF file for reads 1 cannot contain spaces and must have extension '.vcf' or '.vcf.gz'", - "anyOf": [ - { - "type": "string", - "pattern": "^\\S+\\.vcf(\\.gz)?$" - }, - { - "type": "string", - "maxLength": 0 - } - ], + "type": "string", + "pattern": "^\\S+\\.vcf(\\.gz)?$", "format": "file-path", "exists": true }, @@ -215,6 +113,28 @@ "type": "string" } }, - "required": ["patient", "sample"] + "anyOf": [ + { + "dependentRequired": { + "lane": ["fastq_1"] + } + }, + { + "dependentRequired": { + "lane": ["spring_1"] + } + }, + { + "dependentRequired": { + "lane": ["bam"] + } + } + ], + "dependentRequired": { + "fastq_2": ["fastq_1"], + "spring_2": ["spring_1"] + }, + "required": ["patient", "sample"], + "uniqueEntries": ["lane", "patient", "sample"] } } diff --git a/conf/base.config b/conf/base.config index cf19c7081c..667218480e 100644 --- a/conf/base.config +++ b/conf/base.config @@ -9,10 +9,11 @@ */ process { - cpus = { check_max( 1 * task.attempt, 'cpus' ) } - memory = { check_max( 6.GB * task.attempt, 'memory' ) } - time = { check_max( 4.h * task.attempt, 'time' ) } - shell = ['/bin/bash', '-euo', 'pipefail'] + + // TODO nf-core: Check the defaults for all processes + cpus = { 1 * task.attempt } + memory = { 6.GB * task.attempt } + time = { 4.h * task.attempt } // memory errors which should be retried. otherwise error out errorStrategy = { task.exitStatus in ((130..145) + 104) ? 'retry' : 'finish' } @@ -29,76 +30,76 @@ process { maxRetries = 2 } withLabel:process_single { - cpus = { check_max( 1 , 'cpus' ) } - memory = { check_max( 6.GB * task.attempt, 'memory' ) } - time = { check_max( 4.h * task.attempt, 'time' ) } + cpus = { 1 } + memory = { 6.GB * task.attempt } + time = { 4.h * task.attempt } } withLabel:process_low { - cpus = { check_max( 2 * task.attempt, 'cpus' ) } - memory = { check_max( 12.GB * task.attempt, 'memory' ) } - time = { check_max( 4.h * task.attempt, 'time' ) } + cpus = { 2 * task.attempt } + memory = { 12.GB * task.attempt } + time = { 4.h * task.attempt } } withLabel:process_medium { - cpus = { check_max( 6 * task.attempt, 'cpus' ) } - memory = { check_max( 36.GB * task.attempt, 'memory' ) } - time = { check_max( 8.h * task.attempt, 'time' ) } + cpus = { 6 * task.attempt } + memory = { 36.GB * task.attempt } + time = { 8.h * task.attempt } } withLabel:process_high { - cpus = { check_max( 12 * task.attempt, 'cpus' ) } - memory = { check_max( 72.GB * task.attempt, 'memory' ) } - time = { check_max( 16.h * task.attempt, 'time' ) } + cpus = { 12 * task.attempt } + memory = { 72.GB * task.attempt } + time = { 16.h * task.attempt } } withLabel:process_long { - time = { check_max( 20.h * task.attempt, 'time' ) } + time = { 20.h * task.attempt } } withLabel:process_high_memory { - memory = { check_max( 200.GB * task.attempt, 'memory' ) } + memory = { 200.GB * task.attempt } } withName: 'UNZIP.*|UNTAR.*|TABIX.*|BUILD_INTERVALS|CREATE_INTERVALS_BED|VCFTOOLS|BCFTOOLS.*|SAMTOOLS_INDEX' { - cpus = { check_max( 1 * task.attempt, 'cpus' ) } - memory = { check_max( 1.GB * task.attempt, 'memory' ) } + cpus = { 1 * task.attempt } + memory = { 1.GB * task.attempt } } withName: 'FASTQC'{ - cpus = { check_max( 4 * task.attempt, 'cpus' ) } - memory = { check_max( 4.GB * task.attempt, 'memory' ) } + cpus = { 4 * task.attempt } + memory = { 4.GB * task.attempt } } withName: 'FASTP'{ - cpus = { check_max( 12 * task.attempt, 'cpus' ) } - memory = { check_max( 4.GB * task.attempt, 'memory' ) } + cpus = { 12 * task.attempt } + memory = { 4.GB * task.attempt } } withName: 'BWAMEM1_MEM|BWAMEM2_MEM' { - cpus = { check_max( 24 * task.attempt, 'cpus' ) } - memory = { check_max( 30.GB * task.attempt, 'memory' ) } + cpus = { 24 * task.attempt } + memory = { 30.GB * task.attempt } } withName:'CNVKIT_BATCH' { label = "process_high" - memory = { check_max( 36.GB * task.attempt, 'memory' ) } + memory = { 36.GB * task.attempt } } withName: 'GATK4_MARKDUPLICATES|GATK4SPARK_MARKDUPLICATES' { - cpus = { check_max( 6 * task.attempt, 'cpus' ) } - memory = { check_max( 30.GB * task.attempt, 'memory' ) } + cpus = { 6 * task.attempt } + memory = { 30.GB * task.attempt } } withName:'GATK4_APPLYBQSR|GATK4SPARK_APPLYBQSR|GATK4_BASERECALIBRATOR|GATK4SPARK_BASERECALIBRATOR|GATK4_GATHERBQSRREPORTS'{ - cpus = { check_max( 2 * task.attempt, 'cpus' ) } - memory = { check_max( 4.GB * task.attempt, 'memory' ) } + cpus = { 2 * task.attempt } + memory = { 4.GB * task.attempt } } withName:'MOSDEPTH'{ - cpus = { check_max( 4 * task.attempt, 'cpus' ) } - memory = { check_max( 4.GB * task.attempt, 'memory' ) } + cpus = { 4 * task.attempt } + memory = { 4.GB * task.attempt } } withName:'STRELKA.*|MANTA.*' { - cpus = { check_max( 10 * task.attempt, 'cpus' ) } - memory = { check_max( 8.GB * task.attempt, 'memory' ) } + cpus = { 10 * task.attempt } + memory = { 8.GB * task.attempt } } withName:'SAMTOOLS_CONVERT'{ - memory = { check_max( 4.GB * task.attempt, 'memory' ) } + memory = { 4.GB * task.attempt } } withName:'GATK4_MERGEVCFS'{ - cpus = { check_max( 2 * task.attempt, 'cpus' ) } - memory = { check_max( 4.GB * task.attempt, 'memory' ) } + cpus = { 2 * task.attempt } + memory = { 4.GB * task.attempt } } withName: 'MULTIQC' { - cpus = { check_max( 4 * task.attempt, 'cpus' ) } - memory = { check_max( 12.GB * task.attempt, 'memory' ) } + cpus = { 4 * task.attempt } + memory = { 12.GB * task.attempt } } } diff --git a/conf/igenomes.config b/conf/igenomes.config index af199c5e6d..afc253a919 100644 --- a/conf/igenomes.config +++ b/conf/igenomes.config @@ -36,9 +36,8 @@ params { known_indels_vqsr = '--resource:1000G,known=false,training=true,truth=true,prior=10.0 1000G_phase1.indels.b37.vcf.gz --resource:mills,known=false,training=true,truth=true,prior=10.0 Mills_and_1000G_gold_standard.indels.b37.vcf.gz' mappability = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/Control-FREEC/out100m2_hg19.gem" ngscheckmate_bed = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/NGSCheckMate/SNP_GRCh37_hg19_woChr.bed" - snpeff_db = '87' - snpeff_genome = 'GRCh37' - vep_cache_version = '111' + snpeff_db = 'GRCh37.87' + vep_cache_version = '113' vep_genome = 'GRCh37' vep_species = 'homo_sapiens' } @@ -73,9 +72,8 @@ params { pon = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/1000g_pon.hg38.vcf.gz" pon_tbi = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/1000g_pon.hg38.vcf.gz.tbi" sentieon_dnascope_model = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/Sentieon/SentieonDNAscopeModel1.1.model" - snpeff_db = '105' - snpeff_genome = 'GRCh38' - vep_cache_version = '111' + snpeff_db = 'GRCh38.105' + vep_cache_version = '113' vep_genome = 'GRCh38' vep_species = 'homo_sapiens' } @@ -84,9 +82,8 @@ params { fasta = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/genome.fa" ngscheckmate_bed = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Annotation/NGSCheckMate/SNP_GRCh37_hg19_woChr.bed" readme = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Annotation/README.txt" - snpeff_db = '87' - snpeff_genome = 'GRCh37' - vep_cache_version = '111' + snpeff_db = 'GRCh37.87' + vep_cache_version = '113' vep_genome = 'GRCh37' vep_species = 'homo_sapiens' } @@ -94,9 +91,8 @@ params { bwa = "${params.igenomes_base}/Homo_sapiens/NCBI/GRCh38/Sequence/BWAIndex/version0.6.0/" fasta = "${params.igenomes_base}/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa" ngscheckmate_bed ="${params.igenomes_base}/Homo_sapiens/NCBI/GRCh38/Annotation/NGSCheckMate/SNP_GRCh38_hg38_wChr.bed" - snpeff_db = '105' - snpeff_genome = 'GRCh38' - vep_cache_version = '111' + snpeff_db = 'GRCh38.105' + vep_cache_version = '113' vep_genome = 'GRCh38' vep_species = 'homo_sapiens' } @@ -118,8 +114,7 @@ params { known_indels_tbi = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/MouseGenomeProject/mgp.v5.merged.indels.dbSNP142.normed.vcf.gz.tbi" mappability = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Annotation/Control-FREEC/GRCm38_68_mm10.gem" readme = "${params.igenomes_base}/Mus_musculus/Ensembl/GRCm38/Annotation/README.txt" - snpeff_db = '99' - snpeff_genome = 'GRCm38' + snpeff_db = 'GRCm38.99' vep_cache_version = '102' vep_genome = 'GRCm38' vep_species = 'mus_musculus' @@ -138,8 +133,7 @@ params { bwa = "${params.igenomes_base}/Bos_taurus/Ensembl/UMD3.1/Sequence/BWAIndex/version0.6.0/" fasta = "${params.igenomes_base}/Bos_taurus/Ensembl/UMD3.1/Sequence/WholeGenomeFasta/genome.fa" readme = "${params.igenomes_base}/Bos_taurus/Ensembl/UMD3.1/Annotation/README.txt" - snpeff_db = '75' - snpeff_genome = 'UMD3.1' + snpeff_db = 'UMD3.1.75' vep_cache_version = '94' vep_genome = 'UMD3.1' vep_species = 'bos_taurus' @@ -147,9 +141,8 @@ params { 'WBcel235' { bwa = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Sequence/BWAIndex/version0.6.0/" fasta = "${params.igenomes_base}/Caenorhabditis_elegans/Ensembl/WBcel235/Sequence/WholeGenomeFasta/genome.fa" - snpeff_db = '105' - snpeff_genome = 'WBcel235' - vep_cache_version = '111' + snpeff_db = 'WBcel235.105' + vep_cache_version = '113' vep_genome = 'WBcel235' vep_species = 'caenorhabditis_elegans' } @@ -157,8 +150,7 @@ params { bwa = "${params.igenomes_base}/Canis_familiaris/Ensembl/CanFam3.1/Sequence/BWAIndex/version0.6.0/" fasta = "${params.igenomes_base}/Canis_familiaris/Ensembl/CanFam3.1/Sequence/WholeGenomeFasta/genome.fa" readme = "${params.igenomes_base}/Canis_familiaris/Ensembl/CanFam3.1/Annotation/README.txt" - snpeff_db = '99' - snpeff_genome = 'CanFam3.1' + snpeff_db = 'CanFam3.1.99' vep_cache_version = '104' vep_genome = 'CanFam3.1' vep_species = 'canis_lupus_familiaris' @@ -215,9 +207,8 @@ params { 'R64-1-1' { bwa = "${params.igenomes_base}/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Sequence/BWAIndex/version0.6.0/" fasta = "${params.igenomes_base}/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Sequence/WholeGenomeFasta/genome.fa" - snpeff_db = '105' - snpeff_genome = 'R64-1-1' - vep_cache_version = '111' + snpeff_db = 'R64-1-1.105' + vep_cache_version = '113' vep_genome = 'R64-1-1' vep_species = 'saccharomyces_cerevisiae' } @@ -243,9 +234,8 @@ params { 'hg38' { bwa = "${params.igenomes_base}/Homo_sapiens/UCSC/hg38/Sequence/BWAIndex/version0.6.0/" fasta = "${params.igenomes_base}/Homo_sapiens/UCSC/hg38/Sequence/WholeGenomeFasta/genome.fa" - snpeff_db = '105' - snpeff_genome = 'GRCh38' - vep_cache_version = '111' + snpeff_db = 'GRCh38.105' + vep_cache_version = '113' vep_genome = 'GRCh38' vep_species = 'homo_sapiens' } @@ -253,9 +243,8 @@ params { bwa = "${params.igenomes_base}/Homo_sapiens/UCSC/hg19/Sequence/BWAIndex/version0.6.0/" fasta = "${params.igenomes_base}/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa" readme = "${params.igenomes_base}/Homo_sapiens/UCSC/hg19/Annotation/README.txt" - snpeff_db = '87' - snpeff_genome = 'GRCh37' - vep_cache_version = '111' + snpeff_db = 'GRCh37.87' + vep_cache_version = '113' vep_genome = 'GRCh37' vep_species = 'homo_sapiens' } @@ -263,8 +252,7 @@ params { bwa = "${params.igenomes_base}/Mus_musculus/UCSC/mm10/Sequence/BWAIndex/version0.6.0/" fasta = "${params.igenomes_base}/Mus_musculus/UCSC/mm10/Sequence/WholeGenomeFasta/genome.fa" readme = "${params.igenomes_base}/Mus_musculus/UCSC/mm10/Annotation/README.txt" - snpeff_db = '99' - snpeff_genome = 'GRCm38' + snpeff_db = 'GRCm38.99' vep_cache_version = '102' vep_genome = 'GRCm38' vep_species = 'mus_musculus' @@ -334,9 +322,8 @@ params { known_indels_tbi = "${params.igenomes_base}/genomics/homo_sapiens/genome/vcf/mills_and_1000G.indels.vcf.gz.tbi" known_indels_vqsr = '--resource:mills,known=false,training=true,truth=true,prior=10.0 mills_and_1000G.indels.vcf.gz' ngscheckmate_bed = "${params.igenomes_base}/genomics/homo_sapiens/genome/chr21/germlineresources/SNP_GRCh38_hg38_wChr.bed" - snpeff_db = '105' - snpeff_genome = 'WBcel235' - vep_cache_version = '111' + snpeff_db = 'WBcel235.105' + vep_cache_version = '113' vep_genome = 'WBcel235' vep_species = 'caenorhabditis_elegans' } diff --git a/conf/igenomes_ignored.config b/conf/igenomes_ignored.config new file mode 100644 index 0000000000..b4034d8243 --- /dev/null +++ b/conf/igenomes_ignored.config @@ -0,0 +1,9 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Nextflow config file for iGenomes paths +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Empty genomes dictionary to use when igenomes is ignored. +---------------------------------------------------------------------------------------- +*/ + +params.genomes = [:] diff --git a/conf/modules/aligner.config b/conf/modules/aligner.config index 5f44e199b0..70926d573a 100644 --- a/conf/modules/aligner.config +++ b/conf/modules/aligner.config @@ -30,11 +30,11 @@ process { } withName: 'SENTIEON_BWAMEM' { + ext.prefix = { params.split_fastq > 1 ? "${meta.id}".concat('.').concat(reads.get(0).name.tokenize('.')[0]).concat('.bam') : "${meta.id}.sorted.bam" } ext.when = { params.aligner == 'sentieon-bwamem' } } withName: 'BWAMEM.*_MEM|DRAGMAP_ALIGN|SENTIEON_BWAMEM' { - ext.prefix = { params.split_fastq > 1 ? "${meta.id}".concat('.').concat(reads.get(0).name.tokenize('.')[0]) : "${meta.id}.sorted" } publishDir = [ mode: params.publish_dir_mode, path: { "${params.outdir}/preprocessing/" }, @@ -56,6 +56,7 @@ process { } withName: 'BWAMEM.*_MEM|DRAGMAP_ALIGN' { + ext.prefix = { params.split_fastq > 1 ? "${meta.id}".concat('.').concat(reads.get(0).name.tokenize('.')[0]) : "${meta.id}.sorted" } // Markduplicates Spark NEEDS name-sorted reads or runtime goes through the roof // However if it's skipped, reads need to be coordinate-sorted // Only name sort if Spark for Markduplicates + duplicate marking is not skipped diff --git a/conf/modules/annotate.config b/conf/modules/annotate.config index b02c1b3ef3..6459a686fa 100644 --- a/conf/modules/annotate.config +++ b/conf/modules/annotate.config @@ -64,12 +64,12 @@ process { // BCFTOOLS ANNOTATE if (params.tools && params.tools.split(',').contains('bcfann')) { withName: 'NFCORE_SAREK:SAREK:VCF_ANNOTATE_ALL:VCF_ANNOTATE_BCFTOOLS:BCFTOOLS_ANNOTATE' { - ext.args = { '--output-type z' } + ext.args = { '--output-type z --write-index=tbi' } ext.prefix = { input.baseName - '.vcf' + '_BCF.ann' } publishDir = [ mode: params.publish_dir_mode, path: { "${params.outdir}/annotation/${meta.variantcaller}/${meta.id}/" }, - pattern: "*{gz}" + pattern: "*{gz,gz.tbi}" ] } } diff --git a/conf/modules/deepvariant.config b/conf/modules/deepvariant.config index 021990f7f6..ef5e31b796 100644 --- a/conf/modules/deepvariant.config +++ b/conf/modules/deepvariant.config @@ -15,8 +15,8 @@ process { - withName: 'DEEPVARIANT' { - ext.args = { params.wes ? "--model_type WES" : "--model_type WGS" } + withName: 'DEEPVARIANT_RUNDEEPVARIANT' { + ext.args = { params.wes ? "--model_type=WES" : "--model_type=WGS" } ext.prefix = { meta.num_intervals <= 1 ? "${meta.id}.deepvariant" : "${meta.id}.deepvariant.${intervals.baseName}" } ext.when = { params.tools && params.tools.split(',').contains('deepvariant') } publishDir = [ diff --git a/conf/modules/indexcov.config b/conf/modules/indexcov.config new file mode 100644 index 0000000000..082ea3b7cc --- /dev/null +++ b/conf/modules/indexcov.config @@ -0,0 +1,21 @@ + +// INDEXCOV + +process { + if (params.tools && params.tools.split(',').contains('indexcov')) { + + withName: 'SAMTOOLS_REINDEX_BAM' { + ext.args = { ' -F 3844 -q 30 ' } // high mapq , primary read paired properly mapped + } + + withName: 'GOLEFT_INDEXCOV' { + publishDir = [ + mode: params.publish_dir_mode, + path: { "${params.outdir}/variant_calling/indexcov/" } + ] + + } + + } + +} diff --git a/conf/modules/lofreq.config b/conf/modules/lofreq.config new file mode 100644 index 0000000000..253b252b3b --- /dev/null +++ b/conf/modules/lofreq.config @@ -0,0 +1,45 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Config file for defining DSL2 per module options and publishing paths +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Available keys to override module options: + ext.args = Additional arguments appended to command in module. + ext.args2 = Second set of arguments appended to command in module (multi-tool modules). + ext.args3 = Third set of arguments appended to command in module (multi-tool modules). + ext.prefix = File name prefix for output files. + ext.when = When to run the module. +---------------------------------------------------------------------------------------- +*/ + +//LOFREQ + +process { + if (params.tools && params.tools.split(',').contains('lofreq')) { + + withName: "LOFREQ_CALLPARALLEL" { + ext.args = { "--call-indels" } + ext.prefix = { meta.num_intervals <= 1 ? "${meta.id}.lofreq" : "${meta.id}.lofreq.${intervals.baseName}" } + ext.when = { params.tools && params.tools.split(',').contains('lofreq') } + publishDir = [ + mode: params.publish_dir_mode, + path: { "${params.outdir}/variant_calling/" }, + pattern: "*{vcf.gz,vcf.gz.tbi}", + saveAs: { meta.num_intervals > 1 ? null : "lofreq/${meta.id}/${it}" } + ] + } + + withName:'VCFTOOLS_TSTV_COUNT'{ + errorStrategy = 'ignore' + } + + withName: 'MERGE_LOFREQ.*' { + ext.prefix = { "${meta.id}.lofreq" } + publishDir = [ + mode: params.publish_dir_mode, + path: { "${params.outdir}/variant_calling/lofreq/${meta.id}" }, + pattern: "*{vcf.gz,vcf.gz.tbi}" + ] + } + } + +} diff --git a/conf/modules/sentieon_dedup.config b/conf/modules/sentieon_dedup.config index df52c3bb95..35f89720f7 100644 --- a/conf/modules/sentieon_dedup.config +++ b/conf/modules/sentieon_dedup.config @@ -16,7 +16,7 @@ process { withName: 'SENTIEON_DEDUP' { - ext.prefix = { "${meta.id}.dedup" } + ext.prefix = { "${meta.id}.dedup.cram" } ext.when = { params.tools && params.tools.split(',').contains('sentieon_dedup') } publishDir = [ [ diff --git a/conf/modules/tiddit.config b/conf/modules/tiddit.config index 335ecf0951..ac81cdab5e 100644 --- a/conf/modules/tiddit.config +++ b/conf/modules/tiddit.config @@ -47,6 +47,7 @@ process { // SVDB withName: 'NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_SOMATIC_ALL:BAM_VARIANT_CALLING_SOMATIC_TIDDIT:SVDB_MERGE' { + ext.args2 = { '--output-type z' } ext.prefix = { "${meta.id}.tiddit_sv_merge" } publishDir = [ mode: params.publish_dir_mode, diff --git a/conf/modules/trimming.config b/conf/modules/trimming.config index 5fc6f7646b..0b7eff7b2d 100644 --- a/conf/modules/trimming.config +++ b/conf/modules/trimming.config @@ -23,7 +23,8 @@ process { params.three_prime_clip_r1 > 0 ? "--trim_tail1 ${params.three_prime_clip_r1}" : '', // Remove bp from the 3' end of read 1 AFTER adapter/quality trimming has been performed params.three_prime_clip_r2 > 0 ? "--trim_tail2 ${params.three_prime_clip_r2}" : '', // Remove bp from the 3' end of read 2 AFTER adapter/quality trimming has been performed params.trim_nextseq ? '--trim_poly_g' : '', // Apply the --nextseq=X option, to trim based on quality after removing poly-G tails - params.split_fastq > 0 ? "--split_by_lines ${params.split_fastq * 4}" : '' + params.split_fastq > 0 ? "--split_by_lines ${params.split_fastq * 4}" : '', // Output by limiting lines of each file with this option + params.length_required > 0 ? "--length_required ${params.length_required}": '', // Reads shorter will be discarded ].join(' ').trim() publishDir = [ [ diff --git a/conf/modules/umi.config b/conf/modules/umi.config index 7973dd16d8..336a02088f 100644 --- a/conf/modules/umi.config +++ b/conf/modules/umi.config @@ -73,7 +73,7 @@ process { } withName: 'CALLUMICONSENSUS' { - ext.args = { '-M 1 -S Coordinate' } + ext.args = { '-S Coordinate' } ext.prefix = { "${meta.id}_umi-consensus" } publishDir = [ path: { "${params.outdir}/preprocessing/umi/${meta.sample}" }, diff --git a/conf/test.config b/conf/test.config index 81567aed43..5f38bfd90d 100644 --- a/conf/test.config +++ b/conf/test.config @@ -9,23 +9,26 @@ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ +process { + resourceLimits = [ + cpus: 4, + memory: '15.GB', + time: '1.h' + ] +} + params { config_profile_name = 'Test profile' config_profile_description = 'Minimal test dataset to check pipeline function' - // Limit resources so that this can run on GitHub Actions - max_cpus = 2 - max_memory = '6.5GB' - max_time = '8.h' - // Base directory for nf-core/modules test data - modules_testdata_base_path = 's3://ngi-igenomes/testdata/nf-core/modules/' + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' // Input data input = "${projectDir}/tests/csv/3.0/fastq_single.csv" // small genome on igenomes - igenomes_base = 's3://ngi-igenomes/testdata/nf-core/modules' + igenomes_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' genome = 'testdata.nf-core.sarek' // Small reference genome @@ -41,9 +44,6 @@ params { // default params split_fastq = 0 // no FASTQ splitting tools = 'strelka' // Variant calling with Strelka - - // Ignore params that will throw warning through params validation - validationSchemaIgnoreParams = 'genomes' } process { diff --git a/conf/test/markduplicates_bam.config b/conf/test/markduplicates_bam.config deleted file mode 100644 index 16060a2ba8..0000000000 --- a/conf/test/markduplicates_bam.config +++ /dev/null @@ -1,16 +0,0 @@ -/* -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Nextflow config file for running minimal tests -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Defines input files and everything required to run a fast and simple pipeline test. - - Use as follows: - nextflow run nf-core/sarek -profile test,, --outdir -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -*/ - -params { - input = "${projectDir}/tests/csv/3.0/mapped_single_bam.csv" - step = 'markduplicates' - tools = null -} diff --git a/conf/test/markduplicates_cram.config b/conf/test/markduplicates_cram.config deleted file mode 100644 index e8f1d7c6f3..0000000000 --- a/conf/test/markduplicates_cram.config +++ /dev/null @@ -1,16 +0,0 @@ -/* -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Nextflow config file for running minimal tests -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Defines input files and everything required to run a fast and simple pipeline test. - - Use as follows: - nextflow run nf-core/sarek -profile test,, --outdir -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -*/ - -params { - input = "${projectDir}/tests/csv/3.0/mapped_single_cram.csv" - step = 'markduplicates' - tools = null -} diff --git a/conf/test/prepare_recalibration_bam.config b/conf/test/prepare_recalibration_bam.config deleted file mode 100644 index 20a209b438..0000000000 --- a/conf/test/prepare_recalibration_bam.config +++ /dev/null @@ -1,16 +0,0 @@ -/* -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Nextflow config file for running minimal tests -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Defines input files and everything required to run a fast and simple pipeline test. - - Use as follows: - nextflow run nf-core/sarek -profile test,, --outdir -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -*/ - -params { - input = "${projectDir}/tests/csv/3.0/mapped_single_bam.csv" - step = 'prepare_recalibration' - tools = null -} diff --git a/conf/test/prepare_recalibration_cram.config b/conf/test/prepare_recalibration_cram.config deleted file mode 100644 index ccab4977c9..0000000000 --- a/conf/test/prepare_recalibration_cram.config +++ /dev/null @@ -1,16 +0,0 @@ -/* -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Nextflow config file for running minimal tests -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Defines input files and everything required to run a fast and simple pipeline test. - - Use as follows: - nextflow run nf-core/sarek -profile test,, --outdir -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -*/ - -params { - input = "${projectDir}/tests/csv/3.0/mapped_single_cram.csv" - step = 'prepare_recalibration' - tools = null -} diff --git a/conf/test/tools_germline_deepvariant.config b/conf/test/tools_germline_deepvariant.config new file mode 100644 index 0000000000..e50a48cbec --- /dev/null +++ b/conf/test/tools_germline_deepvariant.config @@ -0,0 +1,23 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Nextflow config file for running minimal tests +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Defines input files and everything required to run a fast and simple pipeline test. + + Use as follows: + nextflow run nf-core/sarek -profile test,, --outdir +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +*/ + +params { + input = "${projectDir}/tests/csv/3.0/mapped_single_cram.csv" + genome = null + igenomes_ignore = true + fasta = "${params.modules_testdata_base_path}/genomics/homo_sapiens/genome/genome.fasta" + fasta_fai = "${params.modules_testdata_base_path}/genomics/homo_sapiens/genome/genome.fasta.fai" + intervals = "${params.modules_testdata_base_path}/genomics/homo_sapiens/genome/genome.bed" + nucleotides_per_second = 20 + step = 'variant_calling' + tools = null + wes = true +} diff --git a/conf/test/trimming.config b/conf/test/trimming.config index d904d17660..6786564037 100644 --- a/conf/test/trimming.config +++ b/conf/test/trimming.config @@ -14,6 +14,7 @@ params { clip_r2 = 1 three_prime_clip_r1 = 1 three_prime_clip_r2 = 1 + length_required = 50 tools = null trim_fastq = true } diff --git a/conf/test_full.config b/conf/test_full.config index 1ba5ad2c78..0d00a965fc 100644 --- a/conf/test_full.config +++ b/conf/test_full.config @@ -18,7 +18,7 @@ params { input = 'https://raw.githubusercontent.com/nf-core/test-datasets/sarek/testdata/csv/HCC1395_WXS_somatic_full_test.csv' // Other params - tools = 'ngscheckmate,strelka,mutect2,freebayes,ascat,manta,cnvkit,tiddit,controlfreec,vep,snpeff' + tools = 'ngscheckmate,lofreq,strelka,mutect2,freebayes,ascat,manta,cnvkit,tiddit,controlfreec,vep,snpeff' split_fastq = 20000000 intervals = 's3://ngi-igenomes/test-data/sarek/S07604624_Padded_Agilent_SureSelectXT_allexons_V6_UTR.bed' wes = true diff --git a/conf/test_full_germline.config b/conf/test_full_germline.config index d731a25709..4b2421c625 100644 --- a/conf/test_full_germline.config +++ b/conf/test_full_germline.config @@ -18,6 +18,6 @@ params { input = 'https://raw.githubusercontent.com/nf-core/test-datasets/sarek/testdata/csv/NA12878_WGS_30x_full_test.csv' // Other params - tools = 'strelka,freebayes,haplotypecaller,deepvariant,manta,tiddit,cnvkit,vep,snpeff' + tools = 'indexcov,strelka,freebayes,haplotypecaller,deepvariant,manta,tiddit,cnvkit,vep,snpeff' split_fastq = 50000000 } diff --git a/docs/images/mqc_fastqc_adapter.png b/docs/images/mqc_fastqc_adapter.png deleted file mode 100755 index 361d0e47ac..0000000000 Binary files a/docs/images/mqc_fastqc_adapter.png and /dev/null differ diff --git a/docs/images/mqc_fastqc_counts.png b/docs/images/mqc_fastqc_counts.png deleted file mode 100755 index cb39ebb80a..0000000000 Binary files a/docs/images/mqc_fastqc_counts.png and /dev/null differ diff --git a/docs/images/mqc_fastqc_quality.png b/docs/images/mqc_fastqc_quality.png deleted file mode 100755 index a4b89bf56a..0000000000 Binary files a/docs/images/mqc_fastqc_quality.png and /dev/null differ diff --git a/docs/images/sarek_subway.png b/docs/images/sarek_subway.png index 02937f57e5..a381343500 100644 Binary files a/docs/images/sarek_subway.png and b/docs/images/sarek_subway.png differ diff --git a/docs/images/sarek_subway.svg b/docs/images/sarek_subway.svg index 2a7831b02b..6d8d172652 100644 --- a/docs/images/sarek_subway.svg +++ b/docs/images/sarek_subway.svg @@ -2,13 +2,13 @@ indexcovpre-processingdeepvariantfreebayeshaplotypecallermantastrelka2tiddittidditmutect2ascatmsisensorprocontrolfreeccnvkitcramfreebayesmantastrelka2lofreqExample analysis pathwaysspringmpileupSentieon haplotyperSentieon dnascopeSNPs & IndelsMSI + id="tspan4735-8">MSIstrelka2 diff --git a/docs/images/sarek_workflow.png b/docs/images/sarek_workflow.png index 7fb4cd52c2..8b993ab268 100644 Binary files a/docs/images/sarek_workflow.png and b/docs/images/sarek_workflow.png differ diff --git a/docs/images/sarek_workflow.svg b/docs/images/sarek_workflow.svg index 5f4cbd2ddd..c28d400a13 100644 --- a/docs/images/sarek_workflow.svg +++ b/docs/images/sarek_workflow.svg @@ -4,15 +4,15 @@ image/svg+xml + + + + + + + + + + + + + + + + + + + + + + + +fastqfastqfastq|spring| ubamfastqfastq|spring| ubam + + +Preprocessing Preprocessing + + + + + + + +bcftools annotate, snpeff, vep bcftools annotate, snpeff, vep + + +Annotation Annotation + + + + + + + +Reports Reports + + + + + + + + + + + + + +Variant Calling Variant Calling + + +Somaticfreebayes, mutect2, strelka2 strelka2, lofreq• manta, tiddit• ascat, cnvkit, controlfreec• msisensorpro • msisensorpro + + +Germline Germline + deepvariant, freebayes • deepvariant, freebayes + GATK haplotypecaller, GATK haplotypecaller, + mpileup, strelka2, Sentieon haplotyper + Sentieon haplotyper mpileup, strelka2 +• manta, tiddit • indexcov,manta, tiddit +• cnvkit• cnvkit + + + + + + + + diff --git a/docs/output.md b/docs/output.md index 7f8455f95d..6d723ba03b 100644 --- a/docs/output.md +++ b/docs/output.md @@ -13,6 +13,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d - [Directory Structure](#directory-structure) - [Preprocessing](#preprocessing) - [Preparation of input files (FastQ or (u)BAM)](#preparation-of-input-files-fastq-or-ubam) + - [Clip and filter read length](#clip-and-filter-read-length) - [Trim adapters](#trim-adapters) - [Split FastQ files](#split-fastq-files) - [UMI consensus](#umi-consensus) @@ -42,7 +43,9 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d - [Sentieon Haplotyper](#sentieon-haplotyper) - [Sentieon Haplotyper joint germline variant calling](#sentieon-haplotyper-joint-germline-variant-calling) - [Strelka](#strelka) + - [Lofreq](#lofreq) - [Structural Variants](#structural-variants) + - [Indexcov](#indexcov) - [Manta](#manta) - [TIDDIT](#tiddit) - [Sample heterogeneity, ploidy and CNVs](#sample-heterogeneity-ploidy-and-cnvs) @@ -106,6 +109,10 @@ Sarek pre-processes raw FastQ files or unmapped BAM files, based on [GATK best p [FastP](https://github.com/OpenGene/fastp) is a tool designed to provide all-in-one preprocessing for FastQ files and as such is used for trimming and splitting. By default, these files are not published. However, if publishing is enabled, please be aware that these files are only published once, meaning if trimming and splitting is enabled, then the resulting files will be sharded FastQ files with trimmed reads. If only one of them is enabled then the files contain either trimmed or split reads, respectively. +#### Clip and filter read length + +[FastP](https://github.com/OpenGene/fastp) enables efficient clipping of reads from either the 5' end (`--clip_r1`, `--clip_r2`) or the 3' end (`--three_prime_clip_r1`, `--three_prime_clip_r2`). Additionally, FastP allows the filtering of reads based on insert size by specifying a minimum required length with the `--length_required` parameter (default: 15bp). It is recommended to optimize these parameters according to the specific characteristics of your data. + #### Trim adapters [FastP](https://github.com/OpenGene/fastp) supports global trimming, which means it trims all reads in the front or the tail. This function is useful since sometimes you want to drop some cycles of a sequencing run. In the current implementation in Sarek @@ -548,7 +555,7 @@ In Sentieon's package DNAseq, joint germline variant calling is done by first ru For further downstream analysis, take a look [here](https://github.com/Illumina/strelka/blob/v2.9.x/docs/userGuide/README.md#interpreting-the-germline-multi-sample-variants-vcf).
-Output files for all single samples (normal or tumor-only) +Output files for single samples (normal) **Output directory: `{outdir}/variantcalling/strelka//`** @@ -570,8 +577,46 @@ For further downstream analysis, take a look [here](https://github.com/Illumina/
+#### Lofreq + +[Lofreq](https://github.com/CSB5/lofreq) is a fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data. It makes full use of base-call qualities and other sources of errors inherent in sequencing, which are usually ignored by other methods or only used for filtering. For further reading and documentation see the [Lofreq user guide](https://csb5.github.io/lofreq/). + +
+Output files for tumor-only samples + +**Output directory: `{outdir}/variant_calling/lofreq//`** + +-`.vcf.gz` +-VCF which provides a detailed description of the detected genetic variants. + +
+ ### Structural Variants +#### indexcov + +[indexcov](https://github.com/brentp/goleft/tree/master/indexcov) quickly estimate coverage from a whole-genome bam or cram index. +A bam index has 16KB resolution and it is used as a coverage estimate . +The output is scaled to around 1. So a long stretch with values of 1.5 would be a heterozygous duplication. This is useful as a quick QC to get coverage values across the genome. + +**Output directory: `{outdir}/variantcalling/indexcov/`** + +In addition to the interactive HTML files, `indexcov` outputs a number of text files: + +- `-indexcov.ped`: a .ped/.fam file with the inferred sex in the appropriate column if the sex chromosomes were found. + the CNX and CNY columns indicating the floating-point estimate of copy-number for those chromosomes. + `bins.out`: how many bins had a coverage value outside of (0.85, 1.15). high values can indicate high-bias samples. + `bins.lo`: number of bins with value < 0.15. high values indicate missing data. + `bins.hi`: number of bins with value > 1.15. + `bins.in`: number of bins with value inside of (0.85, 1.15) + `p.out`: `bins.out/bins.in` + `PC1...PC5`: PCA projections calculated with depth of autosomes. + +- `-indexcov.roc`: tab-delimited columns of chrom, scaled coverage cutoff, and $n_samples columns where each indicates the + proportion of 16KB blocks at or above that scaled coverage value. +- `-indexcov.bed.gz`: a bed file with columns of chrom, start, end, and a column per sample where the values indicate there + scaled coverage for that sample in that 16KB chunk. + #### Manta [Manta](https://github.com/Illumina/manta) calls structural variants (SVs) and indels from mapped paired-end sequencing reads. diff --git a/docs/usage.md b/docs/usage.md index 6a91830492..8231d40be8 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -53,9 +53,9 @@ The above pipeline run specified with a params file in yaml format: nextflow run nf-core/sarek -params-file params.yaml ``` -with `params.yaml` containing: +with: -```yaml +```yaml title="params.yaml" input: './samplesheet.csv' outdir: './results/' genome: 'GATK.GRCh38' @@ -559,7 +559,7 @@ Some of the currently, available test profiles: | no_intervals | `nextflow run main.nf -profile test_cache,no_intervals,docker` | | targeted | `nextflow run main.nf -profile test_cache,targeted,docker` | | tools_germline | `nextflow run main.nf -profile test_cache,tools_germline,docker --tools strelka` | -| tools_tumoronly | `nextflow run main.nf -profile test_cache,tools_tumoronly,docker --tools strelka` | +| tools_tumoronly | `nextflow run main.nf -profile test_cache,tools_tumoronly,docker --tools mutect2` | | tools_somatic | `nextflow run main.nf -profile test_cache,tools_somatic,docker --tools strelka` | | trimming | `nextflow run main.nf -profile test_cache,trim_fastq,docker` | | umi | `nextflow run main.nf -profile test_cache,umi,docker` | @@ -575,20 +575,22 @@ Sarek can be started at different points in the analysis by setting the paramete This list is by no means exhaustive and it will depend on the specific analysis you would like to run. This is a suggestion based on the individual docs of the tools specifically for human genomes and a garden-variety sequencing run as well as what has been added to the pipeline. -| Tool | WGS | WES |  Panel |  Normal | Tumor | Somatic | -| :------------------------------------------------------------------------------------------------------ | :-: | :-: | :----: | :-----: | :---: | :-----: | -| [DeepVariant](https://github.com/google/deepvariant) | x | x | x | x | - | - | -| [FreeBayes](https://github.com/ekg/freebayes) | x | x | x | x | x | x | -| [GATK HaplotypeCaller](https://gatk.broadinstitute.org/hc/en-us/articles/5358864757787-HaplotypeCaller) | x | x | x | x | - | - | -| [GATK Mutect2](https://gatk.broadinstitute.org/hc/en-us/articles/5358911630107-Mutect2) | x | x | x | - | x | x | -| [mpileup](https://www.htslib.org/doc/samtools-mpileup.html) | x | x | x | x | x | - | -| [Strelka](https://github.com/Illumina/strelka) | x | x | x | x | x | x | -| [Manta](https://github.com/Illumina/manta) | x | x | x | x | x | x | -| [TIDDIT](https://github.com/SciLifeLab/TIDDIT) | x | x | x | x | x | x | -| [ASCAT](https://github.com/VanLoo-lab/ascat) | x | x | - | - | - | x | -| [CNVKit](https://cnvkit.readthedocs.io/en/stable/) | x | x | - | x | x | x | -| [Control-FREEC](https://github.com/BoevaLab/FREEC) | x | x | x | - | x | x | -| [MSIsensorPro](https://github.com/xjtu-omics/msisensor-pro) | x | x | x | - | - | x | +| Tool | WGS | WES |  Panel |  Germline | Tumor-Only | Somatic (Tumor-Normal) | +| :------------------------------------------------------------------------------------------------------ | :-: | :-: | :----: | :-------: | :--------: | :--------------------: | +| [DeepVariant](https://github.com/google/deepvariant) | x | x | x | x | - | - | +| [FreeBayes](https://github.com/ekg/freebayes) | x | x | x | x | x | x | +| [GATK HaplotypeCaller](https://gatk.broadinstitute.org/hc/en-us/articles/5358864757787-HaplotypeCaller) | x | x | x | x | - | - | +| [GATK Mutect2](https://gatk.broadinstitute.org/hc/en-us/articles/5358911630107-Mutect2) | x | x | x | - | x | x | +| [lofreq](https://github.com/CSB5/lofreq) | x | x | x | - | x | - | +| [mpileup](https://www.htslib.org/doc/samtools-mpileup.html) | x | x | x | x | x | - | +| [Strelka](https://github.com/Illumina/strelka) | x | x | x | x | - | x | +| [Manta](https://github.com/Illumina/manta) | x | x | x | x | x | x | +| [indexcov](https://github.com/brentp/goleft/tree/master/indexcov) | x | - | - | x | - | x | +| [TIDDIT](https://github.com/SciLifeLab/TIDDIT) | x | x | x | x | x | x | +| [ASCAT](https://github.com/VanLoo-lab/ascat) | x | x | - | - | - | x | +| [CNVKit](https://cnvkit.readthedocs.io/en/stable/) | x | x | - | x | x | x | +| [Control-FREEC](https://github.com/BoevaLab/FREEC) | x | x | x | - | x | x | +| [MSIsensorPro](https://github.com/xjtu-omics/msisensor-pro) | x | x | x | - | - | x | ## How to run ASCAT with whole-exome sequencing data? @@ -646,6 +648,89 @@ mv *loci* battenberg_loci_on_target_hg38/ 3. Copy the `targets_with_chr.bed` and `GC_G1000_on_target_hg38.txt` files into the newly created `battenberg_loci_on_target_hg38` folder before running the next set of steps. ASCAT generates a list of GC correction loci with sufficient coverage in a sample, then intersects that with the list of all loci with tumour logR values in that sample. If the intersection is <10% the size of the latter, it will fail with an error. Because the Battenberg loci/allele sets are very dense, subsetting to on-target regions is still too many loci. This script ensures that all SNPs with GC correction information are included in the loci list, plus a random sample of another 30% of all on target loci. You may need to vary this proportion depending on your set of targets. A good rule of thumb is that the size of your GC correction loci list should be about 15% the size of your total loci list. This allows for a margin of error. +### 'chr'-based versus non 'chr'-based reference + +Please note that loci files provided from ASCAT developers (https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WES) are not 'chr'-based (chromosome names are '1', '2', '3', etc. and not 'chr1', 'chr2', 'chr3', etc.). If your BAMs are 'chr'-based, you will need to add 'chr' + +```bash +for i in {1..22} X; + do sed -i 's/^/chr/' G1000_loci_hg19_chr${i}.txt; +done). +``` + +ASCAT will internally remove 'chr' so the other files (allele, GC correction and RT correction) should not be modified and chrom_names (ascat.prepareHTS) should be c(1:22,'X'). + +If using ASCAT provided references: + +```bash + +cd .../G1000_lociAll_hg38_unzipped/G1000_lociAll_hg38 + +# Function to check and correct 'chr' prefix +check_and_correct_chr_prefix() { + local file=$1 + local chr_number=$2 + + # Check if file exists + if [ ! -f "$file" ]; then + echo "Error: File $file not found." + exit 1 + fi + + # Check first line of the file + first_line=$(head -n 1 "$file") + + if [[ $first_line == chr${chr_number}* ]]; then + echo "File $file already has correct 'chr' prefix. No changes needed." + elif [[ $first_line == chrchr${chr_number}* ]]; then + echo "File $file has duplicate 'chr' prefix. Correcting..." + sed -i 's/^chrchr/chr/' "$file" + elif [[ $first_line == ${chr_number}* ]]; then + echo "File $file is missing 'chr' prefix. Adding..." + sed -i 's/^/chr/' "$file" + else + echo "Error: Unexpected format in $file. Please check manually." + exit 1 + fi +} + +# Check and correct 'chr' prefix for each loci file +for i in {1..22} X; do + check_and_correct_chr_prefix "G1000_loci_hg38_chr${i}.txt" "${i}" +done + +for i in {1..22} X +do + # Generate BED file from the tailored loci set + awk '{ print $1 "\t" $2-1 "\t" $2 }' G1000_loci_hg38_chr${i}.txt > chr${i}.bed + + # Extract relevant GC content data for this chromosome + grep "^chr${i}_" GC_G1000_on_target_hg38.txt > chr${i}.txt + + # Intersect BED file with target regions to find loci on target + bedtools intersect -a chr${i}.bed -b targets_with_chr.bed | awk '{ print $1 "_" $3 }' > chr${i}_on_target.txt + + # Calculate the number of lines needed for random sampling (30% of total) + n=$(wc -l < chr${i}_on_target.txt) + count=$((n * 3 / 10)) + + # Get loci that are both on target and match the GC content data + grep -xf chr${i}.txt chr${i}_on_target.txt > chr${i}.temp + + # Add random subset of on-target loci to the list + shuf -n $count chr${i}_on_target.txt >> chr${i}.temp + + # Sort, remove duplicates, and format output + sort -n -k2 -t '_' chr${i}.temp | uniq | awk 'BEGIN { FS="_" } ; { print $1 "\t" $2 }' > battenberg_loci_on_target_hg38_chr${i}.txt +done + +# Compress the resulting loci files into a zip archive +zip battenberg_loci_on_target_hg38.zip battenberg_loci_on_target_hg38_chr*.txt + +``` + +If using Battenberg provided references: + ```bash cd battenberg_loci_on_target_hg38/ rm *chrstring* @@ -837,30 +922,30 @@ nextflow run nf-core/sarek --known_indels false --genome GRCh38.GATK For GATK.GRCh38 the links for each reference file and the corresponding processes that use them is listed below. For GATK.GRCh37 the files originate from the same sources: -| File | Tools | Origin | Docs | -| :-------------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :-------------------------------------------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------- | -| ascat_alleles | ASCAT | https://www.dropbox.com/s/uouszfktzgoqfy7/G1000_alleles_hg38.zip | https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WGS | -| ascat_loci | ASCAT | https://www.dropbox.com/s/80cq0qgao8l1inj/G1000_loci_hg38.zip | https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WGS | -| ascat_loci_gc | ASCAT | https://www.dropbox.com/s/80cq0qgao8l1inj/G1000_loci_hg38.zip | https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WGS | -| ascat_loci_rt | ASCAT | https://www.dropbox.com/s/xlp99uneqh6nh6p/RT_G1000_hg38.zip | https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WGS | -| bwa | bwa-mem | `bwa index -p bwa/${fasta.baseName} $fasta` | | -| bwamem2 | bwa-mem2 | `bwa-mem2 index -p bwamem2/${fasta} $fasta` | | -| dragmap | DragMap | `dragen-os --build-hash-table true --ht-reference $fasta --output-directory dragmap` | | -| dbsnp | Baserecalibrator, ControlFREEC, GenotypeGVCF, HaplotypeCaller | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | https://gatk.broadinstitute.org/hc/en-us/articles/360035890811-Resource-bundle | -| dbsnp_tbi | Baserecalibrator, ControlFREEC, GenotypeGVCF, HaplotypeCaller | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | | -| dict | Baserecalibrator(Spark), CNNScoreVariant, EstimateLibraryComplexity, FilterMutectCalls, FilterVariantTranches, GatherPileupSummaries,GenotypeGVCF, GetPileupSummaries, HaplotypeCaller, MarkDulpicates(Spark), MergeVCFs, Mutect2, Variantrecalibrator | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | https://gatk.broadinstitute.org/hc/en-us/articles/360035890811-Resource-bundle | -| fasta | ApplyBQSR(Spark), ApplyVQSR, ASCAT, Baserecalibrator(Spark), BWA, BWAMem2, CNNScoreVariant, CNVKit, ControlFREEC, DragMap, DEEPVariant, EnsemblVEP, EstimateLibraryComplexity, FilterMutectCalls, FilterVariantTranches, FreeBayes, GatherPileupSummaries,GenotypeGVCF, GetPileupSummaries, HaplotypeCaller, interval building, Manta, MarkDuplicates(Spark),MergeVCFs,MSISensorPro, Mutect2, Samtools, SnpEff, Strelka, Tiddit, Variantrecalibrator | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | https://gatk.broadinstitute.org/hc/en-us/articles/360035890811-Resource-bundle | -| fasta_fai | ApplyBQSR(Spark), ApplyVQSR, ASCAT, Baserecalibrator(Spark), BWA, BWAMem2, CNNScoreVariant, CNVKit, ControlFREEC, DragMap, DEEPVariant, EnsemblVEP, EstimateLibraryComplexity, FilterMutectCalls, FilterVariantTranches, FreeBayes, GatherPileupSummaries,GenotypeGVCF, GetPileupSummaries, HaplotypeCaller, interval building, Manta, MarkDuplicates(Spark),MergeVCFs,MSISensorPro, Mutect2, Samtools, SnpEff, Strelka, Tiddit, Variantrecalibrator | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | https://gatk.broadinstitute.org/hc/en-us/articles/360035890811-Resource-bundle | -| germline_resource | GetPileupsummaries,Mutect2 | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | | -| germline_resource_tbi | GetPileupsummaries,Mutect2 | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | | -| intervals | ApplyBQSR(Spark), ASCAT, Baserecalibrator(Spark), BCFTools, CNNScoreVariants, ControlFREEC, Deepvariant, FilterVariantTranches, FreeBayes, GenotypeGVCF, GetPileupSummaries, HaplotypeCaller, Strelka, mpileup, MSISensorPro, Mutect2, VCFTools | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | | -| known_indels | BaseRecalibrator(Spark), FilterVariantTranches | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | | -| known_indels_tbi | BaseRecalibrator(Spark), FilterVariantTranches | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | | -| known_snps | BaseRecalibrator(Spark), FilterVariantTranches, VariantRecalibrator | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | | -| known_snps_tbi | BaseRecalibrator(Spark), FilterVariantTranches, VariantRecalibrator | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | -| mappability | ControlFREEC | http://xfer.curie.fr/get/vyIi4w8EONl/out100m2_hg38.zip | http://boevalab.inf.ethz.ch/FREEC/tutorial.html | -| pon | Mutect2 | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | https://gatk.broadinstitute.org/hc/en-us/articles/360035890631-Panel-of-Normals-PON- | -| pon_tbi | Mutect2 | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | https://gatk.broadinstitute.org/hc/en-us/articles/360035890631-Panel-of-Normals-PON- | +| File | Tools | Origin | Docs | +| :-------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :-------------------------------------------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------- | +| ascat_alleles | ASCAT | https://www.dropbox.com/s/uouszfktzgoqfy7/G1000_alleles_hg38.zip | https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WGS | +| ascat_loci | ASCAT | https://www.dropbox.com/s/80cq0qgao8l1inj/G1000_loci_hg38.zip | https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WGS | +| ascat_loci_gc | ASCAT | https://www.dropbox.com/s/80cq0qgao8l1inj/G1000_loci_hg38.zip | https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WGS | +| ascat_loci_rt | ASCAT | https://www.dropbox.com/s/xlp99uneqh6nh6p/RT_G1000_hg38.zip | https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WGS | +| bwa | bwa-mem | `bwa index -p bwa/${fasta.baseName} $fasta` | | +| bwamem2 | bwa-mem2 | `bwa-mem2 index -p bwamem2/${fasta} $fasta` | | +| dragmap | DragMap | `dragen-os --build-hash-table true --ht-reference $fasta --output-directory dragmap` | | +| dbsnp | Baserecalibrator, ControlFREEC, GenotypeGVCF, HaplotypeCaller | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | https://gatk.broadinstitute.org/hc/en-us/articles/360035890811-Resource-bundle | +| dbsnp_tbi | Baserecalibrator, ControlFREEC, GenotypeGVCF, HaplotypeCaller | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | | +| dict | Baserecalibrator(Spark), CNNScoreVariant, EstimateLibraryComplexity, FilterMutectCalls, FilterVariantTranches, GatherPileupSummaries,GenotypeGVCF, GetPileupSummaries, HaplotypeCaller, MarkDulpicates(Spark), MergeVCFs, Mutect2, Variantrecalibrator | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | https://gatk.broadinstitute.org/hc/en-us/articles/360035890811-Resource-bundle | +| fasta | ApplyBQSR(Spark), ApplyVQSR, ASCAT, Baserecalibrator(Spark), BWA, BWAMem2, CNNScoreVariant, CNVKit, ControlFREEC, DragMap, DEEPVariant, EnsemblVEP, EstimateLibraryComplexity, FilterMutectCalls, FilterVariantTranches, FreeBayes, GatherPileupSummaries,GenotypeGVCF, GetPileupSummaries, HaplotypeCaller, indexcov, interval building, Manta, MarkDuplicates(Spark),MergeVCFs,MSISensorPro, Mutect2, Samtools, SnpEff, Strelka, Tiddit, Variantrecalibrator | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | https://gatk.broadinstitute.org/hc/en-us/articles/360035890811-Resource-bundle | +| fasta_fai | ApplyBQSR(Spark), ApplyVQSR, ASCAT, Baserecalibrator(Spark), BWA, BWAMem2, CNNScoreVariant, CNVKit, ControlFREEC, DragMap, DEEPVariant, EnsemblVEP, EstimateLibraryComplexity, FilterMutectCalls, FilterVariantTranches, FreeBayes, GatherPileupSummaries,GenotypeGVCF, GetPileupSummaries, HaplotypeCaller, indexcov, interval building, Manta, MarkDuplicates(Spark),MergeVCFs,MSISensorPro, Mutect2, Samtools, SnpEff, Strelka, Tiddit, Variantrecalibrator | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | https://gatk.broadinstitute.org/hc/en-us/articles/360035890811-Resource-bundle | +| germline_resource | GetPileupsummaries,Mutect2 | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | | +| germline_resource_tbi | GetPileupsummaries,Mutect2 | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | | +| intervals | ApplyBQSR(Spark), ASCAT, Baserecalibrator(Spark), BCFTools, CNNScoreVariants, ControlFREEC, Deepvariant, FilterVariantTranches, FreeBayes, GenotypeGVCF, GetPileupSummaries, HaplotypeCaller, Strelka, mpileup, MSISensorPro, Mutect2, VCFTools | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | | +| known_indels | BaseRecalibrator(Spark), FilterVariantTranches | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | | +| known_indels_tbi | BaseRecalibrator(Spark), FilterVariantTranches | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | | +| known_snps | BaseRecalibrator(Spark), FilterVariantTranches, VariantRecalibrator | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | | +| known_snps_tbi | BaseRecalibrator(Spark), FilterVariantTranches, VariantRecalibrator | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | +| mappability | ControlFREEC | http://xfer.curie.fr/get/vyIi4w8EONl/out100m2_hg38.zip | http://boevalab.inf.ethz.ch/FREEC/tutorial.html | +| pon | Mutect2 | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | https://gatk.broadinstitute.org/hc/en-us/articles/360035890631-Panel-of-Normals-PON- | +| pon_tbi | Mutect2 | [GATKBundle](https://console.cloud.google.com/storage/browser/_details/genomics-public-data/resources/broad/hg38/v0/) | https://gatk.broadinstitute.org/hc/en-us/articles/360035890631-Panel-of-Normals-PON- | ## How to customise SnpEff and VEP annotation @@ -897,7 +982,6 @@ By default all is specified in the [igenomes.config](https://github.com/nf-core/ Explanation can be found for all params in the documentation: - [snpeff_db](https://nf-co.re/sarek/parameters#snpeff_db) -- [snpeff_genome](https://nf-co.re/sarek/parameters#snpeff_genome) - [vep_genome](https://nf-co.re/sarek/parameters#vep_genome) - [vep_species](https://nf-co.re/sarek/parameters#vep_species) - [vep_cache_version](https://nf-co.re/sarek/parameters#vep_cache_version) @@ -905,8 +989,7 @@ Explanation can be found for all params in the documentation: With the previous example of `GRCh38`, these are the values that were used for these params: ```bash -snpeff_db = '105' -snpeff_genome = 'GRCh38' +snpeff_db = 'GRCh38.105' vep_cache_version = '110' vep_genome = 'GRCh38' vep_species = 'homo_sapiens' @@ -1013,6 +1096,12 @@ This command could be used to point to the recently downloaded cache and run Snp nextflow run nf-core/sarek --outdir results --vep_cache /path_to/my-own-cache/vep_cache --snpeff_cache /path_to/my-own-cache/snpeff_cache --tools vep,snpeff --input samplesheet_vcf.csv ``` +Here is an example on how sarek may be used to download the SnpEff cache for Candida auris: + +```bash +nextflow run nf-core/sarek --outdir results --outdir_cache /path_to/my-own-cache --tools snpeff --download_cache --build_only_index --input false --snpeff_db _candida_auris_gca_001189475 --step annotate --genome null --igenomes_ignore +``` + ### Create containers with pre-downloaded cache nf-core is no longer maintaining containers with pre-downloaded cache. Hosting the cache within the container is not recommended as it can cause a number of problems. Instead we recommned using an external cache. The following is left for legacy reasons. @@ -1099,12 +1188,20 @@ Sentieon supply license in the form of a string-value (a url) or a file. It shou nextflow secrets set SENTIEON_LICENSE_BASE64 $(echo -n | base64 -w 0) ``` +:::note + is formatted as `IP:Port` for example: `12.12.12.12:8990` +::: + If a license file is supplied, then the nextflow secret should be set like this: ```bash nextflow secrets set SENTIEON_LICENSE_BASE64 \$(cat | base64 -w 0) ``` +:::note +If you're looking for documentation on how the nf-core Sentieon GitHub Actions and Sentieon License Server are set up: [Here be dragons.](https://github.com/nf-core/ops/blob/main/pulumi/sentieon_license_server/README.md) +::: + ### Available Sentieon functions Sarek contains the following Sentieon functions from [DnaSeq](https://support.sentieon.com/manual/DNAseq_usage/dnaseq/) : [bwa mem](https://support.sentieon.com/manual/usages/general/#bwa-mem-syntax), [LocusCollector](https://support.sentieon.com/manual/usages/general/#locuscollector-algorithm) + [Dedup](https://support.sentieon.com/manual/usages/general/#dedup-algorithm), [Haplotyper](https://support.sentieon.com/manual/usages/general/#haplotyper-algorithm), [GVCFtyper](https://support.sentieon.com/manual/usages/general/#gvcftyper-algorithm) and [VarCal](https://support.sentieon.com/manual/usages/general/#varcal-algorithm) + [ApplyVarCal](https://support.sentieon.com/manual/usages/general/#applyvarcal-algorithm), so the basic processing of alignment of fastq-files to VCF-files can be done using speedup Sentieon functions. @@ -1147,7 +1244,7 @@ Currently, Sentieon's version of BQSR, QualCal, is not available in Sarek. Recen ## Requested resources for the tools Resource requests are difficult to generalize and are often dependent on input data size. Currently, the number of cpus and memory requested by default were adapted from tests on 5 ICGC paired whole-genome sequencing samples with approximately 40X and 80X depth. -For targeted data analysis, this is overshooting by a lot. In this case resources for each process can be limited by either setting `--max_memory` and `-max_cpus` or tailoring the request by process name as described [here](#resource-requests). If you are using sarek for a certain data type regulary, and would like to make these requests available to others on your system, an institution-specific, pipeline-specific config file can be added [here](https://github.com/nf-core/configs/tree/master/conf/pipeline/sarek). +For targeted data analysis, this is overshooting by a lot. In this case resources for each process can be limited by tailoring the request by process name as described [here](#resource-requests). If you are using sarek for a certain data type regulary, and would like to make these requests available to others on your system, an institution-specific, pipeline-specific config file can be added [here](https://github.com/nf-core/configs/tree/master/conf/pipeline/sarek). ## CNV calling with CNVkit diff --git a/docs/usage/variantcalling/img/bqsr.excalidraw.svg b/docs/usage/variantcalling/img/bqsr.excalidraw.svg new file mode 100644 index 0000000000..4dc41bd095 --- /dev/null +++ b/docs/usage/variantcalling/img/bqsr.excalidraw.svg @@ -0,0 +1,17 @@ + + + eyJ2ZXJzaW9uIjoiMSIsImVuY29kaW5nIjoiYnN0cmluZyIsImNvbXByZXNzZWQiOnRydWUsImVuY29kZWQiOiJ4nO1daXPiSrL9fn+Fo9/XQbeqstaJeDGBN7CNd1x1MDAxNptcdTAwMTdcdTAwMTNcdTAwMGVcdTAwMTZhMKtZbMzE/e8vi7aNXGaIxVxiXHUwMDEwd0xHuLtBWCUpT548WVlZ//ljb+9X963l/vrn3i+3X8jVKsV27vXXP+z7L267U2k28CM2/H+n2WtcdTAwMTeGR5a73Vbnn3/+Wc+1q263VctcdTAwMTVcXOel0unlap1ur1hpOoVm/c9K1613/mV/XuTq7v+2mvVit+2MTlx1MDAxMnGLlW6z/ftcXG7NrbuNblx1MDAwN3/7/+H/9/b+M/zpXHUwMDE5XdstdHONx5o7/MLwI89cdTAwMDDBjL970WxcZlx1MDAwN0ulXHUwMDA2SjhcdTAwMDXxeUSlc4jn67pF/LiEY3ZHn9i3fjHdXHUwMDE0pzcq0X+4T3b3eb6duY/HRqctVWq12+5b7fetyFx1MDAxNcq9tmdQnW67WXUzlWK3bM8+9v7n9zpNvFx1MDAwYqNvtZu9x3LD7XS+fKfZylx1MDAxNSrdN/tcdTAwMWUhn+/+vlx1MDAwYv/cXHUwMDFivdPH/1x1MDAwMTBHKFx1MDAwMpSNLnT4TSlcdTAwMWOtxNg4XHUwMDBlmjW89TiO/6Gu/TNcdTAwMWFJPleoPuJwXHUwMDFhxdExpVLBXHUwMDE0zOiY1/erY+rzrbJbeSx38T3uuWR3eIulVkZcdFxun+/bX986KVx1MDAwZVx1MDAxZva/R7e1jWZyYr/Q6NVq3jvTKL7fmVx1MDAwZqNcdTAwMTiZXHUwMDA1vL/z12j89vijcXPymtRcdTAwMTez6rr97ueFeWwg2y9D8Spuolx1MDAwM/f5NFx1MDAxNT1p7KfL6tfncX+9/2s0/F6rmPttUlRcdTAwMWElqVx1MDAxMUKi7X1+Xqs0quPXVmtcdTAwMTaqIyv8w3MhXHUwMDEz5v9lnFx1MDAxZcunXGZ8LVx1MDAxZowyTFOpXHUwMDE2tvzpV1x1MDAxZHrLN46gaPnccMqJXHUwMDFjXHUwMDA3gIJVXHUwMDAx0G3nXHUwMDFhnVaujSY1XHRcdTAwMDJcblx1MDAwZaXGXHUwMDE4Qami3lx1MDAxM31CwjOgd0gwXCJcdTAwMTRcdTAwMDVmQK1cdTAwMGWKL1x1MDAxZkxY/1xmXHUwMDAz5Vx1MDAxY1x1MDAxNFnGQEejaja6t5WBO7ySL+9cdTAwMWXn6pXa25dHObRcXLyJ0V9f3orWKo/WgH9cdTAwMTVwqG77i213K0hcdTAwMTKfXHUwMDA31CvFotftXHUwMDE38Dy5SsNtnyzirZvtymOlkaslJ4eBV+zGP1x1MDAxZVx1MDAxMnU8jymf67j2U/u+nonL2bSk1Pi7I3BcdTAwMTJcdTAwMDKMXHQygu88cJ5n1eAulii81XJFJlq1WvnuIFx1MDAxYnpwXCL9SMFcdTAwMDBtbZyXpIPI1V8++Fx1MDAwZTnlWYnl898lJ1xynGnieX9cdTAwMTfI6Tj9QNv64u4sm7spvFx1MDAxZaVuXHUwMDFljq7d5chJXHUwMDEz9FhrJyfJfO1fMSqogCXCsulXXHUwMDFkdvtXXHUwMDE06YGBIFxcgpaKfVx1MDAwNYHigYBgNkFJhyhjtFx1MDAxNIoxKbz8OIOiKEaSmiCzXHUwMDA1XHUwMDEwt32bovCuXHS2OYpKromi5njucYpKbpCiOPVXTohQjGeEXHUwMDE53ah5XHUwMDEw3Y81j1pcdTAwMWR4Ulx1MDAwMorxzlu+lzrfT4RcdTAwMWSi3GjnS+D2IZvQXHUwMDAyv0Z0m9dOnEngUpnd4qc7mtaRezeSP2imT/drl1x1MDAwNXmSqSzHT4qQpYD/LX5cdTAwMDLiK55cdTAwMTjajZEon8zCxj/9qkNu/IJcYocyJVxi437iKVxiXHUwMDE4XHUwMDA0rqAwflx1MDAxNozg8MTqyPi+gmJcdTAwMWNV/s4rqDlee5tcbopcdTAwMTP/XGJcdTAwMTLpiSst5OJcYs3V6vVMXlx1MDAwZVx1MDAxMrSffL6KPldl8fw17FxixfDQMZRQMYFNpCg+/sHmXHUwMDE1XHUwMDE0Plx1MDAwNlx1MDAwNpKQXHUwMDAwUlx1MDAxOVx1MDAxYqSoa2ViXHRyxKB8v3/Qfa2blHt/XHUwMDEzQlx0xbR/XG6BXHUwMDFiYDhcdTAwMDa2OFx1MDAwMKZfdchcdTAwMDHAXHSKJEVQTJqpXHUwMDEyXG5cdTAwMDJBwToklGCCU0Y8k1x1MDAxM9uQUFx1MDAxOD4uY6bhlFBzXFz3diWUL0cxTVx0SFx1MDAxMIsjNHpcdTAwMTCN9+LnzfjD4PW0d3liXHUwMDEy6kiGXHUwMDFloZI7hCM8p6koyVfP8ZVKbsF8W0RRjdG8XHUwMDAxYnaLokyqdNqJkqwpkofLwsNL92L/xSxHUUYzSbxIWFx1MDAwYkVcdTAwMTlfilJaXHUwMDAxum65eJJv+kWH3v6NTXKjXsRAXGJcdTAwMDT1aIJcdTAwMGaKXG5cdTAwMDBcdTAwMDWzXHUwMDE5Sjhcblx1MDAxOcpo/DlcdTAwMDbEXHUwMDE5XGaF5GCMlDqA9ML3XHSKXCK1siWMdDWCiq2JoOY47nGCim2QoICT8Xc/XHUwMDAwKlx1MDAxNEHZb/Tis1Bnrlx1MDAxONxcZpKtWOauk9tcdTAwMWaI46tidD/0XHUwMDAwtcVcdTAwMTFoalOqI6QzXHUwMDFl0X2HoHKiqEulb1x1MDAxM1x1MDAxNFx1MDAxOFx1MDAwMG2k2q1cdTAwMWGJXHUwMDAyjVx1MDAwZjrJ6PUgdtG8a1x1MDAxNc5fdfWos6SGMtRzi9ZFUJ6HO2b/XFxcdTAwMWKCXHUwMDEwWLxCYvo1h978jSNQXHUwMDA1gERcdTAwMGUy2oP2j0moXHUwMDAwMDCbnpijrYAyknFiZ4P1JC6mXGIoxVx1MDAwMC1Fku1cbigtzebmoFx1MDAwZdbET3P89jg/XHUwMDFkbJCfuPLFJ1x1MDAxM1RrotXi/HRcZk9cdTAwMTeFg8fru/ZAXHUwMDE03GI8fXuVb4ZcdTAwMWSgXHUwMDAyiKOIX/VcdTAwMWVdkZxWVE9SKUEl2TFy4vePtergNZXWT49gjlx1MDAwN/00f6qFUD2B9FxyzijGZVx1MDAwNL2fWnxcdTAwMDJ2+lWH3vqlXHUwMDAz1Fq/XHUwMDFlyqdJ9bQqXHUwMDA01iCdUDRRkHSrXHUwMDE1fIJcdTAwMDKly1hoOKXTXHUwMDFjl71N6SSILzVcdTAwMTlU/ELRJaojKoVa863CUpHW/v15rFx1MDAxN+/H+8/R0IPTaGd8XHUwMDAy+LN+j1x1MDAxMLUyO61YvyekoFxmXHUwMDA3uFPkVFf3hcOj64O+ubmOt6Rrzs1dKoSzT2D8ycn6TIE+e3H7n37VIbd/SbhDlLKT/dSngC9cdTAwMDBcdTAwMTSsYfaJcaYwppTbze0pafjGXGJqXZNPc1x1MDAxY/dWJ5880nRi5Vx1MDAxM0YnNrG0ePb90bylajelu8YzPbt8ZbFz3rhcdH39npDc4WpcZlx1MDAxOFx1MDAxZlx1MDAxNLVy/d5qmT27XCJLci52i55cdTAwMWWOs/3Dx1o/NThcdTAwMWM8PVx1MDAxNY5Zj1x1MDAxY/ZDmNhcdTAwMDPtW78nmFx1MDAwNq5cYl08szf9osNu+4o6XHUwMDAymJTUL7O33sq9b6X1NMfghfMgXHUwMDEyXG4rMJPgS4n71ZhpXVm9OVx1MDAwZXubWT3hiTwmmFx0pNSGicWZSSQ7hVx1MDAxN1x1MDAxYcmlyvck91CusHg8nVx0OzolUIeC9Fx1MDAxMU9i9cq9XHUwMDE1M3tcdTAwMDKQnjBC2C12MqTNuFx1MDAxNq+ty1xuTd6ddPOJ5OVLXGIze5xLX1x1MDAwMGjQSlx1MDAwMZWLi6fpV1x1MDAxZHpcdTAwMDBIR3FcdTAwMDTAe2HEeGaPXHUwMDA3gYI1JPcoYVxuwFx1MDAxOLXV7Fx1MDAxZSM44o1R1Lqye3M895qze8OjpoCTXHUwMDFhX3AyyvHW21U3XHUwMDBigzNVauXKhdvDSDVD44/VZidJT0XowYkkZFxmmyQnLVx1MDAxY7rmciWhXHUwMDFjxagxwI2dgFdT4kakTo2yXHUwMDBlnZ7Ch8GYR+h+xJFcbqNILWhcdTAwMDCzw5P0xebTly92XHUwMDE5QY9H1DL6x3Onc+3ufqVRrDRcdTAwMWXHv+I2ij6f1HKd7kGzXq90cVx1MDAxOFfNSqM7fsTw90bb7eZr2c1N3Fx1MDAwYvzNvp+17K/7ytejf+2NjGr4n89///tcdTAwMWZTj/Z98PY1+chHv+5cdTAwMGbv30ujnfnnMalRWlx1MDAxYrXEQuTzhGy0aEFV04c0X0KTaJ89lsOOdsqkQ7TRwnCGt1x1MDAxN8alolx1MDAwNkeC4lZcdTAwMTlcdTAwMTlOgfGxkVx1MDAwNUjJRk5Odn/CnjjoeL/w9MeEXHUwMDFiXHUwMDA2TJqvKdO/XG7a8X5cdTAwMTnF+bfSnH9ztE971MMvfjzkJfE9W3D6lzlqglpcdTAwMTOALVx1MDAxZW7HXHUwMDFln5/LvTNVT5zc9rNcdTAwMWS3er1fOFxyO8ZcdTAwMDHA4VNToVxm7CxcdTAwMWXCOoBiklXWM1PCXHJjwut9dkFy0kxfxtL87jTaUFx1MDAxN93n50FcdTAwMWSNOoRcdTAwMGKaucdtj1c6gi2vkItcdTAwMTdSTb/msCOAXHUwMDEzXHUwMDA3pTWTmkg5uZyZcVx1MDAxOVxmXGaCX89MhP1cbvVcdTAwMWO9XHUwMDA1xclcdIb9XHUwMDFiU5zrWs88x3dvcz2zUL7TdYyidYJRYnGInmSvXlx1MDAwNoy9uZ1i9bpWU53zRCv8XHUwMDFkcVB2XG42noz5TVLGIYE03Vitplx1MDAwNEFIhaFSXHUwMDA0gMVccpLUyVlbJ6B5dHjDqHyq64P723Q0hEUlXFzO6LpBOVx1MDAwNlx1MDAwN2ZxLTb9osNcdTAwMGVcdTAwMDFFXHUwMDFko1xyXHUwMDEwXHUwMDEwWk/UlDCugoHBXHUwMDFhikpQPEqqaVx1MDAxMCmXXWGpdVx1MDAxNZXMcd7bLCpR/ulcdTAwMTKNQVxyoVx1MDAxYZZYMFa8uk9cXMHxeVwi0oxmtLygd+Y+7Fxi5Vx1MDAwNqnIXHUwMDE2bkxU5FslxY0mXFyu3lx1MDAxOWq1snwjXHUwMDEwwGo92c+1kVR+8PCaaKpu7OAqcnVcdTAwMTg9vLy5Se+HcPJO+lx1MDAxN/5cbs2FXHUwMDE0yyRcdTAwMTOmX3TIISBcYnekslx1MDAxMFAgJlx1MDAxNzVbKVx1MDAxNVxiXHUwMDBl1jB9XHUwMDA3RFx1MDAwM2OwZS2lJWyu9HFds3dzvPc2a/OVf/c25ChcZq+MYIuXfyUhXHUwMDE5Ob4pZ1x1MDAwZbLPqcFJtaYuY24y7Fx1MDAxOFx1MDAwNaQppZXtbjGhpbRDeVx1MDAxML03VtRS+CRsXHUwMDEzXHUwMDFjvVs1JskyfXZcdTAwMGJlOG6rU3NK6oem3i+EUEuhfvDDgG26j5H94oHa9GtcdTAwMGU5XHUwMDAyOFx1MDAxMVx1MDAwZVxuJkolmSqlZDAoWEd3KKKUYkJstT7/7yGl5vjurUop6p/skMY6xyVcdTAwMWFgl1x1MDAwN9nHXHUwMDE461aPKieF6v5Ftdvv5UNcdTAwMGZQyVx1MDAxZEOn1ZlYilx1MDAwMs3V6t03VlRShqProLBjSirdeTt/aSa6pdTJoJpcdTAwMTankTM2uFx1MDAwZaGSXHUwMDEyM1ao2KZhTCyzhnL6VYdcdTAwMWRcdTAwMDOK2JS3sZUtPlIqXHUwMDEwIKyjXHUwMDEyklNFrLlstVx1MDAxMnKzNLW2dc6z/fc2tZT0uMWJNqOccaPUXHUwMDEyfXbbhMXzlav0WfKy43aahy/JlFxi/Vx1MDAwNlpcdTAwMWPAXHUwMDAxM1x1MDAxZSt+XHUwMDEwlZFMXHUwMDFist1Ou+gjtF10vlvTUlx1MDAwZq88XHUwMDFmSTxlTM28qoebWPf27egqjFKK+XfapcxokGxxLTX9okOPXHUwMDAw7WghuVxyS6drqUBQsFx1MDAwNi0lMZKQYqu94Fx1MDAwNeeb7MWxLik1x3VvVUop//U0dlx1MDAxMZPQQFx1MDAxN0doSbZYsV2vZtK5y4eyvGtkn9/Cv9xcdTAwMTOog5dppnOUQFx1MDAxZFx1MDAxM0SB3ypiSlx0oSRTbLfSfS+DWHH/pa8rMfeqltf7kZI8ulx1MDAwYqGWkjPWVFx1MDAxMnzwiFx1MDAwMLV4PmH6VYdcdTAwMWVcdTAwMDLKQbgjXHUwMDA0zFBLTUqpQGCwjma7tllcdTAwMTDh6r8o4bcuJTXHeW9TSWniz1LCgEUoXVxcSSWzt3HzmrpM311ccnKvoC6iifBvRSyMcabvxTqs8IPx+aotlKFLyuxcdTAwMWFcdTAwMTi+WzT1VO5WX/l1Vsf711x1MDAwZjl1WHHb8adcdTAwMTCWoUvjv2uPIFx1MDAxNFx1MDAxNPAlXHUwMDE2V06/6pBjQFx1MDAxMulwQFxmwHsh+lx1MDAxOE2pYHBcdTAwMTB4XHUwMDFkOvonNJXtKilBtdz9KvQ5rnubVehcdTAwMTi7zOAobavQl+hcYj/oXHUwMDFjg5u5XHUwMDE3pVrq9un2gqfrqVj4lZRcdTAwMTRcdTAwMGVw21p6arZcdTAwMGZ/rK6kVmxcbm/bXHUwMDFla1xyOyalTkUkVyGlQlx1MDAwZurtxln8+CTJr09C2DtKat+lUtSgz15u49PpV1x1MDAxZHZcZijmcGIk5UKyieZRw3xfXHUwMDEwOFhHY3hOXHUwMDA0M8RsN+G3USm1rlx1MDAwZVJzvPc2O0jpXHUwMDE524ejZTI0zSWqJ55pXlx1MDAwYpbMkXRF9tNSiIp8iodcdTAwMWSiXHUwMDEymEOmtpBcdTAwMWFKKa1cdTAwMDKYlFqxeoJcdTAwMDJgXHUwMDFjXHUwMDFiRFx1MDAxOdNcdTAwMDZZ6rXazpLsY6ZcdTAwMTFcdTAwMDF2XHUwMDEwrVx1MDAwZXpXp3dh7CGluO+kXHUwMDE0XHUwMDA3wySVsLiQmn7RoUeAciRDXHUwMDA0vNdOjOf7VDAoWEfpXHUwMDA02L3PtFx1MDAwZWJn1F1hqXUl/OY47y01kVx1MDAwMv9Un6KUMWaWSPWxJ54t3p9mrkk1dX/4pJv8pVx1MDAxNfq9WcFcdTAwMTbYTltoT6kzvt9WkNxEnFx1MDAxOUhkjlx1MDAxOVu49clVWlxuXHUwMDAzPIj97oLuISNcdTAwMTlVniVFPz1kPlx1MDAwZph41PY1esij7//h/XtZLPvvQaQ5Y0DpXHUwMDEyc8vVkmrLk448fLw5Ok88XHUwMDBm7nqJdPjX5SvjUCpsU1BjhNJipI8/XHUwMDEwLant26ox3pOeLlx1MDAwNSHAte0oJkQgzTNcdTAwMDLHNVx1MDAwZZj84HrK0ZvBNeX+5cd2XHUwMDBiXHUwMDE3RVx1MDAxOF9iPq4myEvpJnpD3Dt29HJcdTAwMTUjt+nzsCOb02HViOBKK07luJAkdraOaW6okFx1MDAwNLx73W1cdTAwMWbZOCZcdTAwMDZayjBcdTAwMDL766q2XHUwMDFmYH9cdTAwMWWwXHUwMDE5YDPlK42Nsr10+Fx1MDAxMtL4rGS6MpFzXHUwMDBib4lWunZcdTAwMTm5OM6ch37bdcGHXHUwMDEzfMKureF2fcFcdTAwMThh44PQXHUwMDFhdYig2ni3oNg+rKkmhNreJ6Fr3Wo5gXP908xxytFcdTAwMWLCtfSfnGRIXlx1MDAwNDXnXHUwMDEyO9pAJFx1MDAwNk+Z29LhQbZ4mrvP3MVSoVfVXFxxR2pBNFx1MDAxMVx1MDAwNu85XHUwMDFiK1x1MDAxZUBgXHUwMDEzSTk3YljRXHUwMDExqkjcXHUwMDAwulx1MDAxYsJoXHUwMDE4XHTb7jT/rVxu7Vx1MDAxZmC/f38lYIN/7ztjJFKCdzPNueXbR/y+8HjLbuP1bKp7ljx7LOFcdTAwMTNcYjmuUVU7VFxyu1NcdTAwMGVxPfotXHUwMDFmXG7bNpZcdTAwMTHCNthcblx1MDAxOa5ta2gwPJSBuJT6J3M27ejN4JpcdTAwMTP/npbIXHUwMDA0XG448MVcdDtS18WbwlupUTYvpW66Sp9cdTAwMWJng7BcdTAwMDNbcrypQFx1MDAxNd5cbsrt1PRcdTAwMDSwXHUwMDE5XHUwMDExXHUwMDAyODFcdTAwMDCMj49ru8CWRGBQXHUwMDE1ypS4XHUwMDA06ZlA+Vx1MDAwMfbnXHUwMDAxwVx1MDAwMtut1SqtzvTsXHUwMDE58a2SUlx1MDAwNFlba754kZSqmOPH0+L9yWv+XCJcdTAwMTWRXHUwMDBmrd51yW/+udBudjqRcq5bKG9cdTAwMWbeXHUwMDFjhfbUUkFqXHUwMDFjObtNp0uAXHUwMDAy/V4tu3A+liOT4Wv0KD6BrVx1MDAxY0qGXHUwMDFiKP0+goxcdTAwMWPPZz0vp8xcdTAwMTC5ln7rK8LbMO/+jCtWUMwyYulcdTAwMWZ4gsIwTCwxt/OSr9cj0SO3+li5uzxPRkX27Vbvglx1MDAxMaOPd4RcdTAwMWXfq3hoxlxmXHUwMDA1pZjdyG/bdmz1XHUwMDAxMd5lXHUwMDAzIbJjaWBcdTAwMTma+q5cdTAwMWQz/1xyLtBcdTAwMTNzXHUwMDA12iwuoFx1MDAwZU8z3VKsqF9cdTAwMDd1nS2nXHUwMDFh96R+5tfeP1R2LDGa4or7uGOhOFx1MDAwYrVDtlx1MDAxYuxoiq4vhPvYKFx1MDAwMsSj8NZU00b9e42gf1x1MDAxMmjEni3D55nxXHUwMDFi2c9cdTAwMTbJ/nU1cv5Ya+Xf8lx1MDAwZr0rv35AoTJjipJcdTAwMDH9hlx1MDAwMlx1MDAwMPRsdlx1MDAwMu6LLVMhXHUwMDFjXG6c2V2PjDbafzumVVxmmkrhXHUwMDEwu5aWSKNcdTAwMTn3JFx1MDAxYWfUtXF0xGjhsK11rLazp51cdTAwMTJkm9tcdTAwMWKx7Zbctos2uvfoNpp1j3f+UuVWc0tfLf5rjVu32fIrcPsy8vFqNv+zXHUwMDA3Udzm21x1MDAwZsV/ek1rXHUwMDAwo9TiKD18qt/UK+WT04db95pcdTAwMTZq6qXeNruAUnTTjmBCS2RcdTAwMWPFXHUwMDA0XHUwMDFmT8VzhZRgS2IkJ1xuI23/nFx1MDAxZCtcdTAwMTmX8+/B1IhhU1x1MDAxNDyFXVx1MDAxY6P5XHUwMDE03pla7KZcdTAwMDQ6l22tkdhcdTAwMDJML2K3e20rvjdcbs/Js65cdTAwMTOW1DO1MlF0avBFOCxcdTAwMGXMKKukL4/jkUg1XHUwMDFlbyePS2dcdTAwMTe6kdtcdTAwMDVgUoxcdTAwMDKtdDOGXHUwMDEyRVx1MDAwMb6yJ9NcYluBelx1MDAwMYM1XGbF9Kpz3z7saahjjdx2n1x1MDAxMV+i0ZlLl7RcdTAwMDBiXHUwMDAy6eK1K7j8QMzeVbntXHUwMDE29zqFZnvDXHUwMDE0OntcdTAwMDTrxCtcdTAwMDP/rUc5+nIqloCrXHUwMDBiXHUwMDE3PP1aPCpF3U7mIH+TjZ4ov9nscMGV2dVKYFxi6lwiTej4enjEj4PyXHUwMDE1XHUwMDEwroZq28dmPYC18+rKhrtcdTAwMWZrRlx1MDAxNkGspohZXHJsW3vebFx1MDAwMbCtttvBUe1VXHUwMDFhe8X87cXVvzaL1lx1MDAxOadfJ1Slb66bKYH6XHUwMDE4mWRhpCZbXHUwMDE5QUnv7jJyVXS71excdTAwMTm0iuldQCpGXHUwMDBmzjC+l4BcdTAwMDEvI/xrjlx1MDAwNYA5XHUwMDFjXHUwMDAzUVx1MDAxNK2C41ErbVx1MDAxM+zfxYU4XHUwMDFjqV0z855NWYhZmc1cdTAwMWXIQIrKdlx1MDAwNaiN5maR6T3fOqFcYr4pXCJKpbJJOD66yfOwWI+LfidTf2IniadWtaEvL59cdTAwMGZcdTAwMGV3XHUwMDAyi0Y6UrJhQ05cdTAwMTSf8LXAXHUwMDEzgDiU4e2wW3ZcdTAwMDPnq9ZtT4dcIlDH6lxuyajQzGfP7mnTyuhcdTAwMWPsVMPWo9xvVWh/XHUwMDBii29uZ7Ng/HLCtVx1MDAxMqPvOnxF7NIoLka2N1x1MDAwZoyXmT5liVT7ilZad+xi0Dw3tYNdXHUwMDAwoySo9uxeXCJcdTAwMTJ/Sm/UPlx1MDAwNCNcdTAwMDa4SIuMMYJcdTAwMTGsUSRMxEhcdHDNJFx1MDAwNDBcdLwrYPy7XHUwMDEyI1x1MDAwM/+2NbZLNkNnvXiUetE7es41oJy7ub4+PCXJw8jdSX1cdTAwMTfASFx1MDAwMZmRo++xYtJcYqSmr2hUXHUwMDE0cUJtplx1MDAwNShcdTAwMTCq5djIXHUwMDAy0pOSO7Z5XHJgnCyGXHUwMDEz61x1MDAwYsFRcVx1MDAwMprKrfVcdTAwMDXYXHUwMDAyXHUwMDFjW80uXHUwMDBlqpKr7bntdrPd2bSg9D/9OvtcdTAwMDNQ7t/Axlx1MDAxNkVyucQ+qvX+bZy5+/tH8eRb7Ch538gl72BcdTAwMTewXG6cOMTgXHUwMDFmpqjRmpgx5uTSIciokjGtOFx1MDAwMnY9yVpNXHUwMDFjvN/DUihmd1x1MDAwN1x1MDAxY8XSn1AlXHUwMDBlRq1cdTAwMDLdXG5DhEpJmCch8I5dRqSxXHUwMDE1k0HMqlx1MDAwNDt7z1xiXlx1MDAxNOM/XHUwMDBiXHUwMDEyJ4/2ffT2XHUwMDE1mfLUR7/wXHUwMDBm79/Lwt9/bz5cdTAwMDRccnCliVi8XqdyXHUwMDE4u9vvvZQoZd1iOtt8NoPjh12AP955x1CBKoFQZSukx1x1MDAxMkpcXDhIoFx1MDAxYbTSdqNcdTAwMTmyXHUwMDFlqmbckdbfKi5cdTAwMDRGXHUwMDA3nE6ZQ1x1MDAxZI37o2M3eqyA9pFcclx1MDAxY+yGedPXP2D/ePk/6LGvr1x1MDAwNO1cdTAwMTmdf7hcdTAwMDCkXGK1xJKHp2Y+X+BcdTAwMDftXGZ9yZjq/VtcIle8bu9cdTAwMDK0XHUwMDA1gKNcdTAwMThHbWnR7cnZ/Vx1MDAwNjY4iH1cdTAwMGIgW2LkjTaDXHUwMDA0trBcdTAwMWTFkZDRr1x1MDAxM6WImNI/kjhcZp06J8PaMibQJjypsvc5XHUwMDFl28GIhnB5XHUwMDEzXCJdYGz0rVx1MDAwZV5/c6T7PvnhtyefeUDQp/7YR5eMXHUwMDFhXHUwMDFjQ/vF10RcdTAwMWP1sm/7+41odb/55rpHUZa6rPitY1xmXHUwMDE19iVQR0olqb3BwrurxW/wc8dOslwi5aP5qnU1+1x1MDAxN7ZjXHUwMDExMFxmx6l1NFPkN3FsZyAh8JlIbYSmXHUwMDEzIT3VtoCDsdCtkEDoK+NcdTAwMWTwXHUwMDBm9D9evlx1MDAwZt6+XCKTz3xJ6Ptvv+OLfEqGbSPo4onwrEi/XHUwMDFk5HuHplx1MDAwNdXk21FcInbejrJdQD63y0UoXHUwMDEzgFJcdTAwMDRlzFhHTlDgMECRZV2vVlx1MDAwMOuJ57lyXHUwMDEwtoZPbEg3M1x1MDAwZi6MwtFcdTAwMDdcdTAwMDH1XUm8XHLzXZtNto2dcq3TxNy3XHUwMDE4XHUwMDEy7YJLlNmLJ8Ozrccj2elcdTAwMWZcdTAwMGbi2f2Xw4PXk9515WhcdTAwMTdcdTAwMDApNHc4hjuEM0tcdTAwMWFcdTAwMTOA5I5gXHUwMDFhhbdmXHUwMDE4XGKtuvg4ODhyXHUwMDE1VC/OXHUwMDFmMIZcdTAwMDGMzFx1MDAxN4vGSG4482TD51x1MDAxNjrG6dm9rNSvslwiccq7hYKKseQuYFx1MDAxMUA5XGI1NG3EouKeOZ5hXbJhXHUwMDBlk4pyu6tcdTAwMDJcdTAwMTcwY1nPSvNSdm9cdTAwMTMwXHUwMDEzW3CNwDiR6kLg4ldcdTAwMDJpXHUwMDAzsF00erzfXHUwMDFjNFKyWSh6z7dcdTAwMThcdTAwMGX5t3A4Y9JJ2a54ii+xcodcdTAwMWPdllx1MDAxZe5FYvBaVM3yVe8527/q7Vx1MDAwNFx1MDAxMFx1MDAxNXHsXHUwMDFlxkBcYlx1MDAxN1xiXHUwMDAzXHUwMDE4a1x1MDAxY2+Ig6JcdTAwMGYjVKJt5L6mguPlgUhcdTAwMTVcYqTr7XWNx7hcdDVcdTAwMTVXK2/B9YNET+P9yfXaWlx1MDAwMCxRxJhcdTAwMTk8prPnnWSVPqeSL6r1UILb3ShiXHUwMDA042i8WsJcdTAwMDRcdTAwMDFJxjZcdTAwMTlCQlx1MDAwNKmlXHUwMDEwaHbEyDUt1FlcdTAwMWWGqFx1MDAxMolNXHUwMDFj6p3Xij8wnNFcdTAwMDFaSY2ahanFVWLs9DqejMj284Uq8f5bp5DpXHUwMDE1dqL/XHUwMDA3J9QhlFx1MDAxMo1wo3a2dVx1MDAwMoj4odFcZilTcLpcdTAwMWUgMkS7XaNcYvqjkcJClMhB40NcbkH94jI7p66GRbZhLLJccmGR+aZQNTWKg2CLi8T7+8OUhNQtacpqLX+UfmxHobxcdTAwMTNQlLaTiKaCSMVtf9exrVx1MDAxNlx1MDAxMItE4d0wctjeeU0p1G+C0XZM4oJcdTAwMDQxVfpcdTAwMDPGd9vZXHUwMDE2XHUwMDE4fYnRzqtcbuatkplbnthNpctCV1x1MDAwYiyhc0lDXHUwMDFmb85cdTAwMTK7seDNaFSCVNmNqcWESmSOkswoMExphqIoREi0a1x1MDAxMpHQzVx1MDAwZivuPFx1MDAxMMWMvWhccpqe0UvkTk9O+k3onzz0abH+9HLjpnqt58wuINGWXHUwMDEzXHSpgaJcdTAwMWNE5tNj+1x1MDAxOVx1MDAxOHBcdTAwMTCEgMpMounxUEWoVOKAgJitl/T/YHFVLErfpaegpN2JXHUwMDE1XHUwMDE2n+Mvv0Wfze3pc6zZyb3w7EP2IML8mlx1MDAxOYdcdTAwMGKKitgtg7hQxs6XjzeMRFrkiqDR21xiXHUwMDE2jFxizTRcdTAwMDayoU1fgtzePFx1MDAwNrf1KN6VXHUwMDFmP2mb71x1MDAwMVx1MDAxMfxcdTAwMWJcdTAwMWTZpSTobtni6dN07JZmXHUwMDEyiefTg1x1MDAxYlFN11x1MDAxZUS7w3ZcIn0qXHQ4WnG7vszOXHUwMDE5svFcdTAwMDCVOlx1MDAxMvlQXHUwMDFhbjihPDzTXHUwMDE4Wtk+joL8XHUwMDE3Rad/VyD6t0+mtjbZXHUwMDAycfHo9PX5sNeo7KfPb+OnXHUwMDE3KZ54bWZq1Z1cdTAwMDBcIkhHc7RoRFx1MDAwMVx1MDAwM287oFx1MDAwZiBcdTAwMGWrX4S0fVx1MDAxMqRYT8Xrd5DI7Vx1MDAxM6L/VVx0m51B4lx1MDAxZu8lrr9yrdZtXHUwMDE379mvjypifEiV4vuFj87966Xivu5PNVx1MDAxOPuyO9hcdTAwMGbRbXHkXHUwMDBlS5L/+uOv/1x1MDAwN0YrJq8ifQ== + + + + + ATATGCGTCGATGTGTGACGreference genomeNGS readoriginal Phred scorepresent in dbSNP?noyesnopotential errors?errorerror10101020202020101010 \ No newline at end of file diff --git a/docs/usage/variantcalling/img/clinvar_results.png b/docs/usage/variantcalling/img/clinvar_results.png new file mode 100644 index 0000000000..5e88f3d961 Binary files /dev/null and b/docs/usage/variantcalling/img/clinvar_results.png differ diff --git a/docs/usage/variantcalling/img/clinvar_search.png b/docs/usage/variantcalling/img/clinvar_search.png new file mode 100644 index 0000000000..df8c58d84d Binary files /dev/null and b/docs/usage/variantcalling/img/clinvar_search.png differ diff --git a/docs/usage/variantcalling/img/gnomAD_COL6A1_v2.1.png b/docs/usage/variantcalling/img/gnomAD_COL6A1_v2.1.png new file mode 100644 index 0000000000..2fbc497fe0 Binary files /dev/null and b/docs/usage/variantcalling/img/gnomAD_COL6A1_v2.1.png differ diff --git a/docs/usage/variantcalling/img/gnomAD_COL6A1_v4.0.png b/docs/usage/variantcalling/img/gnomAD_COL6A1_v4.0.png new file mode 100644 index 0000000000..ee148dba85 Binary files /dev/null and b/docs/usage/variantcalling/img/gnomAD_COL6A1_v4.0.png differ diff --git a/docs/usage/variantcalling/img/gnomAD_constraint.png b/docs/usage/variantcalling/img/gnomAD_constraint.png new file mode 100644 index 0000000000..9183272073 Binary files /dev/null and b/docs/usage/variantcalling/img/gnomAD_constraint.png differ diff --git a/docs/usage/variantcalling/img/gnomad_search.png b/docs/usage/variantcalling/img/gnomad_search.png new file mode 100644 index 0000000000..c5577353cb Binary files /dev/null and b/docs/usage/variantcalling/img/gnomad_search.png differ diff --git a/docs/usage/variantcalling/img/gnomad_var_present.png b/docs/usage/variantcalling/img/gnomad_var_present.png new file mode 100644 index 0000000000..5e42034bc2 Binary files /dev/null and b/docs/usage/variantcalling/img/gnomad_var_present.png differ diff --git a/docs/usage/variantcalling/img/interpretation.excalidraw.svg b/docs/usage/variantcalling/img/interpretation.excalidraw.svg new file mode 100644 index 0000000000..dd9c7f6a3a --- /dev/null +++ b/docs/usage/variantcalling/img/interpretation.excalidraw.svg @@ -0,0 +1,17 @@ + + + eyJ2ZXJzaW9uIjoiMSIsImVuY29kaW5nIjoiYnN0cmluZyIsImNvbXByZXNzZWQiOnRydWUsImVuY29kZWQiOiJ4nN1aaVPbWlx1MDAxMv3Or6A8X2aqgnL3JVVTU6xcdIHHmlxmTGZeTVx0SdiKtVx1MDAxOEnGNq/y36evXHUwMDAwS7LlXHLsh1x1MDAxOZNcIuZqa/Xtc0533/vHxuZmI1x1MDAxYnS8xqfNhtd37MB3XHUwMDEzu9f4YMbvvST141xiXHUwMDBlkfzvNO4mTn5mK8s66aePXHUwMDFmQztpe1knsFx1MDAxZM+699OuXHUwMDFkpFnX9WPLicOPfuaF6T/M71x1MDAxMzv0/t6JQzdLrOIhW57rZ3Hy+Cwv8EIvylK4+7/h783NP/LfJetcdTAwMTLPyeyoXHUwMDE5ePlcdTAwMDX5ocJArNDo6ElcdTAwMWPlxlxupFx1MDAwNFx1MDAxN1LT4Vx0frpcdTAwMDePyzxcdTAwMTeO3oLJXnHEXGY1jlx1MDAwZf91luz0kv/2PH3tUvfyN+Q9XHUwMDE0T731g+AyXHUwMDFiXHUwMDA0j56wnVY3KdmUZknc9q58N2tcdTAwMTm7RsaH16UxOKG4Kom7zVbkpWnlmrhjO342MGOoeL9HJ3zaLEb6Zoo0sbDmXHUwMDE0XHUwMDEzPlx1MDAxY8+v5NjSXGYzJEpcdTAwMDdcdTAwMWWN2Y1cdTAwMDNwP1x1MDAxOPNcdTAwMTfsmZ/CnFx1MDAxYttpN8GmyC3OuSG35OamOKf3/IpCXGbHWp7fbGXG6ax4lpc7mlxuglx1MDAxNMK8mFx1MDAwNvOAzqGbT/nvhXdcdTAwMTNcYpZDc0XUXHKCsoNcIvfJQc+hUVx1MDAwNFx1MDAwN31cdTAwMWH5VbyBOX9/NKjKgVVcdK7M62fDVyuFwlbfXHJ6Z+xob3dv15e/nfZuz+95Y3jer1x1MDAwZvW3fbyYOL6jzpG/XHUwMDE3d/Y0uf56cHnmXFxWn/L8fDtJ4l7pvk/fXG63dDuu/Vx1MDAxOLFYaIU4ZVx1MDAxOKuS51x1MDAwMz9qj/osiJ12XHUwMDEx5Fx1MDAxYiWDx8BVef8yrjCZhCvMJMNCcaHnXHUwMDA2Vr0311x1MDAxY1iAKEsxTjm4nVx1MDAxMCbwXGK81FLglSV2lHbsXHUwMDA0wrVcdTAwMDZihFlCa1xy0645R1LhccRxNIY4Rlx1MDAxNNdcdTAwMTi/XHUwMDFlcJVcdTAwMDNjyJpcdTAwMTakSGKlXHUwMDE3XGLSwqo4yi79XHUwMDA3M1x1MDAxZFx1MDAwNFVGXHUwMDBm7NBcdTAwMGZcdTAwMDaV6cyj16AoiuLMziBI083/RH/tJKAwjvnzb43KuduB3zTR3XDgXHUwMDFkvKRcdTAwMTL4mVx1MDAwZvo0PCH0XbesOFx1MDAwZVx1MDAxOGD7kZdcdTAwMWPOo1x1MDAxNHHiN/3IXHUwMDBlvtXYV29cdTAwMWR4yPvyPKXYKsXUjZ165iiMMzpcdTAwMTXL04WSs9HRIaA5XHUwMDAxoeSSXHUwMDE0jp1cdTAwMDXom+CAYPefbvv4XHUwMDAxb1xycLh1cXSarjuguUCApjqlJFx1MDAxNlxi1Mp0spSBXHUwMDE0OjmGWi4wlVKxXHUwMDAyNe9BJ1W37Udp2Fx1MDAwZn+GO4PzXHUwMDEzR3Wi2/a8Oun8uOtep/2tXGLT+9Bv0pvBXHLeW55OgkrxMsBWo5NiXCKsXHUwMDA0U1pjLeaHVb0311x1MDAxZVbCkpJT+CdcdTAwMDTBSI+ASy5cdTAwMDFcXNNVUiFLUa0kI0JyXHUwMDEwSTKOt1x1MDAxYZWUmlx1MDAxMIS4fDuZJEwrXrJ21TLpwFx1MDAwM4zKbdqRXHUwMDBi9zPngVxc+tFtnIS5Oq1ILWeoxaha1plZa+QyRNP17TCO3DpsQzRPxraGmFZcZs2fXHUwMDAzXHUwMDFmdE/Z2UV4tfvj58mXlnPY/jrwWuuObUalxavoNVx1MDAxN1x1MDAxMkktOPQ6vby9dbSja+DMa7JcXMiER1x1MDAwMYxcdFxiJsKsRPJLVEyyKsWMtlx1MDAwZlx1MDAwNntxb2sv3j/a3kG95LzdPvpcdTAwMTMqy1Up8dT7ft6/uj74ektCLPA35l9/Qd962fJcdTAwMTRcdTAwMWWoXHUwMDFlldlgXHUwMDE1XG6vXHUwMDE5n8RcdTAwMDJMU0qRWoBcdTAwMDTqJ3/dSVx1MDAwMPRVYClcdTAwMTjUXHRUSFxcXHUwMDE1eIr5q7lgqrxcdTAwMGJuXHUwMDExzeGHSVOnlKhoyFx1MDAwZWxM3bHWkknO2Vx1MDAxMsjhxfJcdTAwMGVFXHUwMDE1XHUwMDE07a+U91KCOUPeIXpApkHR7+3Et415q5HzXHUwMDE5SjYq509mjVx1MDAxYjWXfFM2XHUwMDE1tVNrXsomJ+dcdTAwMWNLLYiUxfTMwi77XHUwMDEyXHUwMDFlf/4s7TtCd2l6K85cdTAwMGW+65O1xy7RXHUwMDE2J7RcdTAwMDJQcyXjgFrjXHUwMDAy9WpcdTAwMTX3XHUwMDFjXafigllyXHUwMDAyXFzHi1+iYa5cYsfvq/jldvfsolx1MDAxZm3TXHUwMDA33dxcdTAwMWVcdTAwMWO3YiRJ/CdI49T7RvvZT3XXvsvwQf9kcNw7Pj3rni1NcjlWbOVFNeFqXCJulVBcdTAwMTRJvMCqTv0srTtuubKYXHUwMDE0XHUwMDE0UTiXK1x1MDAxML8qfFx1MDAwNVpcbnyn19VIWPBkpaFUVorMWVczjaXUTL1p+3nRMK1cdTAwMTPe+evqTtzpXHUwMDA2eW1cbuJcdTAwMGKm2Ea/VqW+M2RoVH1LttVYtoxcbtpcdTAwMGJcdTAwMDK/k9ZcdTAwMGKwmLg6XHUwMDBiJSSAWJD5cXy8//3k4T7TXHUwMDBmN1df0v1cdTAwMDN9mvm8v/Y4hlx1MDAwMlx1MDAxYVx1MDAxOIvrUVx1MDAwMeZSW6jSil5qz5lcdTAwMTV5zVx1MDAxMKuqpoRmXHUwMDE4I67xSlx1MDAxNmdXVkL/3P1x/n27f3ef9bOrXHUwMDBiqdFB2LtYXHUwMDBmfYRcboks0tB7kT4yOVx1MDAxMVZYK1x1MDAwMeTH+fxN53pvrjuuIL1UXHUwMDA0XHUwMDEzXHUwMDEwXHUwMDFirKHIqoJLIGVRSiD/xFx1MDAxYZVXXHUwMDAxli2Rmlggdlx1MDAxYTKKSaVpXHUwMDE54E+gU1x1MDAxMlx0glx1MDAxOXpThSRMS7xAoL5OIUHDnKCbrq7FPENcdTAwMWPGWsw19sylhVhNRewjY9Qp4WTIglx1MDAxMkKOxVx1MDAxN8hop3dcdTAwMDHXXHUwMDE0sVRSS2KszVJcdTAwMTGnVKpqRkuwslx1MDAwNFF6pfspuLYk4dzs5mBmrapmXHUwMDAzXHUwMDEzxsJcdTAwMDJcdTAwMGVXeFx1MDAxY7iQjFOslHwrsZyx30K/KOFNMzvJdvzI9aNm1bCnnXnzbHjIXHTA6Vx1MDAxYStcdTAwMTFkNVx1MDAwNHxLqMZcbtRQSVE6q2l3zLtapm5cdTAwMTl7YS9yZ1x1MDAxYjK9+VQyZFx1MDAwYllcdTAwMTQpppBGYFx1MDAwNCGKXHUwMDE1NVTJXHUwMDEyXHSGQrFCmaBcdTAwMWNLwsasXG7sNNuNw9DPwONnsVx1MDAxZmWjns1duG2Q3/LssWmHtypcdTAwMWaDXHUwMDAw9UdcdTAwMWFWXHUwMDFkc9NqOlR82yxglP8x/P77h9qzXHUwMDAxZFx1MDAwMlx0qSR+jHBcIsuXY0StvJykQ/9PvNNEsOQ3KmBS3Gej/P/CXHUwMDE0iScvt1x0jDCCkJo/qZm+oLGmXHUwMDE0Kbi01OjuXHUwMDE0yFx1MDAxNCxVrVx1MDAxZpbeYVx1MDAwN+qtbmQrrbtRi1x1MDAwMTfLMTaExFxualx1MDAwN4HfavVtSt9dKKDpgqWXz4bTXHUwMDE3tKskhIjxXHUwMDFmY0AzXGYpROroUDG6cj5cdTAwMDQ6XHUwMDA0jcBcdTAwMDJcdTAwMDHXUSU0UrzGXHUwMDEygVx1MDAxOWTUoHNMcqG5eO90uMUh6WCq3P82XHUwMDFm09qSWM0mwa0xbOSXXHUwMDBmUbEs8mNi8lKFXHUwMDE2Zk/LXCJcdTAwMWLZp/eW15T9OCZcdTAwMTaHmk0yijjXvFCDfJlRQ0lcdTAwMDfyZVJIiM6SM5bNhsjSXGZRzlx1MDAwMFxuXG7yPVba51xctD25XHUwMDA1eSAhoI9CVcrP51wiT1xiKphWciUrXHUwMDFhr0xcdTAwMTZf3Fx1MDAxZJ2TXHUwMDFlXHUwMDE3ydFARVx1MDAwMF+YsLxJpnU5X3liJfwyZpzeKFx1MDAxZLVcdTAwMDJhZfZ5c4aA9TRcdTAwMTk3wnpcXEV794Q4ObzNZyywl0ZwbGJcdTAwMDEsXHUwMDE5Z0RcdDR/dje9h7eu/Ia0ZTrekOdcIlx1MDAwMZRe2lx1MDAxOZA3hLG2INVGJtND8Fx1MDAxOTVsqVx1MDAwNMdcdTAwMTVcdTAwMThcdTAwMDBTLDTYUbffiitcdTAwMGJTU0hcdTAwMTCoeplcdTAwMTbFgtwzwVGFXHUwMDExRFx1MDAxMVvC/sl3R3BzU4thXHUwMDE2yFx1MDAxZIiiwmxcdTAwMTeEL+M1KLVUuY5akOWmN75GWI5cdTAwMWH3cKVN6lwiXHUwMDA1raFay0QmsDBAXHUwMDFmKi6I0/9cdTAwMDe+m1x1MDAxMO3mM1x1MDAxNueT+G7j6Vx0XHK707nMIN6Gc1x1MDAwMnHvu09cdTAwMWTR4k1cdTAwMWL3vtfbqd0wYT5mVSH3qSErL1x1MDAwN8GvjV//XHUwMDAzICVcdTAwMWZcdTAwMDUifQ== + + + + + annotations (prediction)clinical and family informationfilter variantspopulation databasesconclusion \ No newline at end of file diff --git a/docs/usage/variantcalling/img/overview.excalidraw.svg b/docs/usage/variantcalling/img/overview.excalidraw.svg new file mode 100644 index 0000000000..6b17b2f6e5 --- /dev/null +++ b/docs/usage/variantcalling/img/overview.excalidraw.svg @@ -0,0 +1,17 @@ + + + eyJ2ZXJzaW9uIjoiMSIsImVuY29kaW5nIjoiYnN0cmluZyIsImNvbXByZXNzZWQiOnRydWUsImVuY29kZWQiOiJ4nOVdaVNcdTAwMWLLzv6eX0Hlfj2e093qRX2q3norbFx0W9jD8p5blLFccjhcdTAwMTjbsc166/z3VzJcdFx1MDAxZc/mMXhguHGqXGJ4mZFnpOeR1JL6P1x1MDAxZubmPlx1MDAwZe67jY9/zX1s3NWqrWa9V739+Fx1MDAwNz9/0+j1m502vaSGf/c7173a8J1cdTAwMTeDQbf/159/XlV7l41Bt1WtNYKbZv+62upcdTAwMGau681OUOtcXP3ZXHUwMDFjNK76/8s/v1avXHUwMDFh/9PtXFzVXHUwMDA3vWB0kkqj3lx1MDAxY3R6j+dqtFx1MDAxYVeN9qBPR/8/+ntu7j/DnyHpeo3aoNo+bzWGXHUwMDFmXHUwMDE4vjRcdTAwMTJQXHUwMDFiXHUwMDFkffZrpz1cdTAwMTTWXCJI0Eaqpzc0+4t0ukGjTq+ekciN0Sv81MeN7auDs51tU1m6XHUwMDFk1E53V9ZWXHUwMDE2tnF01rNmq7U7uG89Xolq7eK6XHUwMDE3kqk/6HUuXHUwMDFiXHUwMDA3zfrggl6XkeefPtfv0EVcdTAwMTh9qte5Pr9oN/r9sc90utVac3BPz1x1MDAxOfH05OM1+Gtu9MxcdTAwMWT9hVx1MDAxMEgjpdPSXHUwMDAxXGKp7dOr/HGJJpBeaWs8XHUwMDAwWFx0Klwi10KnRXeC5PqXbPC/kWSn1drlOYnXro/ec6rO1Onp6D23P7+t8y5w3nrnwIFxzj+946LRPL9cdTAwMTiwJEJcdTAwMDXSamGc9UaQLCFJXHUwMDFhwzuiNWhplManXHUwMDE3+PTdlfpQN/49ulxyPdKqXHUwMDE1/kT7utVcbl/Jdv3nlfylQyMtgp/P/DP6fvz+pZD2jc5w3a1XXHUwMDFmtUTSd7J8XHUwMDAxvXUjuVrN9mX09K1O7XKkWFx1MDAxZkLniil0o9VqdvuJ6myVT1NnXHRcdTAwMDJcdTAwMWQqqTC3Plx1MDAxZqxf43a/uuVcdTAwMWYuv9ZOXHUwMDFmvq2sXHUwMDFmnJ2+rT5LMUmhK85cdTAwMDbeI1gvXHUwMDEwQTv0Y1x1MDAxYa2EXHUwMDBlvEJlXHUwMDE1SmSFNy/R6EGv2u53qz3SgbhWe1x1MDAxNVdjZ6NqK1xyemckWl2E3qrJevv0mdGnnzRg8WLlwrcu785F233ePer1jzfbK0/fdEwrq71e5/bj0yv//JF83F9vXHUwMDFmNO5cdTAwMDbjXHUwMDA3ejzj9reFebF6t7N+uHvy7X710+rG9t566LA/f8swNlx0Xmo1K2NcdTAwMWKTM0xcdTAwMWPCpFqa0Sg1oNK5LS35W5fd0oxcdTAwMGJQSGWt08JcdGXkuKVcdTAwMDFcdTAwMDTCoFx1MDAxMkZcdTAwMDJcdTAwMTDu2OIszUBgjPSOrrox2qIzcctTXCJqecZcYm1Qq1x1MDAxOVx1MDAxON7YXHUwMDBi0zDD1Mo6kqrTXHUwMDFl7DZcdTAwMWaGXG5nx55drl41W/djt3WoxXRcdTAwMTWXP+3ubX9cdTAwMWN7+lOrec5cbv2xRuI2emO6PmiSz/X0hqtmvVx1MDAxZfaianSuarPd6K3kYYtOr3nebFdbe8mi0DdvfHlcIvtAmZAy9Fx1MDAxYvwqP68zbfVcdTAwMTGCXHUwMDEyjFVag9Gnn6zVOmGlMfndvGxQLI+xXHUwMDA2QmipJThF3lx1MDAxZIJ3OLpcYndDnFxuPDlcZlx1MDAwNrRcdTAwMTVagVx1MDAxNeO+n9Iy0FJKtnChvbJRYWdnv+xlOiGcXCLnT2tMst5AXHUwMDExqVx1MDAwYim8I5dd+5AsP2lcdTAwMTSUXHUwMDA3klx1MDAxMUZf4nVptFx1MDAxMCPvXHUwMDBmqr3BfLNdb7bPx1x1MDAwNftcdTAwMTn35DG9ISzUrvuPOsH6QIjtXGZcdTAwMTAuS1x1MDAxYzmMfMWqXZY69mVcdTAwMWLt+mQhzre93qrdbVxmmqubptu/XHUwMDFjrC3fXHUwMDFjJFx0UVx1MDAxMVx1MDAwMalcdTAwMTS5ouTHS3Rkfjomg1x1MDAwZYT33tKdRkJ28tJiMrWq/cFC5+qqOaBrvdVptlx1MDAwN9FrOrx4n1x1MDAxOFx1MDAxMy5cdTAwMWHV2Fxyp+9cdTAwMTR+jVx1MDAxNLRcdTAwMTmJXHUwMDEyu3zQcVx1MDAxN2n029yIXGaHfzz9/u8/XHUwMDEy360hsFx1MDAxNkm3XHUwMDA1kvfp5Iii+FEh/ab4XHUwMDA1iJCIvZD910lcdTAwMDdMs5lfh1x1MDAxYjeX0dE+hP9PQ9LMmJngJFxyTLVywiFMXHUwMDAxptlaU1x1MDAxZTBcdTAwMWRHTmFcdTAwMDJcdTAwMDZMYbwkblEy4vlQjIHWW1xuQWzBMYaUXHUwMDEwaNTgjHh8QFx1MDAxYzutXGbI1o1ccnHqT9BUjrBAoClcdTAwMDQzc4TMySHCi0OPssRcYlKlx1xiilxcVVwiZVx1MDAxMLktZe3Loentf/5xeN//MVx1MDAxOLQq7Vx1MDAxZue4VHpLUTKwxDcgki1FqcAy3JF6XHUwMDE2bCloXHUwMDAz47xXxoAjXFxcdTAwMDSMXHUwMDFiStxAJFx1MDAwMzXnpNTLLSQ7RsgyXHUwMDA1ODhecL7XbZ1uu36VTKK7eLUxXHUwMDBiU0AvLblMWtkpTCEpXHUwMDAyXHRcdTAwMDVXXHUwMDEzXCKQKkdcdTAwMTNXY3doLFxuaTXOXHUwMDA2XHUwMDE5Mcig001cdTAwMGJAxkSORlx1MDAxYlx0p81cdTAwMTdx4PN5UqYn46zlbKXLT5M33/bFXHUwMDAxdExj5XJxvrkxuPt64lbLbvxakY/yyE5cdTAwMTF+NFx1MDAwMYTBceasaCEgPyR8iicjN1x1MDAxOFx1MDAxMOrYJDqUXG6dUs7b0dcoXHUwMDAzXHUwMDFmTkqZrSi9ty9cdTAwMWa2/EpF717cXHUwMDFl3H7uV7YnZ+JcdTAwMWU/vPult7naOr5cXO7Didvaai4sXGaWTvNhS+Zxj2pcdTAwMTdHS/M/vl6uwVatc3C7r+Fh633Rt0lnb1xuTsjj0jp/Lj35JpXdgIFCNeNcZjFcdTAwMDVcdTAwMWFlIJJKJ273XG4oninUllxyp1x1MDAxMT1aoZTXXiUk1Vx1MDAxMyxZkpJoMvJcdTAwMTksXHUwMDA2vW1uLz+z8prqXFz9uttcIsJcdTAwMWM0+lx1MDAwNWX5Jlx1MDAxMFGUd1OFKpx9XHUwMDAxUtlXXHUwMDE5XHUwMDE0noPx3NZ7u1OheGTv4ka1ayfbS6vLe2eb52W3XutEoFx1MDAxYyTQL7OjKJaBfcJcdTAwMWGujaXgSVx1MDAwNKPRuFnY6SsyrrxYfzg/7199Xb9dhlx1MDAxYjt/ZvZcdTAwMWJcdTAwMGJvzbhcdTAwMGa769uiv/K1tXF08anVW9Zf1M6X98W4XG4zKFdcdTAwMGJcdTAwMTDO2fxcdTAwMDFz8l0qvdFiXHUwMDAwXHUwMDFlXGZcdTAwMDXLWlx1MDAxODLfiOli8aaLKnBEuJruOnptXHUwMDEyMkombshcdTAwMTRaW8NcdTAwMDVcdTAwMWG/XHUwMDBm4TJRzf24ptBycD/Xr3V6jb/bREn092mvOmCNLoaCJ5BRlILjYs6lSJmLkzVkmnbqXHUwMDFhnLfRJ5/oWFx1MDAxMFx1MDAwM2jpQrVcdTAwMTGTTDtcdTAwMWJDS2raxssgxsUwdp1nbcvOXHUwMDA2Llx1MDAxMuuOKqpcdTAwMDLhw+L8ylx0XHUwMDBiK8FcdTAwMTgspo6q5Fx1MDAwYmnZ3u3c2Fx1MDAxYZaw2lthhNHGXHUwMDFhlFx0XHUwMDBiafAr+1x1MDAxMfvSuVx1MDAxNtSy7XwuvKonXHUwMDAxXHUwMDExXHUwMDA1XHUwMDAyiaKk8aPE0pMsKpBOYoIo72xcdTAwMWQtrtH8qPxU5tGnP4T/f1ZcdTAwMDBhReoyXHUwMDE3UHxp1DRcdTAwMDHEXHUwMDA2rPZcdTAwMGX697tuvnYvVL+qXHUwMDBmLndvylx1MDAwZVik4FxcV4NKaG1t2Fx1MDAxM+DPa2lcdTAwMDOkIFtq57n4JpREnrlL4riYIS2h51x1MDAwM7pcdTAwMTlcdTAwMWPmaD98RFx1MDAxMc1Icme0n0VcdTAwMGX/XHUwMDE1Y4zdu9sunlxc3pyuL65ebPWWdr6f711P5bMrIUFcdTAwMTWfJVx1MDAxM+klp8RcIp6EmYLXXHUwMDEzv3TZzWSoncahtihcdTAwMDTp4biZXHUwMDAwXHUwMDA06El5Lb2vYDNRnssmXHUwMDEwKVRcdTAwMDLFUULcWuK5Mu80enSzSHo/23OfWlVf5rnfVJmkXHUwMDA2c/1mcYmyXHSIXHUwMDFm9dJTRCo8TUa4mGa+XHUwMDFlUFJcdTAwMTBcbvlXqa6Pd3B9vrl3Klfu9n7ctrvr/bvdspuvtprXoI1xQFas7fhcbrWWJkCniVx1MDAwMnm9lOtL34blXHUwMDEycmdIXHUwMDFlnUZcdTAwMDXvi9e271fvT1x1MDAxNo7Xtuzm9p71+0du7fJkSl4z4XLNolxuvGU6r5EvZDyoKYgt+VuX3TI8UtRcdTAwMDCGq/WA2S3CbEq8mmlwylx1MDAxYbxcdTAwMTfcpVx1MDAwNHQyXHUwMDFkt5CEpJThfiAri6/emKW2vozazlx1MDAxYu1cdTAwMGWr9tzf7Wq/n11r8VKKm1x1MDAwMPdRintcdTAwMTItSbBZ5J4yic7Y9NwyXHUwMDA18c5oUFPklmVje+9k9dvJ2snlevO4W+leK/O29jyx1U9rXGasXHUwMDE0UqL1moKlUMfR0JyFXGbI2K1cdTAwMDftwaBcYvVFPKfX7+ys5ms+bsqAKuClZJGYlEJcdTAwMTdcdTAwMDCYpLyUJK/WXHUwMDFiLngtXHUwMDE13T2qw9ZcdTAwMGbZXFzdWKpcdTAwMWXuLq9cdTAwMGZqg96+/VpZXHUwMDFkZ79nrel8v19u7d8uXezu7KjDjeZgs+PWXHUwMDFh+Y6bg0elQ6/DXHUwMDE2WFx1MDAwNI+CTG+9IJ9cdTAwMGKtkpjfw7w0clx1MDAxYlcr3f7y9501udjtdWuHpc+jXHUwMDE4p1x1MDAwM6eM9GglgFx1MDAxZK+iXHUwMDAwXqW1XHUwMDFhjZKevG1cdTAwMGIvsbpsXHUwMDEyXHUwMDA1XlpcIlx1MDAxMlXja8JZsSGC4DLVt1wi0DcoUPxcdTAwMTWIXHUwMDExXHUwMDBmtjg/m0ydXHUwMDA1lSmmnrzwONCFsntcdTAwMTEzNUp4y7mc3Fb6qVGRjfXmfm2+3ts52jtdqHz2vuxWimSk1pKdkksrXHUwMDA14ng3XHUwMDE0PUfMRFx1MDAxNKq0XHUwMDE32vlcdTAwMDLNNF/phFx1MDAwMaUk+d7vq1bxeKu1PG+2llx1MDAxYd2jvY1e+/Pt0Ylq5mVDsyU7u/2V9Vx1MDAxZrvfz/BIf9k7a+3MonJid/37LWyq46+Hhy2/XHUwMDA2gyNxrN1MKycgbMhFsKxLb0dcdTAwMDZcdTAwMDGSXHUwMDFirfKbb/JNKrn5elx1MDAxOHaVSaFcdTAwMWM5r/QzYr72tcyX/VvHxDnsR3RcdDnYeKRKQaJCVEa8eS/yNJo6XHUwMDFiov1VmvB3O7M4YaaR61x1MDAwNIJKo+VXq6LIjmTT+1xuXHUwMDE0t1PKMClMsvUlt99uVNFXXHUwMDBmL840foFtY1ePy27rXHUwMDEyeFxcXGbZXHUwMDBi926DXHUwMDAxOV5UYaxcctAgaJSKokkostHAXHUwMDA1Q38hceSAMT9cdTAwMTfdY7Es3SdtjHlnjVx1MDAwNqtf72/aW/VcdTAwMDdrXHUwMDFl7i76cn/h8vjS5SXZs4f929XK2mXnfFx1MDAxN3ZcdTAwMGZq3Yu6XazPgLzru1f+fHdv6fvgxFxytpdcdTAwMGW2XHUwMDE37s/1+1wib21dmkFLXplcdTAwMDPAKVqFku9SyS1am8Ba4KZcblx1MDAwNZZcdTAwMWJcdTAwMDbHXHJcdTAwMWFl4LVWaGzRXHUwMDA2LSBAXG6SXHUwMDE1L+Ry8aOMm3VSu1x1MDAwMa9cYqi3XUN9ZfquttudQZFcdTAwMWM9gZliXHUwMDFkflx08lx1MDAxNL96ml4jJL121oOV+Vx1MDAxN4k+71xcXq2vfbpcXN38sXm/+t1U70/ONstuueBUYLxcdTAwMTdcdTAwMWVcdTAwMTGdXHUwMDAwlOOOt7FcdTAwMThcdTAwMThhiFx1MDAwN5W32tpcdTAwMDKrXHUwMDFmnsfFhlx1MDAwNVdYsrTyJCrealfPxberXHUwMDFmO5tmcX3xcKF2dH9ayUuZjY450n7h5GSl0V9YrHfh/Ntaa1x1MDAwNlS8t//5unHUO/7UNOvd5e5db61yMduW/cKpXHUwMDE4THqZMjJcdTAwMDPpKVx1MDAxNn2Tb1LZ7Vx1MDAxOX0gwCqHmlx1MDAxN09cdTAwMWRGfGuEgKdeXGJ0Rdsz6MB6j8bz3dfC5uz8M0pcdTAwMTktfiMmbjLPdnuNQtl4XHUwMDAyN0XZOE2mWTByxvxLJ1T02V/m69h0XHT/8zvSp8s9WLz7/NBtV27W8PjT1uV3/FJ281XOUGyMPDNcdOhcdTAwMWaa8UVe8kpcdTAwMDLhtFx1MDAwNCGkdChcdTAwMDRGJHv18ZdIUlx1MDAwMNpZXHUwMDE47Fxmx19O4t9L6Sv1rbXNjmqLxnW1XHUwMDAx93q+npcnXzDPI/O4xffcXHUwMDE3XHUwMDFmXG5jetVcdTAwMTVPmNRKY/7kVvJdKrtcdTAwMDX74aRcdTAwMWFe9pTGYaRcdTAwMDFQqoBcdTAwMTd9nFx1MDAwMlxukummXHUwMDE0Zr5GXHUwMDA0wENcdTAwMDNcdTAwMTX51T55pKaO5bGlI3Dx5De8e/rNP1Nz/tPG3J9/t1x1MDAxN3Y+bVx1MDAxNMS9XHUwMDEziCjKvUOB5sblycW7kD1eM4t306uIudrISDVFXHUwMDE43JRX5stas3J/0t24XHUwMDE1+5tLnc/fS5/AQnKbvVx1MDAwMXYyhk1hkVwiXHUwMDBmMlt6XHUwMDA3kbOR9Cbhistg5Z067b3SuqCwtzDaXZxf+37leqeV3cuV8zPnavrbl41XaJDPPO5cdTAwMGKKv8pCu1x1MDAwNtOrP7xBg3aKJaXkm1Ry+/VSXHUwMDA0XHUwMDBliLuk1eQki3G3XHUwMDE5QFx1MDAwNuKxYlx1MDAwZYTy0lx1MDAxN2e/z+Jdr6VSXGLvP/9cXCbancBDJaBdXGbd8Cjtak33XHUwMDAzpmmqXHUwMDFm3O49fKtcdTAwMWXj2fb5/f1cdTAwMGZ5Wdu5+2HLbrdcdTAwMDAmkNyfXG5cdTAwMTYtMWuke0dBoIlwldRCXHUwMDEy62Jx6SqbmHSO8672hCEzXHUwMDE5KPmKvHt0czxYWV666fRPvmP/Zv/bWnV/51x1MDAxNYqYM4/7gnKwsvCuNan5Kk1cdTAwMDGekCjy95gn36Sy26+BwPEsXFzJdWqI0e5ZTcQruPvIS9TaXHUwMDE0l65SjqfeeFx1MDAwYp78dMNdeHF7TthDwnvU3Dry+1x1MDAxMO+3heWCXGJ3XHUwMDAyXHUwMDAxRVx0d0yQWexcdTAwMWaRwbRSyNSVXiXJYbNEQfkrLJvHXHUwMDA39Ss8Xals7nRcdTAwMTb94dnRSdeVfp5cXEW7QPK8eeGtRzm2XHUwMDE09qvGUkmOcoH3XvIv6lx1MDAxZppcdTAwMDHXsvPDk7TeXHUwMDE51T5syKv+pV//tNPzxydr++v1rfNXqGTOPO5cdTAwMGKKt8pCtShSV3a1XHUwMDEyQ1HyM23iPSq7+SpcZlxmXHUwMDAw0SxxLnnDKuIqa1wiYo6AXHUwMDA1XHUwMDE5rzJcdTAwMWWKc5WfR7WggVx1MDAwYjXefrum/1x1MDAwNqqdQEBvSrUqNaile4Co3Vx1MDAxNK1cZl44vfrwsOLO/eHnjYWDva3Bwm3ZLdU6bpFcdTAwMTXCotGamHQ8XHUwMDE3ZaymV5XmJllcdTAwMTWevjP7cTIjsUaNSCFcdTAwMGb8185L3nC95FvNi3tcdTAwMWXNqvX5qlqr9VdcdTAwMGZcdTAwMDfu7LPunS5ttOE3KnSSIfiPVkpcdTAwMTg/XHUwMDFjnZ0/45t8MctuZd5cdTAwMDdcXKCJXHUwMDBlhLM2REKPdU4mcFKgXHUwMDA1TvvaXHUwMDAyNy+0hszZe81ypO1dXHUwMDE4821cdTAwMWTv3WXDXHUwMDE2+k7JcJqC4/5tmPJmyodcdTAwMTN4XCJebDwuS8FlTTKMrlHnlXf/0mqKXHUwMDA2ge/bXHUwMDFinZvbpYbtK+j2Kra6Z8xR2a1VaVx1MDAxNWgjkFx1MDAxNM/SV0ZcdTAwMTVt+cFAad6shyeKh3uV3yb2VJJcdTAwMGKwhMdy7eo7iVx1MDAxNTfxy8HhQ+vLXHUwMDFh+If95nm/pr9d51x1MDAwZT5f0ImTedxcdTAwMTeUK5eFbcPrXHUwMDEyUZ92WJbufH77Tb5JZbdfK1x1MDAwM4HeK6B4z1x1MDAxOFx1MDAxYqlLXHUwMDFj0i1FhNI6RzFegfb7vOCT96o0xrn336BbhuhzXHUwMDAyXHUwMDAzXHUwMDE1XHUwMDFlfaZcdTAwMGYpTy2DkEJ7ocQ0Sd7sMs+y2qlcdTAwMWFcdTAwMGUhtERgMNxcdTAwMTl53C1WoFx1MDAwMkuXwWjg4UGiODs1hk4klefh8NZcdTAwMTCVJox8g4BYVntNYTBcYqncaO3oXHUwMDE3XHJb5dGIYkYkzmCW+bOMOecs8+z9XCLnxmaZ081Ep1x1MDAxZJJMXHUwMDBlXHUwMDEwRlx1MDAxODH3tCmw5L3eheVcYsXzcvszd1x0zq5lXHUwMDFjl0pcdTAwMTJIeyCUXHUwMDA2Z5z0NkGqmFx1MDAxMO9snHm6lvMjpt+jw31cYv8/Nc5hetui01x1MDAxZXGq/dCzy85LinNcdTAwMDRugVx1MDAwN6RYgjdkhujWZmBcdTAwMDJSPFx1MDAxZcvBPV+6uCSbgcDyoCsnNV10XHUwMDEx2lx1MDAxZvVcdOXizohcdTAwMDatKfbHsm7RUCSs5Vx1MDAwNlx1MDAxMOIxdMDbivAmdUbjqDcoXHUwMDAzP3KB2FTbRFx1MDAwMCjLU/KRXGZLSVx1MDAxN1x1MDAwNzFuqFNIJOVcZqmCIIx996iWptSRT79cZsSsy5g8b1CQXHUwMDAwU1TPZFdcdTAwMDeXXHUwMDE0xdDZQFx08sToWmhBXGY+jmJcdTAwMWPrXHUwMDAwXHUwMDAwenCkeVx1MDAxYYurnrE64JmhaK23WoFIXGKqUFx1MDAwN1x1MDAxMoVcdTAwMTCErYbwK+arcbOL4/mkhexGXnJUy73XXHUwMDBiIYoxXHUwMDE0JVx1MDAwYmGkIVx1MDAwYlx1MDAxMz6OapaYTWtcdTAwMDKVR52Q8S70XFwol13/Ooa03OTBXHUwMDAxNYmleKt5jMkk2ZvTPORKWEZBZ989yNFlNqTu5C678CxwfkhF7oMgZ9ogb/QsvZp0tFRcdTAwMDPiR9R0ZoWgdKJUXHUwMDA0tcSdSGFBflx1MDAwNM3ugygpgpLaXHUwMDA2w5VcdTAwMTfjnDZKRur+Pd1IoNhI6eHU8Vx1MDAwMvt2SNSA02LcfGBcYlx1MDAwNUf4OEozS1x1MDAxNlx1MDAxNS15XG5ek8yh5r9fc6VcYlx1MDAwZcir9SWEUJ6pqUbKNHtcYs1ccldcdTAwMWPv0uUjZkRcdTAwMWX+wTPlXehtT06Ze4w9rSBcdTAwMTP3z4PQ7DHhY1x1MDAxMEq46KRFuv1cdTAwMWEoJrFxWMeAXrKCe2BcdTAwMTn/XHUwMDFkvndcdTAwMDStSM2jJujbsNerrFx1MDAxN2Mgqi1BXCI5j1x1MDAwNIjeeaNh8vHSbIhcdTAwMWYx65lcdTAwMTWKgkmNplFx8K5E/qX07GL5koIo145JJNrnLIUnTY5cdTAwMTSGXHUwMDFhXHUwMDExOFwifnJcdTAwMGac0d5cdTAwMTRcdTAwMTdNa1x1MDAxMXhN3lx1MDAwNVlcdJl2SJ9C0XQghzZtSbXAW7AqoTnSWWN8KFb7jWA0N2Rxgk5oXHRyuMkh3/Y4ZJmAwEpcdTAwMDNphiSpKf59XHUwMDFljGZcdTAwMTeGj8f8Qouhd8yLNTynJr4to1xuJI9286ySpCg+7lx1MDAxZb83XHUwMDFjTVX74cfjXG4/K+STLqPtW3K1XHUwMDFl2Pz+Y3Y/T0mhj8LrXHUwMDAwJG+PyGNcdTAwMGZCmfOfyCdcdTAwMDPH3KbpXHUwMDFlkF+mXCJyzVx1MDAxMPmILIdcdTAwMDVEIFx1MDAwNYGX9Vx0ZVxuRI5cdTAwMWPIUnRBLGhUaMvS0fQkyXNV7W9cdTAwMTmD50ZcdTAwMTnO6mliM6mQriVPt8PRtlxyoYiXQjavyN5cZoEjN2M+XHUwMDBm/LJnXHUwMDE1R8WSwE3MXHUwMDFhUVIwMDLPkF/LiSBPYOx5Q0L9X1x1MDAwMH6pmj98Oa70s0I/ZdPzj2CdM1x1MDAxMC6in7j1ZWbrRknRzyH529aiIXRTkjz5XGL6cUuBslo5bSl49sW1XHUwMDE0XHUwMDE45LGt1ikyRVx1MDAwZY+SNlxiI5Ik78B4XHLsiXhcdTAwMTefiuGHxiz977hcXDxccszw/GpScUGRljBcdTAwMThad59cdTAwMWKlICnoIZDkRSqvtHhuXG4ys00hin2ewnVygaQxdFx1MDAxM+PQJ1x1MDAwM7qK5P0g965ZXHLvXHUwMDFm+tK1fvhyXFzhZ1x1MDAwNX0mvVDGXHUwMDEywjrSjPx1MtnNZSVFvlxu0LVHinktfVXn0ES6NFxi7XizNWJcdTAwMDHJfFxc4Mhq7Vx1MDAwMm3pRMpcdTAwMTN8kSRcdCvIXHUwMDE0olx1MDAwNVx1MDAxNsBx3Z1w7LrEXHUwMDFjP8UwQ5r0Oy4p50ZcdTAwMTnBdMbpXHUwMDFjXHUwMDFlwEas50JcdTAwMGKbT1x1MDAxMS/PMvRG0M1AMpNnXHUwMDAyX/bw63HgY1x1MDAxZESK/4B4Vvq4I2pcdTAwMDLFu2B6+skjYuKpzHdcdTAwMDZ7XHUwMDEw0OWn2J6us+bfMPxp71x1MDAwMtQ8XHUwMDA22lx1MDAxYaVcdTAwMWPYSVx1MDAwN0u1XHUwMDFlfsTtZlZcdTAwMDDqUptRrZNcXM0/RfdNdt11SfFTXCJdeHIjeJNcdTAwMGWQxkWmhvNcZkRAaXmcLypVYMbQ8TxUZ8iaee3SQnLK0HNeXiMn5o20XHUwMDEwn1wiPlxcO3qzIaZviZ65oYpcXDQrkUuuLSjFIGlC/sqT40hmaL3gpVx1MDAxN6XI9J6Hn9lcdTAwMTXGsaCZXHUwMDAyNcdcdTAwMWI4kKdcdTAwMDQ67jm6wFsg7fDSWHYhRVxcqneGoelqP/x4XFzhZ1x1MDAwNnypayXkkVx1MDAwYsnRYm7gy25cZikp8Fx1MDAxMdiRXHUwMDFkgNJKXHUwMDBmNyRcdTAwMWKf78rApyxcdTAwMTB/kVx1MDAwMSBAcVx1MDAxNTtGXHUwMDA1XHUwMDEyyS+w5KPwzrpJsMchOzm5mlx1MDAxOZW4Nr49KJmMVzPpQnx3sDdcdTAwMTXCUFx1MDAxNMzzx0GRrVx1MDAxOVx1MDAwMy5eXHUwMDFm47hiznkkh4ZJ76nGY0rcy1x1MDAxZdNcdTAwMWWRykgveGygNDyRLFx1MDAwZXsmgGHWmjxeiVx1MDAwMlxc/I68M9RL0/nhh2PaPivIU1wiPU9o6Vxc5Lib/HnC7N7zkoKeYb5R3FZcIniv40ie0Hg33JyFXHUwMDFirYUxXHUwMDA1rpJcdTAwMTBUXHUwMDA1PFXNOkGRkFx1MDAwYrfVhmGPbJRcdTAwMTPpXHUwMDE0XHUwMDE1kGdcdTAwMWFcdTAwMGKVXHUwMDFkPcglhbKukUyzpfe0qJdcdTAwMWJfhvBcIoaT/TlcdTAwMTeuUYk4wFiK1Vx1MDAxY7lcdTAwMTmEMTw+Sz0zWM5u3o5cYqV4e/mhLnKWKp65pIiPXuJqL8I9IGx876CXrvP8qETVfUrUS+t4VenjlnjUn0ZcdTAwMDNcIj/orW0sn6w/2NWV47PqYWOhsvh9ffO67KCngctcdTAwMWZcdTAwMDRcdTAwMTmB0d6Ds+OunuSCJKt5nYj9PfpZXHUwMDFj7HE7XHUwMDBi71xuQrClXHRlMVx1MDAwMfZcdTAwMTJmTJDQvF42XHUwMDAznHvZ5u8jXGIuesZEt9eY6/Y6NZL61fd+Tzv3VLMlPvyEgY/Vbnd3QNfyXHQ26eY16z8vyEiOjzfNxu18XFyd/nU2fHAr/dDy2cRcdTAwMWFDxvrnwz//XHUwMDBmNT9QNSJ9 + + + + + FASTQalignmentmark duplicatesbase quality scorerecalibrationvariant sitesgenotype assignmentvariant callingvariant quality score recalibrationannotationinterpretationBAM /CRAMBAM /CRAMVCFVCFanswerVCFpre processing \ No newline at end of file diff --git a/docs/usage/variantcalling/img/sarek_subway.png b/docs/usage/variantcalling/img/sarek_subway.png new file mode 100644 index 0000000000..e2a689b1ca Binary files /dev/null and b/docs/usage/variantcalling/img/sarek_subway.png differ diff --git a/docs/usage/variantcalling/interpretation.md b/docs/usage/variantcalling/interpretation.md new file mode 100644 index 0000000000..0e3b1e3954 --- /dev/null +++ b/docs/usage/variantcalling/interpretation.md @@ -0,0 +1,126 @@ +--- +order: 4 +--- + +# Interpretation + +Once variants have been called, the following steps depend on the type of study and the experimental design. +For large population studies, like case-control association analyses, an appropriate large-scale statistical approach will be chosen and different statistical or analytical tools will be used to carry out the tertiary analyses. + +When only a few individuals are involved, and in particular in clinical contexts, the goal will be to interpret the findings in light of different sources of information and pinpoint a causative variant for the investigated phenotype. + +## Overview + +When variants have been called, and a diagnosis is necessary, investigators will need to combine: + +- the predictions resulting from annotations like the one we carried out +- biological and clinical information + +with the goal of narrowing the search space and reducing the number of variants to be inspected. +This approach is summarised in the diagram below: + +![interpretation](./img/interpretation.excalidraw.svg) + +Once the list of variants has been reduced, more in-depth analyses of the reported cases and the genomic region in existing databases might be useful to reach a conclusion. + +## Finding Causative Variants + +Some of these steps might be carried out via software. For this tutorial however, we chose to perform these steps one by one in order to get a better view of the rationale behind this approach. + +We will start by looking at the annotated VCF, which is found at this location in our GitPod environment: + +```bash +cd /workspace/gitpod/training/annotation/haplotypecaller/joint_variant_calling +``` + +Here, we should verify in which order the two samples we used for this analysis have been written in the VCF file. We can do that by grepping the column names row of the file, and printing at screen the fields from 10th onwards, i.e. the sample columns: + +```bash +zcat joint_germline_recalibrated_snpEff.ann.vcf.gz | grep "#CHROM" | cut -f 10- +``` + +This returns: + +```bash +case_case control_control +``` + +showing that case variants have been written in field 10th and control variants in field 11th. + +Next, in this educational scenario we might assume that an affected individual (case) will carry at least one alternative allele for the causative variant, while the control individual will be a homozygous for the reference. +With this assumption in mind, and a bit of one-liner code, we could first filter the homozygous for the alternative allele in our case, and then the heterozygous. + +In this first one, we can use the following code: + +```bash +zcat joint_germline_recalibrated_snpEff.ann.vcf.gz | grep PASS | grep HIGH | perl -nae 'if($F[10]=~/0\/0/ && $F[9]=~/1\/1/){print $_;}' +``` + +which results in the following variant. + +```bash +chr21 32576780 rs541034925 A AC 332.43 PASS AC=2;AF=0.5;AN=4;DB;DP=94;ExcessHet=0;FS=0;MLEAC=2;MLEAF=0.5;MQ=60;POSITIVE_TRAIN_SITE;QD=33.24;SOR=3.258;VQSLOD=953355.11;culprit=FS;ANN=AC|frameshift_variant|HIGH|TCP10L|ENSG00000242220|transcript|ENST00000300258.8|protein_coding|5/5|c.641dupG|p.Val215fs|745/3805|641/648|214/215||,AC|frameshift_variant|HIGH|CFAP298-TCP10L|ENSG00000265590|transcript|ENST00000673807.1|protein_coding|8/8|c.1163dupG|p.Val389fs|1785/4781|1163/1170|388/389|| GT:AD:DP:GQ:PL 1/1:0,10:10:30:348,30,0 0/0:81,0:81:99:0,119,1600 +``` + +Now we can search for this variant in the [gnomAD database](https://gnomad.broadinstitute.org), which hosts variants and genomic information from sequencing data of almost one million individuals (see [v4 release](https://gnomad.broadinstitute.org/news/2023-11-gnomad-v4-0/)). + +In order to search for the variant we can type its coordinates in the search field and choose the proposed variant corresponding to the exact position we need. See the figure below: + +![gnomad search](./img/gnomad_search.png) + +the resulting [variant data](https://gnomad.broadinstitute.org/region/21-32576780-32576780?dataset=gnomad_r4) show that our variant is present, and that it's been described already in [ClinVar](https://www.ncbi.nlm.nih.gov/clinvar/), where the provided interpretation (Clinical Significance) is "Benign". + +We can see the resulting table in the following image: + +![gnomad results](./img/gnomad_var_present.png) + +Quite importantly, the gnomAD database allows us to gather more information on the gene this variant occurs in. We can inspect the so called "constraint data", by clicking on the gene name and inspecting the "constraint" table on the top right of the page. + +![constraint](./img/gnomAD_constraint.png) + +This information gives us a better view of the selective pressure variation on this gene might be subject to, and therefore inform our understanding of the potential impact of a loss of function variant in this location. + +In this specific case however the gene is not under purifying selection neither for loss of function variants (LOEUF 0.89) nor for missense ones. + +We can continue our analysis by looking at the heterozygous variants in our case, for which the control carries a reference homozygous, with the code: + +```bash +zcat joint_germline_recalibrated_snpEff.ann.vcf.gz | grep PASS | grep HIGH | perl -nae 'if($F[10]=~/0\/0/ && $F[9]=~/0\/1/){print $_;}' +``` + +This will results in the following list of variants: + +```bash +chr21 44339194 rs769070783 T C 57.91 PASS AC=1;AF=0.25;AN=4;BaseQRankSum=-2.373;DB;DP=84;ExcessHet=0;FS=0;MLEAC=1;MLEAF=0.25;MQ=60;MQRankSum=0;POSITIVE_TRAIN_SITE;QD=3.41;ReadPosRankSum=-0.283;SOR=0.859;VQSLOD=198.85;culprit=FS;ANN=C|start_lost|HIGH|CFAP410|ENSG00000160226|transcript|ENST00000397956.7|protein_coding|1/7|c.1A>G|p.Met1?|200/1634|1/1128|1/375||,C|upstream_gene_variant|MODIFIER|ENSG00000232969|ENSG00000232969|transcript|ENST00000426029.1|pseudogene||n.-182T>C|||||182|,C|downstream_gene_variant|MODIFIER|ENSG00000184441|ENSG00000184441|transcript|ENST00000448927.1|pseudogene||n.*3343T>C|||||3343|;LOF=(CFAP410|ENSG00000160226|1|1.00) GT:AD:DP:GQ:PL 0/1:8,9:17:66:66,0,71 0/0:67,0:67:99:0,118,999 +chr21 44406660 rs139273180 C T 35.91 PASS AC=1;AF=0.25;AN=4;BaseQRankSum=-4.294;DB;DP=127;ExcessHet=0;FS=5.057;MLEAC=1;MLEAF=0.25;MQ=60;MQRankSum=0;POSITIVE_TRAIN_SITE;QD=0.51;ReadPosRankSum=0.526;SOR=1.09;VQSLOD=269.00;culprit=FS;ANN=T|stop_gained|HIGH|TRPM2|ENSG00000142185|transcript|ENST00000397932.6|protein_coding|19/33|c.2857C>T|p.Gln953*|2870/5216|2857/4662|953/1553||;LOF=(TRPM2|ENSG00000142185|1|1.00);NMD=(TRPM2|ENSG00000142185|1|1.00) GT:AD:DP:GQ:PL 0/1:48,22:71:44:44,0,950 0/0:51,0:51:99:0,100,899 +chr21 45989090 . C T 43.91 PASS AC=1;AF=0.25;AN=4;BaseQRankSum=2.65;DP=89;ExcessHet=0;FS=4.359;MLEAC=1;MLEAF=0.25;MQ=60;MQRankSum=0;QD=2.58;ReadPosRankSum=-1.071;SOR=1.863;VQSLOD=240.19;culprit=FS;ANN=T|stop_gained|HIGH|COL6A1|ENSG00000142156|transcript|ENST00000361866.8|protein_coding|9/35|c.811C>T|p.Arg271*|892/4203|811/3087|271/1028||;LOF=(COL6A1|ENSG00000142156|1|1.00);NMD=(COL6A1|ENSG00000142156|1|1.00) GT:AD:DP:GQ:PL 0/1:10,7:18:51:52,0,51 0/0:70,0:70:99:0,120,1800 +``` + +If we search them one by one, we will see that one in particular occurs on a gene (COL6A1) which was previously reported as constrained for loss of function variants in the database version 2.1: + +![col6a1v2](./img/gnomAD_COL6A1_v2.1.png) + +while the version 4.0 of the database, resulting from almost one million samples, reports the gene as _not_ constrained: + +![col6a1v4](./img/gnomAD_COL6A1_v4.0.png) + +We can search for this variant in ClinVar by using an advanced search and limiting our search to both chromosome and base position, like indicated in figure below: + +![clinvar search](./img/clinvar_search.png) + +This will return two results: one deletion and one single nucleotide variant C>T corresponding to the one we called in the case individual: + +![clinvar results](./img/clinvar_results.png) + +If we click on the nomenclature of the variant we found, we will be able to access the data provided with the submission. In [this page](https://www.ncbi.nlm.nih.gov/clinvar/variation/497373/) we can see that multiple submitters have provided an interpretation for this nonsense mutation (2 stars). +Under the section "Submitted interpretations and evidence" we can gather additional data on the clinical information that led the submitters to classify the variant as "pathogenic". + +## Conclusions + +After narrowing down our search and inspecting genomic context and clinical information, we can conclude that the variant + +```bash +chr21 45989090 C T AC=1;AF=0.25;AN=4;BaseQRankSum=2.37;DP=86;ExcessHet=0;FS=0;MLEAC=1;MLEAF=0.25;MQ=60;MQRankSum=0;QD=2.99;ReadPosRankSum=-0.737;SOR=1.022;VQSLOD=9.09;culprit=QD;ANN=T|stop_gained|HIGH|COL6A1|ENSG00000142156|transcript|ENST00000361866.8|protein_coding|9/35|c.811C>T|p.Arg271*|892/4203|811/3087|271/1028||;LOF=(COL6A1|ENSG00000142156|1|1.00);NMD=(COL6A1|ENSG00000142156|1|1.00) GT:AD:DP:GQ:PL 0/1:8,6:15:40:50,0,40 0/0:70,0:70:99:0,112,1494 +``` + +is most likely the causative one, because it creates a premature stop in the COL6A1 gene, with loss of function variants on this gene known to be pathogenic. diff --git a/docs/usage/variantcalling/introduction.md b/docs/usage/variantcalling/introduction.md new file mode 100644 index 0000000000..78e9c79075 --- /dev/null +++ b/docs/usage/variantcalling/introduction.md @@ -0,0 +1,45 @@ +--- +order: 1 +--- + +# Variant Calling Tutorial + +These pages are a tutorial workshop for the [Nextflow](https://www.nextflow.io) pipeline [nf-core/sarek](https://nf-co.re/sarek). + +In this workshop, we will recap the application of next generation sequencing to identify genetic variations in a genome. You will learn how to use the pipeline sarek to carry out this data-intensive workflow efficiently. We will cover topics such as experimental design, configuration of the pipeline and code execution. + +Please note that this is not an introductory workshop, and we will assume some basic familiarity with Nextflow. + +By the end of this workshop, you will be able to: + +- understand the key concepts behind variant calling, as adopted in this pipeline +- analyse simple NGS datasets with the sarek workflow +- customise some of its features for your own variant calling analyses +- integrate different sources of information to identify candidate variants +- make a hypothesis about variant interpretation using the output of sarek + +Let's get started! + +## Running with Gitpod + +In order to run this using GitPod, please make sure: + +1. You have a GitHub account: if not, create one [here](https://github.com/signup) +2. Once you have a GitHub account, sign up for GitPod using your GitHub user [here](https://gitpod.io/login/) choosing "continue with GitHub". + +Now you're all set and can use the following button to launch the service: + +[![Open in GitPod](https://img.shields.io/badge/Gitpod-%20Open%20in%20Gitpod-908a85?logo=gitpod)](https://gitpod.io/#https://github.com/lescai-teaching/sarek-tutorial) + +## Additional documentation + +- You can find detailed documentation on **Nextflow** [here](https://www.nextflow.io/docs/latest/) +- You can find additional training on [these pages](https://training.nextflow.io) + +## Credits & Copyright + +This training material has been written by [Francesco Lescai](https://github.com/lescai) during the [nf-core](https://nf-co.re) Hackathon in Barcelona, 2023. It was originally meant as a contribution for the nf-core community, and aimed at anyone who is interested in using nf-core pipelines for their studies or research activities. + +The Docker image and Gitpod environment used in this repository have been created by [Seqera](https://seqera.io) but have been made open-source ([CC BY-NC-ND](https://creativecommons.org/licenses/by-nc-nd/4.0/)) for the community. + +All examples and descriptions are licensed under the [Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License](http://creativecommons.org/licenses/by-nc-nd/4.0/). diff --git a/docs/usage/variantcalling/sarek.md b/docs/usage/variantcalling/sarek.md new file mode 100644 index 0000000000..9e071937cd --- /dev/null +++ b/docs/usage/variantcalling/sarek.md @@ -0,0 +1,158 @@ +--- +order: 3 +--- + +# Using Sarek for Variant Calling + +In order to carry out a germline variant calling analysis we will use the nf-core pipeline [sarek](https://nf-co.re/sarek/3.3.2). + +## Overview + +The pipeline is organised following the three main analysis blocks we previously described: pre-processing, variant calling and annotation. + +![sarek_overview](./img/sarek_subway.png) + +In each analysis block, the user can choose among a range of different options in terms of aligners, callers and software to carry out the annotation. +The analysis can also start from different steps, depending the input available and whether it has been partially processed already. + +## Experimental Design + +In order to choose the different options Sarek offers, the user should collect a few key elements of the experimental design before beginning the analysis. + +### Library design + +If the experiment used a capture (or targeted) strategy, the user will need to make sure the `bed` file with the target regions is available. +This file will be useful if the user wants to limit variant calling and annotation to those regions. +In this case the file can be passed to Sarek command line using the `--intervals target.bed` parameter. +Should the sequencing strategy be a _whole exome_ or _panel_, the pipeline gives the possibility to enable specific settings for this library type, using the parameter `--wes`. + +### Reference genome + +nf-core pipelines make use of the Illumina iGenomes collection as [reference genomes](https://nf-co.re/docs/usage/reference_genomes). +Before starting the analysis, the user might want to check whether the genome they need is part of this collection. +They also might want to consider downloading the reference locally, when running on premises: this would be useful for multiple runs and to speed up the analysis. In this case the parameter `--igenomes_base` might be used to pass the root directory of the downloaded references. + +One might also need to use custom files: in this case the user might either provide specific parameters at command line, or create a config file adding a new section to the `genome` object. See [here](https://nf-co.re/docs/usage/reference_genomes#custom-genomes) for more details. + +We will follow this specific approach in this tutorial, since the data we will be using have been simulated on chromosome 21 of the Human GRCh38 reference, and we have prepared fasta, indexes and annotation files containing only this chromosome locally. + +### Input files + +The input data should be provided in a CSV file, according to a format that is largely common for nf-core pipelines. +The format is described in the [sarek usage page](https://nf-co.re/sarek/3.3.2/docs/usage#input-sample-sheet-configurations). + +## GATK Best Practices + +During this tutorial we will use the options Sarek offers to follow the [GATK best practices workflow](https://gatk.broadinstitute.org/hc/en-us/articles/360035535932-Germline-short-variant-discovery-SNPs-Indels-). + +This is solely for educational purposes, since the tutorial dataset includes only 2 samples: while joint-genotyping is a valid choice, the use of soft filtering for such a limited dataset will not offer significant improvements. Additionally, running VQSR on a small dataset will incur in issues with some annotations and will require limiting this step to fewer parameters than usual. + +## Running Sarek Germline + +In the following sections we will first prepare our references, then set our computational resources in order to be able to run the pipeline on a gitpod VM, edit the filtering settings and finally run the pipeline. + +### Reference Genome + +Following the considerations above, we will first of all edit the `nextflow.config` file in our working directory to add a new genome. +It is sufficient to add the following code to the `parameters` directive in the config. + +```groovy +igenomes_base = '/workspace/gitpod/training/data/refs/' +genomes { + 'GRCh38chr21' { + bwa = "${params.igenomes_base}/sequence/Homo_sapiens_assembly38_chr21.fasta.{amb,ann,bwt,pac,sa}" + dbsnp = "${params.igenomes_base}/annotations/dbsnp_146.hg38_chr21.vcf.gz" + dbsnp_tbi = "${params.igenomes_base}/annotations/dbsnp_146.hg38_chr21.vcf.gz.tbi" + dbsnp_vqsr = '--resource:dbsnp,known=false,training=true,truth=false,prior=2.0 dbsnp_146.hg38_chr21.vcf.gz' + dict = "${params.igenomes_base}/sequence/Homo_sapiens_assembly38_chr21.dict" + fasta = "${params.igenomes_base}/sequence/Homo_sapiens_assembly38_chr21.fasta" + fasta_fai = "${params.igenomes_base}/sequence/Homo_sapiens_assembly38_chr21.fasta.fai" + germline_resource = "${params.igenomes_base}/annotations/gnomAD.r2.1.1.GRCh38.PASS.AC.AF.only_chr21.vcf.gz" + germline_resource_tbi = "${params.igenomes_base}/annotations/gnomAD.r2.1.1.GRCh38.PASS.AC.AF.only_chr21.vcf.gz.tbi" + known_snps = "${params.igenomes_base}/annotations/1000G_phase1.snps.high_confidence.hg38_chr21.vcf.gz" + known_snps_tbi = "${params.igenomes_base}/annotations/1000G_phase1.snps.high_confidence.hg38_chr21.vcf.gz.tbi" + known_snps_vqsr = '--resource:1000G,known=false,training=true,truth=true,prior=10.0 1000G_phase1.snps.high_confidence.hg38_chr21.vcf.gz' + known_indels = "${params.igenomes_base}/annotations/Mills_and_1000G_gold_standard.indels.hg38_chr21.vcf.gz" + known_indels_tbi = "${params.igenomes_base}/annotations/Mills_and_1000G_gold_standard.indels.hg38_chr21.vcf.gz.tbi" + known_indels_vqsr = '--resource:mills,known=false,training=true,truth=true,prior=10.0 Mills_and_1000G_gold_standard.indels.hg38_chr21.vcf.gz' + snpeff_db = '105' + snpeff_genome = 'GRCh38' + } +} +``` + +### Computing resources + +Based on the choices we made when starting up the gitpod environment, we recommend to use the following additional parameters. +They can also be added to the parameters directive in the config file we just edited. + +```groovy +params { + use_annotation_cache_keys = true +} + +process { + resourceLimits = [ + cpus: 2, + memory: '6.5GB', + time: '2.h' + ] +} + +``` + +The parameter `use_annotation_cache_keys` allows the annotation software to deal with the local paths when the cache is downloaded on the environment. + +### Filtering parameters + +As we mentioned earlier, we will be using the VQSR filtering tool once the variants have been called. +However, this tool should be used to take advantage of larger amount of variant annotations and improve filtering: when a small tutorial dataset is used, some of the annotations will not have sufficient data or might even have no variance. +In order to account for this, we have to change the filtering options and limit this approach to a subset of variant annotations. + +We can do this by editing the process descriptors for the Sarek modules running VQSR for both single nucleotide variants and insertion/deletions. + +```groovy +process { + withName: 'VARIANTRECALIBRATOR_INDEL' { + ext.prefix = { "${meta.id}_INDEL" } + ext.args = "-an QD -an FS -an SOR -an DP -mode INDEL" + publishDir = [ + enabled: false + ] + } + + withName: 'VARIANTRECALIBRATOR_SNP' { + ext.prefix = { "${meta.id}_SNP" } + ext.args = "-an QD -an MQ -an FS -an SOR -mode SNP" + publishDir = [ + enabled: false + ] + } +} +``` + +### Launching the pipeline + +Now we are ready to launch the pipeline, and we can use the following command line: + +```bash +nextflow run nf-core/sarek -r 3.4.0 \ +--input /workspace/gitpod/training/data/reads/sarek-input.csv \ +--outdir . \ +--tools haplotypecaller,snpeff \ +--genome GRCh38chr21 \ +--joint_germline \ +--intervals /workspace/gitpod/training/exome_target_hg38_chr21.bed \ +--wes +``` + +Notice that we have selected `--joint_germline` to enable the joint-genotyping workflow, we have specified our library strategy is using a capture with `--wes` and we have provided a bed file with the targets with `--intervals`. +The target file in this case refers to the capture intervals on chromosome 21 only, where the data have been simulated. + +The whole pipeline from FASTQ input to annotated VCF should run in about 25 minutes. + +Our final VCF file will be located in + +```bash +./annotation/haplotypecaller/joint_variant_calling +``` diff --git a/docs/usage/variantcalling/theory.md b/docs/usage/variantcalling/theory.md new file mode 100644 index 0000000000..e8e8a14b80 --- /dev/null +++ b/docs/usage/variantcalling/theory.md @@ -0,0 +1,140 @@ +--- +order: 1 +--- + +# Calling Variants on Sequencing Data + +Before we dive into one of the nf-core pipelines used for variant calling, it's worth looking at some theoretical aspects of variant calling. + +## Overview + +The term "variant calling" is rooted in the history of DNA sequencing, and it indicates an approach where we identify (i.e. call) positions in a genome (loci) which are variable in a population (genetic variants). The specific genotype of an individual at that variant locus is then assigned. + +There are many different approaches for calling variants from sequencing data: here, we will look more specifically at a reference-based variant calling approach, i.e. where a reference genome is needed and variant sites are identified by comparing the reads to this reference. + +Over the years, also thanks to the work carried out by the [GATK team](https://gatk.broadinstitute.org/hc/en-us) at the Broad Institute, there has been a convergence on a "best practices" workflow, which is summarised in the diagram below: + +![overview](./img/overview.excalidraw.svg) + +In this scheme we can identify a few key phases in the workflow. Pre-processing is the first part, where raw data are handled and mapped to a genome reference, to be then transformed in order to increase the accuracy of the following analyses. Then, variant calling is carried out. This is followed by filtering and annotation. +Here we will briefly discuss these key steps, which might vary depending on the specific type of data one is performing variant calling on. + +## Alignment + +The alignment step is where reads obtained from genome fragments of a sample are identified as originating from a specific location in the genome. +This step is essential in a reference-based workflow, because it is the comparison of the raw data with the reference to inform us on whether a position in the genome might be variable or not. + +Mismatches, insertions and deletions (INDELs) as well as duplicated regions make this step sometimes challenging: this is the reason why an appropriate aligner has to be chosen, depending on the sequencing application and data type. + +Once each raw read has been aligned to the region of the genome it is most likely originating from, the sequence of all reads overlapping each locus can be used to identify potentially variable sites. Each read will support the presence of an allele identical to the reference, or a different one (alternative allele), and the variant calling algorithm will measure the weighted support for each allele. + +However, the support given by the raw data to alternative variants might be biased. For this reason, one can apply certain corrections to the data to ensure the support for the alleles is assessed correctly. This is done by performing the two steps described below: marking duplicates, and recalibrating base quality scores. + +## Marking Duplicates + +Duplicates are non-independent measurements of a sequence fragment. + +Since DNA fragmentation is theoretically random, reads originating from different fragments provide independent information. An algorithm can use this information to assess the support for different alleles. +When these measurements however are not independent, the raw data might provide a biased support towards a specific allele. + +Duplicates can be caused by PCR during library preparation (library duplicates) or might occur during sequencing, when the instrument is reading the signal from different clusters (as in Illumina short read sequencing). These latter are called "optical duplicates". + +A specific step called "marking duplicates" identifies these identical pairs using their orientation and 5' position (before any clipping), which will be assumed to be coming from the same input DNA template: one representative pair is then chosen based on quality scores and other criteria, while the other ones are marked. +Marked reads are then ignored in the following steps. + +## Base Quality Score Recalibration + +Among the parameters used by a variant calling algorithm to weigh the support for different alleles, the quality score of the base in the read at the variant locus is quite important. +Sequencing instruments, however, can make systematic errors when reading the signal at each cycle, and cannot account for errors originated in PCR. + +Once a read has been aligned to the reference, an appropriate algorithm can however compare the error rate estimated from the existing base quality scores, with the actual differences observed with the reference sequence (empirical quality), and perform appropriate corrections. +This process is called "base quality score recalibration" (BQSR). + +To calculate empirical qualities, the algorithm simply counts the number of mismatches in the observed bases. Any mismatch which does not overlap a known variant is considered an error. The empirical error rate is simply the ratio between counted errors and the total observed bases. +A Yates correction is applied to this, to avoid either dividing by 0 or dealing with small counts. + +$$ +e_{empirical} = \frac{n_{mismatches} + 1}{n_{bases} +2} +$$ + +The empirical error is expressed as a Quality in Phred-scale: + +$$ +Q_{empirical} = -10 \times log_{10}(e_{empirical}) +$$ + +Let's use a simple example like the one in the diagram below, where for illustrative purposes we only consider the bases belonging to the same read. + +![bqsr](./img/bqsr.excalidraw.svg) + +In this example we have 3 mismatches, but one is a reported variant site: we therefore only count 2 errors, over 10 observed bases. According to the approach we just explained, + +$$ +Q_{empirical} = -10 \times log_{10}(\frac{2 + 1}{10 +2}) = 6.29 +$$ + +To calculate the average reported Q score, we should sum the error probabilities and then convert them back into phred scale: + +$$ +Q_{average} = -10 \times log_{10}(\frac {0.1 + 0.1 + 0.01 + 0.1 + 0.01 + 0.01 + 0.01 + 0.1 + 0.1 + 0.1}{10}) = 11.94 +$$ + +Our empirical Q score would be 6.29, the average reported Q score is 11.94, and therefore the $\Delta = 11.94 - 6.29 = 5.65$ + +The recalibrated Q score of each base would correspond to the reported Q score minus this $\Delta$. + +In a real sequencing dataset, this calculation is performed for different groups (bins) of bases: those in the same lane, those with the same original quality score, per machine cycle, per sequencing context. +In each bin, the difference ($\Delta$) between the average reported quality and the empirical quality is calculated. +The recalibrated score would then be the reported score minus the sum of all deltas calculated in each bin the base belongs to. + +A detailed summary of this approach can be found on the [GATK BQSR page](https://gatk.broadinstitute.org/hc/en-us/articles/360035890531-Base-Quality-Score-Recalibration-BQSR-). We also found quite useful this [step by step guide](https://rstudio-pubs-static.s3.amazonaws.com/64456_4778547202f24f32b0edc325e96b061a.html) through the matematical approach. Full details are explained in the [publication](https://www.nature.com/articles/ng.806) that first proposed this method. + +## Calling Variants + +Once we have prepared the data for an accurate identification of the variants, we are ready to perform the next steps. +The most important innovation introduced some years ago in this part of the workflow, has been to separate the identification of a variant site (i.e. variant calling itself) from the assignment of the genotype to each individual. +This approach makes the computation more approachable, especially for large sample cohorts: BAM files are only accessed per-sample in the first step, while multi-sample cohort data are used together in the second step in order to increase the accuracy of genotype assignment. + +### Identifying Variants + +In this phase, which is performed on each sample independently, a first step uses a sliding window to count differences compared to the reference (i.e. mismatches, INDELs) and potentially variable regions are identified. GATK calls these "active regions". +Then, a local graph assembly of the reads is created to identify plausible haplotypes, which are aligned to the reference with a traditional alignment algorithm called "Smith-Waterman": this is used to identify variants. +For each read in an active region, the support for each of the haplotypes is counted and a likelihood score for each combination of read/haplotype is calculated. +The likelihoods at this step allow to calculate the support for each of the alleles in a variant site, and read-haplotype likelihoods are a key input for the Bayesian statistics used to determine the most likely genotype. +This first genotype assignment could be sufficient if one analysed a single sample only. + +### Assigning Genotypes + +When multiple samples are analysed, information from each of them could collectively improve the genotype assignment. +This is because the magnitude of potential biases (example: strand bias) can be better estimated, and because the distributions of those annotations used to inform the genotype assignment become more stable when more data are available, by combining multiple samples. +The use of a larger cohort also increases the sensitivity. + +This is possible if the variant calling step is run by producing a variation of the VCF file format called GVCF: this format includes, in addition to variant sites, also non-variant intervals in the genome of each sample. Moreover, it reports probability likelihoods of a non-reference symbolic allele at these non-variant intervals. +This information allows to re-genotype each sample by using data from the whole cohort. + +You can read more on the GATK website about the [logic of joint calling](https://gatk.broadinstitute.org/hc/en-us/articles/360035890431-The-logic-of-joint-calling-for-germline-short-variants). + +### Filtering Variants + +There are several ways to spot potential false positives through filtering. + +_Hard filtering_ uses pre-defined thresholds of different variant annotations (allele-depth, mapping quality and many others) in order to flag variants passing all these criteria, and those failing to meet any of them. This approach is mostly useful when calling a few samples and enough data are not available for more sophisticated solutions. + +_Soft filtering_ infers the thresholds to be applied from the data themselves. This approach uses the distributions of the annotations, and their overlap with known and validated variants: it defines those combinations of annotations which are more likely to describe true positives (the variants they refer to in the analysis cohort overlap with those validated in other databases). This approach is used by a GATK tool called Variant Quality Score Recalibration (VQSR). + +More details can be found on the [GATK VQSR page](https://gatk.broadinstitute.org/hc/en-us/articles/360035531612-Variant-Quality-Score-Recalibration-VQSR-). + +More recently, pre-trained deep learning models are also available to filter variants based on neural network architectures trained on a large number of variants from population databases. + +## Annotation + +Once the analysis has produced a final VCF file, the final step which is necessary to interpret the results is called "annotation". +This step uses different databases to describe (annotate) each variant from a genomic, biological, or population point of view. +The software used to carry out this task will add information to the VCF file such as: + +- the gene each variant overlaps with +- the transcript the variant overlaps with +- the potential biological consequence on each of those transcripts +- population frequency (minor allele frequency, described in different databases such as gnomAD) + +And several other items we can use to interpret our findings from a biological or clinical point of view. diff --git a/main.nf b/main.nf index 92bcc47b2b..effa97ef16 100755 --- a/main.nf +++ b/main.nf @@ -56,7 +56,6 @@ params.pon = getGenomeAttribute('pon') params.pon_tbi = getGenomeAttribute('pon_tbi') params.sentieon_dnascope_model = getGenomeAttribute('sentieon_dnascope_model') params.snpeff_db = getGenomeAttribute('snpeff_db') -params.snpeff_genome = getGenomeAttribute('snpeff_genome') params.vep_cache_version = getGenomeAttribute('vep_cache_version') params.vep_genome = getGenomeAttribute('vep_genome') params.vep_species = getGenomeAttribute('vep_species') @@ -177,7 +176,7 @@ workflow NFCORE_SAREK { rt_file = PREPARE_GENOME.out.rt_file // Tabix indexed vcf files - bcftools_annotations_tbi = params.bcftools_annotations ? params.bcftools_annotations_tbi ? Channel.fromPath(params.bcftools_annotations_tbi).collect() : PREPARE_GENOME.out.bcftools_annotations_tbi : Channel.empty([]) + bcftools_annotations_tbi = params.bcftools_annotations ? params.bcftools_annotations_tbi ? Channel.fromPath(params.bcftools_annotations_tbi).collect() : PREPARE_GENOME.out.bcftools_annotations_tbi : Channel.value([]) dbsnp_tbi = params.dbsnp ? params.dbsnp_tbi ? Channel.fromPath(params.dbsnp_tbi).collect() : PREPARE_GENOME.out.dbsnp_tbi : Channel.value([]) germline_resource_tbi = params.germline_resource ? params.germline_resource_tbi ? Channel.fromPath(params.germline_resource_tbi).collect() : PREPARE_GENOME.out.germline_resource_tbi : [] //do not change to Channel.value([]), the check for its existence then fails for Getpileupsumamries known_indels_tbi = params.known_indels ? params.known_indels_tbi ? Channel.fromPath(params.known_indels_tbi).collect() : PREPARE_GENOME.out.known_indels_tbi : Channel.value([]) @@ -235,7 +234,7 @@ workflow NFCORE_SAREK { if (params.download_cache) { // Assuming that even if the cache is provided, if the user specify download_cache, sarek will download the cache ensemblvep_info = Channel.of([ [ id:"${params.vep_cache_version}_${params.vep_genome}" ], params.vep_genome, params.vep_species, params.vep_cache_version ]) - snpeff_info = Channel.of([ [ id:"${params.snpeff_genome}.${params.snpeff_db}" ], params.snpeff_genome, params.snpeff_db ]) + snpeff_info = Channel.of([ [ id:"${params.snpeff_db}" ], params.snpeff_db ]) DOWNLOAD_CACHE_SNPEFF_VEP(ensemblvep_info, snpeff_info) snpeff_cache = DOWNLOAD_CACHE_SNPEFF_VEP.out.snpeff_cache vep_cache = DOWNLOAD_CACHE_SNPEFF_VEP.out.ensemblvep_cache.map{ meta, cache -> [ cache ] } @@ -246,7 +245,6 @@ workflow NFCORE_SAREK { ANNOTATION_CACHE_INITIALISATION( (params.snpeff_cache && params.tools && (params.tools.split(',').contains("snpeff") || params.tools.split(',').contains('merge'))), params.snpeff_cache, - params.snpeff_genome, params.snpeff_db, (params.vep_cache && params.tools && (params.tools.split(',').contains("vep") || params.tools.split(',').contains('merge'))), params.vep_cache, @@ -309,7 +307,6 @@ workflow NFCORE_SAREK { vep_genome, vep_species ) - emit: multiqc_report = SAREK.out.multiqc_report // channel: /path/to/multiqc_report.html } @@ -322,13 +319,11 @@ workflow NFCORE_SAREK { workflow { main: - // // SUBWORKFLOW: Run initialisation tasks // PIPELINE_INITIALISATION( params.version, - params.help, params.validate_params, params.monochrome_logs, args, diff --git a/modules.json b/modules.json index 26d801647b..ad4fd57616 100644 --- a/modules.json +++ b/modules.json @@ -7,496 +7,507 @@ "nf-core": { "ascat": { "branch": "master", - "git_sha": "3f5420aa22e00bd030a2556dfdffc9e164ec0ec5", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "bcftools/annotate": { "branch": "master", - "git_sha": "a5ba4d59c2b248c0379b0f8aeb4e7e754566cd1f", + "git_sha": "cb08035150685b11d890d90c9534d4f16869eaec", "installed_by": ["modules"], "patch": "modules/nf-core/bcftools/annotate/bcftools-annotate.diff" }, "bcftools/concat": { "branch": "master", - "git_sha": "a5ba4d59c2b248c0379b0f8aeb4e7e754566cd1f", + "git_sha": "d1e0ec7670fa77905a378627232566ce54c3c26d", "installed_by": ["modules"] }, "bcftools/mpileup": { "branch": "master", - "git_sha": "a5ba4d59c2b248c0379b0f8aeb4e7e754566cd1f", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["bam_ngscheckmate"] }, "bcftools/sort": { "branch": "master", - "git_sha": "a5ba4d59c2b248c0379b0f8aeb4e7e754566cd1f", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "bcftools/stats": { "branch": "master", - "git_sha": "a5ba4d59c2b248c0379b0f8aeb4e7e754566cd1f", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "bwa/index": { "branch": "master", - "git_sha": "086fa66260595e123b0ea47a6512539b72a9afa3", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "bwa/mem": { "branch": "master", - "git_sha": "0c34b8159f62cde451c4ff249629c9d0a4f3f9c3", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "bwamem2/index": { "branch": "master", - "git_sha": "7081e04c18de9480948d34513a1c1e2d0fa9126d", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "bwamem2/mem": { "branch": "master", - "git_sha": "3afb95b2e15fc4a2347470255a7ef654f650c8ec", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "cat/cat": { "branch": "master", - "git_sha": "9437e6053dccf4aafa022bfd6e7e9de67e625af8", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "cat/fastq": { "branch": "master", - "git_sha": "4fc983ad0b30e6e32696fa7d980c76c7bfe1c03e", + "git_sha": "a1abf90966a2a4016d3c3e41e228bfcbd4811ccc", "installed_by": ["modules"] }, "cnvkit/antitarget": { "branch": "master", - "git_sha": "7d8eff8f0cbc20cb83ce624e86c58ede51397054", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "cnvkit/batch": { "branch": "master", - "git_sha": "f53b071421340e6fac0806c86ba030e578e94826", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "cnvkit/call": { "branch": "master", - "git_sha": "a64788f5ad388f1d2ac5bd5f1f3f8fc81476148c", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "cnvkit/export": { "branch": "master", - "git_sha": "a64788f5ad388f1d2ac5bd5f1f3f8fc81476148c", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "cnvkit/genemetrics": { "branch": "master", - "git_sha": "a64788f5ad388f1d2ac5bd5f1f3f8fc81476148c", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "cnvkit/reference": { "branch": "master", - "git_sha": "f8693ff46b884892982d658271ed260380111c53", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "controlfreec/assesssignificance": { "branch": "master", - "git_sha": "e6c5689c1d4c7f255a7cc042b0a2fa25a9b3c4fa", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"], "patch": "modules/nf-core/controlfreec/assesssignificance/controlfreec-assesssignificance.diff" }, "controlfreec/freec": { "branch": "master", - "git_sha": "7b5827ac89358ad6dd3e8f328f6d1427d7f14a68", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "controlfreec/freec2bed": { "branch": "master", - "git_sha": "0c7fd5488d43188ee801c800461d259389d34c19", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "controlfreec/freec2circos": { "branch": "master", - "git_sha": "b626cd7bf99db4f42de314ee8b70d1c389a7b9f4", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "controlfreec/makegraph2": { "branch": "master", - "git_sha": "c3f338377c177a01847eeea2f77da33ce89f92e6", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, - "deepvariant": { + "deepvariant/rundeepvariant": { "branch": "master", - "git_sha": "199ba086a259e1933d6e0ab7596e4a977bbd483a", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "dragmap/align": { "branch": "master", - "git_sha": "dd2757cc22c5de8943fa38ba7cd6f8cc1eb65ac1", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"], "patch": "modules/nf-core/dragmap/align/dragmap-align.diff" }, "dragmap/hashtable": { "branch": "master", - "git_sha": "ae9e01cb5e77faada314047e78423b22b4f5bbc5", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"], "patch": "modules/nf-core/dragmap/hashtable/dragmap-hashtable.diff" }, "ensemblvep/download": { "branch": "master", - "git_sha": "3db4f8488315cd7d7cf3fcb64251f6603210e831", + "git_sha": "6e3585d9ad20b41adc7d271009f8cb5e191ecab4", "installed_by": ["modules"] }, "ensemblvep/vep": { "branch": "master", - "git_sha": "b42fec6f7c6e5d0716685cabb825ef6bf6e386b5", + "git_sha": "6e3585d9ad20b41adc7d271009f8cb5e191ecab4", "installed_by": ["modules", "vcf_annotate_ensemblvep"] }, "fastp": { "branch": "master", - "git_sha": "95cf5fe0194c7bf5cb0e3027a2eb7e7c89385080", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "fastqc": { "branch": "master", - "git_sha": "285a50500f9e02578d90b3ce6382ea3c30216acd", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "fgbio/callmolecularconsensusreads": { "branch": "master", - "git_sha": "3f5420aa22e00bd030a2556dfdffc9e164ec0ec5", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "fgbio/fastqtobam": { "branch": "master", - "git_sha": "19f81cab3b2a08f37c4f3727ddb30c01ebf07be6", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "fgbio/groupreadsbyumi": { "branch": "master", - "git_sha": "3f5420aa22e00bd030a2556dfdffc9e164ec0ec5", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "freebayes": { "branch": "master", - "git_sha": "77978839bef6d437f21edb900b49bcbc04f9f735", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/applybqsr": { "branch": "master", - "git_sha": "af273ea6618c50e82c372abe18b0a225e84fe6f7", + "git_sha": "6b3bf38285d94cc1ea3cd9fa93310d54b04c3819", "installed_by": ["modules"] }, "gatk4/applyvqsr": { "branch": "master", - "git_sha": "cee8fe33d3ef1a220dee67dac75a32f7c872f63f", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/baserecalibrator": { "branch": "master", - "git_sha": "8a223e11d4e6deb36484e01891eae9c1cacb5f5d", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/calculatecontamination": { "branch": "master", - "git_sha": "77ffba959bbe8b6e1d95d47688075d113e24f0d4", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/cnnscorevariants": { "branch": "master", - "git_sha": "60a7dbae179bcfa24c10294cc9a07423a239c19a", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/createsequencedictionary": { "branch": "master", - "git_sha": "e6fe277739f5894711405af3e717b2470bd956b5", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/estimatelibrarycomplexity": { "branch": "master", - "git_sha": "1943aa60f7490c3d6740e8872e6e69122ccc8087", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/filtermutectcalls": { "branch": "master", - "git_sha": "7d814815f638e1483995b24a23f5f23229036bbf", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/filtervarianttranches": { "branch": "master", - "git_sha": "d742e3143f2ccb8853c29b35cfcf50b5e5026980", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/gatherbqsrreports": { "branch": "master", - "git_sha": "d742e3143f2ccb8853c29b35cfcf50b5e5026980", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/gatherpileupsummaries": { "branch": "master", - "git_sha": "d742e3143f2ccb8853c29b35cfcf50b5e5026980", + "git_sha": "679f45cae4f603f12d7c38c042afee11150574a0", "installed_by": ["modules"] }, "gatk4/genomicsdbimport": { "branch": "master", - "git_sha": "5caf7640a9ef1d18d765d55339be751bb0969dfa", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/genotypegvcfs": { "branch": "master", - "git_sha": "d742e3143f2ccb8853c29b35cfcf50b5e5026980", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/getpileupsummaries": { "branch": "master", - "git_sha": "b632dcbf8bd3d7b9cb22fd0b2416e9e6cb8f4045", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/haplotypecaller": { "branch": "master", - "git_sha": "d742e3143f2ccb8853c29b35cfcf50b5e5026980", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/intervallisttobed": { "branch": "master", - "git_sha": "d742e3143f2ccb8853c29b35cfcf50b5e5026980", - "installed_by": ["modules"] + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"], + "patch": "modules/nf-core/gatk4/intervallisttobed/gatk4-intervallisttobed.diff" }, "gatk4/learnreadorientationmodel": { "branch": "master", - "git_sha": "d742e3143f2ccb8853c29b35cfcf50b5e5026980", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/markduplicates": { "branch": "master", - "git_sha": "194fca815cf594646e638fa5476acbcc296f1850", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/mergemutectstats": { "branch": "master", - "git_sha": "cafe91148ca110e52ceaa07f3e373b882800d04b", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/mergevcfs": { "branch": "master", - "git_sha": "194fca815cf594646e638fa5476acbcc296f1850", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/mutect2": { "branch": "master", - "git_sha": "5fd04feb37b58caa6a54d41e38c80066bdf71056", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/variantrecalibrator": { "branch": "master", - "git_sha": "d742e3143f2ccb8853c29b35cfcf50b5e5026980", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4spark/applybqsr": { "branch": "master", - "git_sha": "3b928d02096f928ef224d89f2a502afaa6e06556", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4spark/baserecalibrator": { "branch": "master", - "git_sha": "d742e3143f2ccb8853c29b35cfcf50b5e5026980", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4spark/markduplicates": { "branch": "master", - "git_sha": "3b928d02096f928ef224d89f2a502afaa6e06556", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gawk": { "branch": "master", - "git_sha": "cf3ed075695639b0a0924eb0901146df1996dc08", + "git_sha": "97321eded31a12598837a476d3615300af413bb7", + "installed_by": ["modules"] + }, + "goleft/indexcov": { + "branch": "master", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"] + }, + "lofreq/callparallel": { + "branch": "master", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "manta/germline": { "branch": "master", - "git_sha": "ebc1733b77c702f19fe42076a5edfcbaa0d84f66", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "manta/somatic": { "branch": "master", - "git_sha": "ab693fbb906b3a1151ad21e270129a9d48437ab6", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "manta/tumoronly": { "branch": "master", - "git_sha": "8731a6221dd10fd9039e18518b390b43e14ef9ae", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "mosdepth": { "branch": "master", - "git_sha": "e0616fba0919adb190bfe070d17fb12d76ba3a26", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "msisensorpro/msisomatic": { "branch": "master", - "git_sha": "3f5420aa22e00bd030a2556dfdffc9e164ec0ec5", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "msisensorpro/scan": { "branch": "master", - "git_sha": "3f5420aa22e00bd030a2556dfdffc9e164ec0ec5", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "multiqc": { "branch": "master", - "git_sha": "b7ebe95761cd389603f9cc0e0dc384c0f663815a", + "git_sha": "cf17ca47590cc578dfb47db1c2a44ef86f89976d", "installed_by": ["modules"] }, "ngscheckmate/ncm": { "branch": "master", - "git_sha": "0e04b949c90e686c8b07495576832d78ab9210cf", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["bam_ngscheckmate"] }, "samblaster": { "branch": "master", - "git_sha": "310850152f3e1dec6ba28b28e1f1cb9ab8660a49", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "samtools/bam2fq": { "branch": "master", - "git_sha": "04fbbc7c43cebc0b95d5b126f6d9fe4effa33519", + "git_sha": "b13f07be4c508d6ff6312d354d09f2493243e208", "installed_by": ["modules"] }, "samtools/collatefastq": { "branch": "master", - "git_sha": "04fbbc7c43cebc0b95d5b126f6d9fe4effa33519", + "git_sha": "b13f07be4c508d6ff6312d354d09f2493243e208", "installed_by": ["modules"] }, "samtools/convert": { "branch": "master", - "git_sha": "04fbbc7c43cebc0b95d5b126f6d9fe4effa33519", + "git_sha": "b13f07be4c508d6ff6312d354d09f2493243e208", "installed_by": ["modules"] }, "samtools/faidx": { "branch": "master", - "git_sha": "04fbbc7c43cebc0b95d5b126f6d9fe4effa33519", + "git_sha": "b13f07be4c508d6ff6312d354d09f2493243e208", "installed_by": ["modules"] }, "samtools/index": { "branch": "master", - "git_sha": "04fbbc7c43cebc0b95d5b126f6d9fe4effa33519", + "git_sha": "b13f07be4c508d6ff6312d354d09f2493243e208", "installed_by": ["modules"] }, "samtools/merge": { "branch": "master", - "git_sha": "04fbbc7c43cebc0b95d5b126f6d9fe4effa33519", + "git_sha": "b13f07be4c508d6ff6312d354d09f2493243e208", "installed_by": ["modules"] }, "samtools/mpileup": { "branch": "master", - "git_sha": "04fbbc7c43cebc0b95d5b126f6d9fe4effa33519", + "git_sha": "13e7d1046922381df90cd8fe9bee8c3e57ae8457", "installed_by": ["modules"] }, "samtools/stats": { "branch": "master", - "git_sha": "04fbbc7c43cebc0b95d5b126f6d9fe4effa33519", + "git_sha": "b13f07be4c508d6ff6312d354d09f2493243e208", "installed_by": ["modules"] }, "samtools/view": { "branch": "master", - "git_sha": "04fbbc7c43cebc0b95d5b126f6d9fe4effa33519", + "git_sha": "b13f07be4c508d6ff6312d354d09f2493243e208", "installed_by": ["modules"] }, "sentieon/applyvarcal": { "branch": "master", - "git_sha": "e809c6b078d5343bdf8b5b2b78483096a2b5a973", + "git_sha": "eb7b70119bfb1877334c996d13e520c61b21067d", "installed_by": ["modules"] }, "sentieon/bwamem": { "branch": "master", - "git_sha": "e809c6b078d5343bdf8b5b2b78483096a2b5a973", + "git_sha": "eb7b70119bfb1877334c996d13e520c61b21067d", "installed_by": ["modules"] }, "sentieon/dedup": { "branch": "master", - "git_sha": "e809c6b078d5343bdf8b5b2b78483096a2b5a973", + "git_sha": "eb7b70119bfb1877334c996d13e520c61b21067d", "installed_by": ["modules"] }, "sentieon/dnamodelapply": { "branch": "master", - "git_sha": "e809c6b078d5343bdf8b5b2b78483096a2b5a973", + "git_sha": "eb7b70119bfb1877334c996d13e520c61b21067d", "installed_by": ["modules"] }, "sentieon/dnascope": { "branch": "master", - "git_sha": "e809c6b078d5343bdf8b5b2b78483096a2b5a973", + "git_sha": "eb7b70119bfb1877334c996d13e520c61b21067d", "installed_by": ["modules"] }, "sentieon/gvcftyper": { "branch": "master", - "git_sha": "e809c6b078d5343bdf8b5b2b78483096a2b5a973", + "git_sha": "eb7b70119bfb1877334c996d13e520c61b21067d", "installed_by": ["modules"] }, "sentieon/haplotyper": { "branch": "master", - "git_sha": "e809c6b078d5343bdf8b5b2b78483096a2b5a973", + "git_sha": "eb7b70119bfb1877334c996d13e520c61b21067d", "installed_by": ["modules"] }, "sentieon/varcal": { "branch": "master", - "git_sha": "e809c6b078d5343bdf8b5b2b78483096a2b5a973", + "git_sha": "eb7b70119bfb1877334c996d13e520c61b21067d", "installed_by": ["modules"] }, "snpeff/download": { "branch": "master", - "git_sha": "214d575774c172062924ad3564b4f66655600730", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "snpeff/snpeff": { "branch": "master", - "git_sha": "2f3db6f45147ebbb56b371536e31bdf622b5bfee", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules", "vcf_annotate_snpeff"] }, "spring/decompress": { "branch": "master", - "git_sha": "0a92fa8d17d9e3c411e01a0ce41a86eff02b1599", + "git_sha": "d7462e71f9129083ce10c3fe953ed401781e0ebd", "installed_by": ["modules"] }, "strelka/germline": { "branch": "master", - "git_sha": "e8f2c77a6e4174ee0a48d073d4cc8ff06c44bb4c", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "strelka/somatic": { "branch": "master", - "git_sha": "a626d7c63cb0ee675686a2f47b26cdc53266e186", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "svdb/merge": { "branch": "master", - "git_sha": "ba3f3df395d2719dcef5c67189042a1dc555c701", + "git_sha": "eb2c3f7ee2c938ab1a49764bdb1319adaa35492c", "installed_by": ["modules"] }, "tabix/bgziptabix": { "branch": "master", - "git_sha": "09d3c8c29b31a2dfd610305b10550f0e1dbcd4a9", + "git_sha": "f448e846bdadd80fc8be31fbbc78d9f5b5131a45", "installed_by": ["modules", "vcf_annotate_snpeff"] }, "tabix/tabix": { "branch": "master", - "git_sha": "9502adb23c0b97ed8e616bbbdfa73b4585aec9a1", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules", "vcf_annotate_ensemblvep"] }, "tiddit/sv": { "branch": "master", - "git_sha": "6af4979ee1a57c986102175d9e1bb7ab834f3ae8", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "untar": { "branch": "master", - "git_sha": "5caf7640a9ef1d18d765d55339be751bb0969dfa", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "unzip": { "branch": "master", - "git_sha": "3f5420aa22e00bd030a2556dfdffc9e164ec0ec5", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "vcftools": { "branch": "master", - "git_sha": "624ecdc43b72e0a45bf05d9b57215d18dcd538f8", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] } } @@ -505,22 +516,22 @@ "nf-core": { "bam_ngscheckmate": { "branch": "master", - "git_sha": "cfd937a668919d948f6fcbf4218e79de50c2f36f", + "git_sha": "c60c14b285b89bdd0607e371417dadb80385ad6e", "installed_by": ["subworkflows"] }, "utils_nextflow_pipeline": { "branch": "master", - "git_sha": "5caf7640a9ef1d18d765d55339be751bb0969dfa", + "git_sha": "3aa0aec1d52d492fe241919f0c6100ebf0074082", "installed_by": ["subworkflows"] }, "utils_nfcore_pipeline": { "branch": "master", - "git_sha": "92de218a329bfc9a9033116eb5f65fd270e72ba3", + "git_sha": "1b6b9a3338d011367137808b49b923515080e3ba", "installed_by": ["subworkflows"] }, - "utils_nfvalidation_plugin": { + "utils_nfschema_plugin": { "branch": "master", - "git_sha": "5caf7640a9ef1d18d765d55339be751bb0969dfa", + "git_sha": "2fd2cd6d0e7b273747f32e465fdc6bcc3ae0814e", "installed_by": ["subworkflows"] }, "vcf_annotate_ensemblvep": { diff --git a/modules/local/add_info_to_vcf/environment.yml b/modules/local/add_info_to_vcf/environment.yml index 34513c7f4a..315f6dc67e 100644 --- a/modules/local/add_info_to_vcf/environment.yml +++ b/modules/local/add_info_to_vcf/environment.yml @@ -1,7 +1,5 @@ -name: gawk channels: - conda-forge - bioconda - - defaults dependencies: - - anaconda::gawk=5.1.0 + - conda-forge::gawk=5.3.0 diff --git a/modules/local/create_intervals_bed/environment.yml b/modules/local/create_intervals_bed/environment.yml index 34513c7f4a..315f6dc67e 100644 --- a/modules/local/create_intervals_bed/environment.yml +++ b/modules/local/create_intervals_bed/environment.yml @@ -1,7 +1,5 @@ -name: gawk channels: - conda-forge - bioconda - - defaults dependencies: - - anaconda::gawk=5.1.0 + - conda-forge::gawk=5.3.0 diff --git a/modules/local/create_intervals_bed/main.nf b/modules/local/create_intervals_bed/main.nf index 6a3c9c5a47..ad42e6ad53 100644 --- a/modules/local/create_intervals_bed/main.nf +++ b/modules/local/create_intervals_bed/main.nf @@ -72,4 +72,18 @@ process CREATE_INTERVALS_BED { END_VERSIONS """ } + + stub: + def prefix = task.ext.prefix ?: "${intervals.baseName}" + def metrics = task.ext.metrics ?: "${prefix}.metrics" + // def prefix_basename = prefix.substring(0, prefix.lastIndexOf(".")) + + """ + touch ${prefix}.stub.bed + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + gawk: \$(awk -Wversion | sed '1!d; s/.*Awk //; s/,.*//') + END_VERSIONS + """ } diff --git a/modules/local/samtools/reindex_bam/environment.yml b/modules/local/samtools/reindex_bam/environment.yml new file mode 100644 index 0000000000..da2df5e43a --- /dev/null +++ b/modules/local/samtools/reindex_bam/environment.yml @@ -0,0 +1,6 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::samtools=1.20 + - bioconda::htslib=1.20 diff --git a/modules/local/samtools/reindex_bam/main.nf b/modules/local/samtools/reindex_bam/main.nf new file mode 100644 index 0000000000..153f9093d6 --- /dev/null +++ b/modules/local/samtools/reindex_bam/main.nf @@ -0,0 +1,57 @@ +/** + * The aim of this process is to re-index the bam file without the duplicate, supplementary, unmapped etc, for goleft/indexcov + * It creates a BAM containing only a header (so indexcov can get the sample name) and a BAM index were low quality reads, supplementary etc, have been removed + */ +process SAMTOOLS_REINDEX_BAM { + tag "$meta.id" + label 'process_low' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/samtools:1.20--h50ea8bc_0' : + 'biocontainers/samtools:1.20--h50ea8bc_0' }" + + input: + tuple val(meta), path(input), path(input_index) + tuple val(meta2), path(fasta) + tuple val(meta3), path(fai) + + output: + tuple val(meta), path("${meta.id}.reindex.bam"), path("${meta.id}.reindex.bam.bai"),emit: output + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def reference = fasta ? "--reference ${fasta}" : "" + """ + # write header only + samtools \\ + view \\ + --header-only \\ + --threads ${task.cpus} \\ + -O BAM \\ + -o "${meta.id}.reindex.bam" \\ + ${reference} \\ + ${input} + + # write BAM index only, remove unmapped, supplementary, etc... + samtools \\ + view \\ + --uncompressed \\ + --write-index \\ + --threads ${task.cpus} \\ + -O BAM \\ + -o "/dev/null##idx##${meta.id}.reindex.bam.bai" \\ + ${reference} \\ + ${args} \\ + ${input} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + END_VERSIONS + """ +} diff --git a/modules/nf-core/ascat/environment.yml b/modules/nf-core/ascat/environment.yml index 52935f0978..63d87708d6 100644 --- a/modules/nf-core/ascat/environment.yml +++ b/modules/nf-core/ascat/environment.yml @@ -1,8 +1,6 @@ -name: ascat channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::ascat=3.1.1 - bioconda::cancerit-allelecount=4.3.0 diff --git a/modules/nf-core/ascat/meta.yml b/modules/nf-core/ascat/meta.yml index 34ea2e51d9..db7c92926a 100644 --- a/modules/nf-core/ascat/meta.yml +++ b/modules/nf-core/ascat/meta.yml @@ -6,105 +6,151 @@ keywords: - cram tools: - ascat: - description: ASCAT is a method to derive copy number profiles of tumour cells, accounting for normal cell admixture and tumour aneuploidy. ASCAT infers tumour purity (the fraction of tumour cells) and ploidy (the amount of DNA per tumour cell), expressed as multiples of haploid genomes from SNP array or massively parallel sequencing data, and calculates whole-genome allele-specific copy number profiles (the number of copies of both parental alleles for all SNP loci across the genome). + description: ASCAT is a method to derive copy number profiles of tumour cells, + accounting for normal cell admixture and tumour aneuploidy. ASCAT infers tumour + purity (the fraction of tumour cells) and ploidy (the amount of DNA per tumour + cell), expressed as multiples of haploid genomes from SNP array or massively + parallel sequencing data, and calculates whole-genome allele-specific copy number + profiles (the number of copies of both parental alleles for all SNP loci across + the genome). documentation: https://github.com/VanLoo-lab/ascat/tree/master/man tool_dev_url: https://github.com/VanLoo-lab/ascat doi: "10.1093/bioinformatics/btaa538" licence: ["GPL v3"] + identifier: biotools:ascat input: - - args: - type: map - description: | - Groovy Map containing tool parameters. MUST follow the structure/keywords below and be provided via modules.config. Parameters must be set between quotes. (optional) parameters can be removed from the map, if they are not set. For default values, please check the documentation above. - - ``` - { - [ - "gender": "XX", - "genomeVersion": "hg19" - "purity": (optional), - "ploidy": (optional), - "gc_files": (optional), - "minCounts": (optional), - "BED_file": (optional) but recommended for WES, - "chrom_names": (optional), - "min_base_qual": (optional), - "min_map_qual": (optional), - "ref_fasta": (optional), - "skip_allele_counting_tumour": (optional), - "skip_allele_counting_normal": (optional) - ] - } - ``` - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input_normal: - type: file - description: BAM/CRAM file, must adhere to chr1, chr2, ...chrX notation For modifying chromosome notation in bam files please follow https://josephcckuo.wordpress.com/2016/11/17/modify-chromosome-notation-in-bam-file/. - pattern: "*.{bam,cram}" - - index_normal: - type: file - description: index for normal_bam/cram - pattern: "*.{bai,crai}" - - input_tumor: - type: file - description: BAM/CRAM file, must adhere to chr1, chr2, ...chrX notation - pattern: "*.{bam,cram}" - - index_tumor: - type: file - description: index for tumor_bam/cram - pattern: "*.{bai,crai}" - - allele_files: - type: file - description: allele files for ASCAT WGS. Can be downloaded here https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WGS - - loci_files: - type: file - description: loci files for ASCAT WGS. Loci files without chromosome notation can be downloaded here https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WGS Make sure the chromosome notation matches the bam/cram input files. To add the chromosome notation to loci files (hg19/hg38) if necessary, you can run this command `if [[ $(samtools view | head -n1 | cut -f3)\" == *\"chr\"* ]]; then for i in {1..22} X; do sed -i 's/^/chr/' G1000_loci_hg19_chr_${i}.txt; done; fi` - - bed_file: - type: file - description: Bed file for ASCAT WES (optional, but recommended for WES) - - fasta: - type: file - description: Reference fasta file (optional) - - gc_file: - type: file - description: GC correction file (optional) - Used to do logR correction of the tumour sample(s) with genomic GC content - - rt_file: - type: file - description: replication timing correction file (optional, provide only in combination with gc_file) + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input_normal: + type: file + description: BAM/CRAM file, must adhere to chr1, chr2, ...chrX notation For + modifying chromosome notation in bam files please follow + https://josephcckuo.wordpress.com/2016/11/17/modify-chromosome-notation-in-bam-file/. + pattern: "*.{bam,cram}" + - index_normal: + type: file + description: index for normal_bam/cram + pattern: "*.{bai,crai}" + - input_tumor: + type: file + description: BAM/CRAM file, must adhere to chr1, chr2, ...chrX notation + pattern: "*.{bam,cram}" + - index_tumor: + type: file + description: index for tumor_bam/cram + pattern: "*.{bai,crai}" + - - allele_files: + type: file + description: allele files for ASCAT WGS. Can be downloaded here https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WGS + - - loci_files: + type: file + description: loci files for ASCAT WGS. Loci files without chromosome notation + can be downloaded here https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WGS + Make sure the chromosome notation matches the bam/cram input files. To add + the chromosome notation to loci files (hg19/hg38) if necessary, you can run + this command `if [[ $(samtools view | head -n1 | cut -f3)\" + == *\"chr\"* ]]; then for i in {1..22} X; do sed -i 's/^/chr/' G1000_loci_hg19_chr_${i}.txt; + done; fi` + - - bed_file: + type: file + description: Bed file for ASCAT WES (optional, but recommended for WES) + - - fasta: + type: file + description: Reference fasta file (optional) + - - gc_file: + type: file + description: GC correction file (optional) - Used to do logR correction of the + tumour sample(s) with genomic GC content + - - rt_file: + type: file + description: replication timing correction file (optional, provide only in combination + with gc_file) output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - allelefreqs: - type: file - description: Files containing allee frequencies per chromosome - pattern: "*{alleleFrequencies_chr*.txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*alleleFrequencies_chr*.txt": + type: file + description: Files containing allee frequencies per chromosome + pattern: "*{alleleFrequencies_chr*.txt}" + - bafs: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*BAF.txt": + type: file + description: BAF file + - cnvs: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*cnvs.txt": + type: file + description: CNV file + - logrs: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*LogR.txt": + type: file + description: LogR file - metrics: - type: file - description: File containing quality metrics - pattern: "*.{metrics.txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*metrics.txt": + type: file + description: File containing quality metrics + pattern: "*.{metrics.txt}" - png: - type: file - description: ASCAT plots - pattern: "*.{png}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*png": + type: file + description: ASCAT plots + pattern: "*.{png}" - purityploidy: - type: file - description: File with purity and ploidy data - pattern: "*.{purityploidy.txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*purityploidy.txt": + type: file + description: File with purity and ploidy data + pattern: "*.{purityploidy.txt}" - segments: - type: file - description: File with segments data - pattern: "*.{segments.txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*segments.txt": + type: file + description: File with segments data + pattern: "*.{segments.txt}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@aasNGC" - "@lassefolkersen" diff --git a/modules/nf-core/bcftools/annotate/bcftools-annotate.diff b/modules/nf-core/bcftools/annotate/bcftools-annotate.diff index d15bf86b54..a9a8f03558 100644 --- a/modules/nf-core/bcftools/annotate/bcftools-annotate.diff +++ b/modules/nf-core/bcftools/annotate/bcftools-annotate.diff @@ -1,28 +1,29 @@ Changes in module 'nf-core/bcftools/annotate' +Changes in 'bcftools/annotate/main.nf': --- modules/nf-core/bcftools/annotate/main.nf +++ modules/nf-core/bcftools/annotate/main.nf -@@ -8,7 +8,10 @@ +@@ -8,8 +8,10 @@ 'biocontainers/bcftools:1.20--h8b25389_0' }" input: -- tuple val(meta), path(input), path(index), path(annotations), path(annotations_index), path(header_lines) -+ tuple val(meta), path(input) +- tuple val(meta), path(input), path(index), path(annotations), path(annotations_index) +- path(header_lines) ++ tuple val(meta), path(input), path(index) + path annotations + path annotations_index + path header_lines output: tuple val(meta), path("*.{vcf,vcf.gz,bcf,bcf.gz}"), emit: vcf -@@ -29,6 +32,10 @@ - "vcf" - if ("$input" == "${prefix}.${extension}") error "Input and output names are the same, set prefix in module configuration to disambiguate!" - """ -+ bcftools \\ -+ index \\ -+ $input -+ - bcftools \\ - annotate \\ - $args \\ +'modules/nf-core/bcftools/annotate/meta.yml' is unchanged +'modules/nf-core/bcftools/annotate/environment.yml' is unchanged +'modules/nf-core/bcftools/annotate/tests/main.nf.test.snap' is unchanged +'modules/nf-core/bcftools/annotate/tests/vcf_gz_index.config' is unchanged +'modules/nf-core/bcftools/annotate/tests/vcf_gz_index_csi.config' is unchanged +'modules/nf-core/bcftools/annotate/tests/bcf.config' is unchanged +'modules/nf-core/bcftools/annotate/tests/vcf.config' is unchanged +'modules/nf-core/bcftools/annotate/tests/main.nf.test' is unchanged +'modules/nf-core/bcftools/annotate/tests/tags.yml' is unchanged +'modules/nf-core/bcftools/annotate/tests/vcf_gz_index_tbi.config' is unchanged ************************************************************ diff --git a/modules/nf-core/bcftools/annotate/environment.yml b/modules/nf-core/bcftools/annotate/environment.yml index 3d4e337992..5c00b116ad 100644 --- a/modules/nf-core/bcftools/annotate/environment.yml +++ b/modules/nf-core/bcftools/annotate/environment.yml @@ -1,7 +1,5 @@ -name: bcftools_annotate channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::bcftools=1.20 diff --git a/modules/nf-core/bcftools/annotate/main.nf b/modules/nf-core/bcftools/annotate/main.nf index 39c86bbe43..1b3a371ecf 100644 --- a/modules/nf-core/bcftools/annotate/main.nf +++ b/modules/nf-core/bcftools/annotate/main.nf @@ -8,13 +8,15 @@ process BCFTOOLS_ANNOTATE { 'biocontainers/bcftools:1.20--h8b25389_0' }" input: - tuple val(meta), path(input) + tuple val(meta), path(input), path(index) path annotations path annotations_index path header_lines output: tuple val(meta), path("*.{vcf,vcf.gz,bcf,bcf.gz}"), emit: vcf + tuple val(meta), path("*.tbi") , emit: tbi, optional: true + tuple val(meta), path("*.csi") , emit: csi, optional: true path "versions.yml" , emit: versions when: @@ -30,11 +32,11 @@ process BCFTOOLS_ANNOTATE { args.contains("--output-type z") || args.contains("-Oz") ? "vcf.gz" : args.contains("--output-type v") || args.contains("-Ov") ? "vcf" : "vcf" + def index_command = !index ? "bcftools index $input" : '' + if ("$input" == "${prefix}.${extension}") error "Input and output names are the same, set prefix in module configuration to disambiguate!" """ - bcftools \\ - index \\ - $input + $index_command bcftools \\ annotate \\ @@ -59,10 +61,16 @@ process BCFTOOLS_ANNOTATE { args.contains("--output-type z") || args.contains("-Oz") ? "vcf.gz" : args.contains("--output-type v") || args.contains("-Ov") ? "vcf" : "vcf" - + def index_extension = args.contains("--write-index=tbi") || args.contains("-W=tbi") ? "tbi" : + args.contains("--write-index=csi") || args.contains("-W=csi") ? "csi" : + args.contains("--write-index") || args.contains("-W") ? "csi" : + "" def create_cmd = extension.endsWith(".gz") ? "echo '' | gzip >" : "touch" + def create_index = extension.endsWith(".gz") && index_extension.matches("csi|tbi") ? "touch ${prefix}.${extension}.${index_extension}" : "" + """ ${create_cmd} ${prefix}.${extension} + ${create_index} cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/bcftools/annotate/meta.yml b/modules/nf-core/bcftools/annotate/meta.yml index f3aa463bf5..5bfccd2bd8 100644 --- a/modules/nf-core/bcftools/annotate/meta.yml +++ b/modules/nf-core/bcftools/annotate/meta.yml @@ -13,41 +13,64 @@ tools: documentation: https://samtools.github.io/bcftools/bcftools.html#annotate doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:bcftools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: Query VCF or BCF file, can be either uncompressed or compressed - - index: - type: file - description: Index of the query VCF or BCF file - - annotations: - type: file - description: Bgzip-compressed file with annotations - - annotations_index: - type: file - description: Index of the annotations file - - header_lines: - type: file - description: Contains lines to append to the output VCF header + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: Query VCF or BCF file, can be either uncompressed or compressed + - index: + type: file + description: Index of the query VCF or BCF file + - annotations: + type: file + description: Bgzip-compressed file with annotations + - annotations_index: + type: file + description: Index of the annotations file + - - header_lines: + type: file + description: Contains lines to append to the output VCF header output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - vcf: - type: file - description: Compressed annotated VCF file - pattern: "*{vcf,vcf.gz,bcf,bcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.{vcf,vcf.gz,bcf,bcf.gz}": + type: file + description: Compressed annotated VCF file + pattern: "*{vcf,vcf.gz,bcf,bcf.gz}" + - tbi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tbi": + type: file + description: Alternative VCF file index + pattern: "*.tbi" + - csi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: Default VCF file index + pattern: "*.csi" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@projectoriented" - "@ramprasadn" diff --git a/modules/nf-core/bcftools/annotate/tests/bcf.config b/modules/nf-core/bcftools/annotate/tests/bcf.config index b8496b33c3..79d26779da 100644 --- a/modules/nf-core/bcftools/annotate/tests/bcf.config +++ b/modules/nf-core/bcftools/annotate/tests/bcf.config @@ -1,6 +1,4 @@ process { - withName: 'BCFTOOLS_ANNOTATE' { - ext.args = "-x ID,INFO/DP,FORMAT/DP --output-type u" - ext.prefix = { "${meta.id}_ann" } - } -} \ No newline at end of file + ext.args = "-x ID,INFO/DP,FORMAT/DP --output-type u" + ext.prefix = { "${meta.id}_ann" } +} diff --git a/modules/nf-core/bcftools/annotate/tests/main.nf.test b/modules/nf-core/bcftools/annotate/tests/main.nf.test index 609102f836..3a5c493314 100644 --- a/modules/nf-core/bcftools/annotate/tests/main.nf.test +++ b/modules/nf-core/bcftools/annotate/tests/main.nf.test @@ -9,20 +9,21 @@ nextflow_process { tag "bcftools" tag "bcftools/annotate" - test("sarscov2 - [vcf, tbi, vcf2, tbi2, []] - vcf_output") { + test("sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_output") { config "./vcf.config" when { process { """ - input[0] = [ [ id:'test', single_end:false ], // meta map - file(params.test_data['sarscov2']['illumina']['test_vcf_gz'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test_vcf_gz_tbi'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test2_vcf_gz'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test2_vcf_gz_tbi'], checkIfExists: true), - [] + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) ] + input[1] = [] """ } } @@ -33,34 +34,161 @@ nextflow_process { { assert snapshot( process.out.vcf.collect { it.collect { it instanceof Map ? it : file(it).name }}, process.out.versions - ).match("vcf") } + ).match() } ) } } + test("sarscov2 - [vcf, [], annotation, annotation_tbi], [] - vcf_output") { - test("sarscov2 - [vcf, [], [], [], header] - bcf_output") { - - config "./bcf.config" + config "./vcf.config" when { process { """ - vcf = Channel.of([ [ id:'test', single_end:false ], // meta map - file(params.test_data['sarscov2']['illumina']['test_vcf_gz'], checkIfExists: true), + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), [], - [], - [] - ]) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf.collect { it.collect { it instanceof Map ? it : file(it).name }}, + process.out.versions + ).match() } + ) + } + + } + test("sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index") { - header = Channel.of( - '##INFO=', - '##INFO=' - ) - .collectFile(name:"headers.vcf", newLine:true) + config "./vcf_gz_index.config" - input[0] = vcf.combine(header) + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.tbi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_csi") { + + config "./vcf_gz_index_csi.config" + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.tbi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_tbi") { + + config "./vcf_gz_index_tbi.config" + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.tbi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() }, + { assert process.out.tbi[0][1].endsWith(".tbi") } + ) + } + + } + test("sarscov2 - [vcf, [], annotation, annotation_tbi], header - bcf_output") { + + config "./bcf.config" + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + [], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = Channel.of( + '##INFO=', + '##INFO=' + ).collectFile(name:"headers.vcf", newLine:true) """ } } @@ -77,7 +205,7 @@ nextflow_process { } - test("sarscov2 - [vcf, tbi, vcf2, tbi2, []] - stub") { + test("sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - stub") { config "./vcf.config" options "-stub" @@ -85,13 +213,14 @@ nextflow_process { when { process { """ - input[0] = [ [ id:'test', single_end:false ], // meta map - file(params.test_data['sarscov2']['illumina']['test_vcf_gz'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test_vcf_gz_tbi'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test2_vcf_gz'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test2_vcf_gz_tbi'], checkIfExists: true), - [] + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) ] + input[1] = [] """ } } @@ -105,4 +234,94 @@ nextflow_process { } + test("sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index - stub") { + + config "./vcf_gz_index.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match()}, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_csi - stub") { + + config "./vcf_gz_index_csi.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_tbi - stub") { + + config "./vcf_gz_index_tbi.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) + ] + input[1] = [] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert process.out.tbi[0][1].endsWith(".tbi") } + ) + } + + } + } \ No newline at end of file diff --git a/modules/nf-core/bcftools/annotate/tests/main.nf.test.snap b/modules/nf-core/bcftools/annotate/tests/main.nf.test.snap index 359ab38dc1..bac2224a3b 100644 --- a/modules/nf-core/bcftools/annotate/tests/main.nf.test.snap +++ b/modules/nf-core/bcftools/annotate/tests/main.nf.test.snap @@ -18,9 +18,9 @@ "nf-test": "0.8.4", "nextflow": "24.04.2" }, - "timestamp": "2024-06-03T11:39:40.696827933" + "timestamp": "2024-06-12T16:39:33.331888" }, - "vcf": { + "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index": { "content": [ [ [ @@ -31,6 +31,18 @@ "test_vcf.vcf.gz" ] ], + [ + + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.csi" + ] + ], [ "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" ] @@ -39,9 +51,9 @@ "nf-test": "0.8.4", "nextflow": "24.04.2" }, - "timestamp": "2024-06-03T11:39:29.215629503" + "timestamp": "2024-08-15T10:07:59.658031137" }, - "sarscov2 - [vcf, tbi, vcf2, tbi2, []] - stub": { + "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_csi - stub": { "content": [ { "0": [ @@ -54,7 +66,304 @@ ] ], "1": [ + + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" + ], + "csi": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-08-15T10:09:05.096883418" + }, + "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_csi": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz" + ] + ], + [ + + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.csi" + ] + ], + [ + "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-08-15T10:08:10.581301219" + }, + "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" + ], + "csi": [ + + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-08-15T10:08:43.975017625" + }, + "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_tbi": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.tbi" + ] + ], + [ + + ], + [ + "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-08-15T10:08:21.354059092" + }, + "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_output": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz" + ] + ], + [ + "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-08-15T10:07:37.788393317" + }, + "sarscov2 - [vcf, [], annotation, annotation_tbi], [] - vcf_output": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz" + ] + ], + [ + "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-08-15T10:07:48.500746325" + }, + "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_tbi - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" + ], + "csi": [ + + ], + "tbi": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "vcf": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-08-15T10:09:16.094918834" + }, + "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" + ], + "csi": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tbi": [ + ], "vcf": [ [ @@ -74,6 +383,6 @@ "nf-test": "0.8.4", "nextflow": "24.04.2" }, - "timestamp": "2024-06-03T11:39:52.630055205" + "timestamp": "2024-08-15T10:08:54.366358502" } } \ No newline at end of file diff --git a/modules/nf-core/bcftools/annotate/tests/vcf.config b/modules/nf-core/bcftools/annotate/tests/vcf.config index cb809f614c..611868d55c 100644 --- a/modules/nf-core/bcftools/annotate/tests/vcf.config +++ b/modules/nf-core/bcftools/annotate/tests/vcf.config @@ -1,6 +1,4 @@ process { - withName: 'BCFTOOLS_ANNOTATE' { - ext.prefix = { "${meta.id}_vcf" } - ext.args = "-x ID,INFO/DP,FORMAT/DP --output-type z" - } + ext.args = "-x ID,INFO/DP,FORMAT/DP --output-type z" + ext.prefix = { "${meta.id}_vcf" } } diff --git a/modules/nf-core/bcftools/annotate/tests/vcf_gz_index.config b/modules/nf-core/bcftools/annotate/tests/vcf_gz_index.config new file mode 100644 index 0000000000..2fd9a225f0 --- /dev/null +++ b/modules/nf-core/bcftools/annotate/tests/vcf_gz_index.config @@ -0,0 +1,4 @@ +process { + ext.args = "--output-type z --write-index --no-version" + ext.prefix = { "${meta.id}_vcf" } +} diff --git a/modules/nf-core/bcftools/annotate/tests/vcf_gz_index_csi.config b/modules/nf-core/bcftools/annotate/tests/vcf_gz_index_csi.config new file mode 100644 index 0000000000..512c1dfb05 --- /dev/null +++ b/modules/nf-core/bcftools/annotate/tests/vcf_gz_index_csi.config @@ -0,0 +1,4 @@ +process { + ext.args = "--output-type z --write-index=csi --no-version" + ext.prefix = { "${meta.id}_vcf" } +} diff --git a/modules/nf-core/bcftools/annotate/tests/vcf_gz_index_tbi.config b/modules/nf-core/bcftools/annotate/tests/vcf_gz_index_tbi.config new file mode 100644 index 0000000000..7feb5ebbed --- /dev/null +++ b/modules/nf-core/bcftools/annotate/tests/vcf_gz_index_tbi.config @@ -0,0 +1,4 @@ +process { + ext.args = "--output-type z --write-index=tbi --no-version" + ext.prefix = { "${meta.id}_vcf" } +} diff --git a/modules/nf-core/bcftools/concat/environment.yml b/modules/nf-core/bcftools/concat/environment.yml index 6544e949c8..5c00b116ad 100644 --- a/modules/nf-core/bcftools/concat/environment.yml +++ b/modules/nf-core/bcftools/concat/environment.yml @@ -1,7 +1,5 @@ -name: bcftools_concat channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::bcftools=1.20 diff --git a/modules/nf-core/bcftools/concat/main.nf b/modules/nf-core/bcftools/concat/main.nf index 092d2c62e1..a94b28d86d 100644 --- a/modules/nf-core/bcftools/concat/main.nf +++ b/modules/nf-core/bcftools/concat/main.nf @@ -11,8 +11,10 @@ process BCFTOOLS_CONCAT { tuple val(meta), path(vcfs), path(tbi) output: - tuple val(meta), path("*.gz"), emit: vcf - path "versions.yml" , emit: versions + tuple val(meta), path("${prefix}.vcf.gz") , emit: vcf + tuple val(meta), path("${prefix}.vcf.gz.tbi"), emit: tbi, optional: true + tuple val(meta), path("${prefix}.vcf.gz.csi"), emit: csi, optional: true + path "versions.yml" , emit: versions when: task.ext.when == null || task.ext.when @@ -20,7 +22,11 @@ process BCFTOOLS_CONCAT { script: def args = task.ext.args ?: '' prefix = task.ext.prefix ?: "${meta.id}" + def tbi_names = tbi.findAll { file -> !(file instanceof List) }.collect { file -> file.name } + def create_input_index = vcfs.collect { vcf -> tbi_names.contains(vcf.name + ".tbi") ? "" : "tabix ${vcf}" }.join("\n ") """ + ${create_input_index} + bcftools concat \\ --output ${prefix}.vcf.gz \\ $args \\ @@ -34,9 +40,16 @@ process BCFTOOLS_CONCAT { """ stub: + def args = task.ext.args ?: '' prefix = task.ext.prefix ?: "${meta.id}" + def index = args.contains("--write-index=tbi") || args.contains("-W=tbi") ? "tbi" : + args.contains("--write-index=csi") || args.contains("-W=csi") ? "csi" : + args.contains("--write-index") || args.contains("-W") ? "csi" : + "" + def create_index = index.matches("csi|tbi") ? "touch ${prefix}.vcf.gz.${index}" : "" """ echo "" | gzip > ${prefix}.vcf.gz + ${create_index} cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/bcftools/concat/meta.yml b/modules/nf-core/bcftools/concat/meta.yml index 91cb54d5c7..d2565b289f 100644 --- a/modules/nf-core/bcftools/concat/meta.yml +++ b/modules/nf-core/bcftools/concat/meta.yml @@ -13,36 +13,68 @@ tools: documentation: http://www.htslib.org/doc/bcftools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:bcftools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcfs: - type: list - description: | - List containing 2 or more vcf files - e.g. [ 'file1.vcf', 'file2.vcf' ] - - tbi: - type: list - description: | - List containing 2 or more index files (optional) - e.g. [ 'file1.tbi', 'file2.tbi' ] + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcfs: + type: list + description: | + List containing 2 or more vcf files + e.g. [ 'file1.vcf', 'file2.vcf' ] + - tbi: + type: list + description: | + List containing 2 or more index files (optional) + e.g. [ 'file1.tbi', 'file2.tbi' ] output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - vcf: - type: file - description: VCF concatenated output file - pattern: "*.{vcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.{vcf.gz}" + - ${prefix}.vcf.gz: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.{vcf.gz}" + - tbi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.tbi" + - ${prefix}.vcf.gz.tbi: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.tbi" + - csi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.csi" + - ${prefix}.vcf.gz.csi: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.csi" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@abhi18av" - "@nvnieuwk" diff --git a/modules/nf-core/bcftools/concat/tests/main.nf.test b/modules/nf-core/bcftools/concat/tests/main.nf.test index d5d9f78743..cb4642b29c 100644 --- a/modules/nf-core/bcftools/concat/tests/main.nf.test +++ b/modules/nf-core/bcftools/concat/tests/main.nf.test @@ -9,22 +9,23 @@ nextflow_process { tag "bcftools" tag "bcftools/concat" - config "./nextflow.config" - test("sarscov2 - [[vcf1, vcf2], [tbi1, tbi2]]") { + test("homo_sapiens - [[vcf1, vcf2], [tbi1, tbi2]]") { + + config "./nextflow.config" when { process { """ - input[0] = [ + input[0] = [ [ id:'test3' ], // meta map [ - file(params.test_data['homo_sapiens']['illumina']['test_haplotc_cnn_vcf_gz'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_genome_vcf_gz'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test_haplotcaller.cnn.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true) ], [ - file(params.test_data['homo_sapiens']['illumina']['test_genome_vcf_gz_tbi'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_haplotc_cnn_vcf_gz_tbi'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test_haplotcaller.cnn.vcf.gz.tbi', checkIfExists: true) ] ] """ @@ -40,16 +41,130 @@ nextflow_process { } - test("sarscov2 - [[vcf1, vcf2], []]") { + test("homo_sapiens - [[vcf1, vcf2], [tbi1, tbi2]] - vcf_gz_index") { + + config "./vcf_gz_index.config" when { process { """ - input[0] = [ + input[0] = [ [ id:'test3' ], // meta map [ - file(params.test_data['homo_sapiens']['illumina']['test_haplotc_cnn_vcf_gz'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_genome_vcf_gz'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test_haplotcaller.cnn.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true) + ], + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test_haplotcaller.cnn.vcf.gz.tbi', checkIfExists: true) + ] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.tbi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("homo_sapiens - [[vcf1, vcf2], [tbi1, tbi2]] - vcf_gz_index_csi") { + + config "./vcf_gz_index_csi.config" + + when { + process { + """ + input[0] = [ + [ id:'test3' ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test_haplotcaller.cnn.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true) + ], + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test_haplotcaller.cnn.vcf.gz.tbi', checkIfExists: true) + ] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.tbi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("homo_sapiens - [[vcf1, vcf2], [tbi1, tbi2]] - vcf_gz_index_tbi") { + + config "./vcf_gz_index_tbi.config" + + when { + process { + """ + input[0] = [ + [ id:'test3' ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test_haplotcaller.cnn.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true) + ], + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test_haplotcaller.cnn.vcf.gz.tbi', checkIfExists: true) + ] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.tbi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() }, + { assert process.out.tbi[0][1].endsWith(".tbi") } + ) + } + + } + + + test("homo_sapiens - [[vcf1, vcf2], []]") { + + config "./nextflow.config" + + when { + process { + """ + input[0] = [ + [ id:'test3' ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test_haplotcaller.cnn.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true) ], [] ] @@ -66,21 +181,23 @@ nextflow_process { } - test("sarscov2 - [[vcf1, vcf2], [tbi1, tbi2]] - stub") { + test("homo_sapiens - [[vcf1, vcf2], [tbi1, tbi2]] - stub") { + config "./nextflow.config" options "-stub" + when { process { """ - input[0] = [ + input[0] = [ [ id:'test3' ], // meta map [ - file(params.test_data['homo_sapiens']['illumina']['test_haplotc_cnn_vcf_gz'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_genome_vcf_gz'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test_haplotcaller.cnn.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true) ], [ - file(params.test_data['homo_sapiens']['illumina']['test_genome_vcf_gz_tbi'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_haplotc_cnn_vcf_gz_tbi'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test_haplotcaller.cnn.vcf.gz.tbi', checkIfExists: true) ] ] """ @@ -96,4 +213,104 @@ nextflow_process { } -} + test("homo_sapiens - [[vcf1, vcf2], [tbi1, tbi2]] - vcf_gz_index - stub") { + + config "./vcf_gz_index.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test3' ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test_haplotcaller.cnn.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true) + ], + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test_haplotcaller.cnn.vcf.gz.tbi', checkIfExists: true) + ] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("homo_sapiens - [[vcf1, vcf2], [tbi1, tbi2]] - vcf_gz_index_csi - stub") { + + config "./vcf_gz_index_csi.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test3' ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test_haplotcaller.cnn.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true) + ], + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test_haplotcaller.cnn.vcf.gz.tbi', checkIfExists: true) + ] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("homo_sapiens - [[vcf1, vcf2], [tbi1, tbi2]] - vcf_gz_index_tbi - stub") { + + config "./vcf_gz_index_tbi.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test3' ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test_haplotcaller.cnn.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true) + ], + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test_haplotcaller.cnn.vcf.gz.tbi', checkIfExists: true) + ] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert process.out.tbi[0][1].endsWith(".tbi") } + ) + } + + } + + +} \ No newline at end of file diff --git a/modules/nf-core/bcftools/concat/tests/main.nf.test.snap b/modules/nf-core/bcftools/concat/tests/main.nf.test.snap index bef0ff05be..09e87cd3e5 100644 --- a/modules/nf-core/bcftools/concat/tests/main.nf.test.snap +++ b/modules/nf-core/bcftools/concat/tests/main.nf.test.snap @@ -1,5 +1,60 @@ { - "sarscov2 - [[vcf1, vcf2], []]": { + "homo_sapiens - [[vcf1, vcf2], [tbi1, tbi2]] - vcf_gz_index - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test3" + }, + "test3_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + [ + { + "id": "test3" + }, + "test3_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + "versions.yml:md5,c6e19f105510a46af1c5da9064e2e659" + ], + "csi": [ + [ + { + "id": "test3" + }, + "test3_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test3" + }, + "test3_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,c6e19f105510a46af1c5da9064e2e659" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-26T11:04:11.178539482" + }, + "homo_sapiens - [[vcf1, vcf2], [tbi1, tbi2]]": { "content": [ { "0": [ @@ -11,7 +66,19 @@ ] ], "1": [ + + ], + "2": [ + + ], + "3": [ "versions.yml:md5,c6e19f105510a46af1c5da9064e2e659" + ], + "csi": [ + + ], + "tbi": [ + ], "vcf": [ [ @@ -27,12 +94,129 @@ } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-26T11:03:08.765639958" + }, + "homo_sapiens - [[vcf1, vcf2], [tbi1, tbi2]] - vcf_gz_index": { + "content": [ + [ + [ + { + "id": "test3" + }, + "test3_vcf.vcf.gz:md5,5f6796c3ae109a1a5b87353954693f5a" + ] + ], + [ + [ + { + "id": "test3" + }, + "test3_vcf.vcf.gz.csi" + ] + ], + [ + + ], + [ + "versions.yml:md5,c6e19f105510a46af1c5da9064e2e659" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-26T11:03:21.607274757" + }, + "homo_sapiens - [[vcf1, vcf2], [tbi1, tbi2]] - vcf_gz_index_tbi - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test3" + }, + "test3_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test3" + }, + "test3_vcf.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,c6e19f105510a46af1c5da9064e2e659" + ], + "csi": [ + + ], + "tbi": [ + [ + { + "id": "test3" + }, + "test3_vcf.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "vcf": [ + [ + { + "id": "test3" + }, + "test3_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,c6e19f105510a46af1c5da9064e2e659" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-31T15:16:00.637917586" + "timestamp": "2024-09-26T11:04:27.332133878" }, - "sarscov2 - [[vcf1, vcf2], [tbi1, tbi2]]": { + "homo_sapiens - [[vcf1, vcf2], [tbi1, tbi2]] - vcf_gz_index_csi": { + "content": [ + [ + [ + { + "id": "test3" + }, + "test3_vcf.vcf.gz:md5,5f6796c3ae109a1a5b87353954693f5a" + ] + ], + [ + [ + { + "id": "test3" + }, + "test3_vcf.vcf.gz.csi" + ] + ], + [ + + ], + [ + "versions.yml:md5,c6e19f105510a46af1c5da9064e2e659" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-26T11:03:36.575719606" + }, + "homo_sapiens - [[vcf1, vcf2], []]": { "content": [ { "0": [ @@ -44,7 +228,19 @@ ] ], "1": [ + + ], + "2": [ + + ], + "3": [ "versions.yml:md5,c6e19f105510a46af1c5da9064e2e659" + ], + "csi": [ + + ], + "tbi": [ + ], "vcf": [ [ @@ -60,12 +256,12 @@ } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-31T15:15:55.750767959" + "timestamp": "2024-09-26T11:03:54.069826178" }, - "sarscov2 - [[vcf1, vcf2], [tbi1, tbi2]] - stub": { + "homo_sapiens - [[vcf1, vcf2], [tbi1, tbi2]] - stub": { "content": [ { "0": [ @@ -77,7 +273,19 @@ ] ], "1": [ + + ], + "2": [ + + ], + "3": [ "versions.yml:md5,c6e19f105510a46af1c5da9064e2e659" + ], + "csi": [ + + ], + "tbi": [ + ], "vcf": [ [ @@ -93,9 +301,95 @@ } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-26T11:04:02.45346063" + }, + "homo_sapiens - [[vcf1, vcf2], [tbi1, tbi2]] - vcf_gz_index_tbi": { + "content": [ + [ + [ + { + "id": "test3" + }, + "test3_vcf.vcf.gz:md5,5f6796c3ae109a1a5b87353954693f5a" + ] + ], + [ + + ], + [ + [ + { + "id": "test3" + }, + "test3_vcf.vcf.gz.tbi" + ] + ], + [ + "versions.yml:md5,c6e19f105510a46af1c5da9064e2e659" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-26T11:03:44.618596639" + }, + "homo_sapiens - [[vcf1, vcf2], [tbi1, tbi2]] - vcf_gz_index_csi - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test3" + }, + "test3_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + [ + { + "id": "test3" + }, + "test3_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + "versions.yml:md5,c6e19f105510a46af1c5da9064e2e659" + ], + "csi": [ + [ + { + "id": "test3" + }, + "test3_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test3" + }, + "test3_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,c6e19f105510a46af1c5da9064e2e659" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-31T15:16:05.717072932" + "timestamp": "2024-09-26T11:04:19.745768656" } } \ No newline at end of file diff --git a/modules/nf-core/bcftools/concat/tests/vcf_gz_index.config b/modules/nf-core/bcftools/concat/tests/vcf_gz_index.config new file mode 100644 index 0000000000..7dd696ee26 --- /dev/null +++ b/modules/nf-core/bcftools/concat/tests/vcf_gz_index.config @@ -0,0 +1,4 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args = "--output-type z --write-index --no-version" +} diff --git a/modules/nf-core/bcftools/concat/tests/vcf_gz_index_csi.config b/modules/nf-core/bcftools/concat/tests/vcf_gz_index_csi.config new file mode 100644 index 0000000000..aebffb6fb7 --- /dev/null +++ b/modules/nf-core/bcftools/concat/tests/vcf_gz_index_csi.config @@ -0,0 +1,4 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args = "--output-type z --write-index=csi --no-version" +} diff --git a/modules/nf-core/bcftools/concat/tests/vcf_gz_index_tbi.config b/modules/nf-core/bcftools/concat/tests/vcf_gz_index_tbi.config new file mode 100644 index 0000000000..b192ae7d19 --- /dev/null +++ b/modules/nf-core/bcftools/concat/tests/vcf_gz_index_tbi.config @@ -0,0 +1,4 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args = "--output-type z --write-index=tbi --no-version" +} diff --git a/modules/nf-core/bcftools/mpileup/environment.yml b/modules/nf-core/bcftools/mpileup/environment.yml index 7e479383ba..5c00b116ad 100644 --- a/modules/nf-core/bcftools/mpileup/environment.yml +++ b/modules/nf-core/bcftools/mpileup/environment.yml @@ -1,7 +1,5 @@ -name: bcftools_mpileup channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::bcftools=1.20 diff --git a/modules/nf-core/bcftools/mpileup/meta.yml b/modules/nf-core/bcftools/mpileup/meta.yml index 65410ddd66..febcb33f60 100644 --- a/modules/nf-core/bcftools/mpileup/meta.yml +++ b/modules/nf-core/bcftools/mpileup/meta.yml @@ -12,56 +12,79 @@ tools: documentation: http://www.htslib.org/doc/bcftools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:bcftools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: Input BAM file - pattern: "*.{bam}" - - intervals: - type: file - description: Input intervals file. A file (commonly '.bed') containing regions to subset - - meta: - type: map - description: | - Groovy Map containing information about the genome fasta, e.g. [ id: 'sarscov2' ] - - fasta: - type: file - description: FASTA reference file - pattern: "*.{fasta,fa}" - - save_mpileup: - type: boolean - description: Save mpileup file generated by bcftools mpileup + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - bam: + type: file + description: Input BAM file + pattern: "*.{bam}" + - intervals: + type: file + description: Input intervals file. A file (commonly '.bed') containing regions + to subset + - - meta2: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: FASTA reference file + pattern: "*.{fasta,fa}" + - - save_mpileup: + type: boolean + description: Save mpileup file generated by bcftools mpileup output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - vcf: - type: file - description: VCF gzipped output file - pattern: "*.{vcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*vcf.gz": + type: file + description: VCF gzipped output file + pattern: "*.{vcf.gz}" - tbi: - type: file - description: tabix index file - pattern: "*.{vcf.gz.tbi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*vcf.gz.tbi": + type: file + description: tabix index file + pattern: "*.{vcf.gz.tbi}" - stats: - type: file - description: Text output file containing stats - pattern: "*{stats.txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*stats.txt": + type: file + description: Text output file containing stats + pattern: "*{stats.txt}" - mpileup: - type: file - description: mpileup gzipped output for all positions - pattern: "{*.mpileup.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.mpileup.gz": + type: file + description: mpileup gzipped output for all positions + pattern: "{*.mpileup.gz}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/bcftools/mpileup/tests/main.nf.test b/modules/nf-core/bcftools/mpileup/tests/main.nf.test index dc35c54266..665a349fc8 100644 --- a/modules/nf-core/bcftools/mpileup/tests/main.nf.test +++ b/modules/nf-core/bcftools/mpileup/tests/main.nf.test @@ -18,12 +18,12 @@ nextflow_process { """ input[0] = [ [ id:'test' ], // meta map - file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), [] ] input[1] = [ [ id:'sarscov2' ], // meta map - file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) ] input[2] = false """ @@ -51,12 +51,12 @@ nextflow_process { """ input[0] = [ [ id:'test' ], // meta map - file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), [] ] input[1] = [ [ id:'sarscov2' ], // meta map - file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) ] input[2] = false """ @@ -82,12 +82,12 @@ nextflow_process { """ input[0] = [ [ id:'test' ], // meta map - file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), [] ] input[1] = [ [ id:'sarscov2' ], // meta map - file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) ] input[2] = true """ @@ -116,12 +116,12 @@ nextflow_process { """ input[0] = [ [ id:'test' ], // meta map - file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), [] ] input[1] = [ [ id:'sarscov2' ], // meta map - file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) ] input[2] = true """ @@ -148,12 +148,12 @@ nextflow_process { """ input[0] = [ [ id:'test' ], // meta map - file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['sarscov2']['genome']['test_bed'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/test.bed', checkIfExists: true) ] input[1] = [ [ id:'sarscov2' ], // meta map - file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) ] input[2] = false """ @@ -181,12 +181,12 @@ nextflow_process { """ input[0] = [ [ id:'test' ], // meta map - file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['sarscov2']['genome']['test_bed'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/test.bed', checkIfExists: true) ] input[1] = [ [ id:'sarscov2' ], // meta map - file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) ] input[2] = false """ diff --git a/modules/nf-core/bcftools/sort/environment.yml b/modules/nf-core/bcftools/sort/environment.yml index 2295ecfd17..5c00b116ad 100644 --- a/modules/nf-core/bcftools/sort/environment.yml +++ b/modules/nf-core/bcftools/sort/environment.yml @@ -1,7 +1,5 @@ -name: bcftools_sort channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::bcftools=1.20 diff --git a/modules/nf-core/bcftools/sort/main.nf b/modules/nf-core/bcftools/sort/main.nf index d5e3ce9af7..7d4c9b8e9d 100644 --- a/modules/nf-core/bcftools/sort/main.nf +++ b/modules/nf-core/bcftools/sort/main.nf @@ -11,8 +11,10 @@ process BCFTOOLS_SORT { tuple val(meta), path(vcf) output: - tuple val(meta), path("*.{vcf,vcf.gz,bcf,bcf.gz}") , emit: vcf - path "versions.yml" , emit: versions + tuple val(meta), path("*.{vcf,vcf.gz,bcf,bcf.gz}"), emit: vcf + tuple val(meta), path("*.tbi") , emit: tbi, optional: true + tuple val(meta), path("*.csi") , emit: csi, optional: true + path "versions.yml" , emit: versions when: task.ext.when == null || task.ext.when @@ -49,9 +51,16 @@ process BCFTOOLS_SORT { args.contains("--output-type z") || args.contains("-Oz") ? "vcf.gz" : args.contains("--output-type v") || args.contains("-Ov") ? "vcf" : "vcf" + def index = args.contains("--write-index=tbi") || args.contains("-W=tbi") ? "tbi" : + args.contains("--write-index=csi") || args.contains("-W=csi") ? "csi" : + args.contains("--write-index") || args.contains("-W") ? "csi" : + "" def create_cmd = extension.endsWith(".gz") ? "echo '' | gzip >" : "touch" + def create_index = extension.endsWith(".gz") && index.matches("csi|tbi") ? "touch ${prefix}.${extension}.${index}" : "" + """ ${create_cmd} ${prefix}.${extension} + ${create_index} cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/bcftools/sort/meta.yml b/modules/nf-core/bcftools/sort/meta.yml index 84747c6d89..f7a6eff17d 100644 --- a/modules/nf-core/bcftools/sort/meta.yml +++ b/modules/nf-core/bcftools/sort/meta.yml @@ -12,30 +12,53 @@ tools: tool_dev_url: https://github.com/samtools/bcftools doi: "10.1093/bioinformatics/btp352" licence: ["MIT"] + identifier: biotools:bcftools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: The VCF/BCF file to be sorted - pattern: "*.{vcf.gz,vcf,bcf}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: The VCF/BCF file to be sorted + pattern: "*.{vcf.gz,vcf,bcf}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - vcf: - type: file - description: Sorted VCF file - pattern: "*.{vcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.{vcf,vcf.gz,bcf,bcf.gz}": + type: file + description: Sorted VCF file + pattern: "*.{vcf.gz}" + - tbi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tbi": + type: file + description: Alternative VCF file index + pattern: "*.tbi" + - csi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: Default VCF file index + pattern: "*.csi" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@Gwennid" maintainers: diff --git a/modules/nf-core/bcftools/sort/tests/main.nf.test b/modules/nf-core/bcftools/sort/tests/main.nf.test index 8a496dda7d..b9bdd76a09 100644 --- a/modules/nf-core/bcftools/sort/tests/main.nf.test +++ b/modules/nf-core/bcftools/sort/tests/main.nf.test @@ -15,7 +15,7 @@ nextflow_process { """ input[0] = [ [ id:'test' ], // meta map - file(params.test_data['sarscov2']['illumina']['test_vcf'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) ] """ } @@ -30,6 +30,96 @@ nextflow_process { } + test("sarscov2 - vcf_gz_index") { + + config "./vcf_gz_index.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.tbi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - vcf_gz_index_csi") { + + config "./vcf_gz_index_csi.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.tbi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - vcf_gz_index_tbi") { + + config "./vcf_gz_index_tbi.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf, + process.out.csi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.tbi.collect { it.collect { it instanceof Map ? it : file(it).name } }, + process.out.versions + ).match() }, + { assert process.out.tbi[0][1].endsWith(".tbi") } + ) + } + + } + test("sarscov2 - vcf - stub") { options "-stub" when { @@ -37,7 +127,7 @@ nextflow_process { """ input[0] = [ [ id:'test' ], // meta map - file(params.test_data['sarscov2']['illumina']['test_vcf'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) ] """ } @@ -51,4 +141,82 @@ nextflow_process { } } -} + + test("sarscov2 - vcf_gz_index - stub") { + + config "./vcf_gz_index.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - vcf_gz_index_csi - stub") { + + config "./vcf_gz_index_csi.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert process.out.csi[0][1].endsWith(".csi") } + ) + } + + } + + test("sarscov2 - vcf_gz_index_tbi - stub") { + + config "./vcf_gz_index_tbi.config" + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert process.out.tbi[0][1].endsWith(".tbi") } + ) + } + + } +} \ No newline at end of file diff --git a/modules/nf-core/bcftools/sort/tests/main.nf.test.snap b/modules/nf-core/bcftools/sort/tests/main.nf.test.snap index 3f478d19d6..f38272cb28 100644 --- a/modules/nf-core/bcftools/sort/tests/main.nf.test.snap +++ b/modules/nf-core/bcftools/sort/tests/main.nf.test.snap @@ -1,4 +1,59 @@ { + "sarscov2 - vcf_gz_index_tbi - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ], + "csi": [ + + ], + "tbi": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T12:06:05.201680777" + }, "vcf": { "content": [ { @@ -11,7 +66,19 @@ ] ], "1": [ + + ], + "2": [ + + ], + "3": [ "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ], + "csi": [ + + ], + "tbi": [ + ], "vcf": [ [ @@ -30,7 +97,179 @@ "nf-test": "0.8.4", "nextflow": "24.04.2" }, - "timestamp": "2024-06-03T11:54:39.200647026" + "timestamp": "2024-06-05T12:04:43.889971134" + }, + "sarscov2 - vcf_gz_index": { + "content": [ + [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,8e722884ffb75155212a3fc053918766" + ] + ], + [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.csi" + ] + ], + [ + + ], + [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T12:04:55.385964497" + }, + "sarscov2 - vcf_gz_index_csi": { + "content": [ + [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,8e722884ffb75155212a3fc053918766" + ] + ], + [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.csi" + ] + ], + [ + + ], + [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T12:05:06.662818922" + }, + "sarscov2 - vcf_gz_index - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ], + "csi": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T12:05:40.012912381" + }, + "sarscov2 - vcf_gz_index_csi - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ], + "csi": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T12:05:52.405673587" }, "sarscov2 - vcf - stub": { "content": [ @@ -44,7 +283,19 @@ ] ], "1": [ + + ], + "2": [ + + ], + "3": [ "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ], + "csi": [ + + ], + "tbi": [ + ], "vcf": [ [ @@ -63,6 +314,37 @@ "nf-test": "0.8.4", "nextflow": "24.04.2" }, - "timestamp": "2024-06-03T11:54:55.594155692" + "timestamp": "2024-06-05T12:05:29.117946461" + }, + "sarscov2 - vcf_gz_index_tbi": { + "content": [ + [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz:md5,8e722884ffb75155212a3fc053918766" + ] + ], + [ + + ], + [ + [ + { + "id": "test" + }, + "test_vcf.vcf.gz.tbi" + ] + ], + [ + "versions.yml:md5,2c9f26ca356ef71199c3a7d1742974cb" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-05T12:05:17.217274984" } } \ No newline at end of file diff --git a/modules/nf-core/bcftools/sort/tests/vcf_gz_index.config b/modules/nf-core/bcftools/sort/tests/vcf_gz_index.config new file mode 100644 index 0000000000..aacd13464a --- /dev/null +++ b/modules/nf-core/bcftools/sort/tests/vcf_gz_index.config @@ -0,0 +1,4 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args = "--output-type z --write-index" +} diff --git a/modules/nf-core/bcftools/sort/tests/vcf_gz_index_csi.config b/modules/nf-core/bcftools/sort/tests/vcf_gz_index_csi.config new file mode 100644 index 0000000000..640eb0ba56 --- /dev/null +++ b/modules/nf-core/bcftools/sort/tests/vcf_gz_index_csi.config @@ -0,0 +1,4 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args = "--output-type z --write-index=csi" +} diff --git a/modules/nf-core/bcftools/sort/tests/vcf_gz_index_tbi.config b/modules/nf-core/bcftools/sort/tests/vcf_gz_index_tbi.config new file mode 100644 index 0000000000..589a50c65d --- /dev/null +++ b/modules/nf-core/bcftools/sort/tests/vcf_gz_index_tbi.config @@ -0,0 +1,4 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args = "--output-type z --write-index=tbi" +} diff --git a/modules/nf-core/bcftools/stats/environment.yml b/modules/nf-core/bcftools/stats/environment.yml index 128fe20422..93357b41ea 100644 --- a/modules/nf-core/bcftools/stats/environment.yml +++ b/modules/nf-core/bcftools/stats/environment.yml @@ -1,8 +1,6 @@ -name: bcftools_stats channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::bcftools=1.20 - bioconda::htslib=1.20 diff --git a/modules/nf-core/bcftools/stats/meta.yml b/modules/nf-core/bcftools/stats/meta.yml index 7ea2103e3b..655a61c5f4 100644 --- a/modules/nf-core/bcftools/stats/meta.yml +++ b/modules/nf-core/bcftools/stats/meta.yml @@ -13,58 +13,86 @@ tools: documentation: http://www.htslib.org/doc/bcftools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:bcftools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: VCF input file - pattern: "*.{vcf}" - - tbi: - type: file - description: | - The tab index for the VCF file to be inspected. Optional: only required when parameter regions is chosen. - pattern: "*.tbi" - - regions: - type: file - description: | - Optionally, restrict the operation to regions listed in this file. (VCF, BED or tab-delimited) - - targets: - type: file - description: | - Optionally, restrict the operation to regions listed in this file (doesn't rely upon tbi index files) - - samples: - type: file - description: | - Optional, file of sample names to be included or excluded. - e.g. 'file.tsv' - - exons: - type: file - description: | - Tab-delimited file with exons for indel frameshifts (chr,beg,end; 1-based, inclusive, optionally bgzip compressed). - e.g. 'exons.tsv.gz' - - fasta: - type: file - description: | - Faidx indexed reference sequence file to determine INDEL context. - e.g. 'reference.fa' + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: VCF input file + pattern: "*.{vcf}" + - tbi: + type: file + description: | + The tab index for the VCF file to be inspected. Optional: only required when parameter regions is chosen. + pattern: "*.tbi" + - - meta2: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - regions: + type: file + description: | + Optionally, restrict the operation to regions listed in this file. (VCF, BED or tab-delimited) + - - meta3: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - targets: + type: file + description: | + Optionally, restrict the operation to regions listed in this file (doesn't rely upon tbi index files) + - - meta4: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - samples: + type: file + description: | + Optional, file of sample names to be included or excluded. + e.g. 'file.tsv' + - - meta5: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - exons: + type: file + description: | + Tab-delimited file with exons for indel frameshifts (chr,beg,end; 1-based, inclusive, optionally bgzip compressed). + e.g. 'exons.tsv.gz' + - - meta6: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: | + Faidx indexed reference sequence file to determine INDEL context. + e.g. 'reference.fa' output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - stats: - type: file - description: Text output file containing stats - pattern: "*_{stats.txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*stats.txt": + type: file + description: Text output file containing stats + pattern: "*_{stats.txt}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/bwa/index/environment.yml b/modules/nf-core/bwa/index/environment.yml index 126e003448..d8789a2092 100644 --- a/modules/nf-core/bwa/index/environment.yml +++ b/modules/nf-core/bwa/index/environment.yml @@ -1,7 +1,5 @@ -name: bwa_index channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::bwa=0.7.18 diff --git a/modules/nf-core/bwa/index/meta.yml b/modules/nf-core/bwa/index/meta.yml index 4c7d30f3aa..4884bca2ab 100644 --- a/modules/nf-core/bwa/index/meta.yml +++ b/modules/nf-core/bwa/index/meta.yml @@ -11,32 +11,35 @@ tools: BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome. homepage: http://bio-bwa.sourceforge.net/ - documentation: http://www.htslib.org/doc/samtools.html + documentation: https://bio-bwa.sourceforge.net/bwa.shtml arxiv: arXiv:1303.3997 licence: ["GPL-3.0-or-later"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing reference information. - e.g. [ id:'test', single_end:false ] - - fasta: - type: file - description: Input genome fasta file + - - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: Input genome fasta file output: - - meta: - type: map - description: | - Groovy Map containing reference information. - e.g. [ id:'test', single_end:false ] - index: - type: file - description: BWA genome index files - pattern: "*.{amb,ann,bwt,pac,sa}" + - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - bwa: + type: file + description: BWA genome index files + pattern: "*.{amb,ann,bwt,pac,sa}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@maxulysse" diff --git a/modules/nf-core/bwa/mem/environment.yml b/modules/nf-core/bwa/mem/environment.yml index 3aa9f0cca2..ef7b966c0f 100644 --- a/modules/nf-core/bwa/mem/environment.yml +++ b/modules/nf-core/bwa/mem/environment.yml @@ -1,10 +1,8 @@ -name: bwa_mem channels: - conda-forge - bioconda - - defaults + dependencies: - bwa=0.7.18 - # renovate: datasource=conda depName=bioconda/samtools - - samtools=1.20 - htslib=1.20.0 + - samtools=1.20 diff --git a/modules/nf-core/bwa/mem/meta.yml b/modules/nf-core/bwa/mem/meta.yml index 1532c2615d..37467d2912 100644 --- a/modules/nf-core/bwa/mem/meta.yml +++ b/modules/nf-core/bwa/mem/meta.yml @@ -14,58 +14,85 @@ tools: BWA is a software package for mapping DNA sequences against a large reference genome, such as the human genome. homepage: http://bio-bwa.sourceforge.net/ - documentation: http://www.htslib.org/doc/samtools.html + documentation: https://bio-bwa.sourceforge.net/bwa.shtml arxiv: arXiv:1303.3997 licence: ["GPL-3.0-or-later"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - reads: - type: file - description: | - List of input FastQ files of size 1 and 2 for single-end and paired-end data, - respectively. - - meta2: - type: map - description: | - Groovy Map containing reference information. - e.g. [ id:'test', single_end:false ] - - index: - type: file - description: BWA genome index files - pattern: "Directory containing BWA index *.{amb,ann,bwt,pac,sa}" - - fasta: - type: file - description: Reference genome in FASTA format - pattern: "*.{fasta,fa}" - - sort_bam: - type: boolean - description: use samtools sort (true) or samtools view (false) - pattern: "true or false" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - reads: + type: file + description: | + List of input FastQ files of size 1 and 2 for single-end and paired-end data, + respectively. + - - meta2: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - index: + type: file + description: BWA genome index files + pattern: "Directory containing BWA index *.{amb,ann,bwt,pac,sa}" + - - meta3: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: Reference genome in FASTA format + pattern: "*.{fasta,fa}" + - - sort_bam: + type: boolean + description: use samtools sort (true) or samtools view (false) + pattern: "true or false" output: - bam: - type: file - description: Output BAM file containing read alignments - pattern: "*.{bam}" + - meta: + type: file + description: Output BAM file containing read alignments + pattern: "*.{bam}" + - "*.bam": + type: file + description: Output BAM file containing read alignments + pattern: "*.{bam}" - cram: - type: file - description: Output CRAM file containing read alignments - pattern: "*.{cram}" + - meta: + type: file + description: Output CRAM file containing read alignments + pattern: "*.{cram}" + - "*.cram": + type: file + description: Output CRAM file containing read alignments + pattern: "*.{cram}" - csi: - type: file - description: Optional index file for BAM file - pattern: "*.{csi}" + - meta: + type: file + description: Optional index file for BAM file + pattern: "*.{csi}" + - "*.csi": + type: file + description: Optional index file for BAM file + pattern: "*.{csi}" - crai: - type: file - description: Optional index file for CRAM file - pattern: "*.{crai}" + - meta: + type: file + description: Optional index file for CRAM file + pattern: "*.{crai}" + - "*.crai": + type: file + description: Optional index file for CRAM file + pattern: "*.{crai}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@jeremy1805" diff --git a/modules/nf-core/bwa/mem/tests/main.nf.test b/modules/nf-core/bwa/mem/tests/main.nf.test index 463b76f81c..5de2c2f453 100644 --- a/modules/nf-core/bwa/mem/tests/main.nf.test +++ b/modules/nf-core/bwa/mem/tests/main.nf.test @@ -9,21 +9,21 @@ nextflow_process { script "../main.nf" process "BWA_MEM" - test("Single-End") { - - setup { - run("BWA_INDEX") { - script "../../index/main.nf" - process { - """ - input[0] = [ - [id: 'test'], - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) - ] - """ - } + setup { + run("BWA_INDEX") { + script "../../index/main.nf" + process { + """ + input[0] = [ + [id: 'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ } } + } + + test("Single-End") { when { process { @@ -49,7 +49,8 @@ nextflow_process { process.out.csi, process.out.crai, process.out.versions, - file(process.out.bam[0][1]).name + bam(process.out.bam[0][1]).getHeaderMD5(), + bam(process.out.bam[0][1]).getReadsMD5() ).match() } ) @@ -59,20 +60,6 @@ nextflow_process { test("Single-End Sort") { - setup { - run("BWA_INDEX") { - script "../../index/main.nf" - process { - """ - input[0] = [ - [id: 'test'], - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) - ] - """ - } - } - } - when { process { """ @@ -97,7 +84,8 @@ nextflow_process { process.out.csi, process.out.crai, process.out.versions, - file(process.out.bam[0][1]).name + bam(process.out.bam[0][1]).getHeaderMD5(), + bam(process.out.bam[0][1]).getReadsMD5() ).match() } ) @@ -107,20 +95,6 @@ nextflow_process { test("Paired-End") { - setup { - run("BWA_INDEX") { - script "../../index/main.nf" - process { - """ - input[0] = [ - [id: 'test'], - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) - ] - """ - } - } - } - when { process { """ @@ -146,7 +120,8 @@ nextflow_process { process.out.csi, process.out.crai, process.out.versions, - file(process.out.bam[0][1]).name + bam(process.out.bam[0][1]).getHeaderMD5(), + bam(process.out.bam[0][1]).getReadsMD5() ).match() } ) @@ -156,20 +131,6 @@ nextflow_process { test("Paired-End Sort") { - setup { - run("BWA_INDEX") { - script "../../index/main.nf" - process { - """ - input[0] = [ - [id: 'test'], - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) - ] - """ - } - } - } - when { process { """ @@ -195,7 +156,8 @@ nextflow_process { process.out.csi, process.out.crai, process.out.versions, - file(process.out.bam[0][1]).name + bam(process.out.bam[0][1]).getHeaderMD5(), + bam(process.out.bam[0][1]).getReadsMD5() ).match() } ) @@ -205,20 +167,6 @@ nextflow_process { test("Paired-End - no fasta") { - setup { - run("BWA_INDEX") { - script "../../index/main.nf" - process { - """ - input[0] = [ - [id: 'test'], - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) - ] - """ - } - } - } - when { process { """ @@ -244,7 +192,8 @@ nextflow_process { process.out.csi, process.out.crai, process.out.versions, - file(process.out.bam[0][1]).name + bam(process.out.bam[0][1]).getHeaderMD5(), + bam(process.out.bam[0][1]).getReadsMD5() ).match() } ) @@ -253,20 +202,9 @@ nextflow_process { } test("Single-end - stub") { + options "-stub" - setup { - run("BWA_INDEX") { - script "../../index/main.nf" - process { - """ - input[0] = [ - [id: 'test'], - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) - ] - """ - } - } - } + when { process { """ @@ -286,30 +224,15 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot( - file(process.out.bam[0][1]).name, - file(process.out.csi[0][1]).name, - process.out.versions - ).match() } + { assert snapshot(process.out).match() } ) } } test("Paired-end - stub") { + options "-stub" - setup { - run("BWA_INDEX") { - script "../../index/main.nf" - process { - """ - input[0] = [ - [id: 'test'], - file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) - ] - """ - } - } - } + when { process { """ @@ -330,11 +253,7 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot( - file(process.out.bam[0][1]).name, - file(process.out.csi[0][1]).name, - process.out.versions - ).match() } + { assert snapshot(process.out).match() } ) } } diff --git a/modules/nf-core/bwa/mem/tests/main.nf.test.snap b/modules/nf-core/bwa/mem/tests/main.nf.test.snap index 038ee7b7a2..2079ea2240 100644 --- a/modules/nf-core/bwa/mem/tests/main.nf.test.snap +++ b/modules/nf-core/bwa/mem/tests/main.nf.test.snap @@ -13,13 +13,14 @@ [ "versions.yml:md5,478b816fbd37871f5e8c617833d51d80" ], - "test.bam" + "b6d9cb250261a4c125413c5d867d87a7", + "798439cbd7fd81cbcc5078022dc5479d" ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-20T08:44:32.953673185" + "timestamp": "2024-08-02T12:22:28.051598" }, "Single-End Sort": { "content": [ @@ -35,13 +36,14 @@ [ "versions.yml:md5,478b816fbd37871f5e8c617833d51d80" ], - "test.bam" + "848434ae4b79cfdcb2281c60b33663ce", + "94fcf617f5b994584c4e8d4044e16b4f" ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-20T08:44:45.27066093" + "timestamp": "2024-08-02T12:22:39.671154" }, "Paired-End": { "content": [ @@ -57,13 +59,14 @@ [ "versions.yml:md5,478b816fbd37871f5e8c617833d51d80" ], - "test.bam" + "5b34d31be84478761f789e3e2e805e31", + "57aeef88ed701a8ebc8e2f0a381b2a6" ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-20T08:44:57.706852274" + "timestamp": "2024-08-02T12:22:51.919479" }, "Paired-End Sort": { "content": [ @@ -79,27 +82,91 @@ [ "versions.yml:md5,478b816fbd37871f5e8c617833d51d80" ], - "test.bam" + "69003376d9a8952622d8587b39c3eaae", + "af8628d9df18b2d3d4f6fd47ef2bb872" ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-20T08:45:10.376505036" + "timestamp": "2024-08-02T12:23:00.833562" }, "Single-end - stub": { "content": [ - "test.bam", - "test.csi", - [ - "versions.yml:md5,478b816fbd37871f5e8c617833d51d80" - ] + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + + ], + "2": [ + [ + { + "id": "test", + "single_end": true + }, + "test.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": true + }, + "test.crai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + "versions.yml:md5,478b816fbd37871f5e8c617833d51d80" + ], + "bam": [ + [ + { + "id": "test", + "single_end": true + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "crai": [ + [ + { + "id": "test", + "single_end": true + }, + "test.crai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "cram": [ + + ], + "csi": [ + [ + { + "id": "test", + "single_end": true + }, + "test.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,478b816fbd37871f5e8c617833d51d80" + ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-20T08:46:07.182072398" + "timestamp": "2024-08-02T12:31:29.46282" }, "Paired-End - no fasta": { "content": [ @@ -115,26 +182,90 @@ [ "versions.yml:md5,478b816fbd37871f5e8c617833d51d80" ], - "test.bam" + "5b34d31be84478761f789e3e2e805e31", + "57aeef88ed701a8ebc8e2f0a381b2a6" ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-20T08:45:53.813076501" + "timestamp": "2024-08-02T12:23:09.942545" }, "Paired-end - stub": { "content": [ - "test.bam", - "test.csi", - [ - "versions.yml:md5,478b816fbd37871f5e8c617833d51d80" - ] + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test.crai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + "versions.yml:md5,478b816fbd37871f5e8c617833d51d80" + ], + "bam": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "crai": [ + [ + { + "id": "test", + "single_end": false + }, + "test.crai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "cram": [ + + ], + "csi": [ + [ + { + "id": "test", + "single_end": false + }, + "test.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,478b816fbd37871f5e8c617833d51d80" + ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-20T08:46:18.412916364" + "timestamp": "2024-08-02T12:31:37.757037" } } \ No newline at end of file diff --git a/modules/nf-core/bwamem2/index/environment.yml b/modules/nf-core/bwamem2/index/environment.yml index 26b439172a..15cee23876 100644 --- a/modules/nf-core/bwamem2/index/environment.yml +++ b/modules/nf-core/bwamem2/index/environment.yml @@ -1,7 +1,5 @@ -name: bwamem2_index channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::bwa-mem2=2.2.1 diff --git a/modules/nf-core/bwamem2/index/meta.yml b/modules/nf-core/bwamem2/index/meta.yml index c14a109252..74f54ef0d8 100644 --- a/modules/nf-core/bwamem2/index/meta.yml +++ b/modules/nf-core/bwamem2/index/meta.yml @@ -13,29 +13,32 @@ tools: homepage: https://github.com/bwa-mem2/bwa-mem2 documentation: https://github.com/bwa-mem2/bwa-mem2#usage licence: ["MIT"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - fasta: - type: file - description: Input genome fasta file + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: Input genome fasta file output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - index: - type: file - description: BWA genome index files - pattern: "*.{0123,amb,ann,bwt.2bit.64,pac}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - bwamem2: + type: file + description: BWA genome index files + pattern: "*.{0123,amb,ann,bwt.2bit.64,pac}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@maxulysse" maintainers: diff --git a/modules/nf-core/bwamem2/mem/environment.yml b/modules/nf-core/bwamem2/mem/environment.yml index cbf06d3993..7e0b5a3479 100644 --- a/modules/nf-core/bwamem2/mem/environment.yml +++ b/modules/nf-core/bwamem2/mem/environment.yml @@ -1,10 +1,8 @@ -name: bwamem2_mem channels: - conda-forge - bioconda - - defaults + dependencies: - bwa-mem2=2.2.1 - # renovate: datasource=conda depName=bioconda/samtools - - samtools=1.19.2 - htslib=1.19.1 + - samtools=1.19.2 diff --git a/modules/nf-core/bwamem2/mem/meta.yml b/modules/nf-core/bwamem2/mem/meta.yml index 931f712943..c6333ca171 100644 --- a/modules/nf-core/bwamem2/mem/meta.yml +++ b/modules/nf-core/bwamem2/mem/meta.yml @@ -17,69 +17,96 @@ tools: documentation: http://www.htslib.org/doc/samtools.html arxiv: arXiv:1303.3997 licence: ["MIT"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - reads: - type: file - description: | - List of input FastQ files of size 1 and 2 for single-end and paired-end data, - respectively. - - meta2: - type: map - description: | - Groovy Map containing reference/index information - e.g. [ id:'test' ] - - index: - type: file - description: BWA genome index files - pattern: "Directory containing BWA index *.{0132,amb,ann,bwt.2bit.64,pac}" - - meta3: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fasta: - type: file - description: Reference genome in FASTA format - pattern: "*.{fa,fasta,fna}" - - sort_bam: - type: boolean - description: use samtools sort (true) or samtools view (false) - pattern: "true or false" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - reads: + type: file + description: | + List of input FastQ files of size 1 and 2 for single-end and paired-end data, + respectively. + - - meta2: + type: map + description: | + Groovy Map containing reference/index information + e.g. [ id:'test' ] + - index: + type: file + description: BWA genome index files + pattern: "Directory containing BWA index *.{0132,amb,ann,bwt.2bit.64,pac}" + - - meta3: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fasta: + type: file + description: Reference genome in FASTA format + pattern: "*.{fa,fasta,fna}" + - - sort_bam: + type: boolean + description: use samtools sort (true) or samtools view (false) + pattern: "true or false" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - sam: - type: file - description: Output SAM file containing read alignments - pattern: "*.{sam}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.sam": + type: file + description: Output SAM file containing read alignments + pattern: "*.{sam}" - bam: - type: file - description: Output BAM file containing read alignments - pattern: "*.{bam}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bam": + type: file + description: Output BAM file containing read alignments + pattern: "*.{bam}" - cram: - type: file - description: Output CRAM file containing read alignments - pattern: "*.{cram}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.cram": + type: file + description: Output CRAM file containing read alignments + pattern: "*.{cram}" - crai: - type: file - description: Index file for CRAM file - pattern: "*.{crai}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.crai": + type: file + description: Index file for CRAM file + pattern: "*.{crai}" - csi: - type: file - description: Index file for BAM file - pattern: "*.{csi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: Index file for BAM file + pattern: "*.{csi}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@maxulysse" - "@matthdsm" diff --git a/modules/nf-core/bwamem2/mem/tests/main.nf.test b/modules/nf-core/bwamem2/mem/tests/main.nf.test index 5e67f70b6a..9e0ab14aec 100644 --- a/modules/nf-core/bwamem2/mem/tests/main.nf.test +++ b/modules/nf-core/bwamem2/mem/tests/main.nf.test @@ -10,21 +10,21 @@ nextflow_process { tag "bwamem2/mem" tag "bwamem2/index" - test("sarscov2 - fastq, index, fasta, false") { - - setup { - run("BWAMEM2_INDEX") { - script "../../index/main.nf" - process { - """ - input[0] = Channel.of([ - [:], // meta map - [file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] - ]) - """ - } + setup { + run("BWAMEM2_INDEX") { + script "../../index/main.nf" + process { + """ + input[0] = Channel.of([ + [:], // meta map + [file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + ]) + """ } } + } + + test("sarscov2 - fastq, index, fasta, false") { when { process { @@ -44,7 +44,8 @@ nextflow_process { assertAll( { assert process.success }, { assert snapshot( - file(process.out.bam[0][1]).name, + bam(process.out.bam[0][1]).getHeaderMD5(), + bam(process.out.bam[0][1]).getReadsMD5(), process.out.versions ).match() } ) @@ -54,20 +55,6 @@ nextflow_process { test("sarscov2 - fastq, index, fasta, true") { - setup { - run("BWAMEM2_INDEX") { - script "../../index/main.nf" - process { - """ - input[0] = Channel.of([ - [:], // meta map - [file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] - ]) - """ - } - } - } - when { process { """ @@ -86,7 +73,8 @@ nextflow_process { assertAll( { assert process.success }, { assert snapshot( - file(process.out.bam[0][1]).name, + bam(process.out.bam[0][1]).getHeaderMD5(), + bam(process.out.bam[0][1]).getReadsMD5(), process.out.versions ).match() } ) @@ -96,20 +84,6 @@ nextflow_process { test("sarscov2 - [fastq1, fastq2], index, fasta, false") { - setup { - run("BWAMEM2_INDEX") { - script "../../index/main.nf" - process { - """ - input[0] = Channel.of([ - [:], // meta map - [file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] - ]) - """ - } - } - } - when { process { """ @@ -131,7 +105,8 @@ nextflow_process { assertAll( { assert process.success }, { assert snapshot( - file(process.out.bam[0][1]).name, + bam(process.out.bam[0][1]).getHeaderMD5(), + bam(process.out.bam[0][1]).getReadsMD5(), process.out.versions ).match() } ) @@ -141,20 +116,6 @@ nextflow_process { test("sarscov2 - [fastq1, fastq2], index, fasta, true") { - setup { - run("BWAMEM2_INDEX") { - script "../../index/main.nf" - process { - """ - input[0] = Channel.of([ - [:], // meta map - [file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] - ]) - """ - } - } - } - when { process { """ @@ -176,7 +137,8 @@ nextflow_process { assertAll( { assert process.success }, { assert snapshot( - file(process.out.bam[0][1]).name, + bam(process.out.bam[0][1]).getHeaderMD5(), + bam(process.out.bam[0][1]).getReadsMD5(), process.out.versions ).match() } ) @@ -188,20 +150,6 @@ nextflow_process { options "-stub" - setup { - run("BWAMEM2_INDEX") { - script "../../index/main.nf" - process { - """ - input[0] = Channel.of([ - [:], // meta map - [file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] - ]) - """ - } - } - } - when { process { """ @@ -222,10 +170,7 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot( - file(process.out.bam[0][1]).name, - process.out.versions - ).match() } + { assert snapshot(process.out).match() } ) } diff --git a/modules/nf-core/bwamem2/mem/tests/main.nf.test.snap b/modules/nf-core/bwamem2/mem/tests/main.nf.test.snap index 9fb1e69d07..69bc3612bf 100644 --- a/modules/nf-core/bwamem2/mem/tests/main.nf.test.snap +++ b/modules/nf-core/bwamem2/mem/tests/main.nf.test.snap @@ -1,67 +1,129 @@ { "sarscov2 - [fastq1, fastq2], index, fasta, false": { "content": [ - "test.bam", + "eefa0f44493fd0504e734efd2f1f4a9e", + "57aeef88ed701a8ebc8e2f0a381b2a6", [ "versions.yml:md5,1c1a9566f189ec077b5179bbf453c51a" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-03-19T13:13:18.890289958" + "timestamp": "2024-08-02T12:23:37.929675" }, "sarscov2 - [fastq1, fastq2], index, fasta, true - stub": { "content": [ - "test.bam", - [ - "versions.yml:md5,1c1a9566f189ec077b5179bbf453c51a" - ] + { + "0": [ + + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + + ], + "3": [ + + ], + "4": [ + [ + { + "id": "test", + "single_end": false + }, + "test.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "5": [ + "versions.yml:md5,1c1a9566f189ec077b5179bbf453c51a" + ], + "bam": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "crai": [ + + ], + "cram": [ + + ], + "csi": [ + [ + { + "id": "test", + "single_end": false + }, + "test.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "sam": [ + + ], + "versions": [ + "versions.yml:md5,1c1a9566f189ec077b5179bbf453c51a" + ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-03-19T13:45:51.821633029" + "timestamp": "2024-08-02T12:12:06.693567" }, "sarscov2 - [fastq1, fastq2], index, fasta, true": { "content": [ - "test.bam", + "7aba324f82d5b4e926a5dd7b46029cb4", + "af8628d9df18b2d3d4f6fd47ef2bb872", [ "versions.yml:md5,1c1a9566f189ec077b5179bbf453c51a" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-03-19T13:13:36.458291078" + "timestamp": "2024-08-02T12:23:53.488374" }, "sarscov2 - fastq, index, fasta, false": { "content": [ - "test.bam", + "bc02b41697b3a8f1021b02becec24052", + "798439cbd7fd81cbcc5078022dc5479d", [ "versions.yml:md5,1c1a9566f189ec077b5179bbf453c51a" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-03-19T13:12:44.084654507" + "timestamp": "2024-08-02T12:23:05.644682" }, "sarscov2 - fastq, index, fasta, true": { "content": [ - "test.bam", + "e41d67320815d29ba5f6e9d1ae21902a", + "94fcf617f5b994584c4e8d4044e16b4f", [ "versions.yml:md5,1c1a9566f189ec077b5179bbf453c51a" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-03-19T13:13:01.763341681" + "timestamp": "2024-08-02T12:23:21.837763" } } \ No newline at end of file diff --git a/modules/nf-core/cat/cat/environment.yml b/modules/nf-core/cat/cat/environment.yml index 17a04ef232..9b01c865a2 100644 --- a/modules/nf-core/cat/cat/environment.yml +++ b/modules/nf-core/cat/cat/environment.yml @@ -1,7 +1,5 @@ -name: cat_cat channels: - conda-forge - bioconda - - defaults dependencies: - conda-forge::pigz=2.3.4 diff --git a/modules/nf-core/cat/cat/main.nf b/modules/nf-core/cat/cat/main.nf index adbdbd7ba6..2862c64cd9 100644 --- a/modules/nf-core/cat/cat/main.nf +++ b/modules/nf-core/cat/cat/main.nf @@ -76,4 +76,3 @@ def getFileSuffix(filename) { def match = filename =~ /^.*?((\.\w{1,5})?(\.\w{1,5}\.gz$))/ return match ? match[0][1] : filename.substring(filename.lastIndexOf('.')) } - diff --git a/modules/nf-core/cat/cat/meta.yml b/modules/nf-core/cat/cat/meta.yml index 00a8db0bca..81778a0671 100644 --- a/modules/nf-core/cat/cat/meta.yml +++ b/modules/nf-core/cat/cat/meta.yml @@ -9,25 +9,32 @@ tools: description: Just concatenation documentation: https://man7.org/linux/man-pages/man1/cat.1.html licence: ["GPL-3.0-or-later"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - files_in: - type: file - description: List of compressed / uncompressed files - pattern: "*" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - files_in: + type: file + description: List of compressed / uncompressed files + pattern: "*" output: - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - file_out: - type: file - description: Concatenated file. Will be gzipped if file_out ends with ".gz" - pattern: "${file_out}" + - meta: + type: file + description: Concatenated file. Will be gzipped if file_out ends with ".gz" + pattern: "${file_out}" + - ${prefix}: + type: file + description: Concatenated file. Will be gzipped if file_out ends with ".gz" + pattern: "${file_out}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@erikrikarddaniel" - "@FriederikeHanssen" diff --git a/modules/nf-core/cat/cat/tests/main.nf.test b/modules/nf-core/cat/cat/tests/main.nf.test index fcee2d19f2..9cb1617883 100644 --- a/modules/nf-core/cat/cat/tests/main.nf.test +++ b/modules/nf-core/cat/cat/tests/main.nf.test @@ -29,7 +29,8 @@ nextflow_process { then { assertAll( { assert !process.success }, - { assert process.stdout.toString().contains("The name of the input file can't be the same as for the output prefix") } + { assert process.stdout.toString().contains("The name of the input file can't be the same as for the output prefix") }, + { assert snapshot(process.out.versions).match() } ) } } @@ -83,8 +84,12 @@ nextflow_process { def lines = path(process.out.file_out.get(0).get(1)).linesGzip assertAll( { assert process.success }, - { assert snapshot(lines[0..5]).match("test_cat_zipped_zipped_lines") }, - { assert snapshot(lines.size()).match("test_cat_zipped_zipped_size")} + { assert snapshot( + lines[0..5], + lines.size(), + process.out.versions + ).match() + } ) } } @@ -142,8 +147,12 @@ nextflow_process { def lines = path(process.out.file_out.get(0).get(1)).linesGzip assertAll( { assert process.success }, - { assert snapshot(lines[0..5]).match("test_cat_unzipped_zipped_lines") }, - { assert snapshot(lines.size()).match("test_cat_unzipped_zipped_size")} + { assert snapshot( + lines[0..5], + lines.size(), + process.out.versions + ).match() + } ) } } @@ -170,8 +179,12 @@ nextflow_process { def lines = path(process.out.file_out.get(0).get(1)).linesGzip assertAll( { assert process.success }, - { assert snapshot(lines[0..5]).match("test_cat_one_file_unzipped_zipped_lines") }, - { assert snapshot(lines.size()).match("test_cat_one_file_unzipped_zipped_size")} + { assert snapshot( + lines[0..5], + lines.size(), + process.out.versions + ).match() + } ) } } diff --git a/modules/nf-core/cat/cat/tests/main.nf.test.snap b/modules/nf-core/cat/cat/tests/main.nf.test.snap index 423571ba27..b7623ee650 100644 --- a/modules/nf-core/cat/cat/tests/main.nf.test.snap +++ b/modules/nf-core/cat/cat/tests/main.nf.test.snap @@ -1,10 +1,4 @@ { - "test_cat_unzipped_zipped_size": { - "content": [ - 375 - ], - "timestamp": "2023-10-16T14:33:08.049445686" - }, "test_cat_unzipped_unzipped": { "content": [ { @@ -34,6 +28,10 @@ ] } ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.3" + }, "timestamp": "2023-10-16T14:32:18.500464399" }, "test_cat_zipped_unzipped": { @@ -65,9 +63,13 @@ ] } ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.3" + }, "timestamp": "2023-10-16T14:32:49.642741302" }, - "test_cat_zipped_zipped_lines": { + "test_cat_zipped_zipped": { "content": [ [ "MT192765.1\tGenbank\ttranscript\t259\t29667\t.\t+\t.\tID=unknown_transcript_1;geneID=orf1ab;gene_name=orf1ab", @@ -76,11 +78,31 @@ "MT192765.1\tGenbank\tCDS\t13461\t21548\t.\t+\t0\tParent=unknown_transcript_1;exception=\"ribosomal slippage\";gbkey=CDS;gene=orf1ab;note=\"pp1ab;translated=by -1 ribosomal frameshift\";product=\"orf1ab polyprotein\";protein_id=QIK50426.1", "MT192765.1\tGenbank\tCDS\t21556\t25377\t.\t+\t0\tParent=unknown_transcript_1;gbkey=CDS;gene=S;note=\"structural protein\";product=\"surface glycoprotein\";protein_id=QIK50427.1", "MT192765.1\tGenbank\tgene\t21556\t25377\t.\t+\t.\tParent=unknown_transcript_1" + ], + 78, + [ + "versions.yml:md5,115ed6177ebcff24eb99d503fa5ef894" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T11:51:46.802978" + }, + "test_cat_name_conflict": { + "content": [ + [ + ] ], - "timestamp": "2023-10-16T14:32:33.629048645" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T11:51:29.45394" }, - "test_cat_unzipped_zipped_lines": { + "test_cat_one_file_unzipped_zipped": { "content": [ [ ">MT192765.1 Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/PC00101P/2020, complete genome", @@ -89,11 +111,19 @@ "TAACTCGTCTATCTTCTGCAGGCTGCTTACGGTTTCGTCCGTGTTGCAGCCGATCATCAGCACATCTAGGTTTTGTCCGG", "GTGTGACCGAAAGGTAAGATGGAGAGCCTTGTCCCTGGTTTCAACGAGAAAACACACGTCCAACTCAGTTTGCCTGTTTT", "ACAGGTTCGCGACGTGCTCGTACGTGGCTTTGGAGACTCCGTGGAGGAGGTCTTATCAGAGGCACGTCAACATCTTAAAG" + ], + 374, + [ + "versions.yml:md5,115ed6177ebcff24eb99d503fa5ef894" ] ], - "timestamp": "2023-10-16T14:33:08.038830506" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T11:52:02.774016" }, - "test_cat_one_file_unzipped_zipped_lines": { + "test_cat_unzipped_zipped": { "content": [ [ ">MT192765.1 Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/PC00101P/2020, complete genome", @@ -102,20 +132,16 @@ "TAACTCGTCTATCTTCTGCAGGCTGCTTACGGTTTCGTCCGTGTTGCAGCCGATCATCAGCACATCTAGGTTTTGTCCGG", "GTGTGACCGAAAGGTAAGATGGAGAGCCTTGTCCCTGGTTTCAACGAGAAAACACACGTCCAACTCAGTTTGCCTGTTTT", "ACAGGTTCGCGACGTGCTCGTACGTGGCTTTGGAGACTCCGTGGAGGAGGTCTTATCAGAGGCACGTCAACATCTTAAAG" + ], + 375, + [ + "versions.yml:md5,115ed6177ebcff24eb99d503fa5ef894" ] ], - "timestamp": "2023-10-16T14:33:21.39642399" - }, - "test_cat_zipped_zipped_size": { - "content": [ - 78 - ], - "timestamp": "2023-10-16T14:32:33.641869244" - }, - "test_cat_one_file_unzipped_zipped_size": { - "content": [ - 374 - ], - "timestamp": "2023-10-16T14:33:21.4094373" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T11:51:57.581523" } } \ No newline at end of file diff --git a/modules/nf-core/cat/fastq/environment.yml b/modules/nf-core/cat/fastq/environment.yml index 8c69b121f7..71e04c3d71 100644 --- a/modules/nf-core/cat/fastq/environment.yml +++ b/modules/nf-core/cat/fastq/environment.yml @@ -1,7 +1,5 @@ -name: cat_fastq channels: - conda-forge - bioconda - - defaults dependencies: - - conda-forge::coreutils=8.30 + - conda-forge::coreutils=9.5 diff --git a/modules/nf-core/cat/fastq/main.nf b/modules/nf-core/cat/fastq/main.nf index f132b2adc1..4364a389b7 100644 --- a/modules/nf-core/cat/fastq/main.nf +++ b/modules/nf-core/cat/fastq/main.nf @@ -4,8 +4,8 @@ process CAT_FASTQ { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/ubuntu:20.04' : - 'nf-core/ubuntu:20.04' }" + 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/c2/c262fc09eca59edb5a724080eeceb00fb06396f510aefb229c2d2c6897e63975/data' : + 'community.wave.seqera.io/library/coreutils:9.5--ae99c88a9b28c264' }" input: tuple val(meta), path(reads, stageAs: "input*/*") @@ -53,9 +53,9 @@ process CAT_FASTQ { def prefix = task.ext.prefix ?: "${meta.id}" def readList = reads instanceof List ? reads.collect{ it.toString() } : [reads.toString()] if (meta.single_end) { - if (readList.size > 1) { + if (readList.size >= 1) { """ - touch ${prefix}.merged.fastq.gz + echo '' | gzip > ${prefix}.merged.fastq.gz cat <<-END_VERSIONS > versions.yml "${task.process}": @@ -64,10 +64,10 @@ process CAT_FASTQ { """ } } else { - if (readList.size > 2) { + if (readList.size >= 2) { """ - touch ${prefix}_1.merged.fastq.gz - touch ${prefix}_2.merged.fastq.gz + echo '' | gzip > ${prefix}_1.merged.fastq.gz + echo '' | gzip > ${prefix}_2.merged.fastq.gz cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/cat/fastq/meta.yml b/modules/nf-core/cat/fastq/meta.yml index db4ac3c79a..91ff2fb5f6 100644 --- a/modules/nf-core/cat/fastq/meta.yml +++ b/modules/nf-core/cat/fastq/meta.yml @@ -10,30 +10,33 @@ tools: The cat utility reads files sequentially, writing them to the standard output. documentation: https://www.gnu.org/software/coreutils/manual/html_node/cat-invocation.html licence: ["GPL-3.0-or-later"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - reads: - type: file - description: | - List of input FastQ files to be concatenated. + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - reads: + type: file + description: | + List of input FastQ files to be concatenated. output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - reads: - type: file - description: Merged fastq file - pattern: "*.{merged.fastq.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.merged.fastq.gz": + type: file + description: Merged fastq file + pattern: "*.{merged.fastq.gz}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/cat/fastq/tests/main.nf.test b/modules/nf-core/cat/fastq/tests/main.nf.test index a71dcb8dfa..f88a78b6ca 100644 --- a/modules/nf-core/cat/fastq/tests/main.nf.test +++ b/modules/nf-core/cat/fastq/tests/main.nf.test @@ -13,9 +13,6 @@ nextflow_process { test("test_cat_fastq_single_end") { when { - params { - outdir = "$outputDir" - } process { """ input[0] = Channel.of([ @@ -38,9 +35,6 @@ nextflow_process { test("test_cat_fastq_paired_end") { when { - params { - outdir = "$outputDir" - } process { """ input[0] = Channel.of([ @@ -65,9 +59,6 @@ nextflow_process { test("test_cat_fastq_single_end_same_name") { when { - params { - outdir = "$outputDir" - } process { """ input[0] = Channel.of([ @@ -90,9 +81,6 @@ nextflow_process { test("test_cat_fastq_paired_end_same_name") { when { - params { - outdir = "$outputDir" - } process { """ input[0] = Channel.of([ @@ -117,9 +105,129 @@ nextflow_process { test("test_cat_fastq_single_end_single_file") { when { - params { - outdir = "$outputDir" + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:true ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true)] + ]) + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test_cat_fastq_single_end - stub") { + + options "-stub" + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:true ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true)] + ]) + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test_cat_fastq_paired_end - stub") { + + options "-stub" + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test2_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test2_2.fastq.gz', checkIfExists: true)] + ]) + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test_cat_fastq_single_end_same_name - stub") { + + options "-stub" + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:true ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true)] + ]) + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test_cat_fastq_paired_end_same_name - stub") { + + options "-stub" + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true)] + ]) + """ } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test_cat_fastq_single_end_single_file - stub") { + + options "-stub" + + when { process { """ input[0] = Channel.of([ diff --git a/modules/nf-core/cat/fastq/tests/main.nf.test.snap b/modules/nf-core/cat/fastq/tests/main.nf.test.snap index 43dfe28fc7..f8689a1ce5 100644 --- a/modules/nf-core/cat/fastq/tests/main.nf.test.snap +++ b/modules/nf-core/cat/fastq/tests/main.nf.test.snap @@ -12,7 +12,7 @@ ] ], "1": [ - "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" ], "reads": [ [ @@ -24,11 +24,15 @@ ] ], "versions": [ - "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" ] } ], - "timestamp": "2024-01-17T17:30:39.816981" + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-19T20:02:07.519211144" }, "test_cat_fastq_single_end_same_name": { "content": [ @@ -43,7 +47,7 @@ ] ], "1": [ - "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" ], "reads": [ [ @@ -55,11 +59,15 @@ ] ], "versions": [ - "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" ] } ], - "timestamp": "2024-01-17T17:32:35.229332" + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-19T20:02:31.618628921" }, "test_cat_fastq_single_end_single_file": { "content": [ @@ -74,7 +82,7 @@ ] ], "1": [ - "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" ], "reads": [ [ @@ -86,11 +94,15 @@ ] ], "versions": [ - "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" ] } ], - "timestamp": "2024-01-17T17:34:00.058829" + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-19T20:02:57.904149581" }, "test_cat_fastq_paired_end_same_name": { "content": [ @@ -108,7 +120,7 @@ ] ], "1": [ - "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" ], "reads": [ [ @@ -123,11 +135,126 @@ ] ], "versions": [ - "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" ] } ], - "timestamp": "2024-01-17T17:33:33.031555" + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-19T20:02:44.577183829" + }, + "test_cat_fastq_single_end - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" + ], + "reads": [ + [ + { + "id": "test", + "single_end": true + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-19T20:03:10.603734777" + }, + "test_cat_fastq_paired_end_same_name - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "versions": [ + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-19T20:03:46.041808828" + }, + "test_cat_fastq_single_end_same_name - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" + ], + "reads": [ + [ + { + "id": "test", + "single_end": true + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-19T20:03:34.13865402" }, "test_cat_fastq_paired_end": { "content": [ @@ -145,7 +272,7 @@ ] ], "1": [ - "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" ], "reads": [ [ @@ -160,10 +287,90 @@ ] ], "versions": [ - "versions.yml:md5,d42d6e24d67004608495883e00bd501b" + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-19T20:02:19.64383573" + }, + "test_cat_fastq_paired_end - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "versions": [ + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-19T20:03:22.597246066" + }, + "test_cat_fastq_single_end_single_file - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" + ], + "reads": [ + [ + { + "id": "test", + "single_end": true + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,6ef4fd28546a005865b9454bbedbf81a" ] } ], - "timestamp": "2024-01-17T17:32:02.270935" + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-19T20:03:58.44849001" } } \ No newline at end of file diff --git a/modules/nf-core/cnvkit/antitarget/environment.yml b/modules/nf-core/cnvkit/antitarget/environment.yml index a33a12e23c..b683406cc5 100644 --- a/modules/nf-core/cnvkit/antitarget/environment.yml +++ b/modules/nf-core/cnvkit/antitarget/environment.yml @@ -1,7 +1,5 @@ -name: cnvkit_antitarget channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::cnvkit=0.9.11 diff --git a/modules/nf-core/cnvkit/antitarget/meta.yml b/modules/nf-core/cnvkit/antitarget/meta.yml index d879092d33..13f12a10c0 100644 --- a/modules/nf-core/cnvkit/antitarget/meta.yml +++ b/modules/nf-core/cnvkit/antitarget/meta.yml @@ -15,30 +15,33 @@ tools: tool_dev_url: "https://github.com/etal/cnvkit" doi: 10.1371/journal.pcbi.1004873 licence: ["Apache-2.0"] + identifier: biotools:cnvkit input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - targets: - type: file - description: File containing genomic regions - pattern: "*.{bed}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - targets: + type: file + description: File containing genomic regions + pattern: "*.{bed}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - bed: - type: file - description: File containing off-target regions - pattern: "*.{bed}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bed": + type: file + description: File containing off-target regions + pattern: "*.{bed}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@adamrtalbot" - "@priesgo" diff --git a/modules/nf-core/cnvkit/antitarget/tests/main.nf.test b/modules/nf-core/cnvkit/antitarget/tests/main.nf.test index 558abb6768..84f3818007 100644 --- a/modules/nf-core/cnvkit/antitarget/tests/main.nf.test +++ b/modules/nf-core/cnvkit/antitarget/tests/main.nf.test @@ -25,8 +25,7 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot(process.out.versions).match("version") }, - { assert snapshot(file(process.out.bed.get(0).get(1)).readLines()[0..5]).match() } + { assert snapshot(process.out).match() } ) } diff --git a/modules/nf-core/cnvkit/antitarget/tests/main.nf.test.snap b/modules/nf-core/cnvkit/antitarget/tests/main.nf.test.snap index f793a5275f..57bccf9f99 100644 --- a/modules/nf-core/cnvkit/antitarget/tests/main.nf.test.snap +++ b/modules/nf-core/cnvkit/antitarget/tests/main.nf.test.snap @@ -34,31 +34,35 @@ }, "human - bed": { "content": [ - [ - "chr21\t23354500\t23509999\tAntitarget", - "chr21\t23509999\t23665499\tAntitarget", - "chr21\t23665499\t23820999\tAntitarget", - "chr21\t23820999\t23976499\tAntitarget", - "chr21\t23976499\t24131999\tAntitarget", - "chr21\t24911498\t25066997\tAntitarget" - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.0" - }, - "timestamp": "2024-05-13T13:58:33.088665" - }, - "version": { - "content": [ - [ - "versions.yml:md5,b5e73ea85743cedc68ca6ef8006e5030" - ] + { + "0": [ + [ + { + "id": "test" + }, + "test.antitarget.bed:md5,3d4d20f9f23b39970865d29ef239d20b" + ] + ], + "1": [ + "versions.yml:md5,b5e73ea85743cedc68ca6ef8006e5030" + ], + "bed": [ + [ + { + "id": "test" + }, + "test.antitarget.bed:md5,3d4d20f9f23b39970865d29ef239d20b" + ] + ], + "versions": [ + "versions.yml:md5,b5e73ea85743cedc68ca6ef8006e5030" + ] + } ], "meta": { "nf-test": "0.8.4", "nextflow": "23.10.0" }, - "timestamp": "2024-05-13T13:58:33.070317" + "timestamp": "2024-06-10T10:32:33.936192" } } \ No newline at end of file diff --git a/modules/nf-core/cnvkit/batch/environment.yml b/modules/nf-core/cnvkit/batch/environment.yml index 10c5d6b753..5d79360119 100644 --- a/modules/nf-core/cnvkit/batch/environment.yml +++ b/modules/nf-core/cnvkit/batch/environment.yml @@ -1,11 +1,8 @@ -name: cnvkit_batch - channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::cnvkit=0.9.10 - - bioconda::htslib=1.19.1 - - bioconda::samtools=1.19.2 + - bioconda::htslib=1.17 # Matched with the container + - bioconda::samtools=1.17 # Matched with the container diff --git a/modules/nf-core/cnvkit/batch/meta.yml b/modules/nf-core/cnvkit/batch/meta.yml index f14efe553c..30f7a1a29b 100644 --- a/modules/nf-core/cnvkit/batch/meta.yml +++ b/modules/nf-core/cnvkit/batch/meta.yml @@ -12,94 +12,127 @@ tools: homepage: https://cnvkit.readthedocs.io/en/stable/index.html documentation: https://cnvkit.readthedocs.io/en/stable/index.html licence: ["Apache-2.0"] + identifier: biotools:cnvkit input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - tumor: - type: file - description: | - Input tumour sample bam file (or cram) - - normal: - type: file - description: | - Input normal sample bam file (or cram) - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'test' ] - - fasta: - type: file - description: | - Input reference genome fasta file (only needed for cram_input and/or when normal_samples are provided) - - meta3: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'test' ] - - fasta_fai: - type: file - description: | - Input reference genome fasta index (optional, but recommended for cram_input) - - meta4: - type: map - description: | - Groovy Map containing information about target file - e.g. [ id:'test' ] - - targets: - type: file - description: | - Input target bed file - - meta5: - type: map - description: | - Groovy Map containing information about reference file - e.g. [ id:'test' ] - - reference: - type: file - description: | - Input reference cnn-file (only for germline and tumor-only running) - - panel_of_normals: - type: file - description: | - Input panel of normals file + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - tumor: + type: file + description: | + Input tumour sample bam file (or cram) + - normal: + type: file + description: | + Input normal sample bam file (or cram) + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test' ] + - fasta: + type: file + description: | + Input reference genome fasta file (only needed for cram_input and/or when normal_samples are provided) + - - meta3: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test' ] + - fasta_fai: + type: file + description: | + Input reference genome fasta index (optional, but recommended for cram_input) + - - meta4: + type: map + description: | + Groovy Map containing information about target file + e.g. [ id:'test' ] + - targets: + type: file + description: | + Input target bed file + - - meta5: + type: map + description: | + Groovy Map containing information about reference file + e.g. [ id:'test' ] + - reference: + type: file + description: | + Input reference cnn-file (only for germline and tumor-only running) + - - panel_of_normals: + type: file + description: | + Input panel of normals file output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - bed: - type: file - description: File containing genomic regions - pattern: "*.{bed}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bed": + type: file + description: File containing genomic regions + pattern: "*.{bed}" - cnn: - type: file - description: File containing coverage information - pattern: "*.{cnn}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.cnn": + type: file + description: File containing coverage information + pattern: "*.{cnn}" - cnr: - type: file - description: File containing copy number ratio information - pattern: "*.{cnr}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.cnr": + type: file + description: File containing copy number ratio information + pattern: "*.{cnr}" - cns: - type: file - description: File containing copy number segment information - pattern: "*.{cns}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.cns": + type: file + description: File containing copy number segment information + pattern: "*.{cns}" - pdf: - type: file - description: File with plot of copy numbers or segments on chromosomes - pattern: "*.{pdf}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.pdf": + type: file + description: File with plot of copy numbers or segments on chromosomes + pattern: "*.{pdf}" - png: - type: file - description: File with plot of bin-level log2 coverages and segmentation calls - pattern: "*.{png}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.png": + type: file + description: File with plot of bin-level log2 coverages and segmentation calls + pattern: "*.{png}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@adamrtalbot" - "@drpatelh" diff --git a/modules/nf-core/cnvkit/batch/tests/main.nf.test b/modules/nf-core/cnvkit/batch/tests/main.nf.test index b2c0a9b720..f191a4b952 100644 --- a/modules/nf-core/cnvkit/batch/tests/main.nf.test +++ b/modules/nf-core/cnvkit/batch/tests/main.nf.test @@ -18,12 +18,12 @@ nextflow_process { """ input[0] = [ [ id:'test' ], // meta map - file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test_single_end_sorted_bam'], checkIfExists: true) - ] - input[1] = [[:],file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true)] + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.single_end.sorted.bam', checkIfExists: true) + ] + input[1] = [[:],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] input[2] = [[:],[]] - input[3] = [[:],file(params.test_data['sarscov2']['genome']['baits_bed'], checkIfExists: true)] + input[3] = [[:],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/baits.bed', checkIfExists: true)] input[4] = [[:],[]] input[5] = false """ @@ -34,7 +34,7 @@ nextflow_process { println process.out.bed[0][1] assertAll( { assert process.success }, - { assert snapshot(process.out.version).match() } + { assert snapshot(process.out.versions).match() } ) } @@ -49,10 +49,10 @@ nextflow_process { """ input[0] = [ [ id:'test'], // meta map - file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true) - ] - input[1] = [[:],file(params.test_data['homo_sapiens']['genome']['genome_fasta'], checkIfExists: true)] + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true) + ] + input[1] = [[:],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)] input[2] = [[:],[]] input[3] = [[:],[]] input[4] = [[:],[]] @@ -64,7 +64,7 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot(process.out.version).match() } + { assert snapshot(process.out.versions).match() } ) } @@ -79,11 +79,11 @@ nextflow_process { """ input[0] = [ [ id:'test'], // meta map - file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_sorted_cram'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_cram'], checkIfExists: true) - ] - input[1] = [[:],file(params.test_data['homo_sapiens']['genome']['genome_fasta'], checkIfExists: true)] - input[2] = [[:],file(params.test_data['homo_sapiens']['genome']['genome_fasta_fai'], checkIfExists: true)] + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test2.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true) + ] + input[1] = [[:],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)] + input[2] = [[:],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true)] input[3] = [[:],[]] input[4] = [[:],[]] input[5] = false @@ -94,7 +94,7 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot(process.out.version).match() } + { assert snapshot(process.out.versions).match() } ) } @@ -109,13 +109,13 @@ nextflow_process { """ input[0] = [ [ id:'test'], // meta map - file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_recalibrated_sorted_bam'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.recalibrated.sorted.bam', checkIfExists: true), [] - ] + ] input[1] = [[:],[]] input[2] = [[:],[]] input[3] = [[:],[]] - input[4] = [[:],file(params.test_data['homo_sapiens']['genome']['genome_21_reference_cnn'], checkIfExists: true)] + input[4] = [[:],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/reference_chr21.cnn', checkIfExists: true)] input[5] = false """ } @@ -124,7 +124,7 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot(process.out.version).match() } + { assert snapshot(process.out.versions).match() } ) } @@ -139,13 +139,13 @@ nextflow_process { """ input[0] = [ [ id:'test'], // meta map - file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_recalibrated_sorted_cram'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test2.paired_end.recalibrated.sorted.cram', checkIfExists: true), [] - ] - input[1] = [[:],file(params.test_data['homo_sapiens']['genome']['genome_fasta'], checkIfExists: true)] + ] + input[1] = [[:],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)] input[2] = [[:],[]] input[3] = [[:],[]] - input[4] = [[:],file(params.test_data['homo_sapiens']['genome']['genome_21_reference_cnn'], checkIfExists: true)] + input[4] = [[:],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/reference_chr21.cnn', checkIfExists: true)] input[5] = false """ } @@ -154,7 +154,7 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot(process.out.version).match() } + { assert snapshot(process.out.versions).match() } ) } @@ -168,11 +168,11 @@ nextflow_process { input[0] = [ [ id:'test'], // meta map [], - file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_recalibrated_sorted_cram'], checkIfExists: true) - ] - input[1] = [[:],file(params.test_data['homo_sapiens']['genome']['genome_21_fasta'], checkIfExists: true)] - input[2] = [[:],file(params.test_data['homo_sapiens']['genome']['genome_21_fasta_fai'], checkIfExists: true)] - input[3] = [[:],file(params.test_data['homo_sapiens']['genome']['genome_21_multi_interval_bed'], checkIfExists: true)] + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test2.paired_end.recalibrated.sorted.cram', checkIfExists: true) + ] + input[1] = [[:],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true)] + input[2] = [[:],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai', checkIfExists: true)] + input[3] = [[:],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed', checkIfExists: true)] input[4] = [[:],[]] input[5] = false """ @@ -182,7 +182,7 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot(process.out.version).match() } + { assert snapshot(process.out.versions).match() } ) } @@ -195,13 +195,13 @@ nextflow_process { process { """ input[0] = [ - [ id:'test' ], // meta map - file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test_single_end_sorted_bam'], checkIfExists: true) - ] - input[1] = [[:],file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true)] + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.single_end.sorted.bam', checkIfExists: true) + ] + input[1] = [[:],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] input[2] = [[:],[]] - input[3] = [[:],file(params.test_data['sarscov2']['genome']['baits_bed'], checkIfExists: true)] + input[3] = [[:],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/baits.bed', checkIfExists: true)] input[4] = [[:],[]] input[5] = false """ @@ -211,7 +211,7 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot(process.out.version).match() } + { assert snapshot(process.out.versions).match() } ) } @@ -228,11 +228,12 @@ nextflow_process { input[0] = [ [ id:'test'], // meta map [], - [file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_sorted_bam'], checkIfExists: true) + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.sorted.bam', checkIfExists: true) ] - ] - input[1] = [[:],file(params.test_data['homo_sapiens']['genome']['genome_fasta'], checkIfExists: true)] + ] + input[1] = [[:],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)] input[2] = [[:],[]] input[3] = [[:],[]] input[4] = [[:],[]] @@ -244,7 +245,7 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot(process.out.version).match() } + { assert snapshot(process.out.versions).match() } ) } @@ -260,12 +261,12 @@ nextflow_process { """ input[0] = [ [ id:'test' ], // meta map - file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test_single_end_sorted_bam'], checkIfExists: true) - ] - input[1] = [[:],file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true)] + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.single_end.sorted.bam', checkIfExists: true) + ] + input[1] = [[:],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] input[2] = [[:],[]] - input[3] = [[:],file(params.test_data['sarscov2']['genome']['baits_bed'], checkIfExists: true)] + input[3] = [[:],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/baits.bed', checkIfExists: true)] input[4] = [[:],[]] input[5] = false """ @@ -275,7 +276,11 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot(process.out.bed).match() } + { assert snapshot( + process.out.bed, + process.out.versions + ).match() + } ) } diff --git a/modules/nf-core/cnvkit/batch/tests/main.nf.test.snap b/modules/nf-core/cnvkit/batch/tests/main.nf.test.snap index 5d7cb143e9..205d43f8d0 100644 --- a/modules/nf-core/cnvkit/batch/tests/main.nf.test.snap +++ b/modules/nf-core/cnvkit/batch/tests/main.nf.test.snap @@ -1,19 +1,27 @@ { "cnvkit batch tumouronly mode - bam": { - "content": null, + "content": [ + [ + "versions.yml:md5,5737e02065ca6359586a4078708c73e6" + ] + ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.3" }, - "timestamp": "2024-03-20T14:30:55.150317561" + "timestamp": "2024-08-07T10:07:07.53837" }, "cnvkit batch tumouronly mode - cram": { - "content": null, + "content": [ + [ + "versions.yml:md5,0310a792526148b05f434944a1167835" + ] + ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.3" }, - "timestamp": "2024-03-20T14:31:31.039652656" + "timestamp": "2024-08-07T10:07:48.900117" }, "cnvkit batch - bam - stub": { "content": [ @@ -27,60 +35,87 @@ "baits.target.bed:md5,26d25ff2d6c45b6d92169b3559c6acdb" ] ] + ], + [ + "versions.yml:md5,5737e02065ca6359586a4078708c73e6" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.3" }, - "timestamp": "2024-03-20T14:33:25.790391941" + "timestamp": "2024-08-07T10:09:40.098703" }, "cnvkit batch wgs - bam": { - "content": null, + "content": [ + [ + "versions.yml:md5,5737e02065ca6359586a4078708c73e6" + ] + ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.3" }, - "timestamp": "2024-03-20T14:30:10.677690173" + "timestamp": "2024-08-07T10:06:25.023798" }, "cnvkit batch germline hybrid mode - bam": { - "content": null, + "content": [ + [ + "versions.yml:md5,5737e02065ca6359586a4078708c73e6" + ] + ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.3" }, - "timestamp": "2024-03-20T14:32:50.217076532" + "timestamp": "2024-08-07T10:09:19.191221" }, "cnvkit batch hybrid mode - bam": { - "content": null, + "content": [ + [ + "versions.yml:md5,5737e02065ca6359586a4078708c73e6" + ] + ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.3" }, - "timestamp": "2024-03-20T13:54:41.69602289" + "timestamp": "2024-08-07T10:06:10.438545" }, "cnvkit batch wgs - cram": { - "content": null, + "content": [ + [ + "versions.yml:md5,0310a792526148b05f434944a1167835" + ] + ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.3" }, - "timestamp": "2024-03-20T14:30:27.271060826" + "timestamp": "2024-08-07T10:06:39.492881" }, "cnvkit batch pon mode - bam": { - "content": null, + "content": [ + [ + "versions.yml:md5,5737e02065ca6359586a4078708c73e6" + ] + ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.3" }, - "timestamp": "2024-03-20T14:33:06.391306794" + "timestamp": "2024-08-07T10:09:29.636924" }, "cnvkit batch germline mode - cram": { - "content": null, + "content": [ + [ + "versions.yml:md5,0310a792526148b05f434944a1167835" + ] + ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.3" }, - "timestamp": "2024-03-20T14:32:23.38326412" + "timestamp": "2024-08-07T10:09:07.307311" } } \ No newline at end of file diff --git a/modules/nf-core/cnvkit/call/environment.yml b/modules/nf-core/cnvkit/call/environment.yml index 3b96de7008..152af54d19 100644 --- a/modules/nf-core/cnvkit/call/environment.yml +++ b/modules/nf-core/cnvkit/call/environment.yml @@ -1,7 +1,5 @@ -name: cnvkit_call channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::cnvkit=0.9.10 diff --git a/modules/nf-core/cnvkit/call/main.nf b/modules/nf-core/cnvkit/call/main.nf index fade6df0fd..06d51e857e 100644 --- a/modules/nf-core/cnvkit/call/main.nf +++ b/modules/nf-core/cnvkit/call/main.nf @@ -33,4 +33,15 @@ process CNVKIT_CALL { cnvkit: \$(cnvkit.py version | sed -e 's/cnvkit v//g') END_VERSIONS """ + + stub: + def prefix = task.ext.prefix ?: "${meta.id}" + """ + touch ${prefix}.cns + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + cnvkit: \$(cnvkit.py version | sed -e 's/cnvkit v//g') + END_VERSIONS + """ } diff --git a/modules/nf-core/cnvkit/call/meta.yml b/modules/nf-core/cnvkit/call/meta.yml index 64dc336869..b3b4a4a78a 100644 --- a/modules/nf-core/cnvkit/call/meta.yml +++ b/modules/nf-core/cnvkit/call/meta.yml @@ -1,5 +1,6 @@ name: cnvkit_call -description: Given segmented log2 ratio estimates (.cns), derive each segment’s absolute integer copy number +description: Given segmented log2 ratio estimates (.cns), derive each segment’s absolute + integer copy number keywords: - cnvkit - bam @@ -12,34 +13,37 @@ tools: homepage: https://cnvkit.readthedocs.io/en/stable/index.html documentation: https://cnvkit.readthedocs.io/en/stable/index.html licence: ["Apache-2.0"] + identifier: biotools:cnvkit input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - cns: - type: file - description: CNVKit CNS file. - pattern: "*.cns" - - vcf: - type: file - description: Germline VCF file for BAF. - pattern: "*.vcf{,.gz}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - cns: + type: file + description: CNVKit CNS file. + pattern: "*.cns" + - vcf: + type: file + description: Germline VCF file for BAF. + pattern: "*.vcf{,.gz}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] + - cns: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.cns": + type: file + description: CNS file. + pattern: "*.cns" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - - output: - type: file - description: File containing copy number information in new format. - pattern: "*.{bed,vcf,cdt,jtv,seg,interval_count}" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@adamrtalbot" - "@priesgo" diff --git a/modules/nf-core/cnvkit/call/tests/main.nf.test b/modules/nf-core/cnvkit/call/tests/main.nf.test new file mode 100644 index 0000000000..6012ef137f --- /dev/null +++ b/modules/nf-core/cnvkit/call/tests/main.nf.test @@ -0,0 +1,83 @@ + +nextflow_process { + + name "Test Process CNVKIT_CALL" + script "../main.nf" + process "CNVKIT_CALL" + + tag "modules" + tag "modules_nfcore" + tag "cnvkit" + tag "cnvkit/call" + + test("test-cnvkit-call") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file('https://raw.githubusercontent.com/etal/cnvkit/v0.9.9/test/formats/amplicon.cns', checkIfExists: true), + [] + ] + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test-cnvkit-call-with-vcf") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file('https://raw.githubusercontent.com/etal/cnvkit/v0.9.9/test/formats/amplicon.cns', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true) + ] + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test-cnvkit-call-with-vcf-stub") { + options '-stub' + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file('https://raw.githubusercontent.com/etal/cnvkit/v0.9.9/test/formats/amplicon.cns', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true) + ] + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + +} diff --git a/modules/nf-core/cnvkit/call/tests/main.nf.test.snap b/modules/nf-core/cnvkit/call/tests/main.nf.test.snap new file mode 100644 index 0000000000..844a415ecf --- /dev/null +++ b/modules/nf-core/cnvkit/call/tests/main.nf.test.snap @@ -0,0 +1,107 @@ +{ + "test-cnvkit-call": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.cns:md5,7746029caf9ecc134a075a2d50be269f" + ] + ], + "1": [ + "versions.yml:md5,f47253e21b991f72a741d6b5c9a351a5" + ], + "cns": [ + [ + { + "id": "test", + "single_end": false + }, + "test.cns:md5,7746029caf9ecc134a075a2d50be269f" + ] + ], + "versions": [ + "versions.yml:md5,f47253e21b991f72a741d6b5c9a351a5" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-05T22:24:41.048386" + }, + "test-cnvkit-call-with-vcf": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.cns:md5,2a4b3da8a8131a4ed4ae902a9f96a405" + ] + ], + "1": [ + "versions.yml:md5,f47253e21b991f72a741d6b5c9a351a5" + ], + "cns": [ + [ + { + "id": "test", + "single_end": false + }, + "test.cns:md5,2a4b3da8a8131a4ed4ae902a9f96a405" + ] + ], + "versions": [ + "versions.yml:md5,f47253e21b991f72a741d6b5c9a351a5" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-05T22:24:50.134984" + }, + "test-cnvkit-call-with-vcf-stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.cns:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,f47253e21b991f72a741d6b5c9a351a5" + ], + "cns": [ + [ + { + "id": "test", + "single_end": false + }, + "test.cns:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,f47253e21b991f72a741d6b5c9a351a5" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-05T22:24:56.743954" + } +} \ No newline at end of file diff --git a/modules/nf-core/cnvkit/export/environment.yml b/modules/nf-core/cnvkit/export/environment.yml index a61b2765f6..152af54d19 100644 --- a/modules/nf-core/cnvkit/export/environment.yml +++ b/modules/nf-core/cnvkit/export/environment.yml @@ -1,7 +1,5 @@ -name: cnvkit_export channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::cnvkit=0.9.10 diff --git a/modules/nf-core/cnvkit/export/meta.yml b/modules/nf-core/cnvkit/export/meta.yml index 836baa1b58..a573e03bac 100644 --- a/modules/nf-core/cnvkit/export/meta.yml +++ b/modules/nf-core/cnvkit/export/meta.yml @@ -1,5 +1,6 @@ name: cnvkit_export -description: Convert copy number ratio tables (.cnr files) or segments (.cns) to another format. +description: Convert copy number ratio tables (.cnr files) or segments (.cns) to another + format. keywords: - cnvkit - copy number @@ -11,30 +12,32 @@ tools: homepage: https://cnvkit.readthedocs.io/en/stable/index.html documentation: https://cnvkit.readthedocs.io/en/stable/index.html licence: ["Apache-2.0"] + identifier: biotools:cnvkit input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - cns: - type: file - description: CNVKit CNS file. - pattern: "*.cns" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - cns: + type: file + description: CNVKit CNS file. + pattern: "*.cns" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] + - output: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.${suffix}: + type: file + description: Output file - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - - cns: - type: file - description: File containing copy number segment information - pattern: "*.{cns}" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@adamrtalbot" - "@priesgo" diff --git a/modules/nf-core/cnvkit/genemetrics/environment.yml b/modules/nf-core/cnvkit/genemetrics/environment.yml index 14deb0ef73..152af54d19 100644 --- a/modules/nf-core/cnvkit/genemetrics/environment.yml +++ b/modules/nf-core/cnvkit/genemetrics/environment.yml @@ -1,7 +1,5 @@ -name: cnvkit_genemetrics channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::cnvkit=0.9.10 diff --git a/modules/nf-core/cnvkit/genemetrics/main.nf b/modules/nf-core/cnvkit/genemetrics/main.nf old mode 100755 new mode 100644 diff --git a/modules/nf-core/cnvkit/genemetrics/meta.yml b/modules/nf-core/cnvkit/genemetrics/meta.yml old mode 100755 new mode 100644 index 4bef28c7d5..6b110accc2 --- a/modules/nf-core/cnvkit/genemetrics/meta.yml +++ b/modules/nf-core/cnvkit/genemetrics/meta.yml @@ -12,34 +12,47 @@ tools: homepage: https://cnvkit.readthedocs.io/en/stable/index.html documentation: https://cnvkit.readthedocs.io/en/stable/index.html licence: ["Apache-2.0"] + identifier: biotools:cnvkit input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - cnr: - type: file - description: CNR file - pattern: "*.cnr" - - cns: - type: file - description: CNS file [Optional] - pattern: "*.cns" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - cnr: + type: file + description: CNR file + pattern: "*.cnr" + - cns: + type: file + description: CNS file [Optional] + pattern: "*.cns" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - txt: - type: file - description: TXT file - pattern: "*.txt" + - tsv: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tsv": + type: file + description: TSV file + pattern: "*.tsv" + - cnn: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.cnn": + type: file + description: CNN file + pattern: "*.cnn" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@adamrtalbot" - "@marrip" diff --git a/modules/nf-core/cnvkit/genemetrics/tests/main.nf.test b/modules/nf-core/cnvkit/genemetrics/tests/main.nf.test new file mode 100644 index 0000000000..a2d8b4580c --- /dev/null +++ b/modules/nf-core/cnvkit/genemetrics/tests/main.nf.test @@ -0,0 +1,59 @@ + +nextflow_process { + + name "Test Process CNVKIT_GENEMETRICS" + script "../main.nf" + process "CNVKIT_GENEMETRICS" + + tag "modules" + tag "modules_nfcore" + tag "cnvkit" + tag "cnvkit/genemetrics" + + test("test-cnvkit-genemetrics-with-cns") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file('https://raw.githubusercontent.com/etal/cnvkit/v0.9.9/test/formats/amplicon.cnr', checkIfExists: true), + file('https://raw.githubusercontent.com/etal/cnvkit/v0.9.9/test/formats/amplicon.cns', checkIfExists: true) + ] + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test-cnvkit-genemetrics-without-cns") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file('https://raw.githubusercontent.com/etal/cnvkit/v0.9.9/test/formats/amplicon.cnr', checkIfExists: true), + [] + ] + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + +} diff --git a/modules/nf-core/cnvkit/genemetrics/tests/main.nf.test.snap b/modules/nf-core/cnvkit/genemetrics/tests/main.nf.test.snap new file mode 100644 index 0000000000..53ed81f370 --- /dev/null +++ b/modules/nf-core/cnvkit/genemetrics/tests/main.nf.test.snap @@ -0,0 +1,72 @@ +{ + "test-cnvkit-genemetrics-without-cns": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.tsv:md5,622e154a107301da6f456b4b3196b79d" + ] + ], + "1": [ + "versions.yml:md5,d3f23da560774564afa9c69e2d171e5f" + ], + "tsv": [ + [ + { + "id": "test", + "single_end": false + }, + "test.tsv:md5,622e154a107301da6f456b4b3196b79d" + ] + ], + "versions": [ + "versions.yml:md5,d3f23da560774564afa9c69e2d171e5f" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-28T11:17:13.604558" + }, + "test-cnvkit-genemetrics-with-cns": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.tsv:md5,2a18eca552ea33faab1d39795d9e051c" + ] + ], + "1": [ + "versions.yml:md5,d3f23da560774564afa9c69e2d171e5f" + ], + "tsv": [ + [ + { + "id": "test", + "single_end": false + }, + "test.tsv:md5,2a18eca552ea33faab1d39795d9e051c" + ] + ], + "versions": [ + "versions.yml:md5,d3f23da560774564afa9c69e2d171e5f" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-28T11:17:04.195978" + } +} \ No newline at end of file diff --git a/modules/nf-core/cnvkit/reference/environment.yml b/modules/nf-core/cnvkit/reference/environment.yml index e3070044d0..b683406cc5 100644 --- a/modules/nf-core/cnvkit/reference/environment.yml +++ b/modules/nf-core/cnvkit/reference/environment.yml @@ -1,7 +1,5 @@ -name: cnvkit_reference channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::cnvkit=0.9.11 diff --git a/modules/nf-core/cnvkit/reference/meta.yml b/modules/nf-core/cnvkit/reference/meta.yml index 8747893b6f..965a7b5795 100644 --- a/modules/nf-core/cnvkit/reference/meta.yml +++ b/modules/nf-core/cnvkit/reference/meta.yml @@ -15,33 +15,32 @@ tools: tool_dev_url: https://github.com/etal/cnvkit doi: 10.1371/journal.pcbi.1004873 licence: ["Apache-2.0"] + identifier: biotools:cnvkit input: - - fasta: - type: file - description: File containing reference genome - pattern: "*.{fasta}" - - targets: - type: file - description: File containing genomic regions - pattern: "*.{bed}" - - antitargets: - type: file - description: File containing off-target genomic regions - pattern: "*.{bed}" + - - fasta: + type: file + description: File containing reference genome + pattern: "*.{fasta}" + - - targets: + type: file + description: File containing genomic regions + pattern: "*.{bed}" + - - antitargets: + type: file + description: File containing off-target genomic regions + pattern: "*.{bed}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - cnn: - type: file - description: File containing a copy-number reference (required for CNV calling in tumor_only mode) - pattern: "*.{cnn}" + - "*.cnn": + type: file + description: File containing a copy-number reference (required for CNV calling + in tumor_only mode) + pattern: "*.{cnn}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@adamrtalbot" - "@priesgo" diff --git a/modules/nf-core/cnvkit/reference/tests/main.nf.test b/modules/nf-core/cnvkit/reference/tests/main.nf.test index 47039e3086..f03dd231f3 100644 --- a/modules/nf-core/cnvkit/reference/tests/main.nf.test +++ b/modules/nf-core/cnvkit/reference/tests/main.nf.test @@ -24,8 +24,7 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot(process.out.versions).match("version") }, - { assert snapshot(file(process.out.cnn.get(0)).readLines()[0]).match() } + { assert snapshot(process.out).match() } ) } diff --git a/modules/nf-core/cnvkit/reference/tests/main.nf.test.snap b/modules/nf-core/cnvkit/reference/tests/main.nf.test.snap index 353f6378a6..eb5dee671d 100644 --- a/modules/nf-core/cnvkit/reference/tests/main.nf.test.snap +++ b/modules/nf-core/cnvkit/reference/tests/main.nf.test.snap @@ -1,13 +1,26 @@ { "human - [fasta, bed]": { "content": [ - "chromosome\tstart\tend\tgene\tlog2\tdepth\tgc\trmask\tspread" + { + "0": [ + "multi_intervals.reference.cnn:md5,7c4a7902f5ab101b1f9d6038d331b3d9" + ], + "1": [ + "versions.yml:md5,85ff8911567b4e1245b883541ad3cc1e" + ], + "cnn": [ + "multi_intervals.reference.cnn:md5,7c4a7902f5ab101b1f9d6038d331b3d9" + ], + "versions": [ + "versions.yml:md5,85ff8911567b4e1245b883541ad3cc1e" + ] + } ], "meta": { "nf-test": "0.8.4", "nextflow": "23.10.0" }, - "timestamp": "2024-05-13T14:00:12.102517" + "timestamp": "2024-06-10T10:25:05.273892" }, "human - [fasta, bed] - stub": { "content": [ @@ -31,17 +44,5 @@ "nextflow": "23.10.0" }, "timestamp": "2024-05-13T14:00:30.095718" - }, - "version": { - "content": [ - [ - "versions.yml:md5,85ff8911567b4e1245b883541ad3cc1e" - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.0" - }, - "timestamp": "2024-05-13T14:00:12.090836" } } \ No newline at end of file diff --git a/modules/nf-core/controlfreec/assesssignificance/controlfreec-assesssignificance.diff b/modules/nf-core/controlfreec/assesssignificance/controlfreec-assesssignificance.diff index 96691730cd..49c0c9af21 100644 --- a/modules/nf-core/controlfreec/assesssignificance/controlfreec-assesssignificance.diff +++ b/modules/nf-core/controlfreec/assesssignificance/controlfreec-assesssignificance.diff @@ -1,4 +1,5 @@ Changes in module 'nf-core/controlfreec/assesssignificance' +Changes in 'controlfreec/assesssignificance/main.nf': --- modules/nf-core/controlfreec/assesssignificance/main.nf +++ modules/nf-core/controlfreec/assesssignificance/main.nf @@ -4,8 +4,8 @@ @@ -31,57 +32,19 @@ Changes in module 'nf-core/controlfreec/assesssignificance' touch ${prefix}.p.value.txt +'modules/nf-core/controlfreec/assesssignificance/meta.yml' is unchanged +Changes in 'controlfreec/assesssignificance/environment.yml': --- modules/nf-core/controlfreec/assesssignificance/environment.yml +++ modules/nf-core/controlfreec/assesssignificance/environment.yml -@@ -4,4 +4,4 @@ +@@ -2,4 +2,4 @@ + - conda-forge - bioconda - - defaults dependencies: - - bioconda::control-freec=11.6b + - bioconda::control-freec=11.6 ---- modules/nf-core/controlfreec/assesssignificance/tests/main.nf.test.snap -+++ modules/nf-core/controlfreec/assesssignificance/tests/main.nf.test.snap -@@ -13,7 +13,7 @@ - ] - ], - "1": [ -- "versions.yml:md5,81750d0b4c0e563bd392720d09ae024f" -+ "versions.yml:md5,0a11399a3318a7c75460c4eb71d58766" - ], - "p_value_txt": [ - [ -@@ -26,26 +26,26 @@ - ] - ], - "versions": [ -- "versions.yml:md5,81750d0b4c0e563bd392720d09ae024f" -+ "versions.yml:md5,0a11399a3318a7c75460c4eb71d58766" - ] - } - ], - "meta": { - "nf-test": "0.8.4", -- "nextflow": "23.10.0" -+ "nextflow": "24.02.0" - }, -- "timestamp": "2024-03-26T16:24:34.84551" -+ "timestamp": "2024-04-05T16:15:36.357267" - }, - "version": { - "content": [ - [ -- "versions.yml:md5,81750d0b4c0e563bd392720d09ae024f" -+ "versions.yml:md5,0a11399a3318a7c75460c4eb71d58766" - ] - ], - "meta": { - "nf-test": "0.8.4", -- "nextflow": "23.10.0" -+ "nextflow": "24.02.0" - }, -- "timestamp": "2024-03-26T17:23:22.833417" -+ "timestamp": "2024-04-05T16:14:54.943475" - } - } +'modules/nf-core/controlfreec/assesssignificance/tests/main.nf.test.snap' is unchanged +'modules/nf-core/controlfreec/assesssignificance/tests/nextflow.config' is unchanged +'modules/nf-core/controlfreec/assesssignificance/tests/main.nf.test' is unchanged +'modules/nf-core/controlfreec/assesssignificance/tests/tags.yml' is unchanged ************************************************************ diff --git a/modules/nf-core/controlfreec/assesssignificance/environment.yml b/modules/nf-core/controlfreec/assesssignificance/environment.yml index cb0b9c17c3..444d29ddd4 100644 --- a/modules/nf-core/controlfreec/assesssignificance/environment.yml +++ b/modules/nf-core/controlfreec/assesssignificance/environment.yml @@ -1,7 +1,5 @@ -name: controlfreec_assesssignificance channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::control-freec=11.6 diff --git a/modules/nf-core/controlfreec/assesssignificance/meta.yml b/modules/nf-core/controlfreec/assesssignificance/meta.yml index b8cda6dd50..8a6ab47edb 100644 --- a/modules/nf-core/controlfreec/assesssignificance/meta.yml +++ b/modules/nf-core/controlfreec/assesssignificance/meta.yml @@ -1,5 +1,6 @@ name: controlfreec_assesssignificance -description: Add both Wilcoxon test and Kolmogorov-Smirnov test p-values to each CNV output of FREEC +description: Add both Wilcoxon test and Kolmogorov-Smirnov test p-values to each CNV + output of FREEC keywords: - cna - cnv @@ -8,41 +9,49 @@ keywords: - tumor-only tools: - controlfreec/assesssignificance: - description: Copy number and genotype annotation from whole genome and whole exome sequencing data. + description: Copy number and genotype annotation from whole genome and whole exome + sequencing data. homepage: http://boevalab.inf.ethz.ch/FREEC documentation: http://boevalab.inf.ethz.ch/FREEC/tutorial.html tool_dev_url: https://github.com/BoevaLab/FREEC/ doi: "10.1093/bioinformatics/btq635" licence: ["GPL >=2"] + identifier: "" input: # Only when we have meta - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - cnvs: - type: file - description: _CNVs file generated by FREEC - pattern: "*._CNVs" - - ratio: - type: file - description: ratio file generated by FREEC - pattern: "*.ratio.txt" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - cnvs: + type: file + description: _CNVs file generated by FREEC + pattern: "*._CNVs" + - ratio: + type: file + description: ratio file generated by FREEC + pattern: "*.ratio.txt" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - p_value_txt: - type: file - description: CNV file containing p_values for each call - pattern: "*.p.value.txt" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.p.value.txt": + type: file + description: CNV file containing p_values for each call + pattern: "*.p.value.txt" + - ue_txt: + type: file + description: CNV file containing p_values for each call + pattern: "*.p.value.txt" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@FriederikeHanssen" maintainers: diff --git a/modules/nf-core/controlfreec/assesssignificance/tests/main.nf.test.snap b/modules/nf-core/controlfreec/assesssignificance/tests/main.nf.test.snap index 0498b9c5be..ab354908ec 100644 --- a/modules/nf-core/controlfreec/assesssignificance/tests/main.nf.test.snap +++ b/modules/nf-core/controlfreec/assesssignificance/tests/main.nf.test.snap @@ -13,7 +13,7 @@ ] ], "1": [ - "versions.yml:md5,0a11399a3318a7c75460c4eb71d58766" + "versions.yml:md5,81750d0b4c0e563bd392720d09ae024f" ], "p_value_txt": [ [ @@ -26,26 +26,26 @@ ] ], "versions": [ - "versions.yml:md5,0a11399a3318a7c75460c4eb71d58766" + "versions.yml:md5,81750d0b4c0e563bd392720d09ae024f" ] } ], "meta": { "nf-test": "0.8.4", - "nextflow": "24.02.0" + "nextflow": "23.10.0" }, - "timestamp": "2024-04-05T16:15:36.357267" + "timestamp": "2024-03-26T16:24:34.84551" }, "version": { "content": [ [ - "versions.yml:md5,0a11399a3318a7c75460c4eb71d58766" + "versions.yml:md5,81750d0b4c0e563bd392720d09ae024f" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "24.02.0" + "nextflow": "23.10.0" }, - "timestamp": "2024-04-05T16:14:54.943475" + "timestamp": "2024-03-26T17:23:22.833417" } } \ No newline at end of file diff --git a/modules/nf-core/controlfreec/freec/environment.yml b/modules/nf-core/controlfreec/freec/environment.yml index cb76c6ba93..f6b64529bc 100644 --- a/modules/nf-core/controlfreec/freec/environment.yml +++ b/modules/nf-core/controlfreec/freec/environment.yml @@ -1,7 +1,5 @@ -name: controlfreec_freec channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::control-freec=11.6b diff --git a/modules/nf-core/controlfreec/freec/meta.yml b/modules/nf-core/controlfreec/freec/meta.yml index 1442bbe44a..6b35a38be7 100644 --- a/modules/nf-core/controlfreec/freec/meta.yml +++ b/modules/nf-core/controlfreec/freec/meta.yml @@ -1,5 +1,6 @@ name: controlfreec_freec -description: Copy number and genotype annotation from whole genome and whole exome sequencing data +description: Copy number and genotype annotation from whole genome and whole exome + sequencing data keywords: - cna - cnv @@ -8,172 +9,176 @@ keywords: - tumor-only tools: - controlfreec/freec: - description: Copy number and genotype annotation from whole genome and whole exome sequencing data. + description: Copy number and genotype annotation from whole genome and whole exome + sequencing data. homepage: http://boevalab.inf.ethz.ch/FREEC documentation: http://boevalab.inf.ethz.ch/FREEC/tutorial.html tool_dev_url: https://github.com/BoevaLab/FREEC/ doi: "10.1093/bioinformatics/btq635" licence: ["GPL >=2"] + identifier: "" input: - - args: - type: map - description: | - Groovy Map containing tool parameters. MUST follow the structure/keywords below and be provided via modules.config. - Parameters marked as (optional) can be removed from the map, if they are not set. All values must be surrounded by quotes, meta map parameters can be set with, i.e. `sex = meta.sex`: - For default values, please check the documentation above. - - ``` - { - [ - "general" :[ - "bedgraphoutput": (optional), - "breakpointthreshold": (optional), - "breakpointtype": (optional), - "coefficientofvariation": (optional), - "contamination": (optional), - "contaminationadjustment": (optional), - "degree": (optional), - "forcegccontentnormalization": (optional), - "gccontentprofile": (optional), - "intercept": (optional), - "mincnalength": (optional), - "minmappabilityperwindow": (optional), - "minexpectedgc": (optional), - "maxexpectedgc": (optional), - "minimalsubclonepresence": (optional), - "noisydata": (optional), - "ploidy": (optional), - "printNA": (optional), - "readcountthreshold": (optional), - "sex": (optional), - "step": (optional), - "telocentromeric": (optional), - "uniquematch": (optional), - "window": (optional) - ], - "control":[ - "inputformat": (required), - "mateorientation": (optional), - ], - "sample":[ - "inputformat": (required), - "mateorientation": (optional), - ], - "BAF":[ - "minimalcoverageperposition": (optional), - "minimalqualityperposition": (optional), - "shiftinquality": (optional) - ] - ] - } - ``` - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - mateFile_normal: - type: file - description: File with mapped reads - pattern: "*.{sam,bam,pileup(.gz),bowtie(.gz),eland(.gz),arachne(.gz),psl(.gz),bed(.gz)}" - - mateFile_tumor: - type: file - description: File with mapped reads - pattern: "*.{sam,bam,pileup(.gz),bowtie(.gz),eland(.gz),arachne(.gz),psl(.gz),bed(.gz)}" - - cpn_normal: - type: file - description: Raw copy number profiles (optional) - pattern: "*.cpn" - - cpn_tumor: - type: file - description: Raw copy number profiles (optional) - pattern: "*.cpn" - - minipileup_normal: - type: file - description: miniPileup file from previous run (optional) - pattern: "*.pileup" - - minipileup_tumor: - type: file - description: miniPileup file from previous run (optional) - pattern: "*.pileup" - - fasta: - type: file - description: Reference file (optional; required if args 'makePileup' is set) - pattern: "*.{fasta,fna,fa}" - - fai: - type: file - description: Fasta index - pattern: "*.fai" - - snp_position: - type: file - description: Path to a BED or VCF file with SNP positions to create a mini pileup file from the initial BAM file provided in mateFile (optional) - pattern: "*.{bed,vcf}" - - known_snps: - type: file - description: File with known SNPs - pattern: "*.{vcf,vcf.gz}" - - known_snps_tbi: - type: file - description: Index of known_snps - pattern: "*.tbi" - - chr_directory: - type: file - description: Path to directory with chromosome fasta files (optional, required if gccontentprofile is not provided) - pattern: "*/" - - mappability: - type: file - description: Contains information of mappable positions (optional) - pattern: "*.gem" - - target_bed: - type: file - description: Sorted bed file containing capture regions (optional) - pattern: "*.bed" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - mpileup_normal: + type: file + description: miniPileup file + - mpileup_tumor: + type: file + description: miniPileup file + - cpn_normal: + type: file + description: Raw copy number profiles (optional) + pattern: "*.cpn" + - cpn_tumor: + type: file + description: Raw copy number profiles (optional) + pattern: "*.cpn" + - minipileup_normal: + type: file + description: miniPileup file from previous run (optional) + pattern: "*.pileup" + - minipileup_tumor: + type: file + description: miniPileup file from previous run (optional) + pattern: "*.pileup" + - - fasta: + type: file + description: Reference file (optional; required if args 'makePileup' is set) + pattern: "*.{fasta,fna,fa}" + - - fai: + type: file + description: Fasta index + pattern: "*.fai" + - - snp_position: + type: file + description: Path to a BED or VCF file with SNP positions to create a mini pileup + file from the initial BAM file provided in mateFile (optional) + pattern: "*.{bed,vcf}" + - - known_snps: + type: file + description: File with known SNPs + pattern: "*.{vcf,vcf.gz}" + - - known_snps_tbi: + type: file + description: Index of known_snps + pattern: "*.tbi" + - - chr_directory: + type: file + description: Path to directory with chromosome fasta files (optional, required + if gccontentprofile is not provided) + pattern: "*/" + - - mappability: + type: file + description: Contains information of mappable positions (optional) + pattern: "*.gem" + - - target_bed: + type: file + description: Sorted bed file containing capture regions (optional) + pattern: "*.bed" + - - gccontent_profile: + type: file + description: File with GC-content profile output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - bedgraph: - type: file - description: Bedgraph format for the UCSC genome browser - pattern: ".bedgraph" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*_ratio.BedGraph": + type: file + description: Bedgraph format for the UCSC genome browser + pattern: ".bedgraph" - control_cpn: - type: file - description: files with raw copy number profiles - pattern: "*_control.cpn" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*_control.cpn": + type: file + description: files with raw copy number profiles + pattern: "*_control.cpn" - sample_cpn: - type: file - description: files with raw copy number profiles - pattern: "*_sample.cpn" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*_sample.cpn": + type: file + description: files with raw copy number profiles + pattern: "*_sample.cpn" - gcprofile_cpn: - type: file - description: file with GC-content profile. - pattern: "GC_profile.*.cpn" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - GC_profile.*.cpn: + type: file + description: file with GC-content profile. + pattern: "GC_profile.*.cpn" - BAF: - type: file - description: file B-allele frequencies for each possibly heterozygous SNP position - pattern: "*_BAF.txt" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*_BAF.txt": + type: file + description: file B-allele frequencies for each possibly heterozygous SNP position + pattern: "*_BAF.txt" - CNV: - type: file - description: file with coordinates of predicted copy number alterations. - pattern: "*_CNVs" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*_CNVs": + type: file + description: file with coordinates of predicted copy number alterations. + pattern: "*_CNVs" - info: - type: file - description: parsable file with information about FREEC run - pattern: "*_info.txt" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*_info.txt": + type: file + description: parsable file with information about FREEC run + pattern: "*_info.txt" - ratio: - type: file - description: file with ratios and predicted copy number alterations for each window - pattern: "*_ratio.txt" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*_ratio.txt": + type: file + description: file with ratios and predicted copy number alterations for each + window + pattern: "*_ratio.txt" - config: - type: file - description: Config file used to run Control-FREEC - pattern: "config.txt" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - config.txt: + type: file + description: Config file used to run Control-FREEC + pattern: "config.txt" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@FriederikeHanssen" maintainers: diff --git a/modules/nf-core/controlfreec/freec/tests/main.nf.test b/modules/nf-core/controlfreec/freec/tests/main.nf.test index 90312aebc8..8e38becdfd 100644 --- a/modules/nf-core/controlfreec/freec/tests/main.nf.test +++ b/modules/nf-core/controlfreec/freec/tests/main.nf.test @@ -70,9 +70,9 @@ nextflow_process { path(process.out.BAF.get(0).get(1)).readLines()[0], path(process.out.ratio.get(0).get(1)).readLines()[0], path(process.out.config.get(0).get(1)).readLines()[0], - path(process.out.info.get(0).get(1)).readLines()[0] - ).match() }, - { assert snapshot(process.out.versions).match("version") } + path(process.out.info.get(0).get(1)).readLines()[0], + process.out.versions + ).match() } ) } @@ -159,7 +159,5 @@ nextflow_process { { assert snapshot(process.out).match() } ) } - } - } diff --git a/modules/nf-core/controlfreec/freec/tests/main.nf.test.snap b/modules/nf-core/controlfreec/freec/tests/main.nf.test.snap index 39850d0256..eb0c468ae6 100644 --- a/modules/nf-core/controlfreec/freec/tests/main.nf.test.snap +++ b/modules/nf-core/controlfreec/freec/tests/main.nf.test.snap @@ -178,21 +178,9 @@ ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.0" + "nextflow": "24.04.2" }, - "timestamp": "2024-03-26T13:42:56.629043" - }, - "version": { - "content": [ - [ - "versions.yml:md5,e704fc0e6d1ac333dc419498fa128769" - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.0" - }, - "timestamp": "2024-03-27T10:01:40.977048" + "timestamp": "2024-07-09T10:41:07.003311" }, "human - mpileup": { "content": [ @@ -201,12 +189,15 @@ "Chromosome\tPosition\tBAF\tFittedA\tFittedB\tA\tB\tuncertainty", "Chromosome\tStart\tRatio\tMedianRatio\tCopyNumber\tBAF\testimatedBAF\tGenotype\tUncertaintyOfGT", "[general]", - "Program_Version\tv11.6" + "Program_Version\tv11.6", + [ + "versions.yml:md5,e704fc0e6d1ac333dc419498fa128769" + ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.0" + "nextflow": "24.04.2" }, - "timestamp": "2024-03-27T10:14:50.936823" + "timestamp": "2024-07-09T10:40:42.538035" } } \ No newline at end of file diff --git a/modules/nf-core/controlfreec/freec2bed/environment.yml b/modules/nf-core/controlfreec/freec2bed/environment.yml index 12601ffa55..f6b64529bc 100644 --- a/modules/nf-core/controlfreec/freec2bed/environment.yml +++ b/modules/nf-core/controlfreec/freec2bed/environment.yml @@ -1,7 +1,5 @@ -name: controlfreec_freec2bed channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::control-freec=11.6b diff --git a/modules/nf-core/controlfreec/freec2bed/meta.yml b/modules/nf-core/controlfreec/freec2bed/meta.yml index b10c8ab377..e01f870f4d 100644 --- a/modules/nf-core/controlfreec/freec2bed/meta.yml +++ b/modules/nf-core/controlfreec/freec2bed/meta.yml @@ -8,36 +8,40 @@ keywords: - tumor-only tools: - controlfreec: - description: Copy number and genotype annotation from whole genome and whole exome sequencing data. + description: Copy number and genotype annotation from whole genome and whole exome + sequencing data. homepage: http://boevalab.inf.ethz.ch/FREEC documentation: http://boevalab.inf.ethz.ch/FREEC/tutorial.html tool_dev_url: https://github.com/BoevaLab/FREEC/ doi: "10.1093/bioinformatics/btq635" licence: ["GPL >=2"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - ratio: - type: file - description: ratio file generated by FREEC - pattern: "*.ratio.txt" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ratio: + type: file + description: ratio file generated by FREEC + pattern: "*.ratio.txt" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - bed: - type: file - description: Bed file - pattern: "*.bed" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bed": + type: file + description: Bed file + pattern: "*.bed" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@FriederikeHanssen" maintainers: diff --git a/modules/nf-core/controlfreec/freec2circos/environment.yml b/modules/nf-core/controlfreec/freec2circos/environment.yml index 1915abfd7f..f6b64529bc 100644 --- a/modules/nf-core/controlfreec/freec2circos/environment.yml +++ b/modules/nf-core/controlfreec/freec2circos/environment.yml @@ -1,7 +1,5 @@ -name: controlfreec_freec2circos channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::control-freec=11.6b diff --git a/modules/nf-core/controlfreec/freec2circos/meta.yml b/modules/nf-core/controlfreec/freec2circos/meta.yml index 2c6b77d611..5368c0429e 100644 --- a/modules/nf-core/controlfreec/freec2circos/meta.yml +++ b/modules/nf-core/controlfreec/freec2circos/meta.yml @@ -8,36 +8,40 @@ keywords: - tumor-only tools: - controlfreec: - description: Copy number and genotype annotation from whole genome and whole exome sequencing data. + description: Copy number and genotype annotation from whole genome and whole exome + sequencing data. homepage: http://boevalab.inf.ethz.ch/FREEC documentation: http://boevalab.inf.ethz.ch/FREEC/tutorial.html tool_dev_url: https://github.com/BoevaLab/FREEC/ doi: "10.1093/bioinformatics/btq635" licence: ["GPL >=2"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - ratio: - type: file - description: ratio file generated by FREEC - pattern: "*.ratio.txt" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ratio: + type: file + description: ratio file generated by FREEC + pattern: "*.ratio.txt" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - circos: - type: file - description: Txt file - pattern: "*.circos.txt" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.circos.txt": + type: file + description: Txt file + pattern: "*.circos.txt" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@FriederikeHanssen" maintainers: diff --git a/modules/nf-core/controlfreec/makegraph2/environment.yml b/modules/nf-core/controlfreec/makegraph2/environment.yml index 720c2e950e..f6b64529bc 100644 --- a/modules/nf-core/controlfreec/makegraph2/environment.yml +++ b/modules/nf-core/controlfreec/makegraph2/environment.yml @@ -1,7 +1,5 @@ -name: controlfreec_makegraph2 channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::control-freec=11.6b diff --git a/modules/nf-core/controlfreec/makegraph2/meta.yml b/modules/nf-core/controlfreec/makegraph2/meta.yml index 1325da4a8b..6614b39ba9 100644 --- a/modules/nf-core/controlfreec/makegraph2/meta.yml +++ b/modules/nf-core/controlfreec/makegraph2/meta.yml @@ -8,51 +8,65 @@ keywords: - tumor-only tools: - controlfreec: - description: Copy number and genotype annotation from whole genome and whole exome sequencing data. + description: Copy number and genotype annotation from whole genome and whole exome + sequencing data. homepage: http://boevalab.inf.ethz.ch/FREEC documentation: http://boevalab.inf.ethz.ch/FREEC/tutorial.html tool_dev_url: https://github.com/BoevaLab/FREEC/ doi: "10.1093/bioinformatics/btq635" licence: ["GPL >=2"] + identifier: "" input: # Only when we have meta - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - ratio: - type: file - description: ratio file generated by FREEC - pattern: "*.ratio.txt" - - baf: - type: file - description: .BAF file generated by FREEC - pattern: "*.BAF" - + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ratio: + type: file + description: ratio file generated by FREEC + pattern: "*.ratio.txt" + - baf: + type: file + description: .BAF file generated by FREEC + pattern: "*.BAF" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - png_baf: - type: file - description: Image of BAF plot - pattern: "*_BAF.png" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*_BAF.png": + type: file + description: Image of BAF plot + pattern: "*_BAF.png" - png_ratio_log2: - type: file - description: Image of ratio log2 plot - pattern: "*_ratio.log2.png" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*_ratio.log2.png": + type: file + description: Image of ratio log2 plot + pattern: "*_ratio.log2.png" - png_ratio: - type: file - description: Image of ratio plot - pattern: "*_ratio.png" - + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*_ratio.png": + type: file + description: Image of ratio plot + pattern: "*_ratio.png" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@FriederikeHanssen" diff --git a/modules/nf-core/deepvariant/README.md b/modules/nf-core/deepvariant/README.md deleted file mode 100644 index ca112a7d33..0000000000 --- a/modules/nf-core/deepvariant/README.md +++ /dev/null @@ -1,9 +0,0 @@ -# Conda is not supported at the moment - -The [bioconda](https://bioconda.github.io/recipes/deepvariant/README.html) recipe is not fully working as expected - -Hence, we are using the docker container provided by the authors of the tool: - -- [google/deepvariant](https://hub.docker.com/r/google/deepvariant) - -This image is mirrored on the [nf-core quay.io](https://quay.io/repository/nf-core/deepvariant) for convenience. diff --git a/modules/nf-core/deepvariant/meta.yml b/modules/nf-core/deepvariant/meta.yml deleted file mode 100644 index a50dc57d9a..0000000000 --- a/modules/nf-core/deepvariant/meta.yml +++ /dev/null @@ -1,83 +0,0 @@ -name: deepvariant -description: DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data -keywords: - - variant calling - - machine learning - - neural network -tools: - - deepvariant: - description: DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data - homepage: https://github.com/google/deepvariant - documentation: https://github.com/google/deepvariant - tool_dev_url: https://github.com/google/deepvariant - doi: "10.1038/nbt.4235" - licence: ["BSD-3-clause"] -input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM file - pattern: "*.bam/cram" - - index: - type: file - description: Index of BAM/CRAM file - pattern: "*.bai/crai" - - interval: - type: file - description: Interval file for targeted regions - pattern: "*.bed" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fasta: - type: file - description: The reference fasta file - pattern: "*.fasta" - - meta3: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fai: - type: file - description: Index of reference fasta file - pattern: "*.fai" - - meta4: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - gzi: - type: file - description: GZI index of reference fasta file - pattern: "*.gzi" -output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: Compressed VCF file - pattern: "*.vcf.gz" - - gvcf: - type: file - description: Compressed GVCF file - pattern: "*.g.vcf.gz" - - version: - type: file - description: File containing software version - pattern: "*.{version.txt}" -authors: - - "@abhi18av" - - "@ramprasadn" -maintainers: - - "@abhi18av" - - "@ramprasadn" diff --git a/modules/nf-core/deepvariant/main.nf b/modules/nf-core/deepvariant/rundeepvariant/main.nf similarity index 71% rename from modules/nf-core/deepvariant/main.nf rename to modules/nf-core/deepvariant/rundeepvariant/main.nf index 507b6c1174..7f99c53f6d 100644 --- a/modules/nf-core/deepvariant/main.nf +++ b/modules/nf-core/deepvariant/rundeepvariant/main.nf @@ -1,15 +1,18 @@ -process DEEPVARIANT { +process DEEPVARIANT_RUNDEEPVARIANT { tag "$meta.id" label 'process_high' - //Conda is not supported at the moment - container "nf-core/deepvariant:1.5.0" + // FIXME Conda is not supported at the moment + // BUG https://github.com/nf-core/modules/issues/1754 + // BUG https://github.com/bioconda/bioconda-recipes/issues/30310 + container "nf-core/deepvariant:1.6.1" input: tuple val(meta), path(input), path(index), path(intervals) tuple val(meta2), path(fasta) tuple val(meta3), path(fai) tuple val(meta4), path(gzi) + tuple val(meta5), path(par_bed) output: tuple val(meta), path("${prefix}.vcf.gz") , emit: vcf @@ -29,6 +32,10 @@ process DEEPVARIANT { def args = task.ext.args ?: '' prefix = task.ext.prefix ?: "${meta.id}" def regions = intervals ? "--regions=${intervals}" : "" + def par_regions = par_bed ? "--par_regions_bed=${par_bed}" : "" + // WARN https://github.com/nf-core/modules/pull/5801#issuecomment-2194293755 + // FIXME Revert this on next version bump + def VERSION = '1.6.1' """ /opt/deepvariant/bin/run_deepvariant \\ @@ -38,12 +45,13 @@ process DEEPVARIANT { --output_gvcf=${prefix}.g.vcf.gz \\ ${args} \\ ${regions} \\ - --intermediate_results_dir=. \\ + ${par_regions} \\ + --intermediate_results_dir=tmp \\ --num_shards=${task.cpus} cat <<-END_VERSIONS > versions.yml "${task.process}": - deepvariant: \$(echo \$(/opt/deepvariant/bin/run_deepvariant --version) | sed 's/^.*version //; s/ .*\$//' ) + deepvariant: $VERSION END_VERSIONS """ @@ -53,6 +61,9 @@ process DEEPVARIANT { error "DEEPVARIANT module does not support Conda. Please use Docker / Singularity / Podman instead." } prefix = task.ext.prefix ?: "${meta.id}" + // WARN https://github.com/nf-core/modules/pull/5801#issuecomment-2194293755 + // FIXME Revert this on next version bump + def VERSION = '1.6.1' """ touch ${prefix}.vcf.gz touch ${prefix}.vcf.gz.tbi @@ -61,7 +72,7 @@ process DEEPVARIANT { cat <<-END_VERSIONS > versions.yml "${task.process}": - deepvariant: \$(echo \$(/opt/deepvariant/bin/run_deepvariant --version) | sed 's/^.*version //; s/ .*\$//' ) + deepvariant: $VERSION END_VERSIONS """ } diff --git a/modules/nf-core/deepvariant/rundeepvariant/meta.yml b/modules/nf-core/deepvariant/rundeepvariant/meta.yml new file mode 100644 index 0000000000..29b45ff917 --- /dev/null +++ b/modules/nf-core/deepvariant/rundeepvariant/meta.yml @@ -0,0 +1,122 @@ +name: deepvariant_rundeepvariant +description: DeepVariant is an analysis pipeline that uses a deep neural network to + call genetic variants from next-generation DNA sequencing data +keywords: + - variant calling + - machine learning + - neural network +tools: + - deepvariant: + description: DeepVariant is an analysis pipeline that uses a deep neural network + to call genetic variants from next-generation DNA sequencing data + homepage: https://github.com/google/deepvariant + documentation: https://github.com/google/deepvariant + tool_dev_url: https://github.com/google/deepvariant + doi: "10.1038/nbt.4235" + licence: ["BSD-3-clause"] + identifier: biotools:deepvariant +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM file + pattern: "*.bam/cram" + - index: + type: file + description: Index of BAM/CRAM file + pattern: "*.bai/crai" + - intervals: + type: file + description: file containing intervals + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fasta: + type: file + description: The reference fasta file + pattern: "*.fasta" + - - meta3: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fai: + type: file + description: Index of reference fasta file + pattern: "*.fai" + - - meta4: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - gzi: + type: file + description: GZI index of reference fasta file + pattern: "*.gzi" + - - meta5: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - par_bed: + type: file + description: BED file containing PAR regions + pattern: "*.bed" +output: + - vcf: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.vcf.gz: + type: file + description: Compressed VCF file + pattern: "*.vcf.gz" + - vcf_tbi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.vcf.gz.tbi: + type: file + description: Tabix index file of compressed VCF + pattern: "*.vcf.gz.tbi" + - gvcf: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.g.vcf.gz: + type: file + description: Compressed GVCF file + pattern: "*.g.vcf.gz" + - gvcf_tbi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.g.vcf.gz.tbi: + type: file + description: Tabix index file of compressed GVCF + pattern: "*.g.vcf.gz.tbi" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@abhi18av" + - "@ramprasadn" +maintainers: + - "@abhi18av" + - "@ramprasadn" diff --git a/modules/nf-core/deepvariant/tests/main.nf.test b/modules/nf-core/deepvariant/rundeepvariant/tests/main.nf.test similarity index 67% rename from modules/nf-core/deepvariant/tests/main.nf.test rename to modules/nf-core/deepvariant/rundeepvariant/tests/main.nf.test index 91612c1e28..0790fb8136 100644 --- a/modules/nf-core/deepvariant/tests/main.nf.test +++ b/modules/nf-core/deepvariant/rundeepvariant/tests/main.nf.test @@ -1,9 +1,10 @@ nextflow_process { - name "Test Process DEEPVARIANT" + name "Test Process DEEPVARIANT_RUNDEEPVARIANT" script "../main.nf" - process "DEEPVARIANT" + process "DEEPVARIANT_RUNDEEPVARIANT" + tag "deepvariant/rundeepvariant" tag "deepvariant" tag "modules" tag "modules_nfcore" @@ -31,6 +32,9 @@ nextflow_process { input[3] = [ [],[] ] + input[4] = [ + [],[] + ] """ } } @@ -66,6 +70,48 @@ nextflow_process { input[3] = [ [],[] ] + input[4] = [ + [],[] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("homo_sapiens - [cram, crai, genome_bed] - fasta - fai - par_bed") { + config "./nextflow-non-autosomal-calling.config" + tag "test" + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + '/genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + '/genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true), + file(params.modules_testdata_base_path + '/genomics/homo_sapiens/genome/genome.bed', checkIfExists: true) + ] + input[1] = [ + [ id:'genome'], + file(params.modules_testdata_base_path + '/genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) + ] + input[2] = [ + [ id:'genome'], + file(params.modules_testdata_base_path + '/genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true) + ] + input[3] = [ + [],[] + ] + input[4] = [ + [ id:'genome'], + file(params.modules_testdata_base_path + '/genomics/homo_sapiens/genome/genome.blacklist_intervals.bed', checkIfExists: true) + ] """ } } @@ -102,6 +148,9 @@ nextflow_process { [ id:'genome'], file(params.modules_testdata_base_path + '/genomics/homo_sapiens/genome/genome.fasta.gz.gzi', checkIfExists: true) ] + input[4] = [ + [],[] + ] """ } } @@ -114,4 +163,4 @@ nextflow_process { } } -} \ No newline at end of file +} diff --git a/modules/nf-core/deepvariant/rundeepvariant/tests/main.nf.test.snap b/modules/nf-core/deepvariant/rundeepvariant/tests/main.nf.test.snap new file mode 100644 index 0000000000..1ec351eecc --- /dev/null +++ b/modules/nf-core/deepvariant/rundeepvariant/tests/main.nf.test.snap @@ -0,0 +1,358 @@ +{ + "homo_sapiens - [bam, bai] - fasta_gz - fasta_gz_fai": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.vcf.gz:md5,8b8ab4a675f01e437aa72e1438a717d0" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.vcf.gz.tbi:md5,0000833138104e87b05eaa906821eb21" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.g.vcf.gz:md5,0a629e1745926cfcedf4b169046a921a" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.g.vcf.gz.tbi:md5,49503913c28ec70a6f4aa52f6b357b4d" + ] + ], + "4": [ + "versions.yml:md5,a251d8d9f5e8b737d8298eead96c0890" + ], + "gvcf": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.g.vcf.gz:md5,0a629e1745926cfcedf4b169046a921a" + ] + ], + "gvcf_tbi": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.g.vcf.gz.tbi:md5,49503913c28ec70a6f4aa52f6b357b4d" + ] + ], + "vcf": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.vcf.gz:md5,8b8ab4a675f01e437aa72e1438a717d0" + ] + ], + "vcf_tbi": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.vcf.gz.tbi:md5,0000833138104e87b05eaa906821eb21" + ] + ], + "versions": [ + "versions.yml:md5,a251d8d9f5e8b737d8298eead96c0890" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-29T11:36:27.325363" + }, + "homo_sapiens - [bam, bai] - fasta - fai": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.vcf.gz:md5,8b8ab4a675f01e437aa72e1438a717d0" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.vcf.gz.tbi:md5,0000833138104e87b05eaa906821eb21" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.g.vcf.gz:md5,0a629e1745926cfcedf4b169046a921a" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.g.vcf.gz.tbi:md5,49503913c28ec70a6f4aa52f6b357b4d" + ] + ], + "4": [ + "versions.yml:md5,a251d8d9f5e8b737d8298eead96c0890" + ], + "gvcf": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.g.vcf.gz:md5,0a629e1745926cfcedf4b169046a921a" + ] + ], + "gvcf_tbi": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.g.vcf.gz.tbi:md5,49503913c28ec70a6f4aa52f6b357b4d" + ] + ], + "vcf": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.vcf.gz:md5,8b8ab4a675f01e437aa72e1438a717d0" + ] + ], + "vcf_tbi": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.vcf.gz.tbi:md5,0000833138104e87b05eaa906821eb21" + ] + ], + "versions": [ + "versions.yml:md5,a251d8d9f5e8b737d8298eead96c0890" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-29T11:34:41.779153" + }, + "homo_sapiens - [cram, crai, genome_bed] - fasta - fai": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.vcf.gz:md5,8b8ab4a675f01e437aa72e1438a717d0" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.vcf.gz.tbi:md5,0000833138104e87b05eaa906821eb21" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.g.vcf.gz:md5,0a629e1745926cfcedf4b169046a921a" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.g.vcf.gz.tbi:md5,49503913c28ec70a6f4aa52f6b357b4d" + ] + ], + "4": [ + "versions.yml:md5,a251d8d9f5e8b737d8298eead96c0890" + ], + "gvcf": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.g.vcf.gz:md5,0a629e1745926cfcedf4b169046a921a" + ] + ], + "gvcf_tbi": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.g.vcf.gz.tbi:md5,49503913c28ec70a6f4aa52f6b357b4d" + ] + ], + "vcf": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.vcf.gz:md5,8b8ab4a675f01e437aa72e1438a717d0" + ] + ], + "vcf_tbi": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.vcf.gz.tbi:md5,0000833138104e87b05eaa906821eb21" + ] + ], + "versions": [ + "versions.yml:md5,a251d8d9f5e8b737d8298eead96c0890" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-29T11:35:16.993129" + }, + "homo_sapiens - [cram, crai, genome_bed] - fasta - fai - par_bed": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.vcf.gz:md5,d2e26d65dbbcea9b087ed191b5c9841c" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.vcf.gz.tbi:md5,0801296d0356415bbf1ef8deb4ec84c3" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.g.vcf.gz:md5,4fcaa9a8b55730d191382160c2b5bb0a" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.g.vcf.gz.tbi:md5,f468e846904733b3231ecf00ef7cd4a2" + ] + ], + "4": [ + "versions.yml:md5,a251d8d9f5e8b737d8298eead96c0890" + ], + "gvcf": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.g.vcf.gz:md5,4fcaa9a8b55730d191382160c2b5bb0a" + ] + ], + "gvcf_tbi": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.g.vcf.gz.tbi:md5,f468e846904733b3231ecf00ef7cd4a2" + ] + ], + "vcf": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.vcf.gz:md5,d2e26d65dbbcea9b087ed191b5c9841c" + ] + ], + "vcf_tbi": [ + [ + { + "id": "test", + "single_end": false + }, + "test_out.vcf.gz.tbi:md5,0801296d0356415bbf1ef8deb4ec84c3" + ] + ], + "versions": [ + "versions.yml:md5,a251d8d9f5e8b737d8298eead96c0890" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-29T11:35:52.23093" + } +} \ No newline at end of file diff --git a/modules/nf-core/deepvariant/tests/nextflow-intervals.config b/modules/nf-core/deepvariant/rundeepvariant/tests/nextflow-intervals.config similarity index 70% rename from modules/nf-core/deepvariant/tests/nextflow-intervals.config rename to modules/nf-core/deepvariant/rundeepvariant/tests/nextflow-intervals.config index 6d135f9f10..78d8d5982a 100644 --- a/modules/nf-core/deepvariant/tests/nextflow-intervals.config +++ b/modules/nf-core/deepvariant/rundeepvariant/tests/nextflow-intervals.config @@ -1,6 +1,6 @@ process { - withName: DEEPVARIANT { + withName: DEEPVARIANT_RUNDEEPVARIANT { ext.args = '--model_type=WGS ' ext.prefix = { "${meta.id}_out" } } diff --git a/modules/nf-core/deepvariant/rundeepvariant/tests/nextflow-non-autosomal-calling.config b/modules/nf-core/deepvariant/rundeepvariant/tests/nextflow-non-autosomal-calling.config new file mode 100644 index 0000000000..6d265292ab --- /dev/null +++ b/modules/nf-core/deepvariant/rundeepvariant/tests/nextflow-non-autosomal-calling.config @@ -0,0 +1,8 @@ +process { + + withName: DEEPVARIANT_RUNDEEPVARIANT { + ext.args = '--model_type=WGS --haploid_contigs chr22' + ext.prefix = { "${meta.id}_out" } + } + +} diff --git a/modules/nf-core/deepvariant/tests/nextflow.config b/modules/nf-core/deepvariant/rundeepvariant/tests/nextflow.config similarity index 75% rename from modules/nf-core/deepvariant/tests/nextflow.config rename to modules/nf-core/deepvariant/rundeepvariant/tests/nextflow.config index d335d30b54..77e355cae8 100644 --- a/modules/nf-core/deepvariant/tests/nextflow.config +++ b/modules/nf-core/deepvariant/rundeepvariant/tests/nextflow.config @@ -1,6 +1,6 @@ process { - withName: DEEPVARIANT { + withName: DEEPVARIANT_RUNDEEPVARIANT { ext.args = ' --regions=\"chr22:0-40001\" --model_type=WGS ' ext.prefix = { "${meta.id}_out" } } diff --git a/modules/nf-core/deepvariant/rundeepvariant/tests/tags.yml b/modules/nf-core/deepvariant/rundeepvariant/tests/tags.yml new file mode 100644 index 0000000000..958b8e4149 --- /dev/null +++ b/modules/nf-core/deepvariant/rundeepvariant/tests/tags.yml @@ -0,0 +1,2 @@ +deepvariant/rundeepvariant: + - modules/nf-core/deepvariant/rundeepvariant/** diff --git a/modules/nf-core/deepvariant/tests/main.nf.test.snap b/modules/nf-core/deepvariant/tests/main.nf.test.snap deleted file mode 100644 index 6ad76ae4c3..0000000000 --- a/modules/nf-core/deepvariant/tests/main.nf.test.snap +++ /dev/null @@ -1,269 +0,0 @@ -{ - "homo_sapiens - [bam, bai] - fasta_gz - fasta_gz_fai": { - "content": [ - { - "0": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.vcf.gz:md5,7cba1516f7cf0888765d5ee8caf275f4" - ] - ], - "1": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.vcf.gz.tbi:md5,02a78562bc83520a51010a01fb06f217" - ] - ], - "2": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.g.vcf.gz:md5,8d6ac08997bfd848a0a4d9d295e76952" - ] - ], - "3": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.g.vcf.gz.tbi:md5,37e2d8f4cca0a21113cede608f54885a" - ] - ], - "4": [ - "versions.yml:md5,4678f778b58276933b165fe3e84afc6a" - ], - "gvcf": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.g.vcf.gz:md5,8d6ac08997bfd848a0a4d9d295e76952" - ] - ], - "gvcf_tbi": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.g.vcf.gz.tbi:md5,37e2d8f4cca0a21113cede608f54885a" - ] - ], - "vcf": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.vcf.gz:md5,7cba1516f7cf0888765d5ee8caf275f4" - ] - ], - "vcf_tbi": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.vcf.gz.tbi:md5,02a78562bc83520a51010a01fb06f217" - ] - ], - "versions": [ - "versions.yml:md5,4678f778b58276933b165fe3e84afc6a" - ] - } - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-03-20T13:54:42.757335334" - }, - "homo_sapiens - [bam, bai] - fasta - fai": { - "content": [ - { - "0": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.vcf.gz:md5,7cba1516f7cf0888765d5ee8caf275f4" - ] - ], - "1": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.vcf.gz.tbi:md5,02a78562bc83520a51010a01fb06f217" - ] - ], - "2": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.g.vcf.gz:md5,8d6ac08997bfd848a0a4d9d295e76952" - ] - ], - "3": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.g.vcf.gz.tbi:md5,37e2d8f4cca0a21113cede608f54885a" - ] - ], - "4": [ - "versions.yml:md5,4678f778b58276933b165fe3e84afc6a" - ], - "gvcf": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.g.vcf.gz:md5,8d6ac08997bfd848a0a4d9d295e76952" - ] - ], - "gvcf_tbi": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.g.vcf.gz.tbi:md5,37e2d8f4cca0a21113cede608f54885a" - ] - ], - "vcf": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.vcf.gz:md5,7cba1516f7cf0888765d5ee8caf275f4" - ] - ], - "vcf_tbi": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.vcf.gz.tbi:md5,02a78562bc83520a51010a01fb06f217" - ] - ], - "versions": [ - "versions.yml:md5,4678f778b58276933b165fe3e84afc6a" - ] - } - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-03-20T13:54:18.409489045" - }, - "homo_sapiens - [cram, crai, genome_bed] - fasta - fai": { - "content": [ - { - "0": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.vcf.gz:md5,7cba1516f7cf0888765d5ee8caf275f4" - ] - ], - "1": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.vcf.gz.tbi:md5,02a78562bc83520a51010a01fb06f217" - ] - ], - "2": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.g.vcf.gz:md5,8d6ac08997bfd848a0a4d9d295e76952" - ] - ], - "3": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.g.vcf.gz.tbi:md5,37e2d8f4cca0a21113cede608f54885a" - ] - ], - "4": [ - "versions.yml:md5,4678f778b58276933b165fe3e84afc6a" - ], - "gvcf": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.g.vcf.gz:md5,8d6ac08997bfd848a0a4d9d295e76952" - ] - ], - "gvcf_tbi": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.g.vcf.gz.tbi:md5,37e2d8f4cca0a21113cede608f54885a" - ] - ], - "vcf": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.vcf.gz:md5,7cba1516f7cf0888765d5ee8caf275f4" - ] - ], - "vcf_tbi": [ - [ - { - "id": "test", - "single_end": false - }, - "test_out.vcf.gz.tbi:md5,02a78562bc83520a51010a01fb06f217" - ] - ], - "versions": [ - "versions.yml:md5,4678f778b58276933b165fe3e84afc6a" - ] - } - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-03-20T13:54:30.523871801" - } -} diff --git a/modules/nf-core/deepvariant/tests/tags.yml b/modules/nf-core/deepvariant/tests/tags.yml deleted file mode 100644 index 8e838c7ba2..0000000000 --- a/modules/nf-core/deepvariant/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -deepvariant: - - modules/nf-core/deepvariant/** diff --git a/modules/nf-core/dragmap/align/dragmap-align.diff b/modules/nf-core/dragmap/align/dragmap-align.diff index 18a36816d3..cca557b628 100644 --- a/modules/nf-core/dragmap/align/dragmap-align.diff +++ b/modules/nf-core/dragmap/align/dragmap-align.diff @@ -1,4 +1,5 @@ Changes in module 'nf-core/dragmap/align' +Changes in 'dragmap/align/main.nf': --- modules/nf-core/dragmap/align/main.nf +++ modules/nf-core/dragmap/align/main.nf @@ -4,8 +4,8 @@ @@ -13,18 +14,25 @@ Changes in module 'nf-core/dragmap/align' input: tuple val(meta) , path(reads) +'modules/nf-core/dragmap/align/meta.yml' is unchanged +Changes in 'dragmap/align/environment.yml': --- modules/nf-core/dragmap/align/environment.yml +++ modules/nf-core/dragmap/align/environment.yml -@@ -4,7 +4,7 @@ +@@ -1,8 +1,8 @@ + channels: + - conda-forge - bioconda - - defaults +- dependencies: - - dragmap=1.3.0 -+ - dragmap=1.2.1 - # renovate: datasource=conda depName=bioconda/samtools -- - samtools=1.18 - - pigz=2.8 -+ - samtools=1.19.2 +- - samtools=1.18 ++ - dragmap=1.2.1 + - pigz=2.3.4 ++ # renovate: datasource=conda depName=bioconda/samtools ++ - samtools=1.19.2 +'modules/nf-core/dragmap/align/tests/main.nf.test.snap' is unchanged +'modules/nf-core/dragmap/align/tests/main.nf.test' is unchanged +'modules/nf-core/dragmap/align/tests/tags.yml' is unchanged ************************************************************ diff --git a/modules/nf-core/dragmap/align/environment.yml b/modules/nf-core/dragmap/align/environment.yml index a443ce4455..507a1e9327 100644 --- a/modules/nf-core/dragmap/align/environment.yml +++ b/modules/nf-core/dragmap/align/environment.yml @@ -1,10 +1,8 @@ -name: dragmap_align channels: - conda-forge - bioconda - - defaults dependencies: - dragmap=1.2.1 - # renovate: datasource=conda depName=bioconda/samtools - - samtools=1.19.2 - pigz=2.3.4 + # renovate: datasource=conda depName=bioconda/samtools + - samtools=1.19.2 diff --git a/modules/nf-core/dragmap/align/meta.yml b/modules/nf-core/dragmap/align/meta.yml index 2270bd3397..80f020f58f 100644 --- a/modules/nf-core/dragmap/align/meta.yml +++ b/modules/nf-core/dragmap/align/meta.yml @@ -13,44 +13,104 @@ tools: documentation: https://github.com/Illumina/dragmap tool_dev_url: https://github.com/Illumina/dragmap#basic-command-line-usage licence: ["GPL v3"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - reads: - type: file - description: | - List of input FastQ files of size 1 and 2 for single-end and paired-end data, - respectively. - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'test', single_end:false ] - - hashmap: - type: file - description: DRAGMAP hash table - pattern: "Directory containing DRAGMAP hash table *.{cmp,.bin,.txt}" - - meta3: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome'] - - fasta: - type: file - description: Genome fasta reference files - pattern: "*.{fa,fasta,fna}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - reads: + type: file + description: | + List of input FastQ files of size 1 and 2 for single-end and paired-end data, + respectively. + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test', single_end:false ] + - hashmap: + type: file + description: DRAGMAP hash table + pattern: "Directory containing DRAGMAP hash table *.{cmp,.bin,.txt}" + - - meta3: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome'] + - fasta: + type: file + description: Genome fasta reference files + pattern: "*.{fa,fasta,fna}" + - - sort_bam: + type: boolean + description: Sort the BAM file output: + - sam: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.sam": + type: file + description: Output SAM file containing read alignments + pattern: "*.{sam}" - bam: - type: file - description: Output BAM file containing read alignments - pattern: "*.{bam}" + - meta: + type: file + description: Output BAM file containing read alignments + pattern: "*.{bam}" + - "*.bam": + type: file + description: Output BAM file containing read alignments + pattern: "*.{bam}" + - cram: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.cram": + type: file + description: Output CRAM file containing read alignments + pattern: "*.{cram}" + - crai: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.crai": + type: file + description: Index file for CRAM file + pattern: "*.{crai}" + - csi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: Index file for CRAM file + pattern: "*.{csi}" + - log: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.log": + type: file + description: Log file + pattern: "*.{log}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@edmundmiller" maintainers: diff --git a/modules/nf-core/dragmap/hashtable/dragmap-hashtable.diff b/modules/nf-core/dragmap/hashtable/dragmap-hashtable.diff index c2fba762b8..3e9ffb89c6 100644 --- a/modules/nf-core/dragmap/hashtable/dragmap-hashtable.diff +++ b/modules/nf-core/dragmap/hashtable/dragmap-hashtable.diff @@ -1,4 +1,5 @@ Changes in module 'nf-core/dragmap/hashtable' +Changes in 'dragmap/hashtable/main.nf': --- modules/nf-core/dragmap/hashtable/main.nf +++ modules/nf-core/dragmap/hashtable/main.nf @@ -4,8 +4,8 @@ @@ -13,13 +14,18 @@ Changes in module 'nf-core/dragmap/hashtable' input: tuple val(meta), path(fasta) +'modules/nf-core/dragmap/hashtable/meta.yml' is unchanged +Changes in 'dragmap/hashtable/environment.yml': --- modules/nf-core/dragmap/hashtable/environment.yml +++ modules/nf-core/dragmap/hashtable/environment.yml -@@ -4,4 +4,4 @@ +@@ -2,4 +2,4 @@ + - conda-forge - bioconda - - defaults dependencies: - - bioconda::dragmap=1.3.0 + - bioconda::dragmap=1.2.1 +'modules/nf-core/dragmap/hashtable/tests/main.nf.test.snap' is unchanged +'modules/nf-core/dragmap/hashtable/tests/main.nf.test' is unchanged +'modules/nf-core/dragmap/hashtable/tests/tags.yml' is unchanged ************************************************************ diff --git a/modules/nf-core/dragmap/hashtable/environment.yml b/modules/nf-core/dragmap/hashtable/environment.yml index 3c3d1404f4..9adca49bae 100644 --- a/modules/nf-core/dragmap/hashtable/environment.yml +++ b/modules/nf-core/dragmap/hashtable/environment.yml @@ -1,7 +1,5 @@ -name: dragmap_hashtable channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::dragmap=1.2.1 diff --git a/modules/nf-core/dragmap/hashtable/meta.yml b/modules/nf-core/dragmap/hashtable/meta.yml index 1d1f92f522..450a1e58e0 100644 --- a/modules/nf-core/dragmap/hashtable/meta.yml +++ b/modules/nf-core/dragmap/hashtable/meta.yml @@ -12,29 +12,32 @@ tools: documentation: https://github.com/Illumina/dragmap tool_dev_url: https://github.com/Illumina/dragmap#basic-command-line-usage licence: ["GPL v3"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'test', single_end:false ] - - fasta: - type: file - description: Input genome fasta file + - - meta: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: Input genome fasta file output: - - meta: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'test', single_end:false ] - hashmap: - type: file - description: DRAGMAP hash table - pattern: "*.{cmp,.bin,.txt}" + - meta: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test', single_end:false ] + - dragmap: + type: file + description: DRAGMAP hash table + pattern: "*.{cmp,.bin,.txt}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@edmundmiller" maintainers: diff --git a/modules/nf-core/ensemblvep/download/environment.yml b/modules/nf-core/ensemblvep/download/environment.yml index 2ea8642fe4..3d36eb17c0 100644 --- a/modules/nf-core/ensemblvep/download/environment.yml +++ b/modules/nf-core/ensemblvep/download/environment.yml @@ -1,7 +1,5 @@ -name: ensemblvep_download channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::ensembl-vep=111.0 + - bioconda::ensembl-vep=113.0 diff --git a/modules/nf-core/ensemblvep/download/main.nf b/modules/nf-core/ensemblvep/download/main.nf index f9e025a552..0664a2dfb9 100644 --- a/modules/nf-core/ensemblvep/download/main.nf +++ b/modules/nf-core/ensemblvep/download/main.nf @@ -4,8 +4,8 @@ process ENSEMBLVEP_DOWNLOAD { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/ensembl-vep:111.0--pl5321h2a3209d_0' : - 'biocontainers/ensembl-vep:111.0--pl5321h2a3209d_0' }" + 'https://depot.galaxyproject.org/singularity/ensembl-vep:113.0--pl5321h2a3209d_0' : + 'biocontainers/ensembl-vep:113.0--pl5321h2a3209d_0' }" input: tuple val(meta), val(assembly), val(species), val(cache_version) diff --git a/modules/nf-core/ensemblvep/download/meta.yml b/modules/nf-core/ensemblvep/download/meta.yml index a4277ad7a7..8da9621cbf 100644 --- a/modules/nf-core/ensemblvep/download/meta.yml +++ b/modules/nf-core/ensemblvep/download/meta.yml @@ -1,5 +1,6 @@ name: ensemblvep_download -description: Ensembl Variant Effect Predictor (VEP). The cache downloading options are controlled through `task.ext.args`. +description: Ensembl Variant Effect Predictor (VEP). The cache downloading options + are controlled through `task.ext.args`. keywords: - annotation - cache @@ -12,33 +13,40 @@ tools: homepage: https://www.ensembl.org/info/docs/tools/vep/index.html documentation: https://www.ensembl.org/info/docs/tools/vep/script/index.html licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - assembly: - type: string - description: | - Genome assembly - - species: - type: string - description: | - Specie - - cache_version: - type: string - description: | - cache version + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - assembly: + type: string + description: | + Genome assembly + - species: + type: string + description: | + Specie + - cache_version: + type: string + description: | + cache version output: - cache: - type: file - description: cache - pattern: "*" + - meta: + type: file + description: cache + pattern: "*" + - prefix: + type: file + description: cache + pattern: "*" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@maxulysse" maintainers: diff --git a/modules/nf-core/ensemblvep/download/tests/main.nf.test b/modules/nf-core/ensemblvep/download/tests/main.nf.test index 760c9d5643..a558599578 100644 --- a/modules/nf-core/ensemblvep/download/tests/main.nf.test +++ b/modules/nf-core/ensemblvep/download/tests/main.nf.test @@ -16,7 +16,7 @@ nextflow_process { process { """ input[0] = Channel.of([ - [id:"111_WBcel235"], + [id:"113_WBcel235"], params.vep_genome, params.vep_species, params.vep_cache_version @@ -41,7 +41,7 @@ nextflow_process { process { """ input[0] = Channel.of([ - [id:"111_WBcel235"], + [id:"113_WBcel235"], params.vep_genome, params.vep_species, params.vep_cache_version diff --git a/modules/nf-core/ensemblvep/download/tests/main.nf.test.snap b/modules/nf-core/ensemblvep/download/tests/main.nf.test.snap index 6ea596fbde..706bd28f19 100644 --- a/modules/nf-core/ensemblvep/download/tests/main.nf.test.snap +++ b/modules/nf-core/ensemblvep/download/tests/main.nf.test.snap @@ -5,282 +5,282 @@ "0": [ [ { - "id": "111_WBcel235" + "id": "113_WBcel235" }, [ [ [ [ - "1-1000000.gz:md5,4da54db2f781d08975630811fd831585", - "10000001-11000000.gz:md5,7bee73e51d08f5e6de1796614105c5de", - "1000001-2000000.gz:md5,e8beff9020e261d78988c37e804cc89a", - "11000001-12000000.gz:md5,1a31b2dcf58822e837cd91b7a74a8b4f", - "12000001-13000000.gz:md5,34522be2ee5bd3cf51a9b151c877fe35", - "13000001-14000000.gz:md5,4e5a7b604f8aea2ad9de79b75ed89a6b", - "14000001-15000000.gz:md5,65146be110ea58b64ab8518bcbfbde9d", - "15000001-16000000.gz:md5,a39fdb7b0b056c0254574353351722eb", - "2000001-3000000.gz:md5,b72a04531477615dc4d2c530d09e60df", - "3000001-4000000.gz:md5,50dae46f370e1468c8f8f2c97cc75f0b", - "4000001-5000000.gz:md5,e58e124073689212e5311fbd8ccce415", - "5000001-6000000.gz:md5,db27434dc8be9557f97aa09a95126836", - "6000001-7000000.gz:md5,a5131e3ee41b329eb55fb3849ddb1471", - "7000001-8000000.gz:md5,61e1fbe1a82892a0f9f6ee0380fa60bc", - "8000001-9000000.gz:md5,48166dd4518ec21f597b6acca73809bb", - "9000001-10000000.gz:md5,3e416c856f40f36ec0ed3f42a93b2267" + "1-1000000.gz:md5,cadcba92b0999210dd8d832505d2e4c4", + "10000001-11000000.gz:md5,998a75dd927d10d45f8eebeef5fc7a75", + "1000001-2000000.gz:md5,a5cb3adb1ec9f40eed6a355d1492ba9b", + "11000001-12000000.gz:md5,46e6917f51093e28cce061774b9ed158", + "12000001-13000000.gz:md5,0adffacf8482d6c224df27104f65c9d6", + "13000001-14000000.gz:md5,aee759d812fc900a980ab0c4c5bd0273", + "14000001-15000000.gz:md5,f65537a3f76c40e63b6deb0b6cdb09dc", + "15000001-16000000.gz:md5,379f092ad1afa888da1fc13e80535def", + "2000001-3000000.gz:md5,86839741524579fd089498d6bee44dff", + "3000001-4000000.gz:md5,509b28af3920427e951f00b6973b5df4", + "4000001-5000000.gz:md5,f606e69cf59b0bdf2b61653608d955a6", + "5000001-6000000.gz:md5,a14ce1e21856e4a77ed63c67cbdfb26a", + "6000001-7000000.gz:md5,e1a895d6e8b352182b53ed1d0ce6e24e", + "7000001-8000000.gz:md5,ddf91b60f636d26b68b6bab3520b6b32", + "8000001-9000000.gz:md5,57482b996f89e92bbd0196efa4915cd3", + "9000001-10000000.gz:md5,43b5d89f84236b49b384d7f37f928129" ], [ - "1-1000000.gz:md5,06b83c3bd2c651c5a8a96f0865d54a53", - "10000001-11000000.gz:md5,79b3348f860370d1697e6d4de128fca3", - "1000001-2000000.gz:md5,f98e31f3e25c0a419ebeee5b17527b00", - "11000001-12000000.gz:md5,5f23214fdef1f7637f6046dc751155ed", - "12000001-13000000.gz:md5,9a4422905679e543a92d47142b1acba1", - "13000001-14000000.gz:md5,c5db99e7d56f2115f8da8fe3af83314a", - "14000001-15000000.gz:md5,66b65af3732c0495dc74f3071203ac2f", - "15000001-16000000.gz:md5,d4c30dc42925cc92dc594d4145544f33", - "2000001-3000000.gz:md5,ff9b3fd1235468c738e9201e2fa98e08", - "3000001-4000000.gz:md5,c649308c7d3b4891a8c6f95e583f3a08", - "4000001-5000000.gz:md5,c53d2fa6695248d0a725ef70325aae91", - "5000001-6000000.gz:md5,5481fb4b60ebd97256f5d52c42aee0bd", - "6000001-7000000.gz:md5,94b82e096bacb091e0ef55bcd08b8ab8", - "7000001-8000000.gz:md5,83f571dfaf8b891bf27208466e6f7d8c", - "8000001-9000000.gz:md5,4f07e30c7d772544bc6e99bac371b97b", - "9000001-10000000.gz:md5,f1439765f6428ae99516d95dc5df3926" + "1-1000000.gz:md5,d18811781848f70baef0b0348190d7ce", + "10000001-11000000.gz:md5,19011165abc56233ea0c5b0e6938d9c9", + "1000001-2000000.gz:md5,5e720fa191f3c9ac799b6a071bcc4332", + "11000001-12000000.gz:md5,b19c46fb00ca13a2a31128bd1829ddf5", + "12000001-13000000.gz:md5,54354b0870ca96641c51ed63382da007", + "13000001-14000000.gz:md5,6954fdc223f58eb406e602752ab7d139", + "14000001-15000000.gz:md5,929275a1cfea883999dddc20931a2e72", + "15000001-16000000.gz:md5,5f5b783a589a1fd80cc565e6f339c540", + "2000001-3000000.gz:md5,54e476e0e9f4a5d973ee710fd824abc7", + "3000001-4000000.gz:md5,d78d4a63165429fdb3a61b7cdbd3c43a", + "4000001-5000000.gz:md5,983f8efcebb7f62d7e7b1b3c0573d43e", + "5000001-6000000.gz:md5,e2cd03ed5b67b8ee123e4c4958508fe4", + "6000001-7000000.gz:md5,d04bc9335ba39ace20bce936e3a5cdeb", + "7000001-8000000.gz:md5,9354b26a9ba94aa5bc30f537c22382fb", + "8000001-9000000.gz:md5,b227c6ef81ab72d211d25dc4f44813b9", + "9000001-10000000.gz:md5,a6d7f29edd7c22139403a11cac989b7a" ], [ - "1-1000000.gz:md5,dab3bbb82e7ecc5430319b7723b88436", - "10000001-11000000.gz:md5,a1af0b4ce9c2ff301ac0a935a4189c58", - "1000001-2000000.gz:md5,8a70e4d08b14a4cf19b03a0556a6cae8", - "11000001-12000000.gz:md5,e866bb880cd79e612dc2081635368017", - "12000001-13000000.gz:md5,1b5be2ca310afd8289561331d19ddf07", - "13000001-14000000.gz:md5,907d2bb3f6b9b75fca9e40f1220c0cb4", - "2000001-3000000.gz:md5,783bcc5957ca4853853c5cda0418dbcd", - "3000001-4000000.gz:md5,cb2df81caa4a20215bb793ca2a792e01", - "4000001-5000000.gz:md5,2046030a187c0a86c9be02531aea0ed7", - "5000001-6000000.gz:md5,395a977401540eb90607b31ecc786a0f", - "6000001-7000000.gz:md5,e6a97128bc38649fcfa5dcb0032a570e", - "7000001-8000000.gz:md5,08804df16d4cdeb5a733d2d6b07b058e", - "8000001-9000000.gz:md5,bba084d260c12613403d144073105d9b", - "9000001-10000000.gz:md5,c0775c413018ed7964f3042112fe4e26" + "1-1000000.gz:md5,2117acb322a117a9c5db85c072575331", + "10000001-11000000.gz:md5,646c9582b56eb12ddbb1dd35b25c3670", + "1000001-2000000.gz:md5,ee433e4e5e37b2d008c43e1af4be0f8d", + "11000001-12000000.gz:md5,962fd6e52046484b3b123f9380ed64e9", + "12000001-13000000.gz:md5,1abf2d695c829eb2c88e0d3dbc739a1c", + "13000001-14000000.gz:md5,a6e03bf867f5cc694174a230f1b13a6b", + "2000001-3000000.gz:md5,a5b250aa9e3ee8cecc23bea0e2fa19a1", + "3000001-4000000.gz:md5,1390a6d2a28a4861b282d36d0fb85660", + "4000001-5000000.gz:md5,4bc7106bb2661aea28613c31935a5c8f", + "5000001-6000000.gz:md5,7317d6fbb3c77d7cdd31e781afab8f7d", + "6000001-7000000.gz:md5,1a3b6fa586e570c16b4833e34b28751e", + "7000001-8000000.gz:md5,b7bcb06393682f621403afdf19bf87b4", + "8000001-9000000.gz:md5,0011675a8567d394da54a52480b35786", + "9000001-10000000.gz:md5,e4fa88e4ec57ed0c71fd21090d8aa17a" ], [ - "1-1000000.gz:md5,710a2c1ad83c3c9751a0f152cd98f02c", - "10000001-11000000.gz:md5,ac93a92e62043bddaa59973e26dac8e0", - "1000001-2000000.gz:md5,232ee5ab6c7295007ffc760f361e4c07", - "11000001-12000000.gz:md5,07e49ac7b78fbc29cd920b11a4d21afc", - "12000001-13000000.gz:md5,7849822cf3df022e9f273fb6a928aa8b", - "13000001-14000000.gz:md5,e3d3f0ee264694c72b3b989a542c5694", - "14000001-15000000.gz:md5,e5771b6c2fefe9f62b23c71ab873f94f", - "15000001-16000000.gz:md5,fa02114035f63a504c48005c498f8ec3", - "16000001-17000000.gz:md5,ef0fd03281afc2e636a64fd61df8c4dc", - "17000001-18000000.gz:md5,4483a6d74a07b1101dccee71a22213ed", - "2000001-3000000.gz:md5,f164cbfdc8bc12efd7c26df3935fe190", - "3000001-4000000.gz:md5,ff05a42801004994a4f30f081bc8945a", - "4000001-5000000.gz:md5,b63f51d54dc3cb4b214b54527dfa4234", - "5000001-6000000.gz:md5,6945f59a1fd50f1dfa8a2f1e55fecc12", - "6000001-7000000.gz:md5,3f27a3cb19ece2a9e87da6fe6476faaa", - "7000001-8000000.gz:md5,05bffc6e8af7a80fdd6cbc53e5512d0c", - "8000001-9000000.gz:md5,8a3594ee1eb15d7aa8edeb325e485dce", - "9000001-10000000.gz:md5,338bce56200690d450d5fcac65a91be7" + "1-1000000.gz:md5,a47af22d33275652036ddf7161699c7c", + "10000001-11000000.gz:md5,7fc129e7edbaa5be87306de417c2ef28", + "1000001-2000000.gz:md5,cbc12c339741df5ad06bf9a946be6c93", + "11000001-12000000.gz:md5,d1cc5e20e3d3402debdc102087a5407f", + "12000001-13000000.gz:md5,42c69c8e86d28151e9a8b1787dbee125", + "13000001-14000000.gz:md5,c7459d1789a833e8a898ebdbc607e7d8", + "14000001-15000000.gz:md5,5806b20108f56d9eeabcdd4f8450dca3", + "15000001-16000000.gz:md5,78e859f70026a05be43d48b9b272f287", + "16000001-17000000.gz:md5,539db7fc976bee4b6031f8dcb6a4641d", + "17000001-18000000.gz:md5,f3ea55e7552dc36734d6e8ba67d1e4c2", + "2000001-3000000.gz:md5,539013ecfdcd06eb653445f857265322", + "3000001-4000000.gz:md5,beb9701b402bd5ddc46a4da6e531f783", + "4000001-5000000.gz:md5,3f46efb2635850cc6c3d8ae51727a400", + "5000001-6000000.gz:md5,e11549bca12c5e2a7a208a997fda1c68", + "6000001-7000000.gz:md5,c0f3546c6859dc1a5fe9ff7f015ecd7e", + "7000001-8000000.gz:md5,344b72822f647819f4ee6b5afa9d7701", + "8000001-9000000.gz:md5,1c06d285ff5c53f89f073212343902b7", + "9000001-10000000.gz:md5,79140e754039c6d6fc6eeecddcf2aa8e" ], [ - "1-1000000.gz:md5,779fda1352c0b1d635aa752c185e6ce2" + "1-1000000.gz:md5,40ef48190d3269cd4112450bc717b1ef" ], [ - "1-1000000.gz:md5,267b3134411641d12bb6efcfd5e9d48d", - "10000001-11000000.gz:md5,4f7473abf622b57ca3c8d6de098759f7", - "1000001-2000000.gz:md5,5587d56269638b9cc88bdb7ae5dacc58", - "11000001-12000000.gz:md5,3da928f2caf09b9e3df58f9d1be2c541", - "12000001-13000000.gz:md5,eae3125082e1674f40fcfb8bb7da23e3", - "13000001-14000000.gz:md5,6a91eccbe0cf05737e2d2971d5618876", - "14000001-15000000.gz:md5,60c9b08ad4f674c0394a7c16dee2713e", - "15000001-16000000.gz:md5,b36c052923f3d3e3cc8db9c2852e1e16", - "16000001-17000000.gz:md5,0d0d7a8735aadda492912d032fd8733b", - "17000001-18000000.gz:md5,cf57648ef4cbc3325cff87e6b4b89653", - "18000001-19000000.gz:md5,5c649205050bfcbb4414db329659dadc", - "19000001-20000000.gz:md5,48838329ef3e8c26dd8b1ba82f925704", - "20000001-21000000.gz:md5,c7e1643d2880881fe5d44f718b8e6755", - "2000001-3000000.gz:md5,94849146caeca44e256aec58f1a914b8", - "3000001-4000000.gz:md5,4601fbb22fda2cbc4ed397707f8f7afd", - "4000001-5000000.gz:md5,7c8617d40b6d2e9a37802691f64f775e", - "5000001-6000000.gz:md5,0e81ba81f807f8429351e46dd8385e0e", - "6000001-7000000.gz:md5,6e86fccda1dc539e291bd5768ffd0dad", - "7000001-8000000.gz:md5,f0bf0306012d738fc71f7a26d7af2d81", - "8000001-9000000.gz:md5,0f3da0d0c714760bc2c39bc6696b74d1", - "9000001-10000000.gz:md5,4e0e2ba92378f6e1f261d5e59d8e3d9f" + "1-1000000.gz:md5,1a8739457c429931923ed77596a9ee54", + "10000001-11000000.gz:md5,316fa1d06fc1878b6a5995f4aee3e49d", + "1000001-2000000.gz:md5,3926c03a091850c909bd0ccfc7133c0b", + "11000001-12000000.gz:md5,29ca11d2f05051cc439a0d24a9db134c", + "12000001-13000000.gz:md5,a46f648554e91999652019516c933754", + "13000001-14000000.gz:md5,167b126d1c690a0e7e25fc5ccd09fb7c", + "14000001-15000000.gz:md5,645554c896133c476c3083302371bcf8", + "15000001-16000000.gz:md5,60fc48d9a7aff6286fc6630c46bcfebc", + "16000001-17000000.gz:md5,07e1750d1c95a61e96774d2cf3da4d89", + "17000001-18000000.gz:md5,59d084309f6a975ec1066a828b5845ba", + "18000001-19000000.gz:md5,868c12d305dbd4d04399ec7848804328", + "19000001-20000000.gz:md5,6de03a00061f6a88dcbbb8ed5fc0b8dc", + "20000001-21000000.gz:md5,732b956f13da9ef01f9de3355d12e28b", + "2000001-3000000.gz:md5,7c7528266c523cad419ea25e75d9566e", + "3000001-4000000.gz:md5,bb2283c0cfb0e4601fc535a4d51e6f2d", + "4000001-5000000.gz:md5,64c7f28f554414a88c886b0bcadb3c39", + "5000001-6000000.gz:md5,58e7106fe577a8b5e5c698445b4f0c33", + "6000001-7000000.gz:md5,2e309d12cf1c1c6276585f457ceeacc2", + "7000001-8000000.gz:md5,08cb0600f7806608f0103187a6c9c64e", + "8000001-9000000.gz:md5,869333c2615f714860d17d794640d4ad", + "9000001-10000000.gz:md5,b85cc861c6a3b30cf6f06c8af136b383" ], [ - "1-1000000.gz:md5,83a0a200cb053b1f28e41fe62068d49a", - "10000001-11000000.gz:md5,2f84058256242378b7d14ef526ed42ea", - "1000001-2000000.gz:md5,c655f70a1d16eef55a5003cdb63434ab", - "11000001-12000000.gz:md5,6d5c34d7a61fa4764c546d1b46a5c90d", - "12000001-13000000.gz:md5,84fceee383bb28edb8d749c744a10932", - "13000001-14000000.gz:md5,aadd01464ca36c813a831f5c2016ba02", - "14000001-15000000.gz:md5,94a5325ca01192d5eea218b30f933ec4", - "15000001-16000000.gz:md5,de84954b08d570585a448d8831c12e6d", - "16000001-17000000.gz:md5,be4954afe2236d62226307f8c9f95820", - "17000001-18000000.gz:md5,8fdacfb47fc5728eb27b22bbb8c9b301", - "2000001-3000000.gz:md5,83552c17d88c3986c56c681b0b49bb97", - "3000001-4000000.gz:md5,deb3bc27c8d431d60fc89a6fe49bcbc6", - "4000001-5000000.gz:md5,98de5bbb694c73f7ffde16fb92069117", - "5000001-6000000.gz:md5,7c5a116261bf41309b18c22b0cba5f52", - "6000001-7000000.gz:md5,fb0d2dc71bd0c9263ff23825d8a4ef64", - "7000001-8000000.gz:md5,2375dcd7787e7ca5d26442cea0ff6710", - "8000001-9000000.gz:md5,979f986c27b91a62873e639e3ebeae43", - "9000001-10000000.gz:md5,b80f6906a724e4b0d6c21dd4c77663fd" + "1-1000000.gz:md5,c8d97b084c159c3cb5be1fff4637dfce", + "10000001-11000000.gz:md5,f441f2af06fd4973749dfbfbef40fe1b", + "1000001-2000000.gz:md5,c42a1526a836cfacefb67e9217f648aa", + "11000001-12000000.gz:md5,264421c249c696b45c92e2611285fee7", + "12000001-13000000.gz:md5,e673d1fdbe7dc0d09bea3d11a5797d6a", + "13000001-14000000.gz:md5,88f4f84e63b362f1b4f800c48b37e82c", + "14000001-15000000.gz:md5,26282f2b305ed82fb9f8875e97361105", + "15000001-16000000.gz:md5,30b9132c2610d42919ba231d1adbef2a", + "16000001-17000000.gz:md5,3d0e975ccd1ae4e92bf1d9d915ed293f", + "17000001-18000000.gz:md5,7db5b3819da3df1e47fe757dc9c6f2ba", + "2000001-3000000.gz:md5,55f6130a8d5872bdc9f8eed231ad0f65", + "3000001-4000000.gz:md5,402b826dbf6993c207ad15483a44182b", + "4000001-5000000.gz:md5,43cf926d43db25af5724fb5077edfee1", + "5000001-6000000.gz:md5,f40276dbea3f6f9a75f9301d1253eb09", + "6000001-7000000.gz:md5,df0d2d38060d4e7c606072ae814b1f38", + "7000001-8000000.gz:md5,c4117cc51255c0a91c51ff43403f00f7", + "8000001-9000000.gz:md5,59a4ebadca27041634c58652c544c8dd", + "9000001-10000000.gz:md5,c54510616273a4d1bfa9d525dbbbca40" ], - "chr_synonyms.txt:md5,8a6fce00cc7817ec727c49b7954f10bc", - "info.txt:md5,33ccb74a030a9a345051628c337cb8af" + "chr_synonyms.txt:md5,d390f0bcc6fec9786bc66b75f2d4390b", + "info.txt:md5,249c88c7a71464e048cca0c4b2a21198" ] ] ] ] ], "1": [ - "versions.yml:md5,954fd177c394ba167d575a6aac47390b" + "versions.yml:md5,25f0fd61e1a90ecec5427a9400ad6bc9" ], "cache": [ [ { - "id": "111_WBcel235" + "id": "113_WBcel235" }, [ [ [ [ - "1-1000000.gz:md5,4da54db2f781d08975630811fd831585", - "10000001-11000000.gz:md5,7bee73e51d08f5e6de1796614105c5de", - "1000001-2000000.gz:md5,e8beff9020e261d78988c37e804cc89a", - "11000001-12000000.gz:md5,1a31b2dcf58822e837cd91b7a74a8b4f", - "12000001-13000000.gz:md5,34522be2ee5bd3cf51a9b151c877fe35", - "13000001-14000000.gz:md5,4e5a7b604f8aea2ad9de79b75ed89a6b", - "14000001-15000000.gz:md5,65146be110ea58b64ab8518bcbfbde9d", - "15000001-16000000.gz:md5,a39fdb7b0b056c0254574353351722eb", - "2000001-3000000.gz:md5,b72a04531477615dc4d2c530d09e60df", - "3000001-4000000.gz:md5,50dae46f370e1468c8f8f2c97cc75f0b", - "4000001-5000000.gz:md5,e58e124073689212e5311fbd8ccce415", - "5000001-6000000.gz:md5,db27434dc8be9557f97aa09a95126836", - "6000001-7000000.gz:md5,a5131e3ee41b329eb55fb3849ddb1471", - "7000001-8000000.gz:md5,61e1fbe1a82892a0f9f6ee0380fa60bc", - "8000001-9000000.gz:md5,48166dd4518ec21f597b6acca73809bb", - "9000001-10000000.gz:md5,3e416c856f40f36ec0ed3f42a93b2267" + "1-1000000.gz:md5,cadcba92b0999210dd8d832505d2e4c4", + "10000001-11000000.gz:md5,998a75dd927d10d45f8eebeef5fc7a75", + "1000001-2000000.gz:md5,a5cb3adb1ec9f40eed6a355d1492ba9b", + "11000001-12000000.gz:md5,46e6917f51093e28cce061774b9ed158", + "12000001-13000000.gz:md5,0adffacf8482d6c224df27104f65c9d6", + "13000001-14000000.gz:md5,aee759d812fc900a980ab0c4c5bd0273", + "14000001-15000000.gz:md5,f65537a3f76c40e63b6deb0b6cdb09dc", + "15000001-16000000.gz:md5,379f092ad1afa888da1fc13e80535def", + "2000001-3000000.gz:md5,86839741524579fd089498d6bee44dff", + "3000001-4000000.gz:md5,509b28af3920427e951f00b6973b5df4", + "4000001-5000000.gz:md5,f606e69cf59b0bdf2b61653608d955a6", + "5000001-6000000.gz:md5,a14ce1e21856e4a77ed63c67cbdfb26a", + "6000001-7000000.gz:md5,e1a895d6e8b352182b53ed1d0ce6e24e", + "7000001-8000000.gz:md5,ddf91b60f636d26b68b6bab3520b6b32", + "8000001-9000000.gz:md5,57482b996f89e92bbd0196efa4915cd3", + "9000001-10000000.gz:md5,43b5d89f84236b49b384d7f37f928129" ], [ - "1-1000000.gz:md5,06b83c3bd2c651c5a8a96f0865d54a53", - "10000001-11000000.gz:md5,79b3348f860370d1697e6d4de128fca3", - "1000001-2000000.gz:md5,f98e31f3e25c0a419ebeee5b17527b00", - "11000001-12000000.gz:md5,5f23214fdef1f7637f6046dc751155ed", - "12000001-13000000.gz:md5,9a4422905679e543a92d47142b1acba1", - "13000001-14000000.gz:md5,c5db99e7d56f2115f8da8fe3af83314a", - "14000001-15000000.gz:md5,66b65af3732c0495dc74f3071203ac2f", - "15000001-16000000.gz:md5,d4c30dc42925cc92dc594d4145544f33", - "2000001-3000000.gz:md5,ff9b3fd1235468c738e9201e2fa98e08", - "3000001-4000000.gz:md5,c649308c7d3b4891a8c6f95e583f3a08", - "4000001-5000000.gz:md5,c53d2fa6695248d0a725ef70325aae91", - "5000001-6000000.gz:md5,5481fb4b60ebd97256f5d52c42aee0bd", - "6000001-7000000.gz:md5,94b82e096bacb091e0ef55bcd08b8ab8", - "7000001-8000000.gz:md5,83f571dfaf8b891bf27208466e6f7d8c", - "8000001-9000000.gz:md5,4f07e30c7d772544bc6e99bac371b97b", - "9000001-10000000.gz:md5,f1439765f6428ae99516d95dc5df3926" + "1-1000000.gz:md5,d18811781848f70baef0b0348190d7ce", + "10000001-11000000.gz:md5,19011165abc56233ea0c5b0e6938d9c9", + "1000001-2000000.gz:md5,5e720fa191f3c9ac799b6a071bcc4332", + "11000001-12000000.gz:md5,b19c46fb00ca13a2a31128bd1829ddf5", + "12000001-13000000.gz:md5,54354b0870ca96641c51ed63382da007", + "13000001-14000000.gz:md5,6954fdc223f58eb406e602752ab7d139", + "14000001-15000000.gz:md5,929275a1cfea883999dddc20931a2e72", + "15000001-16000000.gz:md5,5f5b783a589a1fd80cc565e6f339c540", + "2000001-3000000.gz:md5,54e476e0e9f4a5d973ee710fd824abc7", + "3000001-4000000.gz:md5,d78d4a63165429fdb3a61b7cdbd3c43a", + "4000001-5000000.gz:md5,983f8efcebb7f62d7e7b1b3c0573d43e", + "5000001-6000000.gz:md5,e2cd03ed5b67b8ee123e4c4958508fe4", + "6000001-7000000.gz:md5,d04bc9335ba39ace20bce936e3a5cdeb", + "7000001-8000000.gz:md5,9354b26a9ba94aa5bc30f537c22382fb", + "8000001-9000000.gz:md5,b227c6ef81ab72d211d25dc4f44813b9", + "9000001-10000000.gz:md5,a6d7f29edd7c22139403a11cac989b7a" ], [ - "1-1000000.gz:md5,dab3bbb82e7ecc5430319b7723b88436", - "10000001-11000000.gz:md5,a1af0b4ce9c2ff301ac0a935a4189c58", - "1000001-2000000.gz:md5,8a70e4d08b14a4cf19b03a0556a6cae8", - "11000001-12000000.gz:md5,e866bb880cd79e612dc2081635368017", - "12000001-13000000.gz:md5,1b5be2ca310afd8289561331d19ddf07", - "13000001-14000000.gz:md5,907d2bb3f6b9b75fca9e40f1220c0cb4", - "2000001-3000000.gz:md5,783bcc5957ca4853853c5cda0418dbcd", - "3000001-4000000.gz:md5,cb2df81caa4a20215bb793ca2a792e01", - "4000001-5000000.gz:md5,2046030a187c0a86c9be02531aea0ed7", - "5000001-6000000.gz:md5,395a977401540eb90607b31ecc786a0f", - "6000001-7000000.gz:md5,e6a97128bc38649fcfa5dcb0032a570e", - "7000001-8000000.gz:md5,08804df16d4cdeb5a733d2d6b07b058e", - "8000001-9000000.gz:md5,bba084d260c12613403d144073105d9b", - "9000001-10000000.gz:md5,c0775c413018ed7964f3042112fe4e26" + "1-1000000.gz:md5,2117acb322a117a9c5db85c072575331", + "10000001-11000000.gz:md5,646c9582b56eb12ddbb1dd35b25c3670", + "1000001-2000000.gz:md5,ee433e4e5e37b2d008c43e1af4be0f8d", + "11000001-12000000.gz:md5,962fd6e52046484b3b123f9380ed64e9", + "12000001-13000000.gz:md5,1abf2d695c829eb2c88e0d3dbc739a1c", + "13000001-14000000.gz:md5,a6e03bf867f5cc694174a230f1b13a6b", + "2000001-3000000.gz:md5,a5b250aa9e3ee8cecc23bea0e2fa19a1", + "3000001-4000000.gz:md5,1390a6d2a28a4861b282d36d0fb85660", + "4000001-5000000.gz:md5,4bc7106bb2661aea28613c31935a5c8f", + "5000001-6000000.gz:md5,7317d6fbb3c77d7cdd31e781afab8f7d", + "6000001-7000000.gz:md5,1a3b6fa586e570c16b4833e34b28751e", + "7000001-8000000.gz:md5,b7bcb06393682f621403afdf19bf87b4", + "8000001-9000000.gz:md5,0011675a8567d394da54a52480b35786", + "9000001-10000000.gz:md5,e4fa88e4ec57ed0c71fd21090d8aa17a" ], [ - "1-1000000.gz:md5,710a2c1ad83c3c9751a0f152cd98f02c", - "10000001-11000000.gz:md5,ac93a92e62043bddaa59973e26dac8e0", - "1000001-2000000.gz:md5,232ee5ab6c7295007ffc760f361e4c07", - "11000001-12000000.gz:md5,07e49ac7b78fbc29cd920b11a4d21afc", - "12000001-13000000.gz:md5,7849822cf3df022e9f273fb6a928aa8b", - "13000001-14000000.gz:md5,e3d3f0ee264694c72b3b989a542c5694", - "14000001-15000000.gz:md5,e5771b6c2fefe9f62b23c71ab873f94f", - "15000001-16000000.gz:md5,fa02114035f63a504c48005c498f8ec3", - "16000001-17000000.gz:md5,ef0fd03281afc2e636a64fd61df8c4dc", - "17000001-18000000.gz:md5,4483a6d74a07b1101dccee71a22213ed", - "2000001-3000000.gz:md5,f164cbfdc8bc12efd7c26df3935fe190", - "3000001-4000000.gz:md5,ff05a42801004994a4f30f081bc8945a", - "4000001-5000000.gz:md5,b63f51d54dc3cb4b214b54527dfa4234", - "5000001-6000000.gz:md5,6945f59a1fd50f1dfa8a2f1e55fecc12", - "6000001-7000000.gz:md5,3f27a3cb19ece2a9e87da6fe6476faaa", - "7000001-8000000.gz:md5,05bffc6e8af7a80fdd6cbc53e5512d0c", - "8000001-9000000.gz:md5,8a3594ee1eb15d7aa8edeb325e485dce", - "9000001-10000000.gz:md5,338bce56200690d450d5fcac65a91be7" + "1-1000000.gz:md5,a47af22d33275652036ddf7161699c7c", + "10000001-11000000.gz:md5,7fc129e7edbaa5be87306de417c2ef28", + "1000001-2000000.gz:md5,cbc12c339741df5ad06bf9a946be6c93", + "11000001-12000000.gz:md5,d1cc5e20e3d3402debdc102087a5407f", + "12000001-13000000.gz:md5,42c69c8e86d28151e9a8b1787dbee125", + "13000001-14000000.gz:md5,c7459d1789a833e8a898ebdbc607e7d8", + "14000001-15000000.gz:md5,5806b20108f56d9eeabcdd4f8450dca3", + "15000001-16000000.gz:md5,78e859f70026a05be43d48b9b272f287", + "16000001-17000000.gz:md5,539db7fc976bee4b6031f8dcb6a4641d", + "17000001-18000000.gz:md5,f3ea55e7552dc36734d6e8ba67d1e4c2", + "2000001-3000000.gz:md5,539013ecfdcd06eb653445f857265322", + "3000001-4000000.gz:md5,beb9701b402bd5ddc46a4da6e531f783", + "4000001-5000000.gz:md5,3f46efb2635850cc6c3d8ae51727a400", + "5000001-6000000.gz:md5,e11549bca12c5e2a7a208a997fda1c68", + "6000001-7000000.gz:md5,c0f3546c6859dc1a5fe9ff7f015ecd7e", + "7000001-8000000.gz:md5,344b72822f647819f4ee6b5afa9d7701", + "8000001-9000000.gz:md5,1c06d285ff5c53f89f073212343902b7", + "9000001-10000000.gz:md5,79140e754039c6d6fc6eeecddcf2aa8e" ], [ - "1-1000000.gz:md5,779fda1352c0b1d635aa752c185e6ce2" + "1-1000000.gz:md5,40ef48190d3269cd4112450bc717b1ef" ], [ - "1-1000000.gz:md5,267b3134411641d12bb6efcfd5e9d48d", - "10000001-11000000.gz:md5,4f7473abf622b57ca3c8d6de098759f7", - "1000001-2000000.gz:md5,5587d56269638b9cc88bdb7ae5dacc58", - "11000001-12000000.gz:md5,3da928f2caf09b9e3df58f9d1be2c541", - "12000001-13000000.gz:md5,eae3125082e1674f40fcfb8bb7da23e3", - "13000001-14000000.gz:md5,6a91eccbe0cf05737e2d2971d5618876", - "14000001-15000000.gz:md5,60c9b08ad4f674c0394a7c16dee2713e", - "15000001-16000000.gz:md5,b36c052923f3d3e3cc8db9c2852e1e16", - "16000001-17000000.gz:md5,0d0d7a8735aadda492912d032fd8733b", - "17000001-18000000.gz:md5,cf57648ef4cbc3325cff87e6b4b89653", - "18000001-19000000.gz:md5,5c649205050bfcbb4414db329659dadc", - "19000001-20000000.gz:md5,48838329ef3e8c26dd8b1ba82f925704", - "20000001-21000000.gz:md5,c7e1643d2880881fe5d44f718b8e6755", - "2000001-3000000.gz:md5,94849146caeca44e256aec58f1a914b8", - "3000001-4000000.gz:md5,4601fbb22fda2cbc4ed397707f8f7afd", - "4000001-5000000.gz:md5,7c8617d40b6d2e9a37802691f64f775e", - "5000001-6000000.gz:md5,0e81ba81f807f8429351e46dd8385e0e", - "6000001-7000000.gz:md5,6e86fccda1dc539e291bd5768ffd0dad", - "7000001-8000000.gz:md5,f0bf0306012d738fc71f7a26d7af2d81", - "8000001-9000000.gz:md5,0f3da0d0c714760bc2c39bc6696b74d1", - "9000001-10000000.gz:md5,4e0e2ba92378f6e1f261d5e59d8e3d9f" + "1-1000000.gz:md5,1a8739457c429931923ed77596a9ee54", + "10000001-11000000.gz:md5,316fa1d06fc1878b6a5995f4aee3e49d", + "1000001-2000000.gz:md5,3926c03a091850c909bd0ccfc7133c0b", + "11000001-12000000.gz:md5,29ca11d2f05051cc439a0d24a9db134c", + "12000001-13000000.gz:md5,a46f648554e91999652019516c933754", + "13000001-14000000.gz:md5,167b126d1c690a0e7e25fc5ccd09fb7c", + "14000001-15000000.gz:md5,645554c896133c476c3083302371bcf8", + "15000001-16000000.gz:md5,60fc48d9a7aff6286fc6630c46bcfebc", + "16000001-17000000.gz:md5,07e1750d1c95a61e96774d2cf3da4d89", + "17000001-18000000.gz:md5,59d084309f6a975ec1066a828b5845ba", + "18000001-19000000.gz:md5,868c12d305dbd4d04399ec7848804328", + "19000001-20000000.gz:md5,6de03a00061f6a88dcbbb8ed5fc0b8dc", + "20000001-21000000.gz:md5,732b956f13da9ef01f9de3355d12e28b", + "2000001-3000000.gz:md5,7c7528266c523cad419ea25e75d9566e", + "3000001-4000000.gz:md5,bb2283c0cfb0e4601fc535a4d51e6f2d", + "4000001-5000000.gz:md5,64c7f28f554414a88c886b0bcadb3c39", + "5000001-6000000.gz:md5,58e7106fe577a8b5e5c698445b4f0c33", + "6000001-7000000.gz:md5,2e309d12cf1c1c6276585f457ceeacc2", + "7000001-8000000.gz:md5,08cb0600f7806608f0103187a6c9c64e", + "8000001-9000000.gz:md5,869333c2615f714860d17d794640d4ad", + "9000001-10000000.gz:md5,b85cc861c6a3b30cf6f06c8af136b383" ], [ - "1-1000000.gz:md5,83a0a200cb053b1f28e41fe62068d49a", - "10000001-11000000.gz:md5,2f84058256242378b7d14ef526ed42ea", - "1000001-2000000.gz:md5,c655f70a1d16eef55a5003cdb63434ab", - "11000001-12000000.gz:md5,6d5c34d7a61fa4764c546d1b46a5c90d", - "12000001-13000000.gz:md5,84fceee383bb28edb8d749c744a10932", - "13000001-14000000.gz:md5,aadd01464ca36c813a831f5c2016ba02", - "14000001-15000000.gz:md5,94a5325ca01192d5eea218b30f933ec4", - "15000001-16000000.gz:md5,de84954b08d570585a448d8831c12e6d", - "16000001-17000000.gz:md5,be4954afe2236d62226307f8c9f95820", - "17000001-18000000.gz:md5,8fdacfb47fc5728eb27b22bbb8c9b301", - "2000001-3000000.gz:md5,83552c17d88c3986c56c681b0b49bb97", - "3000001-4000000.gz:md5,deb3bc27c8d431d60fc89a6fe49bcbc6", - "4000001-5000000.gz:md5,98de5bbb694c73f7ffde16fb92069117", - "5000001-6000000.gz:md5,7c5a116261bf41309b18c22b0cba5f52", - "6000001-7000000.gz:md5,fb0d2dc71bd0c9263ff23825d8a4ef64", - "7000001-8000000.gz:md5,2375dcd7787e7ca5d26442cea0ff6710", - "8000001-9000000.gz:md5,979f986c27b91a62873e639e3ebeae43", - "9000001-10000000.gz:md5,b80f6906a724e4b0d6c21dd4c77663fd" + "1-1000000.gz:md5,c8d97b084c159c3cb5be1fff4637dfce", + "10000001-11000000.gz:md5,f441f2af06fd4973749dfbfbef40fe1b", + "1000001-2000000.gz:md5,c42a1526a836cfacefb67e9217f648aa", + "11000001-12000000.gz:md5,264421c249c696b45c92e2611285fee7", + "12000001-13000000.gz:md5,e673d1fdbe7dc0d09bea3d11a5797d6a", + "13000001-14000000.gz:md5,88f4f84e63b362f1b4f800c48b37e82c", + "14000001-15000000.gz:md5,26282f2b305ed82fb9f8875e97361105", + "15000001-16000000.gz:md5,30b9132c2610d42919ba231d1adbef2a", + "16000001-17000000.gz:md5,3d0e975ccd1ae4e92bf1d9d915ed293f", + "17000001-18000000.gz:md5,7db5b3819da3df1e47fe757dc9c6f2ba", + "2000001-3000000.gz:md5,55f6130a8d5872bdc9f8eed231ad0f65", + "3000001-4000000.gz:md5,402b826dbf6993c207ad15483a44182b", + "4000001-5000000.gz:md5,43cf926d43db25af5724fb5077edfee1", + "5000001-6000000.gz:md5,f40276dbea3f6f9a75f9301d1253eb09", + "6000001-7000000.gz:md5,df0d2d38060d4e7c606072ae814b1f38", + "7000001-8000000.gz:md5,c4117cc51255c0a91c51ff43403f00f7", + "8000001-9000000.gz:md5,59a4ebadca27041634c58652c544c8dd", + "9000001-10000000.gz:md5,c54510616273a4d1bfa9d525dbbbca40" ], - "chr_synonyms.txt:md5,8a6fce00cc7817ec727c49b7954f10bc", - "info.txt:md5,33ccb74a030a9a345051628c337cb8af" + "chr_synonyms.txt:md5,d390f0bcc6fec9786bc66b75f2d4390b", + "info.txt:md5,249c88c7a71464e048cca0c4b2a21198" ] ] ] ] ], "versions": [ - "versions.yml:md5,954fd177c394ba167d575a6aac47390b" + "versions.yml:md5,25f0fd61e1a90ecec5427a9400ad6bc9" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.03.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-04-15T17:20:01.921038" + "timestamp": "2024-10-21T09:09:48.574969389" }, "celegans - download - stub": { "content": [ @@ -288,7 +288,7 @@ "0": [ [ { - "id": "111_WBcel235" + "id": "113_WBcel235" }, [ @@ -296,12 +296,12 @@ ] ], "1": [ - "versions.yml:md5,954fd177c394ba167d575a6aac47390b" + "versions.yml:md5,25f0fd61e1a90ecec5427a9400ad6bc9" ], "cache": [ [ { - "id": "111_WBcel235" + "id": "113_WBcel235" }, [ @@ -309,14 +309,14 @@ ] ], "versions": [ - "versions.yml:md5,954fd177c394ba167d575a6aac47390b" + "versions.yml:md5,25f0fd61e1a90ecec5427a9400ad6bc9" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.03.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-04-15T18:09:54.909036" + "timestamp": "2024-10-21T09:10:03.728940123" } } \ No newline at end of file diff --git a/modules/nf-core/ensemblvep/download/tests/nextflow.config b/modules/nf-core/ensemblvep/download/tests/nextflow.config index 882bce41a0..0a4ae1a6a8 100644 --- a/modules/nf-core/ensemblvep/download/tests/nextflow.config +++ b/modules/nf-core/ensemblvep/download/tests/nextflow.config @@ -1,5 +1,5 @@ params { - vep_cache_version = "111" + vep_cache_version = "113" vep_genome = "WBcel235" vep_species = "caenorhabditis_elegans" } diff --git a/modules/nf-core/ensemblvep/vep/environment.yml b/modules/nf-core/ensemblvep/vep/environment.yml index 91457c0508..3d36eb17c0 100644 --- a/modules/nf-core/ensemblvep/vep/environment.yml +++ b/modules/nf-core/ensemblvep/vep/environment.yml @@ -1,7 +1,5 @@ -name: ensemblvep_vep channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::ensembl-vep=111.0 + - bioconda::ensembl-vep=113.0 diff --git a/modules/nf-core/ensemblvep/vep/main.nf b/modules/nf-core/ensemblvep/vep/main.nf index e82471aa1d..7d2c82ff0a 100644 --- a/modules/nf-core/ensemblvep/vep/main.nf +++ b/modules/nf-core/ensemblvep/vep/main.nf @@ -4,8 +4,8 @@ process ENSEMBLVEP_VEP { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/ensembl-vep:111.0--pl5321h2a3209d_0' : - 'biocontainers/ensembl-vep:111.0--pl5321h2a3209d_0' }" + 'https://depot.galaxyproject.org/singularity/ensembl-vep:113.0--pl5321h2a3209d_0' : + 'biocontainers/ensembl-vep:113.0--pl5321h2a3209d_0' }" input: tuple val(meta), path(vcf), path(custom_extra_files) @@ -20,7 +20,7 @@ process ENSEMBLVEP_VEP { tuple val(meta), path("*.vcf.gz") , optional:true, emit: vcf tuple val(meta), path("*.tab.gz") , optional:true, emit: tab tuple val(meta), path("*.json.gz") , optional:true, emit: json - path "*.summary.html" , optional:true, emit: report + path "*.html" , optional:true, emit: report path "versions.yml" , emit: versions when: @@ -60,7 +60,7 @@ process ENSEMBLVEP_VEP { echo "" | gzip > ${prefix}.vcf.gz echo "" | gzip > ${prefix}.tab.gz echo "" | gzip > ${prefix}.json.gz - touch ${prefix}.summary.html + touch ${prefix}_summary.html cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/ensemblvep/vep/meta.yml b/modules/nf-core/ensemblvep/vep/meta.yml index d8ff8d1443..9288a93849 100644 --- a/modules/nf-core/ensemblvep/vep/meta.yml +++ b/modules/nf-core/ensemblvep/vep/meta.yml @@ -1,5 +1,6 @@ name: ensemblvep_vep -description: Ensembl Variant Effect Predictor (VEP). The output-file-format is controlled through `task.ext.args`. +description: Ensembl Variant Effect Predictor (VEP). The output-file-format is controlled + through `task.ext.args`. keywords: - annotation - vcf @@ -13,75 +14,96 @@ tools: homepage: https://www.ensembl.org/info/docs/tools/vep/index.html documentation: https://www.ensembl.org/info/docs/tools/vep/script/index.html licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: | - vcf to annotate - - custom_extra_files: - type: file - description: | - extra sample-specific files to be used with the `--custom` flag to be configured with ext.args - (optional) - - genome: - type: string - description: | - which genome to annotate with - - species: - type: string - description: | - which species to annotate with - - cache_version: - type: integer - description: | - which version of the cache to annotate with - - cache: - type: file - description: | - path to VEP cache (optional) - - meta2: - type: map - description: | - Groovy Map containing fasta reference information - e.g. [ id:'test' ] - - fasta: - type: file - description: | - reference FASTA file (optional) - pattern: "*.{fasta,fa}" - - extra_files: - type: file - description: | - path to file(s) needed for plugins (optional) + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: | + vcf to annotate + - custom_extra_files: + type: file + description: | + extra sample-specific files to be used with the `--custom` flag to be configured with ext.args + (optional) + - - genome: + type: string + description: | + which genome to annotate with + - - species: + type: string + description: | + which species to annotate with + - - cache_version: + type: integer + description: | + which version of the cache to annotate with + - - cache: + type: file + description: | + path to VEP cache (optional) + - - meta2: + type: map + description: | + Groovy Map containing fasta reference information + e.g. [ id:'test' ] + - fasta: + type: file + description: | + reference FASTA file (optional) + pattern: "*.{fasta,fa}" + - - extra_files: + type: file + description: | + path to file(s) needed for plugins (optional) output: - vcf: - type: file - description: | - annotated vcf (optional) - pattern: "*.ann.vcf.gz" + - meta: + type: file + description: | + annotated vcf (optional) + pattern: "*.ann.vcf.gz" + - "*.vcf.gz": + type: file + description: | + annotated vcf (optional) + pattern: "*.ann.vcf.gz" - tab: - type: file - description: | - tab file with annotated variants (optional) - pattern: "*.ann.tab.gz" + - meta: + type: file + description: | + tab file with annotated variants (optional) + pattern: "*.ann.tab.gz" + - "*.tab.gz": + type: file + description: | + tab file with annotated variants (optional) + pattern: "*.ann.tab.gz" - json: - type: file - description: | - json file with annotated variants (optional) - pattern: "*.ann.json.gz" + - meta: + type: file + description: | + json file with annotated variants (optional) + pattern: "*.ann.json.gz" + - "*.json.gz": + type: file + description: | + json file with annotated variants (optional) + pattern: "*.ann.json.gz" - report: - type: file - description: VEP report file - pattern: "*.html" + - "*.html": + type: file + description: VEP report file + pattern: "*.html" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@maxulysse" - "@matthdsm" diff --git a/modules/nf-core/ensemblvep/vep/tests/main.nf.test b/modules/nf-core/ensemblvep/vep/tests/main.nf.test index 4aff84a3c1..3e8c0b5379 100644 --- a/modules/nf-core/ensemblvep/vep/tests/main.nf.test +++ b/modules/nf-core/ensemblvep/vep/tests/main.nf.test @@ -21,7 +21,7 @@ nextflow_process { process { """ input[0] = Channel.of([ - [id:"111_WBcel235"], + [id:"113_WBcel235"], params.vep_genome, params.vep_species, params.vep_cache_version @@ -72,7 +72,7 @@ nextflow_process { process { """ input[0] = Channel.of([ - [id:"111_WBcel235"], + [id:"113_WBcel235"], params.vep_genome, params.vep_species, params.vep_cache_version @@ -107,7 +107,7 @@ nextflow_process { assertAll( { assert process.success }, { assert snapshot(process.out.versions).match() }, - { assert path(process.out.tab.get(0).get(1)).linesGzip.contains("## ENSEMBL VARIANT EFFECT PREDICTOR v111.0") } + { assert path(process.out.tab.get(0).get(1)).linesGzip.contains("## ENSEMBL VARIANT EFFECT PREDICTOR v113.0") } ) } } diff --git a/modules/nf-core/ensemblvep/vep/tests/main.nf.test.snap b/modules/nf-core/ensemblvep/vep/tests/main.nf.test.snap index f937b29949..1df94276a4 100644 --- a/modules/nf-core/ensemblvep/vep/tests/main.nf.test.snap +++ b/modules/nf-core/ensemblvep/vep/tests/main.nf.test.snap @@ -2,25 +2,25 @@ "test_ensemblvep_vep_fasta_tab_gz": { "content": [ [ - "versions.yml:md5,bd2ba1b4741a7d0a224160b50859f4ba" + "versions.yml:md5,4fbfeb73f0d4b4aa039f17be8ba9e1f2" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.03.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-04-15T17:35:20.694114" + "timestamp": "2024-10-21T09:12:23.474703494" }, "test_ensemblvep_vep_fasta_vcf": { "content": [ [ - "versions.yml:md5,bd2ba1b4741a7d0a224160b50859f4ba" + "versions.yml:md5,4fbfeb73f0d4b4aa039f17be8ba9e1f2" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.03.0" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-04-15T17:34:41.093843" + "timestamp": "2024-10-21T09:11:54.343590485" } } \ No newline at end of file diff --git a/modules/nf-core/ensemblvep/vep/tests/nextflow.config b/modules/nf-core/ensemblvep/vep/tests/nextflow.config index 882bce41a0..0a4ae1a6a8 100644 --- a/modules/nf-core/ensemblvep/vep/tests/nextflow.config +++ b/modules/nf-core/ensemblvep/vep/tests/nextflow.config @@ -1,5 +1,5 @@ params { - vep_cache_version = "111" + vep_cache_version = "113" vep_genome = "WBcel235" vep_species = "caenorhabditis_elegans" } diff --git a/modules/nf-core/fastp/environment.yml b/modules/nf-core/fastp/environment.yml index 70389e664c..26d4aca5dd 100644 --- a/modules/nf-core/fastp/environment.yml +++ b/modules/nf-core/fastp/environment.yml @@ -1,7 +1,5 @@ -name: fastp channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::fastp=0.23.4 diff --git a/modules/nf-core/fastp/main.nf b/modules/nf-core/fastp/main.nf index 4fc19b7443..e1b9f56560 100644 --- a/modules/nf-core/fastp/main.nf +++ b/modules/nf-core/fastp/main.nf @@ -10,6 +10,7 @@ process FASTP { input: tuple val(meta), path(reads) path adapter_fasta + val discard_trimmed_pass val save_trimmed_fail val save_merged @@ -18,9 +19,9 @@ process FASTP { tuple val(meta), path('*.json') , emit: json tuple val(meta), path('*.html') , emit: html tuple val(meta), path('*.log') , emit: log - path "versions.yml" , emit: versions tuple val(meta), path('*.fail.fastq.gz') , optional:true, emit: reads_fail tuple val(meta), path('*.merged.fastq.gz'), optional:true, emit: reads_merged + path "versions.yml" , emit: versions when: task.ext.when == null || task.ext.when @@ -30,6 +31,8 @@ process FASTP { def prefix = task.ext.prefix ?: "${meta.id}" def adapter_list = adapter_fasta ? "--adapter_fasta ${adapter_fasta}" : "" def fail_fastq = save_trimmed_fail && meta.single_end ? "--failed_out ${prefix}.fail.fastq.gz" : save_trimmed_fail && !meta.single_end ? "--failed_out ${prefix}.paired.fail.fastq.gz --unpaired1 ${prefix}_1.fail.fastq.gz --unpaired2 ${prefix}_2.fail.fastq.gz" : '' + def out_fq1 = discard_trimmed_pass ?: ( meta.single_end ? "--out1 ${prefix}.fastp.fastq.gz" : "--out1 ${prefix}_1.fastp.fastq.gz" ) + def out_fq2 = discard_trimmed_pass ?: "--out2 ${prefix}_2.fastp.fastq.gz" // Added soft-links to original fastqs for consistent naming in MultiQC // Use single ended for interleaved. Add --interleaved_in in config. if ( task.ext.args?.contains('--interleaved_in') ) { @@ -59,7 +62,7 @@ process FASTP { fastp \\ --in1 ${prefix}.fastq.gz \\ - --out1 ${prefix}.fastp.fastq.gz \\ + $out_fq1 \\ --thread $task.cpus \\ --json ${prefix}.fastp.json \\ --html ${prefix}.fastp.html \\ @@ -81,8 +84,8 @@ process FASTP { fastp \\ --in1 ${prefix}_1.fastq.gz \\ --in2 ${prefix}_2.fastq.gz \\ - --out1 ${prefix}_1.fastp.fastq.gz \\ - --out2 ${prefix}_2.fastp.fastq.gz \\ + $out_fq1 \\ + $out_fq2 \\ --json ${prefix}.fastp.json \\ --html ${prefix}.fastp.html \\ $adapter_list \\ @@ -103,14 +106,16 @@ process FASTP { stub: def prefix = task.ext.prefix ?: "${meta.id}" def is_single_output = task.ext.args?.contains('--interleaved_in') || meta.single_end - def touch_reads = is_single_output ? "${prefix}.fastp.fastq.gz" : "${prefix}_1.fastp.fastq.gz ${prefix}_2.fastp.fastq.gz" - def touch_merged = (!is_single_output && save_merged) ? "touch ${prefix}.merged.fastq.gz" : "" + def touch_reads = (discard_trimmed_pass) ? "" : (is_single_output) ? "echo '' | gzip > ${prefix}.fastp.fastq.gz" : "echo '' | gzip > ${prefix}_1.fastp.fastq.gz ; echo '' | gzip > ${prefix}_2.fastp.fastq.gz" + def touch_merged = (!is_single_output && save_merged) ? "echo '' | gzip > ${prefix}.merged.fastq.gz" : "" + def touch_fail_fastq = (!save_trimmed_fail) ? "" : meta.single_end ? "echo '' | gzip > ${prefix}.fail.fastq.gz" : "echo '' | gzip > ${prefix}.paired.fail.fastq.gz ; echo '' | gzip > ${prefix}_1.fail.fastq.gz ; echo '' | gzip > ${prefix}_2.fail.fastq.gz" """ - touch $touch_reads + $touch_reads + $touch_fail_fastq + $touch_merged touch "${prefix}.fastp.json" touch "${prefix}.fastp.html" touch "${prefix}.fastp.log" - $touch_merged cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/fastp/meta.yml b/modules/nf-core/fastp/meta.yml index c22a16abd9..159404d08d 100644 --- a/modules/nf-core/fastp/meta.yml +++ b/modules/nf-core/fastp/meta.yml @@ -11,62 +11,100 @@ tools: documentation: https://github.com/OpenGene/fastp doi: 10.1093/bioinformatics/bty560 licence: ["MIT"] + identifier: biotools:fastp input: - - meta: - type: map - description: | - Groovy Map containing sample information. Use 'single_end: true' to specify single ended or interleaved FASTQs. Use 'single_end: false' for paired-end reads. - e.g. [ id:'test', single_end:false ] - - reads: - type: file - description: | - List of input FastQ files of size 1 and 2 for single-end and paired-end data, - respectively. If you wish to run interleaved paired-end data, supply as single-end data - but with `--interleaved_in` in your `modules.conf`'s `ext.args` for the module. - - adapter_fasta: - type: file - description: File in FASTA format containing possible adapters to remove. - pattern: "*.{fasta,fna,fas,fa}" - - save_trimmed_fail: - type: boolean - description: Specify true to save files that failed to pass trimming thresholds ending in `*.fail.fastq.gz` - - save_merged: - type: boolean - description: Specify true to save all merged reads to the a file ending in `*.merged.fastq.gz` + - - meta: + type: map + description: | + Groovy Map containing sample information. Use 'single_end: true' to specify single ended or interleaved FASTQs. Use 'single_end: false' for paired-end reads. + e.g. [ id:'test', single_end:false ] + - reads: + type: file + description: | + List of input FastQ files of size 1 and 2 for single-end and paired-end data, + respectively. If you wish to run interleaved paired-end data, supply as single-end data + but with `--interleaved_in` in your `modules.conf`'s `ext.args` for the module. + - - adapter_fasta: + type: file + description: File in FASTA format containing possible adapters to remove. + pattern: "*.{fasta,fna,fas,fa}" + - - discard_trimmed_pass: + type: boolean + description: Specify true to not write any reads that pass trimming thresholds. + | This can be used to use fastp for the output report only. + - - save_trimmed_fail: + type: boolean + description: Specify true to save files that failed to pass trimming thresholds + ending in `*.fail.fastq.gz` + - - save_merged: + type: boolean + description: Specify true to save all merged reads to a file ending in `*.merged.fastq.gz` output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - reads: - type: file - description: The trimmed/modified/unmerged fastq reads - pattern: "*fastp.fastq.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.fastp.fastq.gz": + type: file + description: The trimmed/modified/unmerged fastq reads + pattern: "*fastp.fastq.gz" - json: - type: file - description: Results in JSON format - pattern: "*.json" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.json": + type: file + description: Results in JSON format + pattern: "*.json" - html: - type: file - description: Results in HTML format - pattern: "*.html" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.html": + type: file + description: Results in HTML format + pattern: "*.html" - log: - type: file - description: fastq log file - pattern: "*.log" - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.log": + type: file + description: fastq log file + pattern: "*.log" - reads_fail: - type: file - description: Reads the failed the preprocessing - pattern: "*fail.fastq.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.fail.fastq.gz": + type: file + description: Reads the failed the preprocessing + pattern: "*fail.fastq.gz" - reads_merged: - type: file - description: Reads that were successfully merged - pattern: "*.{merged.fastq.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.merged.fastq.gz": + type: file + description: Reads that were successfully merged + pattern: "*.{merged.fastq.gz}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@kevinmenden" diff --git a/modules/nf-core/fastp/tests/main.nf.test b/modules/nf-core/fastp/tests/main.nf.test index 6f1f489785..30dbb8aabf 100644 --- a/modules/nf-core/fastp/tests/main.nf.test +++ b/modules/nf-core/fastp/tests/main.nf.test @@ -10,221 +10,290 @@ nextflow_process { test("test_fastp_single_end") { when { - params { - outdir = "$outputDir" - } + process { """ - adapter_fasta = [] - save_trimmed_fail = false - save_merged = false - input[0] = Channel.of([ [ id:'test', single_end:true ], [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = false + input[3] = false + input[4] = false """ } } then { - def html_text = [ "Q20 bases:12.922000 K (92.984097%)", - "single end (151 cycles)" ] - def log_text = [ "Q20 bases: 12922(92.9841%)", - "reads passed filter: 99" ] - def read_lines = ["@ERR5069949.2151832 NS500628:121:HK3MMAFX2:2:21208:10793:15304/1", - "TCATAAACCAAAGCACTCACAGTGTCAACAATTTCAGCAGGACAACGCCGACAAGTTCCGAGGAACATGTCTGGACCTATAGTTTTCATAAGTCTACACACTGAATTGAAATATTCTGGTTCTAGTGTGCCCTTAGTTAGCAATGTGCGT", - "AAAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEE - { assert path(process.out.reads.get(0).get(1)).linesGzip.contains(read_line) } - } - }, - { html_text.each { html_part -> - { assert path(process.out.html.get(0).get(1)).getText().contains(html_part) } - } - }, - { assert snapshot(process.out.json).match("test_fastp_single_end_json") }, - { log_text.each { log_part -> - { assert path(process.out.log.get(0).get(1)).getText().contains(log_part) } - } - }, - { - assert snapshot( - ( - [process.out.reads[0][0].toString()] + // meta - process.out.reads.collect { file(it[1]).getName() } + - process.out.json.collect { file(it[1]).getName() } + - process.out.html.collect { file(it[1]).getName() } + - process.out.log.collect { file(it[1]).getName() } + - process.out.reads_fail.collect { file(it[1]).getName() } + - process.out.reads_merged.collect { file(it[1]).getName() } - ).sort() - ).match("test_fastp_single_end-_match") - }, - { assert snapshot(process.out.versions).match("versions_single_end") } + { assert path(process.out.html.get(0).get(1)).getText().contains("single end (151 cycles)") }, + { assert path(process.out.log.get(0).get(1)).getText().contains("reads passed filter: 99") }, + { assert snapshot( + process.out.json, + process.out.reads, + process.out.reads_fail, + process.out.reads_merged, + process.out.versions).match() } ) } } - test("test_fastp_single_end-stub") { - - options '-stub' + test("test_fastp_paired_end") { when { - params { - outdir = "$outputDir" - } + process { """ adapter_fasta = [] + save_trimmed_pass = true save_trimmed_fail = false save_merged = false input[0] = Channel.of([ - [ id:'test', single_end:true ], - [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) ] + [ id:'test', single_end:false ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = false + input[3] = false + input[4] = false """ } } then { + assertAll( + { assert process.success }, + { assert path(process.out.html.get(0).get(1)).getText().contains("The input has little adapter percentage (~0.000000%), probably it's trimmed before.") }, + { assert path(process.out.log.get(0).get(1)).getText().contains("Q30 bases: 12281(88.3716%)") }, + { assert snapshot( + process.out.json, + process.out.reads, + process.out.reads_fail, + process.out.reads_merged, + process.out.versions).match() } + ) + } + } + test("fastp test_fastp_interleaved") { + + config './nextflow.interleaved.config' + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:true ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_interleaved.fastq.gz', checkIfExists: true) ] + ]) + input[1] = [] + input[2] = false + input[3] = false + input[4] = false + """ + } + } + + then { assertAll( { assert process.success }, - { - assert snapshot( - ( - [process.out.reads[0][0].toString()] + // meta - process.out.reads.collect { file(it[1]).getName() } + - process.out.json.collect { file(it[1]).getName() } + - process.out.html.collect { file(it[1]).getName() } + - process.out.log.collect { file(it[1]).getName() } + - process.out.reads_fail.collect { file(it[1]).getName() } + - process.out.reads_merged.collect { file(it[1]).getName() } - ).sort() - ).match("test_fastp_single_end-for_stub_match") - }, - { assert snapshot(process.out.versions).match("versions_single_end_stub") } + { assert path(process.out.html.get(0).get(1)).getText().contains("paired end (151 cycles + 151 cycles)") }, + { assert path(process.out.log.get(0).get(1)).getText().contains("reads passed filter: 162") }, + { assert process.out.reads_fail == [] }, + { assert process.out.reads_merged == [] }, + { assert snapshot( + process.out.reads, + process.out.json, + process.out.versions).match() } ) } } - test("test_fastp_paired_end") { + test("test_fastp_single_end_trim_fail") { when { - params { - outdir = "$outputDir" + + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:true ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) ] + ]) + input[1] = [] + input[2] = false + input[3] = true + input[4] = false + """ } + } + + then { + assertAll( + { assert process.success }, + { assert path(process.out.html.get(0).get(1)).getText().contains("single end (151 cycles)") }, + { assert path(process.out.log.get(0).get(1)).getText().contains("reads passed filter: 99") }, + { assert snapshot( + process.out.json, + process.out.reads, + process.out.reads_fail, + process.out.reads_merged, + process.out.versions).match() } + ) + } + } + + test("test_fastp_paired_end_trim_fail") { + + config './nextflow.save_failed.config' + when { process { """ - adapter_fasta = [] - save_trimmed_fail = false - save_merged = false + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true)] + ]) + input[1] = [] + input[2] = false + input[3] = true + input[4] = false + """ + } + } + then { + assertAll( + { assert process.success }, + { assert path(process.out.html.get(0).get(1)).getText().contains("The input has little adapter percentage (~0.000000%), probably it's trimmed before.") }, + { assert path(process.out.log.get(0).get(1)).getText().contains("reads passed filter: 162") }, + { assert snapshot( + process.out.reads, + process.out.reads_fail, + process.out.reads_merged, + process.out.json, + process.out.versions).match() } + ) + } + } + + test("test_fastp_paired_end_merged") { + + when { + process { + """ input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = false + input[3] = false + input[4] = true """ } } then { - def html_text = [ "Q20 bases:25.719000 K (93.033098%)", - "The input has little adapter percentage (~0.000000%), probably it's trimmed before."] - def log_text = [ "No adapter detected for read1", - "Q30 bases: 12281(88.3716%)"] - def json_text = ['"passed_filter_reads": 198'] - def read1_lines = ["@ERR5069949.2151832 NS500628:121:HK3MMAFX2:2:21208:10793:15304/1", - "TCATAAACCAAAGCACTCACAGTGTCAACAATTTCAGCAGGACAACGCCGACAAGTTCCGAGGAACATGTCTGGACCTATAGTTTTCATAAGTCTACACACTGAATTGAAATATTCTGGTTCTAGTGTGCCCTTAGTTAGCAATGTGCGT", - "AAAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEE - { assert path(process.out.reads.get(0).get(1).get(0)).linesGzip.contains(read1_line) } - } - }, - { read2_lines.each { read2_line -> - { assert path(process.out.reads.get(0).get(1).get(1)).linesGzip.contains(read2_line) } - } - }, - { html_text.each { html_part -> - { assert path(process.out.html.get(0).get(1)).getText().contains(html_part) } - } - }, - { json_text.each { json_part -> - { assert path(process.out.json.get(0).get(1)).getText().contains(json_part) } - } - }, - { log_text.each { log_part -> - { assert path(process.out.log.get(0).get(1)).getText().contains(log_part) } - } - }, - { - assert snapshot( - ( - [process.out.reads[0][0].toString()] + // meta - process.out.reads.collect { it[1].collect { item -> file(item).getName() } } + - process.out.json.collect { file(it[1]).getName() } + - process.out.html.collect { file(it[1]).getName() } + - process.out.log.collect { file(it[1]).getName() } + - process.out.reads_fail.collect { file(it[1]).getName() } + - process.out.reads_merged.collect { file(it[1]).getName() } - ).sort() - ).match("test_fastp_paired_end_match") - }, - { assert snapshot(process.out.versions).match("versions_paired_end") } + { assert path(process.out.html.get(0).get(1)).getText().contains("The input has little adapter percentage (~0.000000%), probably it's trimmed before.") }, + { assert path(process.out.log.get(0).get(1)).getText().contains("total reads: 75") }, + { assert snapshot( + process.out.json, + process.out.reads, + process.out.reads_fail, + process.out.reads_merged, + process.out.versions).match() }, ) } } - test("test_fastp_paired_end-stub") { - - options '-stub' + test("test_fastp_paired_end_merged_adapterlist") { when { - params { - outdir = "$outputDir" + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] + ]) + input[1] = Channel.of([ file(params.modules_testdata_base_path + 'delete_me/fastp/adapters.fasta', checkIfExists: true) ]) + input[2] = false + input[3] = false + input[4] = true + """ } + } + + then { + assertAll( + { assert process.success }, + { assert path(process.out.html.get(0).get(1)).getText().contains("
") }, + { assert path(process.out.log.get(0).get(1)).getText().contains("total bases: 13683") }, + { assert snapshot( + process.out.json, + process.out.reads, + process.out.reads_fail, + process.out.reads_merged, + process.out.versions).match() } + ) + } + } + + test("test_fastp_single_end_qc_only") { + + when { process { """ - adapter_fasta = [] - save_trimmed_fail = false - save_merged = false + input[0] = Channel.of([ + [ id:'test', single_end:true ], + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) ] + ]) + input[1] = [] + input[2] = true + input[3] = false + input[4] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert path(process.out.html.get(0).get(1)).getText().contains("single end (151 cycles)") }, + { assert path(process.out.log.get(0).get(1)).getText().contains("reads passed filter: 99") }, + { assert snapshot( + process.out.json, + process.out.reads, + process.out.reads, + process.out.reads_fail, + process.out.reads_fail, + process.out.reads_merged, + process.out.reads_merged, + process.out.versions).match() } + ) + } + } + test("test_fastp_paired_end_qc_only") { + + when { + process { + """ input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = true + input[3] = false + input[4] = false """ } } @@ -232,114 +301,99 @@ nextflow_process { then { assertAll( { assert process.success }, - { - assert snapshot( - ( - [process.out.reads[0][0].toString()] + // meta - process.out.reads.collect { it[1].collect { item -> file(item).getName() } } + - process.out.json.collect { file(it[1]).getName() } + - process.out.html.collect { file(it[1]).getName() } + - process.out.log.collect { file(it[1]).getName() } + - process.out.reads_fail.collect { file(it[1]).getName() } + - process.out.reads_merged.collect { file(it[1]).getName() } - ).sort() - ).match("test_fastp_paired_end-for_stub_match") - }, - { assert snapshot(process.out.versions).match("versions_paired_end-stub") } + { assert path(process.out.html.get(0).get(1)).getText().contains("The input has little adapter percentage (~0.000000%), probably it's trimmed before.") }, + { assert path(process.out.log.get(0).get(1)).getText().contains("Q30 bases: 12281(88.3716%)") }, + { assert snapshot( + process.out.json, + process.out.reads, + process.out.reads, + process.out.reads_fail, + process.out.reads_fail, + process.out.reads_merged, + process.out.reads_merged, + process.out.versions).match() } ) } } - test("fastp test_fastp_interleaved") { + test("test_fastp_single_end - stub") { + + options "-stub" - config './nextflow.interleaved.config' when { - params { - outdir = "$outputDir" + + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:true ], + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) ] + ]) + input[1] = [] + input[2] = false + input[3] = false + input[4] = false + """ } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test_fastp_paired_end - stub") { + + options "-stub" + + when { + process { """ adapter_fasta = [] + save_trimmed_pass = true save_trimmed_fail = false save_merged = false input[0] = Channel.of([ - [ id:'test', single_end:true ], // meta map - [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_interleaved.fastq.gz', checkIfExists: true) ] + [ id:'test', single_end:false ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = false + input[3] = false + input[4] = false """ } } then { - def html_text = [ "Q20 bases:25.719000 K (93.033098%)", - "paired end (151 cycles + 151 cycles)"] - def log_text = [ "Q20 bases: 12922(92.9841%)", - "reads passed filter: 162"] - def read_lines = [ "@ERR5069949.2151832 NS500628:121:HK3MMAFX2:2:21208:10793:15304/1", - "TCATAAACCAAAGCACTCACAGTGTCAACAATTTCAGCAGGACAACGCCGACAAGTTCCGAGGAACATGTCTGGACCTATAGTTTTCATAAGTCTACACACTGAATTGAAATATTCTGGTTCTAGTGTGCCCTTAGTTAGCAATGTGCGT", - "AAAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEE - { assert path(process.out.reads.get(0).get(1)).linesGzip.contains(read_line) } - } - }, - { html_text.each { html_part -> - { assert path(process.out.html.get(0).get(1)).getText().contains(html_part) } - } - }, - { assert snapshot(process.out.json).match("fastp test_fastp_interleaved_json") }, - { log_text.each { log_part -> - { assert path(process.out.log.get(0).get(1)).getText().contains(log_part) } - } - }, - { - assert snapshot( - ( - [process.out.reads[0][0].toString()] + // meta - process.out.reads.collect { file(it[1]).getName() } + - process.out.json.collect { file(it[1]).getName() } + - process.out.html.collect { file(it[1]).getName() } + - process.out.log.collect { file(it[1]).getName() } + - process.out.reads_fail.collect { file(it[1]).getName() } + - process.out.reads_merged.collect { file(it[1]).getName() } - ).sort() - ).match("test_fastp_interleaved-_match") - }, - { assert snapshot(process.out.versions).match("versions_interleaved") } + { assert snapshot(process.out).match() } ) } } - test("fastp test_fastp_interleaved-stub") { + test("fastp - stub test_fastp_interleaved") { - options '-stub' + options "-stub" config './nextflow.interleaved.config' when { - params { - outdir = "$outputDir" - } process { """ - adapter_fasta = [] - save_trimmed_fail = false - save_merged = false - input[0] = Channel.of([ [ id:'test', single_end:true ], // meta map [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_interleaved.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = false + input[3] = false + input[4] = false """ } } @@ -347,277 +401,112 @@ nextflow_process { then { assertAll( { assert process.success }, - { - assert snapshot( - ( - [process.out.reads[0][0].toString()] + // meta - process.out.reads.collect { file(it[1]).getName() } + - process.out.json.collect { file(it[1]).getName() } + - process.out.html.collect { file(it[1]).getName() } + - process.out.log.collect { file(it[1]).getName() } + - process.out.reads_fail.collect { file(it[1]).getName() } + - process.out.reads_merged.collect { file(it[1]).getName() } - ).sort() - ).match("test_fastp_interleaved-for_stub_match") - }, - { assert snapshot(process.out.versions).match("versions_interleaved-stub") } + { assert snapshot(process.out).match() } ) } } - test("test_fastp_single_end_trim_fail") { + test("test_fastp_single_end_trim_fail - stub") { + + options "-stub" when { - params { - outdir = "$outputDir" - } + process { """ - adapter_fasta = [] - save_trimmed_fail = true - save_merged = false - input[0] = Channel.of([ [ id:'test', single_end:true ], // meta map [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = false + input[3] = true + input[4] = false """ } } then { - def html_text = [ "Q20 bases:12.922000 K (92.984097%)", - "single end (151 cycles)"] - def log_text = [ "Q20 bases: 12922(92.9841%)", - "reads passed filter: 99" ] - def read_lines = [ "@ERR5069949.2151832 NS500628:121:HK3MMAFX2:2:21208:10793:15304/1", - "TCATAAACCAAAGCACTCACAGTGTCAACAATTTCAGCAGGACAACGCCGACAAGTTCCGAGGAACATGTCTGGACCTATAGTTTTCATAAGTCTACACACTGAATTGAAATATTCTGGTTCTAGTGTGCCCTTAGTTAGCAATGTGCGT", - "AAAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEE - { assert path(process.out.reads.get(0).get(1)).linesGzip.contains(read_line) } - } - }, - { failed_read_lines.each { failed_read_line -> - { assert path(process.out.reads_fail.get(0).get(1)).linesGzip.contains(failed_read_line) } - } - }, - { html_text.each { html_part -> - { assert path(process.out.html.get(0).get(1)).getText().contains(html_part) } - } - }, - { assert snapshot(process.out.json).match("test_fastp_single_end_trim_fail_json") }, - { log_text.each { log_part -> - { assert path(process.out.log.get(0).get(1)).getText().contains(log_part) } - } - }, - { assert snapshot(process.out.versions).match("versions_single_end_trim_fail") } + { assert snapshot(process.out).match() } ) } } - test("test_fastp_paired_end_trim_fail") { + test("test_fastp_paired_end_trim_fail - stub") { + + options "-stub" config './nextflow.save_failed.config' when { - params { - outdir = "$outputDir" - } process { """ - adapter_fasta = [] - save_trimmed_fail = true - save_merged = false - input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true)] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = false + input[3] = true + input[4] = false """ } } then { - def html_text = [ "Q20 bases:25.719000 K (93.033098%)", - "The input has little adapter percentage (~0.000000%), probably it's trimmed before."] - def log_text = [ "No adapter detected for read1", - "Q30 bases: 12281(88.3716%)"] - def json_text = ['"passed_filter_reads": 162'] - def read1_lines = ["@ERR5069949.2151832 NS500628:121:HK3MMAFX2:2:21208:10793:15304/1", - "TCATAAACCAAAGCACTCACAGTGTCAACAATTTCAGCAGGACAACGCCGACAAGTTCCGAGGAACATGTCTGGACCTATAGTTTTCATAAGTCTACACACTGAATTGAAATATTCTGGTTCTAGTGTGCCCTTAGTTAGCAATGTGCGT", - "AAAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAEEEEE - { assert path(process.out.reads.get(0).get(1).get(0)).linesGzip.contains(read1_line) } - } - }, - { read2_lines.each { read2_line -> - { assert path(process.out.reads.get(0).get(1).get(1)).linesGzip.contains(read2_line) } - } - }, - { failed_read2_lines.each { failed_read2_line -> - { assert path(process.out.reads_fail.get(0).get(1).get(2)).linesGzip.contains(failed_read2_line) } - } - }, - { html_text.each { html_part -> - { assert path(process.out.html.get(0).get(1)).getText().contains(html_part) } - } - }, - { json_text.each { json_part -> - { assert path(process.out.json.get(0).get(1)).getText().contains(json_part) } - } - }, - { log_text.each { log_part -> - { assert path(process.out.log.get(0).get(1)).getText().contains(log_part) } - } - }, - { assert snapshot(process.out.versions).match("versions_paired_end_trim_fail") } + { assert snapshot(process.out).match() } ) } } - test("test_fastp_paired_end_merged") { + test("test_fastp_paired_end_merged - stub") { + + options "-stub" when { - params { - outdir = "$outputDir" - } process { """ - adapter_fasta = [] - save_trimmed_fail = false - save_merged = true input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = false + input[3] = false + input[4] = true """ } } then { - def html_text = [ "
"] - def log_text = [ "Merged and filtered:", - "total reads: 75", - "total bases: 13683"] - def json_text = ['"merged_and_filtered": {', '"total_reads": 75', '"total_bases": 13683'] - def read1_lines = [ "@ERR5069949.1066259 NS500628:121:HK3MMAFX2:1:11312:18369:8333/1", - "CCTTATGACAGCAAGAACTGTGTATGATGATGGTGCTAGGAGAGTGTGGACACTTATGAATGTCTTGACACTCGTTTATAAAGTTTATTATGGTAATGCTTTAGATCAAGCCATTTCCATGTGGGCTCTTATAATCTCTGTTACTTC", - "AAAAAEAEEAEEEEEEEEEEEEEEEEAEEEEAEEEEEEEEAEEEEEEEEEEEEEEEEE/EAEEEEEE/6EEEEEEEEEEAEEAEEE/EE/AEEAEEEEEAEEEA/EEAAEAE - { assert path(process.out.reads.get(0).get(1).get(0)).linesGzip.contains(read1_line) } - } - }, - { read2_lines.each { read2_line -> - { assert path(process.out.reads.get(0).get(1).get(1)).linesGzip.contains(read2_line) } - } - }, - { read_merged_lines.each { read_merged_line -> - { assert path(process.out.reads_merged.get(0).get(1)).linesGzip.contains(read_merged_line) } - } - }, - { html_text.each { html_part -> - { assert path(process.out.html.get(0).get(1)).getText().contains(html_part) } - } - }, - { json_text.each { json_part -> - { assert path(process.out.json.get(0).get(1)).getText().contains(json_part) } - } - }, - { log_text.each { log_part -> - { assert path(process.out.log.get(0).get(1)).getText().contains(log_part) } - } - }, - { - assert snapshot( - ( - [process.out.reads[0][0].toString()] + // meta - process.out.reads.collect { it[1].collect { item -> file(item).getName() } } + - process.out.json.collect { file(it[1]).getName() } + - process.out.html.collect { file(it[1]).getName() } + - process.out.log.collect { file(it[1]).getName() } + - process.out.reads_fail.collect { file(it[1]).getName() } + - process.out.reads_merged.collect { file(it[1]).getName() } - ).sort() - ).match("test_fastp_paired_end_merged_match") - }, - { assert snapshot(process.out.versions).match("versions_paired_end_merged") } + { assert snapshot(process.out).match() } ) } } - test("test_fastp_paired_end_merged-stub") { + test("test_fastp_paired_end_merged_adapterlist - stub") { - options '-stub' + options "-stub" when { - params { - outdir = "$outputDir" - } process { """ - adapter_fasta = [] - save_trimmed_fail = false - save_merged = true - input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = Channel.of([ file(params.modules_testdata_base_path + 'delete_me/fastp/adapters.fasta', checkIfExists: true) ]) + input[2] = false + input[3] = false + input[4] = true """ } } @@ -625,101 +514,63 @@ nextflow_process { then { assertAll( { assert process.success }, - { - assert snapshot( - ( - [process.out.reads[0][0].toString()] + // meta - process.out.reads.collect { it[1].collect { item -> file(item).getName() } } + - process.out.json.collect { file(it[1]).getName() } + - process.out.html.collect { file(it[1]).getName() } + - process.out.log.collect { file(it[1]).getName() } + - process.out.reads_fail.collect { file(it[1]).getName() } + - process.out.reads_merged.collect { file(it[1]).getName() } - ).sort() - ).match("test_fastp_paired_end_merged-for_stub_match") - }, - { assert snapshot(process.out.versions).match("versions_paired_end_merged_stub") } + { assert snapshot(process.out).match() } ) } } - test("test_fastp_paired_end_merged_adapterlist") { + test("test_fastp_single_end_qc_only - stub") { + + options "-stub" when { - params { - outdir = "$outputDir" - } process { """ - adapter_fasta = Channel.of([ file(params.modules_testdata_base_path + 'delete_me/fastp/adapters.fasta', checkIfExists: true) ]) - save_trimmed_fail = false - save_merged = true + input[0] = Channel.of([ + [ id:'test', single_end:true ], + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) ] + ]) + input[1] = [] + input[2] = true + input[3] = false + input[4] = false + """ + } + } + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test_fastp_paired_end_qc_only - stub") { + + options "-stub" + + when { + process { + """ input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] ]) - input[1] = adapter_fasta - input[2] = save_trimmed_fail - input[3] = save_merged + input[1] = [] + input[2] = true + input[3] = false + input[4] = false """ } } then { - def html_text = [ "
"] - def log_text = [ "Merged and filtered:", - "total reads: 75", - "total bases: 13683"] - def json_text = ['"merged_and_filtered": {', '"total_reads": 75', '"total_bases": 13683',"--adapter_fasta"] - def read1_lines = ["@ERR5069949.1066259 NS500628:121:HK3MMAFX2:1:11312:18369:8333/1", - "CCTTATGACAGCAAGAACTGTGTATGATGATGGTGCTAGGAGAGTGTGGACACTTATGAATGTCTTGACACTCGTTTATAAAGTTTATTATGGTAATGCTTTAGATCAAGCCATTTCCATGTGGGCTCTTATAATCTCTGTTACTTC", - "AAAAAEAEEAEEEEEEEEEEEEEEEEAEEEEAEEEEEEEEAEEEEEEEEEEEEEEEEE/EAEEEEEE/6EEEEEEEEEEAEEAEEE/EE/AEEAEEEEEAEEEA/EEAAEAE - { assert path(process.out.reads.get(0).get(1).get(0)).linesGzip.contains(read1_line) } - } - }, - { read2_lines.each { read2_line -> - { assert path(process.out.reads.get(0).get(1).get(1)).linesGzip.contains(read2_line) } - } - }, - { read_merged_lines.each { read_merged_line -> - { assert path(process.out.reads_merged.get(0).get(1)).linesGzip.contains(read_merged_line) } - } - }, - { html_text.each { html_part -> - { assert path(process.out.html.get(0).get(1)).getText().contains(html_part) } - } - }, - { json_text.each { json_part -> - { assert path(process.out.json.get(0).get(1)).getText().contains(json_part) } - } - }, - { log_text.each { log_part -> - { assert path(process.out.log.get(0).get(1)).getText().contains(log_part) } - } - }, - { assert snapshot(process.out.versions).match("versions_paired_end_merged_adapterlist") } + { assert snapshot(process.out).match() } ) } } -} +} \ No newline at end of file diff --git a/modules/nf-core/fastp/tests/main.nf.test.snap b/modules/nf-core/fastp/tests/main.nf.test.snap index 3e87628898..54be7e45f7 100644 --- a/modules/nf-core/fastp/tests/main.nf.test.snap +++ b/modules/nf-core/fastp/tests/main.nf.test.snap @@ -1,55 +1,178 @@ { - "fastp test_fastp_interleaved_json": { + "test_fastp_single_end_qc_only - stub": { "content": [ - [ - [ - { - "id": "test", - "single_end": true - }, - "test.fastp.json:md5,b24e0624df5cc0b11cd5ba21b726fb22" + { + "0": [ + + ], + "1": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + + ], + "5": [ + + ], + "6": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ], + "html": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + + ], + "reads_fail": [ + + ], + "reads_merged": [ + + ], + "versions": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] - ] + } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-03-18T16:19:15.063001" + "timestamp": "2024-07-05T14:31:10.841098" }, - "test_fastp_paired_end_merged-for_stub_match": { + "test_fastp_paired_end": { "content": [ [ [ - "test_1.fastp.fastq.gz", - "test_2.fastp.fastq.gz" - ], - "test.fastp.html", - "test.fastp.json", - "test.fastp.log", - "test.merged.fastq.gz", - "{id=test, single_end=false}" + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,1e0f8e27e71728e2b63fc64086be95cd" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,67b2bbae47f073e05a97a9c2edce23c7", + "test_2.fastp.fastq.gz:md5,25cbdca08e2083dbd4f0502de6b62f39" + ] + ] + ], + [ + + ], + [ + + ], + [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-01-17T18:10:13.467574" + "timestamp": "2024-07-05T13:43:28.665779" }, - "versions_interleaved": { + "test_fastp_paired_end_merged_adapterlist": { "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,5914ca3f21ce162123a824e33e8564f6" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,54b726a55e992a869fd3fa778afe1672", + "test_2.fastp.fastq.gz:md5,29d3b33b869f7b63417b8ff07bb128ba" + ] + ] + ], + [ + + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.merged.fastq.gz:md5,c873bb1ab3fa859dcc47306465e749d5" + ] + ], [ "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T11:56:24.615634793" + "timestamp": "2024-07-05T13:44:18.210375" }, - "test_fastp_single_end_json": { + "test_fastp_single_end_qc_only": { "content": [ [ [ @@ -57,274 +180,1152 @@ "id": "test", "single_end": true }, - "test.fastp.json:md5,c852d7a6dba5819e4ac8d9673bedcacc" + "test.fastp.json:md5,5cc5f01e449309e0e689ed6f51a2294a" ] - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-03-18T16:18:43.526412" - }, - "versions_paired_end": { - "content": [ + ], + [ + + ], + [ + + ], + [ + + ], + [ + + ], + [ + + ], + [ + + ], [ "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T11:55:42.333545689" + "timestamp": "2024-07-05T13:44:27.380974" }, - "test_fastp_paired_end_match": { + "test_fastp_paired_end_trim_fail": { "content": [ [ [ - "test_1.fastp.fastq.gz", - "test_2.fastp.fastq.gz" - ], - "test.fastp.html", - "test.fastp.json", - "test.fastp.log", - "{id=test, single_end=false}" - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-02-01T12:03:06.431833729" - }, - "test_fastp_interleaved-_match": { - "content": [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,6ff32a64c5188b9a9192be1398c262c7", + "test_2.fastp.fastq.gz:md5,db0cb7c9977e94ac2b4b446ebd017a8a" + ] + ] + ], [ - "test.fastp.fastq.gz", - "test.fastp.html", - "test.fastp.json", - "test.fastp.log", - "{id=test, single_end=true}" - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-03-18T16:19:15.111894" - }, - "test_fastp_paired_end_merged_match": { - "content": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test.paired.fail.fastq.gz:md5,409b687c734cedd7a1fec14d316e1366", + "test_1.fail.fastq.gz:md5,4f273cf3159c13f79e8ffae12f5661f6", + "test_2.fail.fastq.gz:md5,f97b9edefb5649aab661fbc9e71fc995" + ] + ] + ], + [ + + ], [ [ - "test_1.fastp.fastq.gz", - "test_2.fastp.fastq.gz" - ], - "test.fastp.html", - "test.fastp.json", - "test.fastp.log", - "test.merged.fastq.gz", - "{id=test, single_end=false}" - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-02-01T12:08:44.496251446" - }, - "versions_single_end_stub": { - "content": [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,4c3268ddb50ea5b33125984776aa3519" + ] + ], [ "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T11:55:27.354051299" + "timestamp": "2024-07-05T13:43:58.749589" }, - "versions_interleaved-stub": { + "fastp - stub test_fastp_interleaved": { "content": [ - [ - "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" - ] + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + + ], + "5": [ + + ], + "6": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ], + "html": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "reads_fail": [ + + ], + "reads_merged": [ + + ], + "versions": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ] + } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T11:56:46.535528418" + "timestamp": "2024-07-05T13:50:00.270029" }, - "versions_single_end_trim_fail": { + "test_fastp_single_end - stub": { "content": [ - [ - "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" - ] + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + + ], + "5": [ + + ], + "6": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ], + "html": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "reads_fail": [ + + ], + "reads_merged": [ + + ], + "versions": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ] + } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T11:59:03.724591407" + "timestamp": "2024-07-05T13:49:42.502789" }, - "test_fastp_paired_end-for_stub_match": { + "test_fastp_paired_end_merged_adapterlist - stub": { "content": [ - [ - [ - "test_1.fastp.fastq.gz", - "test_2.fastp.fastq.gz" + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] ], - "test.fastp.html", - "test.fastp.json", - "test.fastp.log", - "{id=test, single_end=false}" - ] + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + + ], + "5": [ + [ + { + "id": "test", + "single_end": false + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "6": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "reads_fail": [ + + ], + "reads_merged": [ + [ + { + "id": "test", + "single_end": false + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ] + } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-01-17T18:07:15.398827" + "timestamp": "2024-07-05T13:54:53.458252" }, - "versions_paired_end-stub": { + "test_fastp_paired_end_merged - stub": { "content": [ - [ - "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" - ] + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + + ], + "5": [ + [ + { + "id": "test", + "single_end": false + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "6": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "reads_fail": [ + + ], + "reads_merged": [ + [ + { + "id": "test", + "single_end": false + }, + "test.merged.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ] + } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T11:56:06.50017282" + "timestamp": "2024-07-05T13:50:27.689379" }, - "versions_single_end": { + "test_fastp_paired_end_merged": { "content": [ [ - "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-02-01T11:55:07.67921647" - }, - "versions_paired_end_merged_stub": { - "content": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,b712fd68ed0322f4bec49ff2a5237fcc" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,54b726a55e992a869fd3fa778afe1672", + "test_2.fastp.fastq.gz:md5,29d3b33b869f7b63417b8ff07bb128ba" + ] + ] + ], + [ + + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.merged.fastq.gz:md5,c873bb1ab3fa859dcc47306465e749d5" + ] + ], [ "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T11:59:47.350653154" + "timestamp": "2024-07-05T13:44:08.68476" }, - "test_fastp_interleaved-for_stub_match": { + "test_fastp_paired_end - stub": { "content": [ - [ - "test.fastp.fastq.gz", - "test.fastp.html", - "test.fastp.json", - "test.fastp.log", - "{id=test, single_end=true}" - ] + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + + ], + "5": [ + + ], + "6": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "reads_fail": [ + + ], + "reads_merged": [ + + ], + "versions": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ] + } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-01-17T18:08:06.127974" + "timestamp": "2024-07-05T13:49:51.679221" }, - "versions_paired_end_trim_fail": { + "test_fastp_single_end": { "content": [ + [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,c852d7a6dba5819e4ac8d9673bedcacc" + ] + ], + [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,67b2bbae47f073e05a97a9c2edce23c7" + ] + ], + [ + + ], + [ + + ], [ "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T11:59:18.140484878" + "timestamp": "2024-07-05T13:43:18.834322" }, - "test_fastp_single_end-for_stub_match": { + "test_fastp_single_end_trim_fail - stub": { "content": [ - [ - "test.fastp.fastq.gz", - "test.fastp.html", - "test.fastp.json", - "test.fastp.log", - "{id=test, single_end=true}" - ] + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "5": [ + + ], + "6": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ], + "html": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "reads_fail": [ + [ + { + "id": "test", + "single_end": true + }, + "test.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "reads_merged": [ + + ], + "versions": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ] + } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-01-17T18:06:00.244202" + "timestamp": "2024-07-05T14:05:36.898142" }, - "test_fastp_single_end-_match": { + "test_fastp_paired_end_trim_fail - stub": { "content": [ - [ - "test.fastp.fastq.gz", - "test.fastp.html", - "test.fastp.json", - "test.fastp.log", - "{id=test, single_end=true}" - ] + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test.paired.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_1.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "5": [ + + ], + "6": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fastp.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "reads_fail": [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test.paired.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_1.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fail.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "reads_merged": [ + + ], + "versions": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ] + } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-03-18T16:18:43.580336" + "timestamp": "2024-07-05T14:05:49.212847" }, - "versions_paired_end_merged_adapterlist": { + "fastp test_fastp_interleaved": { "content": [ + [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,217d62dc13a23e92513a1bd8e1bcea39" + ] + ], + [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,b24e0624df5cc0b11cd5ba21b726fb22" + ] + ], [ "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T12:05:37.845370554" + "timestamp": "2024-07-05T13:43:38.910832" }, - "versions_paired_end_merged": { + "test_fastp_single_end_trim_fail": { "content": [ + [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.json:md5,9a7ee180f000e8d00c7fb67f06293eb5" + ] + ], + [ + [ + { + "id": "test", + "single_end": true + }, + "test.fastp.fastq.gz:md5,67b2bbae47f073e05a97a9c2edce23c7" + ] + ], + [ + [ + { + "id": "test", + "single_end": true + }, + "test.fail.fastq.gz:md5,3e4aaadb66a5b8fc9b881bf39c227abd" + ] + ], + [ + + ], [ "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-01T11:59:32.860543858" + "timestamp": "2024-07-05T13:43:48.22378" }, - "test_fastp_single_end_trim_fail_json": { + "test_fastp_paired_end_qc_only": { "content": [ [ [ { "id": "test", - "single_end": true + "single_end": false }, - "test.fastp.json:md5,9a7ee180f000e8d00c7fb67f06293eb5" + "test.fastp.json:md5,623064a45912dac6f2b64e3f2e9901df" ] + ], + [ + + ], + [ + + ], + [ + + ], + [ + + ], + [ + + ], + [ + + ], + [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" + }, + "timestamp": "2024-07-05T13:44:36.334938" + }, + "test_fastp_paired_end_qc_only - stub": { + "content": [ + { + "0": [ + + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + + ], + "5": [ + + ], + "6": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "json": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.json:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "log": [ + [ + { + "id": "test", + "single_end": false + }, + "test.fastp.log:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "reads": [ + + ], + "reads_fail": [ + + ], + "reads_merged": [ + + ], + "versions": [ + "versions.yml:md5,48ffc994212fb1fc9f83a74fa69c9f02" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" }, - "timestamp": "2024-01-17T18:08:41.942317" + "timestamp": "2024-07-05T14:31:27.096468" } } \ No newline at end of file diff --git a/modules/nf-core/fastqc/environment.yml b/modules/nf-core/fastqc/environment.yml index 1787b38a9a..691d4c7638 100644 --- a/modules/nf-core/fastqc/environment.yml +++ b/modules/nf-core/fastqc/environment.yml @@ -1,7 +1,5 @@ -name: fastqc channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::fastqc=0.12.1 diff --git a/modules/nf-core/fastqc/main.nf b/modules/nf-core/fastqc/main.nf index d79f1c862d..d8989f4812 100644 --- a/modules/nf-core/fastqc/main.nf +++ b/modules/nf-core/fastqc/main.nf @@ -26,7 +26,10 @@ process FASTQC { def rename_to = old_new_pairs*.join(' ').join(' ') def renamed_files = old_new_pairs.collect{ old_name, new_name -> new_name }.join(' ') - def memory_in_mb = MemoryUnit.of("${task.memory}").toUnit('MB') + // The total amount of allocated RAM by FastQC is equal to the number of threads defined (--threads) time the amount of RAM defined (--memory) + // https://github.com/s-andrews/FastQC/blob/1faeea0412093224d7f6a07f777fad60a5650795/fastqc#L211-L222 + // Dividing the task.memory by task.cpu allows to stick to requested amount of RAM in the label + def memory_in_mb = MemoryUnit.of("${task.memory}").toUnit('MB') / task.cpus // FastQC memory value allowed range (100 - 10000) def fastqc_memory = memory_in_mb > 10000 ? 10000 : (memory_in_mb < 100 ? 100 : memory_in_mb) diff --git a/modules/nf-core/fastqc/meta.yml b/modules/nf-core/fastqc/meta.yml index ee5507e06b..4827da7af2 100644 --- a/modules/nf-core/fastqc/meta.yml +++ b/modules/nf-core/fastqc/meta.yml @@ -16,35 +16,44 @@ tools: homepage: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ documentation: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/ licence: ["GPL-2.0-only"] + identifier: biotools:fastqc input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - reads: - type: file - description: | - List of input FastQ files of size 1 and 2 for single-end and paired-end data, - respectively. + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - reads: + type: file + description: | + List of input FastQ files of size 1 and 2 for single-end and paired-end data, + respectively. output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - html: - type: file - description: FastQC report - pattern: "*_{fastqc.html}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.html": + type: file + description: FastQC report + pattern: "*_{fastqc.html}" - zip: - type: file - description: FastQC report archive - pattern: "*_{fastqc.zip}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.zip": + type: file + description: FastQC report archive + pattern: "*_{fastqc.zip}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@grst" diff --git a/modules/nf-core/fastqc/tests/main.nf.test b/modules/nf-core/fastqc/tests/main.nf.test index 70edae4d99..e9d79a074e 100644 --- a/modules/nf-core/fastqc/tests/main.nf.test +++ b/modules/nf-core/fastqc/tests/main.nf.test @@ -23,17 +23,14 @@ nextflow_process { then { assertAll ( - { assert process.success }, - - // NOTE The report contains the date inside it, which means that the md5sum is stable per day, but not longer than that. So you can't md5sum it. - // looks like this:
Mon 2 Oct 2023
test.gz
- // https://github.com/nf-core/modules/pull/3903#issuecomment-1743620039 - - { assert process.out.html[0][1] ==~ ".*/test_fastqc.html" }, - { assert process.out.zip[0][1] ==~ ".*/test_fastqc.zip" }, - { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, - - { assert snapshot(process.out.versions).match("fastqc_versions_single") } + { assert process.success }, + // NOTE The report contains the date inside it, which means that the md5sum is stable per day, but not longer than that. So you can't md5sum it. + // looks like this:
Mon 2 Oct 2023
test.gz
+ // https://github.com/nf-core/modules/pull/3903#issuecomment-1743620039 + { assert process.out.html[0][1] ==~ ".*/test_fastqc.html" }, + { assert process.out.zip[0][1] ==~ ".*/test_fastqc.zip" }, + { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, + { assert snapshot(process.out.versions).match() } ) } } @@ -54,16 +51,14 @@ nextflow_process { then { assertAll ( - { assert process.success }, - - { assert process.out.html[0][1][0] ==~ ".*/test_1_fastqc.html" }, - { assert process.out.html[0][1][1] ==~ ".*/test_2_fastqc.html" }, - { assert process.out.zip[0][1][0] ==~ ".*/test_1_fastqc.zip" }, - { assert process.out.zip[0][1][1] ==~ ".*/test_2_fastqc.zip" }, - { assert path(process.out.html[0][1][0]).text.contains("File typeConventional base calls") }, - { assert path(process.out.html[0][1][1]).text.contains("File typeConventional base calls") }, - - { assert snapshot(process.out.versions).match("fastqc_versions_paired") } + { assert process.success }, + { assert process.out.html[0][1][0] ==~ ".*/test_1_fastqc.html" }, + { assert process.out.html[0][1][1] ==~ ".*/test_2_fastqc.html" }, + { assert process.out.zip[0][1][0] ==~ ".*/test_1_fastqc.zip" }, + { assert process.out.zip[0][1][1] ==~ ".*/test_2_fastqc.zip" }, + { assert path(process.out.html[0][1][0]).text.contains("File typeConventional base calls") }, + { assert path(process.out.html[0][1][1]).text.contains("File typeConventional base calls") }, + { assert snapshot(process.out.versions).match() } ) } } @@ -83,13 +78,11 @@ nextflow_process { then { assertAll ( - { assert process.success }, - - { assert process.out.html[0][1] ==~ ".*/test_fastqc.html" }, - { assert process.out.zip[0][1] ==~ ".*/test_fastqc.zip" }, - { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, - - { assert snapshot(process.out.versions).match("fastqc_versions_interleaved") } + { assert process.success }, + { assert process.out.html[0][1] ==~ ".*/test_fastqc.html" }, + { assert process.out.zip[0][1] ==~ ".*/test_fastqc.zip" }, + { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, + { assert snapshot(process.out.versions).match() } ) } } @@ -109,13 +102,11 @@ nextflow_process { then { assertAll ( - { assert process.success }, - - { assert process.out.html[0][1] ==~ ".*/test_fastqc.html" }, - { assert process.out.zip[0][1] ==~ ".*/test_fastqc.zip" }, - { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, - - { assert snapshot(process.out.versions).match("fastqc_versions_bam") } + { assert process.success }, + { assert process.out.html[0][1] ==~ ".*/test_fastqc.html" }, + { assert process.out.zip[0][1] ==~ ".*/test_fastqc.zip" }, + { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, + { assert snapshot(process.out.versions).match() } ) } } @@ -138,22 +129,20 @@ nextflow_process { then { assertAll ( - { assert process.success }, - - { assert process.out.html[0][1][0] ==~ ".*/test_1_fastqc.html" }, - { assert process.out.html[0][1][1] ==~ ".*/test_2_fastqc.html" }, - { assert process.out.html[0][1][2] ==~ ".*/test_3_fastqc.html" }, - { assert process.out.html[0][1][3] ==~ ".*/test_4_fastqc.html" }, - { assert process.out.zip[0][1][0] ==~ ".*/test_1_fastqc.zip" }, - { assert process.out.zip[0][1][1] ==~ ".*/test_2_fastqc.zip" }, - { assert process.out.zip[0][1][2] ==~ ".*/test_3_fastqc.zip" }, - { assert process.out.zip[0][1][3] ==~ ".*/test_4_fastqc.zip" }, - { assert path(process.out.html[0][1][0]).text.contains("File typeConventional base calls") }, - { assert path(process.out.html[0][1][1]).text.contains("File typeConventional base calls") }, - { assert path(process.out.html[0][1][2]).text.contains("File typeConventional base calls") }, - { assert path(process.out.html[0][1][3]).text.contains("File typeConventional base calls") }, - - { assert snapshot(process.out.versions).match("fastqc_versions_multiple") } + { assert process.success }, + { assert process.out.html[0][1][0] ==~ ".*/test_1_fastqc.html" }, + { assert process.out.html[0][1][1] ==~ ".*/test_2_fastqc.html" }, + { assert process.out.html[0][1][2] ==~ ".*/test_3_fastqc.html" }, + { assert process.out.html[0][1][3] ==~ ".*/test_4_fastqc.html" }, + { assert process.out.zip[0][1][0] ==~ ".*/test_1_fastqc.zip" }, + { assert process.out.zip[0][1][1] ==~ ".*/test_2_fastqc.zip" }, + { assert process.out.zip[0][1][2] ==~ ".*/test_3_fastqc.zip" }, + { assert process.out.zip[0][1][3] ==~ ".*/test_4_fastqc.zip" }, + { assert path(process.out.html[0][1][0]).text.contains("File typeConventional base calls") }, + { assert path(process.out.html[0][1][1]).text.contains("File typeConventional base calls") }, + { assert path(process.out.html[0][1][2]).text.contains("File typeConventional base calls") }, + { assert path(process.out.html[0][1][3]).text.contains("File typeConventional base calls") }, + { assert snapshot(process.out.versions).match() } ) } } @@ -173,21 +162,18 @@ nextflow_process { then { assertAll ( - { assert process.success }, - - { assert process.out.html[0][1] ==~ ".*/mysample_fastqc.html" }, - { assert process.out.zip[0][1] ==~ ".*/mysample_fastqc.zip" }, - { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, - - { assert snapshot(process.out.versions).match("fastqc_versions_custom_prefix") } + { assert process.success }, + { assert process.out.html[0][1] ==~ ".*/mysample_fastqc.html" }, + { assert process.out.zip[0][1] ==~ ".*/mysample_fastqc.zip" }, + { assert path(process.out.html[0][1]).text.contains("File typeConventional base calls") }, + { assert snapshot(process.out.versions).match() } ) } } test("sarscov2 single-end [fastq] - stub") { - options "-stub" - + options "-stub" when { process { """ @@ -201,12 +187,123 @@ nextflow_process { then { assertAll ( - { assert process.success }, - { assert snapshot(process.out.html.collect { file(it[1]).getName() } + - process.out.zip.collect { file(it[1]).getName() } + - process.out.versions ).match("fastqc_stub") } + { assert process.success }, + { assert snapshot(process.out).match() } ) } } + test("sarscov2 paired-end [fastq] - stub") { + + options "-stub" + when { + process { + """ + input[0] = Channel.of([ + [id: 'test', single_end: false], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) ] + ]) + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("sarscov2 interleaved [fastq] - stub") { + + options "-stub" + when { + process { + """ + input[0] = Channel.of([ + [id: 'test', single_end: false], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_interleaved.fastq.gz', checkIfExists: true) + ]) + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("sarscov2 paired-end [bam] - stub") { + + options "-stub" + when { + process { + """ + input[0] = Channel.of([ + [id: 'test', single_end: false], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true) + ]) + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("sarscov2 multiple [fastq] - stub") { + + options "-stub" + when { + process { + """ + input[0] = Channel.of([ + [id: 'test', single_end: false], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test2_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test2_2.fastq.gz', checkIfExists: true) ] + ]) + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("sarscov2 custom_prefix - stub") { + + options "-stub" + when { + process { + """ + input[0] = Channel.of([ + [ id:'mysample', single_end:true ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ]) + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } } diff --git a/modules/nf-core/fastqc/tests/main.nf.test.snap b/modules/nf-core/fastqc/tests/main.nf.test.snap index 86f7c31154..d5db3092fb 100644 --- a/modules/nf-core/fastqc/tests/main.nf.test.snap +++ b/modules/nf-core/fastqc/tests/main.nf.test.snap @@ -1,88 +1,392 @@ { - "fastqc_versions_interleaved": { + "sarscov2 custom_prefix": { "content": [ [ "versions.yml:md5,e1cc25ca8af856014824abd842e93978" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-01-31T17:40:07.293713" + "timestamp": "2024-07-22T11:02:16.374038" }, - "fastqc_stub": { + "sarscov2 single-end [fastq] - stub": { "content": [ - [ - "test.html", - "test.zip", - "versions.yml:md5,e1cc25ca8af856014824abd842e93978" - ] + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": true + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "html": [ + [ + { + "id": "test", + "single_end": true + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "zip": [ + [ + { + "id": "test", + "single_end": true + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T11:02:24.993809" + }, + "sarscov2 custom_prefix - stub": { + "content": [ + { + "0": [ + [ + { + "id": "mysample", + "single_end": true + }, + "mysample.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "mysample", + "single_end": true + }, + "mysample.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "html": [ + [ + { + "id": "mysample", + "single_end": true + }, + "mysample.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "zip": [ + [ + { + "id": "mysample", + "single_end": true + }, + "mysample.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-01-31T17:31:01.425198" + "timestamp": "2024-07-22T11:03:10.93942" }, - "fastqc_versions_multiple": { + "sarscov2 interleaved [fastq]": { "content": [ [ "versions.yml:md5,e1cc25ca8af856014824abd842e93978" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-01-31T17:40:55.797907" + "timestamp": "2024-07-22T11:01:42.355718" }, - "fastqc_versions_bam": { + "sarscov2 paired-end [bam]": { "content": [ [ "versions.yml:md5,e1cc25ca8af856014824abd842e93978" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-01-31T17:40:26.795862" + "timestamp": "2024-07-22T11:01:53.276274" }, - "fastqc_versions_single": { + "sarscov2 multiple [fastq]": { "content": [ [ "versions.yml:md5,e1cc25ca8af856014824abd842e93978" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-01-31T17:39:27.043675" + "timestamp": "2024-07-22T11:02:05.527626" }, - "fastqc_versions_paired": { + "sarscov2 paired-end [fastq]": { "content": [ [ "versions.yml:md5,e1cc25ca8af856014824abd842e93978" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T11:01:31.188871" + }, + "sarscov2 paired-end [fastq] - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T11:02:34.273566" + }, + "sarscov2 multiple [fastq] - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-01-31T17:39:47.584191" + "timestamp": "2024-07-22T11:03:02.304411" }, - "fastqc_versions_custom_prefix": { + "sarscov2 single-end [fastq]": { "content": [ [ "versions.yml:md5,e1cc25ca8af856014824abd842e93978" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T11:01:19.095607" + }, + "sarscov2 interleaved [fastq] - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-22T11:02:44.640184" + }, + "sarscov2 paired-end [bam] - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "html": [ + [ + { + "id": "test", + "single_end": false + }, + "test.html:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,e1cc25ca8af856014824abd842e93978" + ], + "zip": [ + [ + { + "id": "test", + "single_end": false + }, + "test.zip:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.3" }, - "timestamp": "2024-01-31T17:41:14.576531" + "timestamp": "2024-07-22T11:02:53.550742" } } \ No newline at end of file diff --git a/modules/nf-core/fgbio/callmolecularconsensusreads/environment.yml b/modules/nf-core/fgbio/callmolecularconsensusreads/environment.yml index 1429e478ec..6f3c75600d 100644 --- a/modules/nf-core/fgbio/callmolecularconsensusreads/environment.yml +++ b/modules/nf-core/fgbio/callmolecularconsensusreads/environment.yml @@ -1,7 +1,5 @@ -name: fgbio_callmolecularconsensusreads channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::fgbio=2.0.2 + - bioconda::fgbio=2.2.1 diff --git a/modules/nf-core/fgbio/callmolecularconsensusreads/main.nf b/modules/nf-core/fgbio/callmolecularconsensusreads/main.nf index e9f209ef16..8a2cdb247e 100644 --- a/modules/nf-core/fgbio/callmolecularconsensusreads/main.nf +++ b/modules/nf-core/fgbio/callmolecularconsensusreads/main.nf @@ -4,11 +4,13 @@ process FGBIO_CALLMOLECULARCONSENSUSREADS { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/fgbio:2.0.2--hdfd78af_0' : - 'biocontainers/fgbio:2.0.2--hdfd78af_0' }" + 'https://depot.galaxyproject.org/singularity/fgbio:2.2.1--hdfd78af_0' : + 'biocontainers/fgbio:2.2.1--hdfd78af_0' }" input: - tuple val(meta), path(bam) + tuple val(meta), path(grouped_bam) + val min_reads + val min_baseq output: tuple val(meta), path("*.bam"), emit: bam @@ -19,19 +21,44 @@ process FGBIO_CALLMOLECULARCONSENSUSREADS { script: def args = task.ext.args ?: '' - def prefix = task.ext.prefix ?: "${meta.id}" + def prefix = task.ext.prefix ?: "${meta.id}_consensus_unmapped" + def mem_gb = 8 + if (!task.memory) { + log.info '[fgbio CallMolecularConsensusReads] Available memory not known - defaulting to 8GB. Specify process memory requirements to change this.' + } else { + mem_gb = task.memory.giga + } + if ("$grouped_bam" == "${prefix}.bam") error "Input and output names are the same, use \"task.ext.prefix\" to disambiguate!" """ fgbio \\ + -Xmx${mem_gb}g \\ --tmp-dir=. \\ + --async-io=true \\ + --compression=1 \\ CallMolecularConsensusReads \\ - --input $bam \\ + --input $grouped_bam \\ + --output ${prefix}.bam \\ + --min-reads ${min_reads} \\ + --min-input-base-quality ${min_baseq} \\ --threads ${task.cpus} \\ - $args \\ - --output ${prefix}.bam + $args; cat <<-END_VERSIONS > versions.yml "${task.process}": fgbio: \$( echo \$(fgbio --version 2>&1 | tr -d '[:cntrl:]' ) | sed -e 's/^.*Version: //;s/\\[.*\$//') END_VERSIONS """ + + stub: + prefix = task.ext.prefix ?: "${meta.id}_consensus_unmapped" + if ("$grouped_bam" == "${prefix}.bam") error "Input and output names are the same, use \"task.ext.prefix\" to disambiguate!" + """ + touch ${prefix}.bam + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + fgbio: \$( echo \$(fgbio --version 2>&1 | tr -d '[:cntrl:]' ) | sed -e 's/^.*Version: //;s/\\[.*\$//') + END_VERSIONS + """ + } diff --git a/modules/nf-core/fgbio/callmolecularconsensusreads/meta.yml b/modules/nf-core/fgbio/callmolecularconsensusreads/meta.yml index f4a6ab1bb8..846c297b19 100644 --- a/modules/nf-core/fgbio/callmolecularconsensusreads/meta.yml +++ b/modules/nf-core/fgbio/callmolecularconsensusreads/meta.yml @@ -4,39 +4,47 @@ keywords: - UMIs - consensus sequence - bam - - sam tools: - fgbio: description: Tools for working with genomic and high throughput sequencing data. homepage: https://github.com/fulcrumgenomics/fgbio documentation: http://fulcrumgenomics.github.io/fgbio/ licence: ["MIT"] + identifier: biotools:fgbio input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false, collapse:false ] - - bam: - type: file - description: | - The input SAM or BAM file. - pattern: "*.{bam,sam}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false, collapse:false ] + - grouped_bam: + type: file + description: | + The input SAM or BAM file, grouped by UMIs + pattern: "*.{bam,sam}" + - - min_reads: + type: integer + description: Minimum number of original reads to build each consensus read. + - - min_baseq: + type: integer + description: Ignore bases in raw reads that have Q below this value. output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - bam: - type: file - description: | - Output SAM or BAM file to write consensus reads. - pattern: "*.{bam,sam}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bam": + type: file + description: | + Output SAM or BAM file to write consensus reads. + pattern: "*.{bam,sam}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@sruthipsuresh" maintainers: diff --git a/modules/nf-core/fgbio/callmolecularconsensusreads/tests/main.nf.test b/modules/nf-core/fgbio/callmolecularconsensusreads/tests/main.nf.test new file mode 100644 index 0000000000..8a90634051 --- /dev/null +++ b/modules/nf-core/fgbio/callmolecularconsensusreads/tests/main.nf.test @@ -0,0 +1,72 @@ +nextflow_process { + + name "Test Process FGBIO_CALLMOLECULARCONSENSUSREADS" + script "../main.nf" + process "FGBIO_CALLMOLECULARCONSENSUSREADS" + + tag "modules" + tag "modules_nfcore" + tag "fgbio" + tag "fgbio/callmolecularconsensusreads" + tag "fgbio/sortbam" + + setup { + + run("FGBIO_SORTBAM") { + script "../../sortbam/main.nf" + config "./sort.config" + process { + """ + input[0] = [[ id:'homo_sapiens_genome' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.single_end.bam', checkIfExists: true) + ] + """ + } + } + } + + test("homo_sapiens - bam") { + + when { + process { + """ + input[0] = FGBIO_SORTBAM.out.bam + input[1] = 1 + input[2] = 20 + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("homo_sapiens - stub") { + + options "-stub" + + when { + process { + """ + input[0] = FGBIO_SORTBAM.out.bam + input[1] = 1 + input[2] = 20 + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/fgbio/callmolecularconsensusreads/tests/main.nf.test.snap b/modules/nf-core/fgbio/callmolecularconsensusreads/tests/main.nf.test.snap new file mode 100644 index 0000000000..088f99238c --- /dev/null +++ b/modules/nf-core/fgbio/callmolecularconsensusreads/tests/main.nf.test.snap @@ -0,0 +1,68 @@ +{ + "homo_sapiens - stub": { + "content": [ + { + "0": [ + [ + { + "id": "homo_sapiens_genome" + }, + "homo_sapiens_genome_consensus_unmapped.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,8a39cbc62685ce7afb5ae36609898bde" + ], + "bam": [ + [ + { + "id": "homo_sapiens_genome" + }, + "homo_sapiens_genome_consensus_unmapped.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,8a39cbc62685ce7afb5ae36609898bde" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.10.1" + }, + "timestamp": "2024-05-17T06:01:29.676084265" + }, + "homo_sapiens - bam": { + "content": [ + { + "0": [ + [ + { + "id": "homo_sapiens_genome" + }, + "homo_sapiens_genome_consensus_unmapped.bam:md5,f56c861f1f604ecc9894dc9182b170f8" + ] + ], + "1": [ + "versions.yml:md5,8a39cbc62685ce7afb5ae36609898bde" + ], + "bam": [ + [ + { + "id": "homo_sapiens_genome" + }, + "homo_sapiens_genome_consensus_unmapped.bam:md5,f56c861f1f604ecc9894dc9182b170f8" + ] + ], + "versions": [ + "versions.yml:md5,8a39cbc62685ce7afb5ae36609898bde" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.1" + }, + "timestamp": "2024-05-21T10:34:16.47828891" + } +} \ No newline at end of file diff --git a/modules/nf-core/fgbio/callmolecularconsensusreads/tests/sort.config b/modules/nf-core/fgbio/callmolecularconsensusreads/tests/sort.config new file mode 100644 index 0000000000..b205c8f210 --- /dev/null +++ b/modules/nf-core/fgbio/callmolecularconsensusreads/tests/sort.config @@ -0,0 +1,6 @@ +process { + withName: FGBIO_SORTBAM { + ext.args = '-s TemplateCoordinate' + ext.prefix = { "${meta.id}_out" } + } +} diff --git a/modules/nf-core/fgbio/callmolecularconsensusreads/tests/tags.yml b/modules/nf-core/fgbio/callmolecularconsensusreads/tests/tags.yml new file mode 100644 index 0000000000..4f9fcbad0f --- /dev/null +++ b/modules/nf-core/fgbio/callmolecularconsensusreads/tests/tags.yml @@ -0,0 +1,2 @@ +fgbio/callmolecularconsensusreads: + - "modules/nf-core/fgbio/callmolecularconsensusreads/**" diff --git a/modules/nf-core/fgbio/fastqtobam/environment.yml b/modules/nf-core/fgbio/fastqtobam/environment.yml index 4b1b9e6ecf..6f3c75600d 100644 --- a/modules/nf-core/fgbio/fastqtobam/environment.yml +++ b/modules/nf-core/fgbio/fastqtobam/environment.yml @@ -1,7 +1,5 @@ -name: fgbio_fastqtobam channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::fgbio=2.1.0 + - bioconda::fgbio=2.2.1 diff --git a/modules/nf-core/fgbio/fastqtobam/main.nf b/modules/nf-core/fgbio/fastqtobam/main.nf index f50545c442..b1c884e863 100644 --- a/modules/nf-core/fgbio/fastqtobam/main.nf +++ b/modules/nf-core/fgbio/fastqtobam/main.nf @@ -2,12 +2,10 @@ process FGBIO_FASTQTOBAM { tag "$meta.id" label 'process_low' - // WARN: Version information not provided by tool on CLI. Please update version string below when bumping container versions. - // --version argument gives the wrong version conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/fgbio:2.1.0--hdfd78af_0' : - 'biocontainers/fgbio:2.1.0--hdfd78af_0' }" + 'https://depot.galaxyproject.org/singularity/fgbio:2.2.1--hdfd78af_0' : + 'biocontainers/fgbio:2.2.1--hdfd78af_0' }" input: tuple val(meta), path(reads) @@ -27,11 +25,22 @@ process FGBIO_FASTQTOBAM { def sample_name = args.contains("--sample") ? "" : "--sample ${prefix}" def library_name = args.contains("--library") ? "" : "--library ${prefix}" - def VERSION = '2.1.0' // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions. - """ + def mem_gb = 8 + if (!task.memory) { + log.info '[fgbio FastqToBam] Available memory not known - defaulting to 8GB. Specify process memory requirements to change this.' + } else if (mem_gb > task.memory.giga) { + if (task.memory.giga < 2) { + mem_gb = 1 + } else { + mem_gb = task.memory.giga - 1 + } + } + """ fgbio \\ + -Xmx${mem_gb}g \\ --tmp-dir=. \\ + --async-io=true \\ FastqToBam \\ ${args} \\ --input ${reads} \\ @@ -41,7 +50,7 @@ process FGBIO_FASTQTOBAM { cat <<-END_VERSIONS > versions.yml "${task.process}": - fgbio: $VERSION + fgbio: \$( echo \$(fgbio --version 2>&1 | tr -d '[:cntrl:]' ) | sed -e 's/^.*Version: //;s/\\[.*\$//') END_VERSIONS """ @@ -50,14 +59,12 @@ process FGBIO_FASTQTOBAM { def prefix = task.ext.prefix ?: "${meta.id}" def suffix = task.ext.suffix ?: "bam" - def VERSION = '2.1.0' // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions. - """ touch ${prefix}.${suffix} cat <<-END_VERSIONS > versions.yml "${task.process}": - fgbio: $VERSION + fgbio: \$( echo \$(fgbio --version 2>&1 | tr -d '[:cntrl:]' ) | sed -e 's/^.*Version: //;s/\\[.*\$//') END_VERSIONS """ } diff --git a/modules/nf-core/fgbio/fastqtobam/meta.yml b/modules/nf-core/fgbio/fastqtobam/meta.yml index f26f29da29..bce76cf8c0 100644 --- a/modules/nf-core/fgbio/fastqtobam/meta.yml +++ b/modules/nf-core/fgbio/fastqtobam/meta.yml @@ -7,34 +7,49 @@ keywords: - cram tools: - fgbio: - description: A set of tools for working with genomic and high throughput sequencing data, including UMIs + description: A set of tools for working with genomic and high throughput sequencing + data, including UMIs homepage: http://fulcrumgenomics.github.io/fgbio/ documentation: http://fulcrumgenomics.github.io/fgbio/tools/latest/ tool_dev_url: https://github.com/fulcrumgenomics/fgbio licence: ["MIT"] + identifier: biotools:fgbio input: - - reads: - type: file - description: pair of reads to be converted into BAM file - pattern: "*.{fastq.gz}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - reads: + type: file + description: pair of reads to be converted into BAM file + pattern: "*.{fastq.gz}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - version: - type: file - description: File containing software version - pattern: "*.{version.yml}" - bam: - type: file - description: Unaligned, unsorted BAM file - pattern: "*.{bam}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bam": + type: file + description: Unaligned, unsorted BAM file + pattern: "*.{bam}" - cram: - type: file - description: Unaligned, unsorted CRAM file - pattern: "*.{cram}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.cram": + type: file + description: Unaligned, unsorted CRAM file + pattern: "*.{cram}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@lescai" - "@matthdsm" diff --git a/modules/nf-core/fgbio/fastqtobam/tests/main.nf.test b/modules/nf-core/fgbio/fastqtobam/tests/main.nf.test index da90b17ce4..d10a005220 100644 --- a/modules/nf-core/fgbio/fastqtobam/tests/main.nf.test +++ b/modules/nf-core/fgbio/fastqtobam/tests/main.nf.test @@ -17,8 +17,8 @@ nextflow_process { input[0] = [ [ id:'test', single_end:false ], // meta map [ - file(params.test_data['homo_sapiens']['illumina']['test_umi_1_fastq_gz'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_umi_2_fastq_gz'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/fastq/test.umi_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/fastq/test.umi_2.fastq.gz', checkIfExists: true) ] ] """ @@ -47,8 +47,8 @@ nextflow_process { input[0] = [ [ id:'test', single_end:false ], // meta map [ - file(params.test_data['homo_sapiens']['illumina']['test_umi_1_fastq_gz'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_umi_2_fastq_gz'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/fastq/test.umi_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/fastq/test.umi_2.fastq.gz', checkIfExists: true) ] ] """ @@ -77,8 +77,8 @@ nextflow_process { input[0] = [ [ id:'test', single_end:false ], // meta map [ - file(params.test_data['homo_sapiens']['illumina']['test_umi_1_fastq_gz'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_umi_2_fastq_gz'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/fastq/test.umi_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/fastq/test.umi_2.fastq.gz', checkIfExists: true) ] ] """ @@ -105,7 +105,9 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:false ], // meta map - file(params.test_data['homo_sapiens']['illumina']['test_umi_1_fastq_gz'], checkIfExists: true) + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/fastq/test.umi_1.fastq.gz', checkIfExists: true) + ] ] """ } @@ -133,8 +135,8 @@ nextflow_process { input[0] = [ [ id:'test', single_end:false ], // meta map [ - file(params.test_data['homo_sapiens']['illumina']['test_umi_1_fastq_gz'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_umi_2_fastq_gz'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/fastq/test.umi_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/fastq/test.umi_2.fastq.gz', checkIfExists: true) ] ] """ @@ -163,8 +165,8 @@ nextflow_process { input[0] = [ [ id:'test', single_end:false ], // meta map [ - file(params.test_data['homo_sapiens']['illumina']['test_umi_1_fastq_gz'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_umi_2_fastq_gz'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/fastq/test.umi_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/fastq/test.umi_2.fastq.gz', checkIfExists: true) ] ] """ @@ -193,8 +195,8 @@ nextflow_process { input[0] = [ [ id:'test', single_end:false ], // meta map [ - file(params.test_data['homo_sapiens']['illumina']['test_umi_1_fastq_gz'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_umi_2_fastq_gz'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/fastq/test.umi_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/fastq/test.umi_2.fastq.gz', checkIfExists: true) ] ] """ diff --git a/modules/nf-core/fgbio/fastqtobam/tests/main.nf.test.snap b/modules/nf-core/fgbio/fastqtobam/tests/main.nf.test.snap index 00831f0c5f..fe9c01cdab 100644 --- a/modules/nf-core/fgbio/fastqtobam/tests/main.nf.test.snap +++ b/modules/nf-core/fgbio/fastqtobam/tests/main.nf.test.snap @@ -6,10 +6,14 @@ ], "test.cram", [ - "versions.yml:md5,f4e3de8480e34bd985000ee28a1f2405" + "versions.yml:md5,f6c3db3a20ce5e11c96e0ddd8ccd51de" ] ], - "timestamp": "2023-11-27T14:13:21.972159804" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.1" + }, + "timestamp": "2024-05-21T10:35:16.955213727" }, "homo_sapiens - fastq1": { "content": [ @@ -18,10 +22,14 @@ ], [ - "versions.yml:md5,f4e3de8480e34bd985000ee28a1f2405" + "versions.yml:md5,f6c3db3a20ce5e11c96e0ddd8ccd51de" ] ], - "timestamp": "2023-11-27T14:13:42.872247708" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.1" + }, + "timestamp": "2024-05-21T10:35:50.320465546" }, "homo_sapiens - [fastq1, fastq2] - default": { "content": [ @@ -30,10 +38,14 @@ ], [ - "versions.yml:md5,f4e3de8480e34bd985000ee28a1f2405" + "versions.yml:md5,f6c3db3a20ce5e11c96e0ddd8ccd51de" ] ], - "timestamp": "2023-11-27T14:13:10.377114559" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.1" + }, + "timestamp": "2024-05-21T10:34:53.093519753" }, "homo_sapiens - [fastq1, fastq2] - umi": { "content": [ @@ -42,10 +54,14 @@ ], [ - "versions.yml:md5,f4e3de8480e34bd985000ee28a1f2405" + "versions.yml:md5,f6c3db3a20ce5e11c96e0ddd8ccd51de" ] ], - "timestamp": "2023-11-27T14:13:53.4971996" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.1" + }, + "timestamp": "2024-05-21T10:36:07.216840033" }, "homo_sapiens - [fastq1, fastq2] - bam": { "content": [ @@ -54,10 +70,14 @@ ], [ - "versions.yml:md5,f4e3de8480e34bd985000ee28a1f2405" + "versions.yml:md5,f6c3db3a20ce5e11c96e0ddd8ccd51de" ] ], - "timestamp": "2023-11-27T14:13:32.920615998" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.1" + }, + "timestamp": "2024-05-21T10:35:33.666898844" }, "homo_sapiens - [fastq1, fastq2] - custom sample": { "content": [ @@ -66,10 +86,14 @@ ], [ - "versions.yml:md5,f4e3de8480e34bd985000ee28a1f2405" + "versions.yml:md5,f6c3db3a20ce5e11c96e0ddd8ccd51de" ] ], - "timestamp": "2023-11-27T14:14:03.897969056" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.1" + }, + "timestamp": "2024-05-21T10:36:24.905643178" }, "homo_sapiens - [fastq1, fastq2] - stub": { "content": [ @@ -78,9 +102,13 @@ ], [ - "versions.yml:md5,f4e3de8480e34bd985000ee28a1f2405" + "versions.yml:md5,f6c3db3a20ce5e11c96e0ddd8ccd51de" ] ], - "timestamp": "2023-11-27T14:14:11.765938243" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.1" + }, + "timestamp": "2024-05-21T10:36:38.954087571" } } \ No newline at end of file diff --git a/modules/nf-core/fgbio/groupreadsbyumi/environment.yml b/modules/nf-core/fgbio/groupreadsbyumi/environment.yml index 58e37bf6bd..6f3c75600d 100644 --- a/modules/nf-core/fgbio/groupreadsbyumi/environment.yml +++ b/modules/nf-core/fgbio/groupreadsbyumi/environment.yml @@ -1,7 +1,5 @@ -name: fgbio_groupreadsbyumi channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::fgbio=2.0.2 + - bioconda::fgbio=2.2.1 diff --git a/modules/nf-core/fgbio/groupreadsbyumi/main.nf b/modules/nf-core/fgbio/groupreadsbyumi/main.nf index 7179290c91..da9bf80ace 100644 --- a/modules/nf-core/fgbio/groupreadsbyumi/main.nf +++ b/modules/nf-core/fgbio/groupreadsbyumi/main.nf @@ -4,35 +4,60 @@ process FGBIO_GROUPREADSBYUMI { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/fgbio:2.0.2--hdfd78af_0' : - 'biocontainers/fgbio:2.0.2--hdfd78af_0' }" + 'https://depot.galaxyproject.org/singularity/fgbio:2.2.1--hdfd78af_0' : + 'biocontainers/fgbio:2.2.1--hdfd78af_0' }" input: - tuple val(meta), path(taggedbam) + tuple val(meta), path(bam) val(strategy) output: - tuple val(meta), path("*_umi-grouped.bam") , emit: bam - tuple val(meta), path("*_umi_histogram.txt"), emit: histogram - path "versions.yml" , emit: versions + tuple val(meta), path("*.bam") , emit: bam + tuple val(meta), path("*histogram.txt"), emit: histogram + path "versions.yml" , emit: versions when: task.ext.when == null || task.ext.when script: def args = task.ext.args ?: '' - def prefix = task.ext.prefix ?: "${meta.id}" + def prefix = task.ext.prefix ?: "${meta.id}_umi-grouped" + def mem_gb = 8 + if (!task.memory) { + log.info '[fgbio FilterConsensusReads] Available memory not known - defaulting to 8GB. Specify process memory requirements to change this.' + } else if (mem_gb > task.memory.giga) { + if (task.memory.giga < 2) { + mem_gb = 1 + } else { + mem_gb = task.memory.giga - 1 + } + } - """ + if ("$bam" == "${prefix}.bam") error "Input and output names are the same, use \"task.ext.prefix\" to disambiguate!" + """ fgbio \\ + -Xmx${mem_gb}g \\ --tmp-dir=. \\ GroupReadsByUmi \\ -s $strategy \\ $args \\ - -i $taggedbam \\ - -o ${prefix}_umi-grouped.bam \\ - -f ${prefix}_umi_histogram.txt + -i $bam \\ + -o ${prefix}.bam \\ + -f ${prefix}_histogram.txt + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + fgbio: \$( echo \$(fgbio --version 2>&1 | tr -d '[:cntrl:]' ) | sed -e 's/^.*Version: //;s/\\[.*\$//') + END_VERSIONS + """ + + stub: + def prefix = task.ext.prefix ?: "${meta.id}_umi-grouped" + if ("$bam" == "${prefix}.bam") error "Input and output names are the same, use \"task.ext.prefix\" to disambiguate!" + """ + touch ${prefix}.bam + touch ${prefix}_histogram.txt cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/fgbio/groupreadsbyumi/meta.yml b/modules/nf-core/fgbio/groupreadsbyumi/meta.yml index 02ca91f19f..3e525fd647 100644 --- a/modules/nf-core/fgbio/groupreadsbyumi/meta.yml +++ b/modules/nf-core/fgbio/groupreadsbyumi/meta.yml @@ -12,45 +12,56 @@ keywords: - fgbio tools: - fgbio: - description: A set of tools for working with genomic and high throughput sequencing data, including UMIs + description: A set of tools for working with genomic and high throughput sequencing + data, including UMIs homepage: http://fulcrumgenomics.github.io/fgbio/ documentation: http://fulcrumgenomics.github.io/fgbio/tools/latest/ tool_dev_url: https://github.com/fulcrumgenomics/fgbio licence: ["MIT"] + identifier: biotools:fgbio input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: | - BAM file. Note: the MQ tag is required on reads with mapped mates (!) - pattern: "*.bam" - - strategy: - type: value - description: | - Reguired argument: defines the UMI assignment strategy. - Must be chosen among: Identity, Edit, Adjacency, Paired. + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - bam: + type: file + description: | + BAM file. Note: the MQ tag is required on reads with mapped mates (!) + pattern: "*.bam" + - - strategy: + type: string + enum: ["Identity", "Edit", "Adjacency", "Paired"] + description: | + Reguired argument: defines the UMI assignment strategy. + Must be chosen among: Identity, Edit, Adjacency, Paired. output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - bam: - type: file - description: UMI-grouped BAM - pattern: "*.bam" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bam": + type: file + description: UMI-grouped BAM + pattern: "*.bam" - histogram: - type: file - description: A text file containing the tag family size counts - pattern: "*.txt" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*histogram.txt": + type: file + description: A text file containing the tag family size counts + pattern: "*.txt" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@lescai" maintainers: diff --git a/modules/nf-core/fgbio/groupreadsbyumi/tests/main.nf.test b/modules/nf-core/fgbio/groupreadsbyumi/tests/main.nf.test new file mode 100644 index 0000000000..a9e8bd256c --- /dev/null +++ b/modules/nf-core/fgbio/groupreadsbyumi/tests/main.nf.test @@ -0,0 +1,60 @@ +nextflow_process { + + name "Test Process FGBIO_GROUPREADSBYUMI" + script "../main.nf" + process "FGBIO_GROUPREADSBYUMI" + + tag "modules" + tag "modules_nfcore" + tag "fgbio" + tag "fgbio/groupreadsbyumi" + + test("sarscov2 - bam") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/umi/test.paired_end.unsorted_tagged.bam', checkIfExists: true) + ] + input[1] = "Adjacency" + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - bam - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/umi/test.paired_end.unsorted_tagged.bam', checkIfExists: true) + ] + input[1] = "Adjacency" + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/fgbio/groupreadsbyumi/tests/main.nf.test.snap b/modules/nf-core/fgbio/groupreadsbyumi/tests/main.nf.test.snap new file mode 100644 index 0000000000..dc89a622c7 --- /dev/null +++ b/modules/nf-core/fgbio/groupreadsbyumi/tests/main.nf.test.snap @@ -0,0 +1,108 @@ +{ + "sarscov2 - bam - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test_umi-grouped.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test_umi-grouped_histogram.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,e293eed3614f921114b3bd5b0e1ada10" + ], + "bam": [ + [ + { + "id": "test", + "single_end": false + }, + "test_umi-grouped.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "histogram": [ + [ + { + "id": "test", + "single_end": false + }, + "test_umi-grouped_histogram.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,e293eed3614f921114b3bd5b0e1ada10" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.1" + }, + "timestamp": "2024-05-21T10:48:29.108067677" + }, + "sarscov2 - bam": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test_umi-grouped.bam:md5,35bfc992c30d8e3e50816159fa58cb11" + ] + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test_umi-grouped_histogram.txt:md5,9a0c622b65209afbce0840e2affff983" + ] + ], + "2": [ + "versions.yml:md5,e293eed3614f921114b3bd5b0e1ada10" + ], + "bam": [ + [ + { + "id": "test", + "single_end": false + }, + "test_umi-grouped.bam:md5,35bfc992c30d8e3e50816159fa58cb11" + ] + ], + "histogram": [ + [ + { + "id": "test", + "single_end": false + }, + "test_umi-grouped_histogram.txt:md5,9a0c622b65209afbce0840e2affff983" + ] + ], + "versions": [ + "versions.yml:md5,e293eed3614f921114b3bd5b0e1ada10" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.1" + }, + "timestamp": "2024-05-21T10:48:14.269677258" + } +} \ No newline at end of file diff --git a/modules/nf-core/fgbio/groupreadsbyumi/tests/tags.yml b/modules/nf-core/fgbio/groupreadsbyumi/tests/tags.yml new file mode 100644 index 0000000000..83146c4414 --- /dev/null +++ b/modules/nf-core/fgbio/groupreadsbyumi/tests/tags.yml @@ -0,0 +1,2 @@ +fgbio/groupreadsbyumi: + - "modules/nf-core/fgbio/groupreadsbyumi/**" diff --git a/modules/nf-core/freebayes/environment.yml b/modules/nf-core/freebayes/environment.yml index 6846080a2f..3f59369680 100644 --- a/modules/nf-core/freebayes/environment.yml +++ b/modules/nf-core/freebayes/environment.yml @@ -1,7 +1,5 @@ -name: freebayes channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::freebayes=1.3.6 diff --git a/modules/nf-core/freebayes/meta.yml b/modules/nf-core/freebayes/meta.yml index 1803b2b319..45fc61d5bd 100644 --- a/modules/nf-core/freebayes/meta.yml +++ b/modules/nf-core/freebayes/meta.yml @@ -16,95 +16,101 @@ tools: tool_dev_url: https://github.com/freebayes/freebayes doi: "10.48550/arXiv.1207.3907" licence: ["MIT"] + identifier: biotools:freebayes input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input_1: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - input_1_index: - type: file - description: BAM/CRAM/SAM index file - pattern: "*.{bai,crai}" - - input_2: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - input_2_index: - type: file - description: BAM/CRAM/SAM index file - pattern: "*.{bai,crai}" - - target_bed: - type: file - description: Optional - Limit analysis to targets listed in this BED-format FILE. - pattern: "*.bed" - - ref_meta: - type: map - description: | - Groovy Map containing reference information. - e.g. [ id:'test_reference' ] - - fasta: - type: file - description: reference fasta file - pattern: ".{fa,fa.gz,fasta,fasta.gz}" - - ref_idx_meta: - type: map - description: | - Groovy Map containing reference information. - e.g. [ id:'test_reference' ] - - fasta_fai: - type: file - description: reference fasta file index - pattern: "*.{fa,fasta}.fai" - - samples_meta: - type: map - description: | - Groovy Map containing meta information for the samples file. - e.g. [ id:'test_samples' ] - - samples: - type: file - description: Optional - Limit analysis to samples listed (one per line) in the FILE. - pattern: "*.txt" - - populations_meta: - type: map - description: | - Groovy Map containing meta information for the populations file. - e.g. [ id:'test_populations' ] - - populations: - type: file - description: Optional - Each line of FILE should list a sample and a population which it is part of. - pattern: "*.txt" - - cnv_meta: - type: map - description: | - Groovy Map containing meta information for the cnv file. - e.g. [ id:'test_cnv' ] - - cnv: - type: file - description: | - A copy number map BED file, which has either a sample-level ploidy: - sample_name copy_number - or a region-specific format: - seq_name start end sample_name copy_number - pattern: "*.bed" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input_1: + type: file + description: BAM/CRAM/SAM file + pattern: "*.{bam,cram,sam}" + - input_1_index: + type: file + description: BAM/CRAM/SAM index file + pattern: "*.{bai,crai}" + - input_2: + type: file + description: BAM/CRAM/SAM file + pattern: "*.{bam,cram,sam}" + - input_2_index: + type: file + description: BAM/CRAM/SAM index file + pattern: "*.{bai,crai}" + - target_bed: + type: file + description: Optional - Limit analysis to targets listed in this BED-format + FILE. + pattern: "*.bed" + - - ref_meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test_reference' ] + - fasta: + type: file + description: reference fasta file + pattern: ".{fa,fa.gz,fasta,fasta.gz}" + - - ref_idx_meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test_reference' ] + - fasta_fai: + type: file + description: reference fasta file index + pattern: "*.{fa,fasta}.fai" + - - samples_meta: + type: map + description: | + Groovy Map containing meta information for the samples file. + e.g. [ id:'test_samples' ] + - samples: + type: file + description: Optional - Limit analysis to samples listed (one per line) in the + FILE. + pattern: "*.txt" + - - populations_meta: + type: map + description: | + Groovy Map containing meta information for the populations file. + e.g. [ id:'test_populations' ] + - populations: + type: file + description: Optional - Each line of FILE should list a sample and a population + which it is part of. + pattern: "*.txt" + - - cnv_meta: + type: map + description: | + Groovy Map containing meta information for the cnv file. + e.g. [ id:'test_cnv' ] + - cnv: + type: file + description: | + A copy number map BED file, which has either a sample-level ploidy: + sample_name copy_number + or a region-specific format: + seq_name start end sample_name copy_number + pattern: "*.bed" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software version - pattern: "versions.yml" - vcf: - type: file - description: Compressed VCF file - pattern: "*.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.vcf.gz": + type: file + description: Compressed VCF file + pattern: "*.vcf.gz" + - versions: + - versions.yml: + type: file + description: File containing software version + pattern: "versions.yml" authors: - "@maxibor" - "@FriederikeHanssen" diff --git a/modules/nf-core/freebayes/tests/main.nf.test b/modules/nf-core/freebayes/tests/main.nf.test index bee25a8e78..eb2d7a8055 100644 --- a/modules/nf-core/freebayes/tests/main.nf.test +++ b/modules/nf-core/freebayes/tests/main.nf.test @@ -15,14 +15,14 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:false ], // meta map - file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam_bai'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), [], [], [] ] - input[1] = [ [ id: 'test_fasta' ], file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true) ] - input[2] = [ [ id: 'test_fai' ], file(params.test_data['sarscov2']['genome']['genome_fasta_fai'], checkIfExists: true) ] + input[1] = [ [ id: 'test_fasta' ], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) ] + input[2] = [ [ id: 'test_fai' ], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true) ] input[3] = [ [], [] ] input[4] = [ [], [] ] input[5] = [ [], [] ] @@ -33,10 +33,12 @@ nextflow_process { then { assertAll( { assert process.success }, - // { assert snapshot(process.out).match() }, // Output VCF includes a timestamp, so snapshot not consistent past a day. - { assert snapshot(file(process.out.vcf.get(0).get(1)).name).match("test.vcf.gz") }, - { assert path(process.out.vcf.get(0).get(1)).linesGzip.toString().contains('MT192765.1\t10214\t.\tATTTAC\tATTAC\t29.8242') }, - { assert snapshot(process.out.versions).match() }, + { assert snapshot( + file(process.out.vcf[0][1]).name, + path(process.out.vcf[0][1]).linesGzip[2..10], + process.out.versions + ).match() + } ) } @@ -49,14 +51,14 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:false ], // meta map - file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam_bai'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), [], [], - file(params.test_data['sarscov2']['genome']['test_bed'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/test.bed', checkIfExists: true), ] - input[1] = [ [ id: 'fasta' ], file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true) ] - input[2] = [ [ id: 'fai' ], file(params.test_data['sarscov2']['genome']['genome_fasta_fai'], checkIfExists: true) ] + input[1] = [ [ id: 'fasta' ], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) ] + input[2] = [ [ id: 'fai' ], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true) ] input[3] = [ [], [] ] input[4] = [ [], [] ] input[5] = [ [], [] ] @@ -67,9 +69,12 @@ nextflow_process { then { assertAll( { assert process.success }, - // { assert snapshot(process.out).match() }, // Output VCF includes a timestamp, so snapshot not consistent past a day. - { assert snapshot(file(process.out.vcf.get(0).get(1)).name).match("test.vcf.gz") }, - { assert snapshot(process.out.versions).match() }, + { assert snapshot( + file(process.out.vcf[0][1]).name, + path(process.out.vcf[0][1]).linesGzip[2..10], + process.out.versions + ).match() + } ) } @@ -82,14 +87,14 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:false ], // meta map - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_cram'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_cram_crai'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true), [], [], [], ] - input[1] = [ [ id: 'fasta' ], file(params.test_data['homo_sapiens']['genome']['genome_fasta'], checkIfExists: true) ] - input[2] = [ [ id: 'fai' ], file(params.test_data['homo_sapiens']['genome']['genome_fasta_fai'], checkIfExists: true) ] + input[1] = [ [ id: 'fasta' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) ] + input[2] = [ [ id: 'fai' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true) ] input[3] = [ [], [] ] input[4] = [ [], [] ] input[5] = [ [], [] ] @@ -100,10 +105,12 @@ nextflow_process { then { assertAll( { assert process.success }, - // { assert snapshot(process.out).match() }, // Output VCF includes a timestamp, so snapshot not consistent past a day. - { assert snapshot(file(process.out.vcf.get(0).get(1)).name).match("test.vcf.gz") }, - { assert path(process.out.vcf.get(0).get(1)).linesGzip.toString().contains("chr22\t1982\t.\tA\tG\t459.724") }, - { assert snapshot(process.out.versions).match() }, + { assert snapshot( + file(process.out.vcf[0][1]).name, + path(process.out.vcf[0][1]).linesGzip[2..10], + process.out.versions + ).match() + } ) } @@ -116,14 +123,14 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:false ], // meta map - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam_bai'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_sorted_bam_bai'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.sorted.bam.bai', checkIfExists: true), [], ] - input[1] = [ [ id: 'fasta' ], file(params.test_data['homo_sapiens']['genome']['genome_fasta'], checkIfExists: true) ] - input[2] = [ [ id: 'fai' ], file(params.test_data['homo_sapiens']['genome']['genome_fasta_fai'], checkIfExists: true) ] + input[1] = [ [ id: 'fasta' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) ] + input[2] = [ [ id: 'fai' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true) ] input[3] = [ [], [] ] input[4] = [ [], [] ] input[5] = [ [], [] ] @@ -134,10 +141,12 @@ nextflow_process { then { assertAll( { assert process.success }, - // { assert snapshot(process.out).match() }, // Output VCF includes a timestamp, so snapshot not consistent past a day. - { assert snapshot(file(process.out.vcf.get(0).get(1)).name).match("test.vcf.gz") }, - { assert path(process.out.vcf.get(0).get(1)).linesGzip.toString().contains("chr22\t1982\t.\tA\tG\t670.615") }, - { assert snapshot(process.out.versions).match() }, + { assert snapshot( + file(process.out.vcf[0][1]).name, + path(process.out.vcf[0][1]).linesGzip[2..10], + process.out.versions + ).match() + } ) } @@ -150,14 +159,14 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:false ], // meta map - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_cram'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_cram_crai'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_sorted_cram'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_sorted_cram_crai'], checkIfExists: true), - file(params.test_data['homo_sapiens']['genome']['genome_bed'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test2.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test2.paired_end.sorted.cram.crai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists: true), ] - input[1] = [ [ id: 'fasta' ], file(params.test_data['homo_sapiens']['genome']['genome_fasta'], checkIfExists: true) ] - input[2] = [ [ id: 'fai' ], file(params.test_data['homo_sapiens']['genome']['genome_fasta_fai'], checkIfExists: true) ] + input[1] = [ [ id: 'fasta' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) ] + input[2] = [ [ id: 'fai' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true) ] input[3] = [ [], [] ] input[4] = [ [], [] ] input[5] = [ [], [] ] @@ -168,10 +177,12 @@ nextflow_process { then { assertAll( { assert process.success }, - // { assert snapshot(process.out).match() }, // Output VCF includes a timestamp, so snapshot not consistent past a day. - { assert snapshot(file(process.out.vcf.get(0).get(1)).name).match("test.vcf.gz") }, - { assert path(process.out.vcf.get(0).get(1)).linesGzip.toString().contains("chr22\t1982\t.\tA\tG\t670.615") }, - { assert snapshot(process.out.versions).match() }, + { assert snapshot( + file(process.out.vcf[0][1]).name, + path(process.out.vcf[0][1]).linesGzip[2..10], + process.out.versions + ).match() + } ) } diff --git a/modules/nf-core/freebayes/tests/main.nf.test.snap b/modules/nf-core/freebayes/tests/main.nf.test.snap index 9760a680e8..f9f25a2ee2 100644 --- a/modules/nf-core/freebayes/tests/main.nf.test.snap +++ b/modules/nf-core/freebayes/tests/main.nf.test.snap @@ -1,48 +1,122 @@ { "sarscov2 - [ cram, crai, cram, crai, bed ] - fasta - fai": { "content": [ + "test.vcf.gz", + [ + "##source=freeBayes v1.3.6", + "##reference=genome.fasta", + "##contig=", + "##phasing=none", + "##commandline=\"freebayes -f genome.fasta --target genome.bed test.paired_end.sorted.cram test2.paired_end.sorted.cram\"", + "##INFO=", + "##INFO=", + "##INFO=", + "##INFO=" + ], [ "versions.yml:md5,4d24a735eabf2f037ab935511a2bc99c" ] ], - "timestamp": "2023-12-13T12:20:01.263906" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-14T16:16:26.462458" }, "sarscov2 - [ bam, bai ] - fasta - fai": { "content": [ + "test.vcf.gz", + [ + "##source=freeBayes v1.3.6", + "##reference=genome.fasta", + "##contig=", + "##phasing=none", + "##commandline=\"freebayes -f genome.fasta test.paired_end.sorted.bam\"", + "##INFO=", + "##INFO=", + "##INFO=", + "##INFO=" + ], [ "versions.yml:md5,4d24a735eabf2f037ab935511a2bc99c" ] ], - "timestamp": "2023-12-13T12:19:37.06375" - }, - "test.vcf.gz": { - "content": [ - "test.vcf.gz" - ], - "timestamp": "2023-12-13T12:19:37.050165" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-14T16:16:02.547974" }, "sarscov2 - [ cram, crai ] - fasta - fai": { "content": [ + "test.vcf.gz", + [ + "##source=freeBayes v1.3.6", + "##reference=genome.fasta", + "##contig=", + "##phasing=none", + "##commandline=\"freebayes -f genome.fasta test.paired_end.sorted.cram\"", + "##INFO=", + "##INFO=", + "##INFO=", + "##INFO=" + ], [ "versions.yml:md5,4d24a735eabf2f037ab935511a2bc99c" ] ], - "timestamp": "2023-12-13T12:19:48.797103" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-14T16:16:13.594195" }, "sarscov2 - [ bam, bai, bed ] - fasta - fai": { "content": [ + "test.vcf.gz", + [ + "##source=freeBayes v1.3.6", + "##reference=genome.fasta", + "##contig=", + "##phasing=none", + "##commandline=\"freebayes -f genome.fasta --target test.bed test.paired_end.sorted.bam\"", + "##INFO=", + "##INFO=", + "##INFO=", + "##INFO=" + ], [ "versions.yml:md5,4d24a735eabf2f037ab935511a2bc99c" ] ], - "timestamp": "2023-12-13T12:19:43.147912" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-14T16:16:07.514526" }, "sarscov2 - [ bam, bai, bam, bai ] - fasta - fai": { "content": [ + "test.vcf.gz", + [ + "##source=freeBayes v1.3.6", + "##reference=genome.fasta", + "##contig=", + "##phasing=none", + "##commandline=\"freebayes -f genome.fasta test.paired_end.sorted.bam test2.paired_end.sorted.bam\"", + "##INFO=", + "##INFO=", + "##INFO=", + "##INFO=" + ], [ "versions.yml:md5,4d24a735eabf2f037ab935511a2bc99c" ] ], - "timestamp": "2023-12-13T12:19:55.186773" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-14T16:16:20.085987" } } \ No newline at end of file diff --git a/modules/nf-core/gatk4/applybqsr/environment.yml b/modules/nf-core/gatk4/applybqsr/environment.yml index 80c811e6c1..55993f440c 100644 --- a/modules/nf-core/gatk4/applybqsr/environment.yml +++ b/modules/nf-core/gatk4/applybqsr/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_applybqsr channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/applybqsr/main.nf b/modules/nf-core/gatk4/applybqsr/main.nf index 78db9d7f00..4e91c311b0 100644 --- a/modules/nf-core/gatk4/applybqsr/main.nf +++ b/modules/nf-core/gatk4/applybqsr/main.nf @@ -48,4 +48,17 @@ process GATK4_APPLYBQSR { gatk4: \$(echo \$(gatk --version 2>&1) | sed 's/^.*(GATK) v//; s/ .*\$//') END_VERSIONS """ + + stub: + def prefix = task.ext.prefix ?: "${meta.id}" + def input_extension = input.getExtension() + def output_extension = input_extension == 'bam' ? 'bam' : 'cram' + """ + touch ${prefix}.${output_extension} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + gatk4: \$(echo \$(gatk --version 2>&1) | sed 's/^.*(GATK) v//; s/ .*\$//') + END_VERSIONS + """ } diff --git a/modules/nf-core/gatk4/applybqsr/meta.yml b/modules/nf-core/gatk4/applybqsr/meta.yml index ab9efea3f4..65d9c9e9a8 100644 --- a/modules/nf-core/gatk4/applybqsr/meta.yml +++ b/modules/nf-core/gatk4/applybqsr/meta.yml @@ -16,56 +16,65 @@ tools: documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM file from alignment - pattern: "*.{bam,cram}" - - input_index: - type: file - description: BAI/CRAI file from alignment - pattern: "*.{bai,crai}" - - bqsr_table: - type: file - description: Recalibration table from gatk4_baserecalibrator - - intervals: - type: file - description: Bed file with the genomic regions included in the library (optional) - - fasta: - type: file - description: The reference fasta file - pattern: "*.fasta" - - fai: - type: file - description: Index of reference fasta file - pattern: "*.fasta.fai" - - dict: - type: file - description: GATK sequence dictionary - pattern: "*.dict" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM file from alignment + pattern: "*.{bam,cram}" + - input_index: + type: file + description: BAI/CRAI file from alignment + pattern: "*.{bai,crai}" + - bqsr_table: + type: file + description: Recalibration table from gatk4_baserecalibrator + - intervals: + type: file + description: Bed file with the genomic regions included in the library (optional) + - - fasta: + type: file + description: The reference fasta file + pattern: "*.fasta" + - - fai: + type: file + description: Index of reference fasta file + pattern: "*.fasta.fai" + - - dict: + type: file + description: GATK sequence dictionary + pattern: "*.dict" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - bam: - type: file - description: Recalibrated BAM file - pattern: "*.{bam}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bam": + type: file + description: Recalibrated BAM file + pattern: "*.{bam}" - cram: - type: file - description: Recalibrated CRAM file - pattern: "*.{cram}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.cram": + type: file + description: Recalibrated CRAM file + pattern: "*.{cram}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@yocra3" - "@FriederikeHanssen" diff --git a/modules/nf-core/gatk4/applybqsr/tests/main.nf.test b/modules/nf-core/gatk4/applybqsr/tests/main.nf.test index 3d9c6204f7..acb41ce1e4 100644 --- a/modules/nf-core/gatk4/applybqsr/tests/main.nf.test +++ b/modules/nf-core/gatk4/applybqsr/tests/main.nf.test @@ -92,4 +92,63 @@ nextflow_process { } } + test("sarscov2 - cram - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/test.baserecalibrator.table', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists: true) + ] + input[1] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) + input[2] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true) + input[3] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.dict', checkIfExists: true) + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() }, + ) + } + } + + test("sarscov2 - bam - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/gatk/test.baserecalibrator.table', checkIfExists: true), + [] + ] + input[1] = file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + input[2] = file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true) + input[3] = file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.dict', checkIfExists: true) + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + } diff --git a/modules/nf-core/gatk4/applybqsr/tests/main.nf.test.snap b/modules/nf-core/gatk4/applybqsr/tests/main.nf.test.snap index a387039d6a..19b37d0636 100644 --- a/modules/nf-core/gatk4/applybqsr/tests/main.nf.test.snap +++ b/modules/nf-core/gatk4/applybqsr/tests/main.nf.test.snap @@ -1,4 +1,43 @@ { + "sarscov2 - bam - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + + ], + "2": [ + "versions.yml:md5,bb2a060a0280c812fba3c74b1707b350" + ], + "bam": [ + [ + { + "id": "test" + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "cram": [ + + ], + "versions": [ + "versions.yml:md5,bb2a060a0280c812fba3c74b1707b350" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-20T10:25:00.314573" + }, "sarscov2 - bam - intervals": { "content": [ { @@ -60,6 +99,45 @@ }, "timestamp": "2023-12-09T03:10:46.70859771" }, + "sarscov2 - cram - stub": { + "content": [ + { + "0": [ + + ], + "1": [ + [ + { + "id": "test" + }, + "test.cram:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,bb2a060a0280c812fba3c74b1707b350" + ], + "bam": [ + + ], + "cram": [ + [ + { + "id": "test" + }, + "test.cram:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,bb2a060a0280c812fba3c74b1707b350" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-20T10:24:52.761169" + }, "sarscov2 - bam": { "content": [ { diff --git a/modules/nf-core/gatk4/applyvqsr/environment.yml b/modules/nf-core/gatk4/applyvqsr/environment.yml index c043cd632a..55993f440c 100644 --- a/modules/nf-core/gatk4/applyvqsr/environment.yml +++ b/modules/nf-core/gatk4/applyvqsr/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_applyvqsr channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/applyvqsr/meta.yml b/modules/nf-core/gatk4/applyvqsr/meta.yml index de5d6d067a..ceedff621e 100644 --- a/modules/nf-core/gatk4/applyvqsr/meta.yml +++ b/modules/nf-core/gatk4/applyvqsr/meta.yml @@ -19,57 +19,72 @@ tools: documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test'] - - vcf: - type: file - description: VCF file to be recalibrated, this should be the same file as used for the first stage VariantRecalibrator. - pattern: "*.vcf" - - vcf_tbi: - type: file - description: tabix index for the input vcf file. - pattern: "*.vcf.tbi" - - recal: - type: file - description: Recalibration file produced when the input vcf was run through VariantRecalibrator in stage 1. - pattern: "*.recal" - - recal_index: - type: file - description: Index file for the recalibration file. - pattern: ".recal.idx" - - tranches: - type: file - description: Tranches file produced when the input vcf was run through VariantRecalibrator in stage 1. - pattern: ".tranches" - - fasta: - type: file - description: The reference fasta file - pattern: "*.fasta" - - fai: - type: file - description: Index of reference fasta file - pattern: "*.fasta.fai" - - dict: - type: file - description: GATK sequence dictionary - pattern: "*.dict" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - vcf: + type: file + description: VCF file to be recalibrated, this should be the same file as used + for the first stage VariantRecalibrator. + pattern: "*.vcf" + - vcf_tbi: + type: file + description: tabix index for the input vcf file. + pattern: "*.vcf.tbi" + - recal: + type: file + description: Recalibration file produced when the input vcf was run through + VariantRecalibrator in stage 1. + pattern: "*.recal" + - recal_index: + type: file + description: Index file for the recalibration file. + pattern: ".recal.idx" + - tranches: + type: file + description: Tranches file produced when the input vcf was run through VariantRecalibrator + in stage 1. + pattern: ".tranches" + - - fasta: + type: file + description: The reference fasta file + pattern: "*.fasta" + - - fai: + type: file + description: Index of reference fasta file + pattern: "*.fasta.fai" + - - dict: + type: file + description: GATK sequence dictionary + pattern: "*.dict" output: - vcf: - type: file - description: compressed vcf file containing the recalibrated variants. - pattern: "*.vcf.gz" + - meta: + type: file + description: compressed vcf file containing the recalibrated variants. + pattern: "*.vcf.gz" + - "*.vcf.gz": + type: file + description: compressed vcf file containing the recalibrated variants. + pattern: "*.vcf.gz" - tbi: - type: file - description: Index of recalibrated vcf file. - pattern: "*vcf.gz.tbi" + - meta: + type: file + description: Index of recalibrated vcf file. + pattern: "*vcf.gz.tbi" + - "*.tbi": + type: file + description: Index of recalibrated vcf file. + pattern: "*vcf.gz.tbi" - versions: - type: file - description: File containing software versions. - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions. + pattern: "versions.yml" authors: - "@GCJMackenzie" maintainers: diff --git a/modules/nf-core/gatk4/baserecalibrator/environment.yml b/modules/nf-core/gatk4/baserecalibrator/environment.yml index 365e5c6319..55993f440c 100644 --- a/modules/nf-core/gatk4/baserecalibrator/environment.yml +++ b/modules/nf-core/gatk4/baserecalibrator/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_baserecalibrator channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/baserecalibrator/meta.yml b/modules/nf-core/gatk4/baserecalibrator/meta.yml index 8252b8c290..876b796039 100644 --- a/modules/nf-core/gatk4/baserecalibrator/meta.yml +++ b/modules/nf-core/gatk4/baserecalibrator/meta.yml @@ -16,57 +16,60 @@ tools: documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM file from alignment - pattern: "*.{bam,cram}" - - input_index: - type: file - description: BAI/CRAI file from alignment - pattern: "*.{bai,crai}" - - intervals: - type: file - description: Bed file with the genomic regions included in the library (optional) - - fasta: - type: file - description: The reference fasta file - pattern: "*.fasta" - - fai: - type: file - description: Index of reference fasta file - pattern: "*.fasta.fai" - - dict: - type: file - description: GATK sequence dictionary - pattern: "*.dict" - - known_sites: - type: file - description: VCF files with known sites for indels / snps (optional) - pattern: "*.vcf.gz" - - known_sites_tbi: - type: file - description: Tabix index of the known_sites (optional) - pattern: "*.vcf.gz.tbi" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM file from alignment + pattern: "*.{bam,cram}" + - input_index: + type: file + description: BAI/CRAI file from alignment + pattern: "*.{bai,crai}" + - intervals: + type: file + description: Bed file with the genomic regions included in the library (optional) + - - fasta: + type: file + description: The reference fasta file + pattern: "*.fasta" + - - fai: + type: file + description: Index of reference fasta file + pattern: "*.fasta.fai" + - - dict: + type: file + description: GATK sequence dictionary + pattern: "*.dict" + - - known_sites: + type: file + description: VCF files with known sites for indels / snps (optional) + pattern: "*.vcf.gz" + - - known_sites_tbi: + type: file + description: Tabix index of the known_sites (optional) + pattern: "*.vcf.gz.tbi" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - table: - type: file - description: Recalibration table from BaseRecalibrator - pattern: "*.{table}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.table": + type: file + description: Recalibration table from BaseRecalibrator + pattern: "*.{table}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@yocra3" - "@FriederikeHanssen" diff --git a/modules/nf-core/gatk4/calculatecontamination/environment.yml b/modules/nf-core/gatk4/calculatecontamination/environment.yml index 5ec9c48293..55993f440c 100644 --- a/modules/nf-core/gatk4/calculatecontamination/environment.yml +++ b/modules/nf-core/gatk4/calculatecontamination/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_calculatecontamination channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/calculatecontamination/meta.yml b/modules/nf-core/gatk4/calculatecontamination/meta.yml index b0ffe814c5..ee90a48252 100644 --- a/modules/nf-core/gatk4/calculatecontamination/meta.yml +++ b/modules/nf-core/gatk4/calculatecontamination/meta.yml @@ -17,33 +17,50 @@ tools: documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test' ] - - pileup: - type: file - description: File containing the pileups summary table of a tumor sample to be used to calculate contamination. - pattern: "*.pileups.table" - - matched: - type: file - description: File containing the pileups summary table of a normal sample that matches with the tumor sample specified in pileup argument. This is an optional input. - pattern: "*.pileups.table" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test' ] + - pileup: + type: file + description: File containing the pileups summary table of a tumor sample to + be used to calculate contamination. + pattern: "*.pileups.table" + - matched: + type: file + description: File containing the pileups summary table of a normal sample that + matches with the tumor sample specified in pileup argument. This is an optional + input. + pattern: "*.pileups.table" output: - contamination: - type: file - description: File containing the contamination table. - pattern: "*.contamination.table" + - meta: + type: file + description: File containing the contamination table. + pattern: "*.contamination.table" + - "*.contamination.table": + type: file + description: File containing the contamination table. + pattern: "*.contamination.table" - segmentation: - type: file - description: output table containing segmentation of tumor minor allele fractions (optional) - pattern: "*.segmentation.table" + - meta: + type: file + description: output table containing segmentation of tumor minor allele fractions + (optional) + pattern: "*.segmentation.table" + - "*.segmentation.table": + type: file + description: output table containing segmentation of tumor minor allele fractions + (optional) + pattern: "*.segmentation.table" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@GCJMackenzie" - "@maxulysse" diff --git a/modules/nf-core/gatk4/cnnscorevariants/meta.yml b/modules/nf-core/gatk4/cnnscorevariants/meta.yml index 8a9d0f51c2..b55c9d9995 100644 --- a/modules/nf-core/gatk4/cnnscorevariants/meta.yml +++ b/modules/nf-core/gatk4/cnnscorevariants/meta.yml @@ -14,65 +14,74 @@ tools: documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: VCF file - pattern: "*.vcf.gz" - - tbi: - type: file - description: VCF index file - pattern: "*.vcf.gz.tbi" - - aligned_input: - type: file - description: BAM/CRAM file from alignment (optional) - pattern: "*.{bam,cram}" - - intervals: - type: file - description: Bed file with the genomic regions included in the library (optional) - - fasta: - type: file - description: The reference fasta file - pattern: "*.fasta" - - fai: - type: file - description: Index of reference fasta file - pattern: "*.fasta.fai" - - dict: - type: file - description: GATK sequence dictionary - pattern: "*.dict" - - architecture: - type: file - description: Neural Net architecture configuration json file (optional) - pattern: "*.json" - - weights: - type: file - description: Keras model HD5 file with neural net weights. (optional) - pattern: "*.hd5" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: VCF file + pattern: "*.vcf.gz" + - tbi: + type: file + description: VCF index file + pattern: "*.vcf.gz.tbi" + - aligned_input: + type: file + description: BAM/CRAM file from alignment (optional) + pattern: "*.{bam,cram}" + - intervals: + type: file + description: Bed file with the genomic regions included in the library (optional) + - - fasta: + type: file + description: The reference fasta file + pattern: "*.fasta" + - - fai: + type: file + description: Index of reference fasta file + pattern: "*.fasta.fai" + - - dict: + type: file + description: GATK sequence dictionary + pattern: "*.dict" + - - architecture: + type: file + description: Neural Net architecture configuration json file (optional) + pattern: "*.json" + - - weights: + type: file + description: Keras model HD5 file with neural net weights. (optional) + pattern: "*.hd5" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - vcf: - type: file - description: Annotated VCF file - pattern: "*.vcf" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*cnn.vcf.gz": + type: file + description: Annotated VCF file + pattern: "*.vcf" - tbi: - type: file - description: VCF index file - pattern: "*.vcf.gz.tbi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*cnn.vcf.gz.tbi": + type: file + description: VCF index file + pattern: "*.vcf.gz.tbi" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@FriederikeHanssen" maintainers: diff --git a/modules/nf-core/gatk4/createsequencedictionary/environment.yml b/modules/nf-core/gatk4/createsequencedictionary/environment.yml index 78822ad03f..55993f440c 100644 --- a/modules/nf-core/gatk4/createsequencedictionary/environment.yml +++ b/modules/nf-core/gatk4/createsequencedictionary/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_createsequencedictionary channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/createsequencedictionary/meta.yml b/modules/nf-core/gatk4/createsequencedictionary/meta.yml index f9d70be098..7b5156bb3d 100644 --- a/modules/nf-core/gatk4/createsequencedictionary/meta.yml +++ b/modules/nf-core/gatk4/createsequencedictionary/meta.yml @@ -15,25 +15,32 @@ tools: documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fasta: - type: file - description: Input fasta file - pattern: "*.{fasta,fa}" + - - meta: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fasta: + type: file + description: Input fasta file + pattern: "*.{fasta,fa}" output: - dict: - type: file - description: gatk dictionary file - pattern: "*.{dict}" + - meta: + type: file + description: gatk dictionary file + pattern: "*.{dict}" + - "*.dict": + type: file + description: gatk dictionary file + pattern: "*.{dict}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@maxulysse" - "@ramprasadn" diff --git a/modules/nf-core/gatk4/estimatelibrarycomplexity/environment.yml b/modules/nf-core/gatk4/estimatelibrarycomplexity/environment.yml index 5fdd85af80..55993f440c 100644 --- a/modules/nf-core/gatk4/estimatelibrarycomplexity/environment.yml +++ b/modules/nf-core/gatk4/estimatelibrarycomplexity/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_estimatelibrarycomplexity channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/estimatelibrarycomplexity/meta.yml b/modules/nf-core/gatk4/estimatelibrarycomplexity/meta.yml index 2d5bddf6c9..4fb06a3a2e 100644 --- a/modules/nf-core/gatk4/estimatelibrarycomplexity/meta.yml +++ b/modules/nf-core/gatk4/estimatelibrarycomplexity/meta.yml @@ -13,42 +13,45 @@ tools: tool_dev_url: https://github.com/broadinstitute/gatk doi: "10.1158/1538-7445.AM2017-3590" licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - fasta: - type: file - description: The reference fasta file - pattern: "*.fasta" - - fai: - type: file - description: Index of reference fasta file - pattern: "fasta.fai" - - dict: - type: file - description: GATK sequence dictionary - pattern: "*.dict" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM/SAM file + pattern: "*.{bam,cram,sam}" + - - fasta: + type: file + description: The reference fasta file + pattern: "*.fasta" + - - fai: + type: file + description: Index of reference fasta file + pattern: "fasta.fai" + - - dict: + type: file + description: GATK sequence dictionary + pattern: "*.dict" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - metrics: - type: file - description: File containing metrics on the input files - pattern: "*.{metrics}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.metrics": + type: file + description: File containing metrics on the input files + pattern: "*.{metrics}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@FriederikeHanssen" - "@maxulysse" diff --git a/modules/nf-core/gatk4/filtermutectcalls/environment.yml b/modules/nf-core/gatk4/filtermutectcalls/environment.yml index 7494d84dbc..55993f440c 100644 --- a/modules/nf-core/gatk4/filtermutectcalls/environment.yml +++ b/modules/nf-core/gatk4/filtermutectcalls/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_filtermutectcalls channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/filtermutectcalls/meta.yml b/modules/nf-core/gatk4/filtermutectcalls/meta.yml index 736c838625..9287277eb7 100644 --- a/modules/nf-core/gatk4/filtermutectcalls/meta.yml +++ b/modules/nf-core/gatk4/filtermutectcalls/meta.yml @@ -16,83 +16,103 @@ tools: homepage: https://gatk.broadinstitute.org/hc/en-us documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test' ] - - vcf: - type: file - description: compressed vcf file of mutect2calls - pattern: "*.vcf.gz" - - vcf_tbi: - type: file - description: Tabix index of vcf file - pattern: "*vcf.gz.tbi" - - stats: - type: file - description: Stats file that pairs with output vcf file - pattern: "*vcf.gz.stats" - - orientationbias: - type: file - description: files containing artifact priors for input vcf. Optional input. - pattern: "*.artifact-prior.tar.gz" - - segmentation: - type: file - description: tables containing segmentation information for input vcf. Optional input. - pattern: "*.segmentation.table" - - table: - type: file - description: table(s) containing contamination data for input vcf. Optional input, takes priority over estimate. - pattern: "*.contamination.table" - - estimate: - type: float - description: estimation of contamination value as a double. Optional input, will only be used if table is not specified. - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fasta: - type: file - description: The reference fasta file - pattern: "*.fasta" - - meta3: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fai: - type: file - description: Index of reference fasta file - pattern: "*.fasta.fai" - - meta4: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - dict: - type: file - description: GATK sequence dictionary - pattern: "*.dict" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test' ] + - vcf: + type: file + description: compressed vcf file of mutect2calls + pattern: "*.vcf.gz" + - vcf_tbi: + type: file + description: Tabix index of vcf file + pattern: "*vcf.gz.tbi" + - stats: + type: file + description: Stats file that pairs with output vcf file + pattern: "*vcf.gz.stats" + - orientationbias: + type: file + description: files containing artifact priors for input vcf. Optional input. + pattern: "*.artifact-prior.tar.gz" + - segmentation: + type: file + description: tables containing segmentation information for input vcf. Optional + input. + pattern: "*.segmentation.table" + - table: + type: file + description: table(s) containing contamination data for input vcf. Optional + input, takes priority over estimate. + pattern: "*.contamination.table" + - estimate: + type: float + description: estimation of contamination value as a double. Optional input, + will only be used if table is not specified. + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fasta: + type: file + description: The reference fasta file + pattern: "*.fasta" + - - meta3: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fai: + type: file + description: Index of reference fasta file + pattern: "*.fasta.fai" + - - meta4: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - dict: + type: file + description: GATK sequence dictionary + pattern: "*.dict" output: - vcf: - type: file - description: file containing filtered mutect2 calls. - pattern: "*.vcf.gz" + - meta: + type: file + description: file containing filtered mutect2 calls. + pattern: "*.vcf.gz" + - "*.vcf.gz": + type: file + description: file containing filtered mutect2 calls. + pattern: "*.vcf.gz" - tbi: - type: file - description: tbi file that pairs with vcf. - pattern: "*.vcf.gz.tbi" + - meta: + type: file + description: tbi file that pairs with vcf. + pattern: "*.vcf.gz.tbi" + - "*.vcf.gz.tbi": + type: file + description: tbi file that pairs with vcf. + pattern: "*.vcf.gz.tbi" - stats: - type: file - description: file containing statistics of the filtermutectcalls run. - pattern: "*.filteringStats.tsv" + - meta: + type: file + description: file containing statistics of the filtermutectcalls run. + pattern: "*.filteringStats.tsv" + - "*.filteringStats.tsv": + type: file + description: file containing statistics of the filtermutectcalls run. + pattern: "*.filteringStats.tsv" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@GCJMackenzie" - "@maxulysse" diff --git a/modules/nf-core/gatk4/filtervarianttranches/environment.yml b/modules/nf-core/gatk4/filtervarianttranches/environment.yml index 9763cf1eff..55993f440c 100644 --- a/modules/nf-core/gatk4/filtervarianttranches/environment.yml +++ b/modules/nf-core/gatk4/filtervarianttranches/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_filtervarianttranches channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/filtervarianttranches/meta.yml b/modules/nf-core/gatk4/filtervarianttranches/meta.yml index 9346d2b4a4..398bbb07c1 100644 --- a/modules/nf-core/gatk4/filtervarianttranches/meta.yml +++ b/modules/nf-core/gatk4/filtervarianttranches/meta.yml @@ -14,58 +14,72 @@ tools: documentation: https://gatk.broadinstitute.org/hc/en-us/articles/360051308071-FilterVariantTranches doi: 10.1158/1538-7445.AM2017-3590 licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: a VCF file containing variants, must have info key:CNN_2D - pattern: "*.vcf.gz" - - tbi: - type: file - description: tbi file matching with -vcf - pattern: "*.vcf.gz.tbi" - - resources: - type: list - description: resource A VCF containing known SNP and or INDEL sites. Can be supplied as many times as necessary - pattern: "*.vcf.gz" - - resources_index: - type: list - description: Index of resource VCF containing known SNP and or INDEL sites. Can be supplied as many times as necessary - pattern: "*.vcf.gz" - - fasta: - type: file - description: The reference fasta file - pattern: "*.fasta" - - fai: - type: file - description: Index of reference fasta file - pattern: "fasta.fai" - - dict: - type: file - description: GATK sequence dictionary - pattern: ".dict" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: a VCF file containing variants, must have info key:CNN_2D + pattern: "*.vcf.gz" + - tbi: + type: file + description: tbi file matching with -vcf + pattern: "*.vcf.gz.tbi" + - intervals: + type: file + description: Intervals + - - resources: + type: list + description: resource A VCF containing known SNP and or INDEL sites. Can be + supplied as many times as necessary + pattern: "*.vcf.gz" + - - resources_index: + type: list + description: Index of resource VCF containing known SNP and or INDEL sites. + Can be supplied as many times as necessary + pattern: "*.vcf.gz" + - - fasta: + type: file + description: The reference fasta file + pattern: "*.fasta" + - - fai: + type: file + description: Index of reference fasta file + pattern: "fasta.fai" + - - dict: + type: file + description: GATK sequence dictionary + pattern: ".dict" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - vcf: - type: file - description: VCF file - pattern: "*.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.vcf.gz": + type: file + description: VCF file + pattern: "*.vcf.gz" - tbi: - type: file - description: VCF index file - pattern: "*.vcf.gz.tbi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.vcf.gz.tbi": + type: file + description: VCF index file + pattern: "*.vcf.gz.tbi" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@FriederikeHanssen" maintainers: diff --git a/modules/nf-core/gatk4/gatherbqsrreports/environment.yml b/modules/nf-core/gatk4/gatherbqsrreports/environment.yml index 4248a29812..55993f440c 100644 --- a/modules/nf-core/gatk4/gatherbqsrreports/environment.yml +++ b/modules/nf-core/gatk4/gatherbqsrreports/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_gatherbqsrreports channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/gatherbqsrreports/meta.yml b/modules/nf-core/gatk4/gatherbqsrreports/meta.yml index b9f5bf5f8b..587175b3a4 100644 --- a/modules/nf-core/gatk4/gatherbqsrreports/meta.yml +++ b/modules/nf-core/gatk4/gatherbqsrreports/meta.yml @@ -13,30 +13,33 @@ tools: tool_dev_url: https://github.com/broadinstitute/gatk doi: "10.1158/1538-7445.AM2017-3590" licence: ["BSD-3-clause"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - table: - type: file - description: File(s) containing BQSR table(s) - pattern: "*.table" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - table: + type: file + description: File(s) containing BQSR table(s) + pattern: "*.table" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - table: - type: file - description: File containing joined BQSR table - pattern: "*.table" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.table": + type: file + description: File containing joined BQSR table + pattern: "*.table" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@FriederikeHanssen" maintainers: diff --git a/modules/nf-core/gatk4/gatherbqsrreports/tests/main.nf.test b/modules/nf-core/gatk4/gatherbqsrreports/tests/main.nf.test new file mode 100644 index 0000000000..173b149b54 --- /dev/null +++ b/modules/nf-core/gatk4/gatherbqsrreports/tests/main.nf.test @@ -0,0 +1,60 @@ + +nextflow_process { + + name "Test Process GATK4_GATHERBQSRREPORTS" + script "../main.nf" + process "GATK4_GATHERBQSRREPORTS" + + tag "modules" + tag "modules_nfcore" + tag "gatk4" + tag "gatk4/gatherbqsrreports" + + test("test-gatk4-gatherbqsrreports") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/test.baserecalibrator.table', checkIfExists: true) + ] + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test-gatk4-gatherbqsrreports-multiple") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/test.baserecalibrator.table', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/test2.baserecalibrator.table', checkIfExists: true) + ] + ] + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + +} diff --git a/modules/nf-core/gatk4/gatherbqsrreports/tests/main.nf.test.snap b/modules/nf-core/gatk4/gatherbqsrreports/tests/main.nf.test.snap new file mode 100644 index 0000000000..bc5d4bd133 --- /dev/null +++ b/modules/nf-core/gatk4/gatherbqsrreports/tests/main.nf.test.snap @@ -0,0 +1,72 @@ +{ + "test-gatk4-gatherbqsrreports-multiple": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.table:md5,0c1257eececf95db8ca378272d0f21f9" + ] + ], + "1": [ + "versions.yml:md5,413fc0014d5dc41ab67d65f59f61a4a0" + ], + "table": [ + [ + { + "id": "test", + "single_end": false + }, + "test.table:md5,0c1257eececf95db8ca378272d0f21f9" + ] + ], + "versions": [ + "versions.yml:md5,413fc0014d5dc41ab67d65f59f61a4a0" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-26T12:22:34.490694" + }, + "test-gatk4-gatherbqsrreports": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.table:md5,9603b69fdc3b5090de2e0dd78bfcc4bf" + ] + ], + "1": [ + "versions.yml:md5,413fc0014d5dc41ab67d65f59f61a4a0" + ], + "table": [ + [ + { + "id": "test", + "single_end": false + }, + "test.table:md5,9603b69fdc3b5090de2e0dd78bfcc4bf" + ] + ], + "versions": [ + "versions.yml:md5,413fc0014d5dc41ab67d65f59f61a4a0" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-26T12:22:10.552951" + } +} \ No newline at end of file diff --git a/modules/nf-core/gatk4/gatherpileupsummaries/environment.yml b/modules/nf-core/gatk4/gatherpileupsummaries/environment.yml index 217387f9c0..55993f440c 100644 --- a/modules/nf-core/gatk4/gatherpileupsummaries/environment.yml +++ b/modules/nf-core/gatk4/gatherpileupsummaries/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_gatherpileupsummaries channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/gatherpileupsummaries/main.nf b/modules/nf-core/gatk4/gatherpileupsummaries/main.nf index 561e9bb8b8..bcafd544b4 100644 --- a/modules/nf-core/gatk4/gatherpileupsummaries/main.nf +++ b/modules/nf-core/gatk4/gatherpileupsummaries/main.nf @@ -14,7 +14,7 @@ process GATK4_GATHERPILEUPSUMMARIES { output: tuple val(meta), path("*.pileups.table"), emit: table - path "versions.yml" , emit: versions + path "versions.yml" , emit: versions when: task.ext.when == null || task.ext.when @@ -44,4 +44,15 @@ process GATK4_GATHERPILEUPSUMMARIES { gatk4: \$(echo \$(gatk --version 2>&1) | sed 's/^.*(GATK) v//; s/ .*\$//') END_VERSIONS """ + + stub: + def prefix = task.ext.prefix ?: "${meta.id}" + """ + touch ${prefix}.pileups.table + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + gatk4: \$(echo \$(gatk --version 2>&1) | sed 's/^.*(GATK) v//; s/ .*\$//') + END_VERSIONS + """ } diff --git a/modules/nf-core/gatk4/gatherpileupsummaries/meta.yml b/modules/nf-core/gatk4/gatherpileupsummaries/meta.yml index 35381a3b51..d8b29d2100 100644 --- a/modules/nf-core/gatk4/gatherpileupsummaries/meta.yml +++ b/modules/nf-core/gatk4/gatherpileupsummaries/meta.yml @@ -12,30 +12,36 @@ tools: tool_dev_url: https://github.com/broadinstitute/gatk doi: "10.1158/1538-7445.AM2017-3590" licence: ["BSD-3-clause"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - pileup: - type: file - description: Pileup files from gatk4/getpileupsummaries - pattern: "*.pileups.table" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - pileup: + type: file + description: Pileup files from gatk4/getpileupsummaries + pattern: "*.pileups.table" + - - dict: + type: file + description: dictionary output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - table: - type: file - description: pileup summaries table file - pattern: "*.pileups.table" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.pileups.table": + type: file + description: pileup summaries table file + pattern: "*.pileups.table" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@FriederikeHanssen" - "@maxulysse" diff --git a/modules/nf-core/gatk4/gatherpileupsummaries/tests/main.nf.test b/modules/nf-core/gatk4/gatherpileupsummaries/tests/main.nf.test new file mode 100644 index 0000000000..f33c6a0d94 --- /dev/null +++ b/modules/nf-core/gatk4/gatherpileupsummaries/tests/main.nf.test @@ -0,0 +1,62 @@ + +nextflow_process { + + name "Test Process GATK4_GATHERPILEUPSUMMARIES" + script "../main.nf" + process "GATK4_GATHERPILEUPSUMMARIES" + config "./nextflow.config" + + tag "modules" + tag "modules_nfcore" + tag "gatk4" + tag "gatk4/gatherpileupsummaries" + + test("test-gatk4-gatherpileupsummaries") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/test.pileups.table', checkIfExists: true) ] + ] + input[1] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.dict', checkIfExists: true) + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test-gatk4-gatherpileupsummaries - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/test.pileups.table', checkIfExists: true) ] + ] + input[1] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.dict', checkIfExists: true) + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + +} diff --git a/modules/nf-core/gatk4/gatherpileupsummaries/tests/main.nf.test.snap b/modules/nf-core/gatk4/gatherpileupsummaries/tests/main.nf.test.snap new file mode 100644 index 0000000000..fd9f258344 --- /dev/null +++ b/modules/nf-core/gatk4/gatherpileupsummaries/tests/main.nf.test.snap @@ -0,0 +1,72 @@ +{ + "test-gatk4-gatherpileupsummaries - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.out.pileups.table:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,d3772ab0d5963a88a2748fd83af76c02" + ], + "table": [ + [ + { + "id": "test", + "single_end": false + }, + "test.out.pileups.table:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,d3772ab0d5963a88a2748fd83af76c02" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-20T10:44:42.759098" + }, + "test-gatk4-gatherpileupsummaries": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.out.pileups.table:md5,8e0ca6f66e112bd2f7ec1d31a2d62469" + ] + ], + "1": [ + "versions.yml:md5,d3772ab0d5963a88a2748fd83af76c02" + ], + "table": [ + [ + { + "id": "test", + "single_end": false + }, + "test.out.pileups.table:md5,8e0ca6f66e112bd2f7ec1d31a2d62469" + ] + ], + "versions": [ + "versions.yml:md5,d3772ab0d5963a88a2748fd83af76c02" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-26T12:18:40.835226" + } +} \ No newline at end of file diff --git a/modules/nf-core/gatk4/gatherpileupsummaries/tests/nextflow.config b/modules/nf-core/gatk4/gatherpileupsummaries/tests/nextflow.config new file mode 100644 index 0000000000..2b49a6fa39 --- /dev/null +++ b/modules/nf-core/gatk4/gatherpileupsummaries/tests/nextflow.config @@ -0,0 +1,5 @@ +process { + withName: 'GATK4_GATHERPILEUPSUMMARIES' { + ext.prefix = { "${meta.id}.out" } + } +} diff --git a/modules/nf-core/gatk4/genomicsdbimport/environment.yml b/modules/nf-core/gatk4/genomicsdbimport/environment.yml index a3a13636c6..55993f440c 100644 --- a/modules/nf-core/gatk4/genomicsdbimport/environment.yml +++ b/modules/nf-core/gatk4/genomicsdbimport/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_genomicsdbimport channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/genomicsdbimport/meta.yml b/modules/nf-core/gatk4/genomicsdbimport/meta.yml index 11e565b104..174ae2eb0a 100644 --- a/modules/nf-core/gatk4/genomicsdbimport/meta.yml +++ b/modules/nf-core/gatk4/genomicsdbimport/meta.yml @@ -1,5 +1,6 @@ name: gatk4_genomicsdbimport -description: merge GVCFs from multiple samples. For use in joint genotyping or somatic panel of normal creation. +description: merge GVCFs from multiple samples. For use in joint genotyping or somatic + panel of normal creation. keywords: - gatk4 - genomicsdb @@ -15,61 +16,99 @@ tools: homepage: https://gatk.broadinstitute.org/hc/en-us documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test'] - - vcf: - type: list - description: either a list of vcf files to be used to create or update a genomicsdb, or a file that contains a map to vcf files to be used. - pattern: "*.vcf.gz" - - tbi: - type: list - description: list of tbi files that match with the input vcf files - pattern: "*.vcf.gz_tbi" - - wspace: - type: file - description: path to an existing genomicsdb to be used in update db mode or get intervals mode. This WILL NOT specify name of a new genomicsdb in create db mode. - pattern: "/path/to/existing/gendb" - - interval_file: - type: file - description: file containing the intervals to be used when creating the genomicsdb - pattern: "*.interval_list" - - interval_value: - type: string - description: if an intervals file has not been spcified, the value enetered here will be used as an interval via the "-L" argument - pattern: "example: chr1:1000-10000" - - run_intlist: - type: boolean - description: Specify whether to run get interval list mode, this option cannot be specified at the same time as run_updatewspace. - pattern: "true/false" - - run_updatewspace: - type: boolean - description: Specify whether to run update genomicsdb mode, this option takes priority over run_intlist. - pattern: "true/false" - - input_map: - type: boolean - description: Specify whether the vcf input is providing a list of vcf file(s) or a single file containing a map of paths to vcf files to be used to create or update a genomicsdb. - pattern: "*.sample_map" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - vcf: + type: list + description: either a list of vcf files to be used to create or update a genomicsdb, + or a file that contains a map to vcf files to be used. + pattern: "*.vcf.gz" + - tbi: + type: list + description: list of tbi files that match with the input vcf files + pattern: "*.vcf.gz_tbi" + - interval_file: + type: file + description: file containing the intervals to be used when creating the genomicsdb + pattern: "*.interval_list" + - interval_value: + type: string + description: if an intervals file has not been spcified, the value enetered + here will be used as an interval via the "-L" argument + pattern: "example: chr1:1000-10000" + - wspace: + type: file + description: path to an existing genomicsdb to be used in update db mode or + get intervals mode. This WILL NOT specify name of a new genomicsdb in create + db mode. + pattern: "/path/to/existing/gendb" + - - run_intlist: + type: boolean + description: Specify whether to run get interval list mode, this option cannot + be specified at the same time as run_updatewspace. + pattern: "true/false" + - - run_updatewspace: + type: boolean + description: Specify whether to run update genomicsdb mode, this option takes + priority over run_intlist. + pattern: "true/false" + - - input_map: + type: boolean + description: Specify whether the vcf input is providing a list of vcf file(s) + or a single file containing a map of paths to vcf files to be used to create + or update a genomicsdb. + pattern: "*.sample_map" output: - genomicsdb: - type: directory - description: Directory containing the files that compose the genomicsdb workspace, this is only output for create mode, as update changes an existing db - pattern: "*/$prefix" + - meta: + type: directory + description: Directory containing the files that compose the genomicsdb workspace, + this is only output for create mode, as update changes an existing db + pattern: "*/$prefix" + - $prefix: + type: directory + description: Directory containing the files that compose the genomicsdb workspace, + this is only output for create mode, as update changes an existing db + pattern: "*/$prefix" - updatedb: - type: directory - description: Directory containing the files that compose the updated genomicsdb workspace, this is only output for update mode, and should be the same path as the input wspace. - pattern: "same/path/as/wspace" + - meta: + type: directory + description: Directory containing the files that compose the updated genomicsdb + workspace, this is only output for update mode, and should be the same path + as the input wspace. + pattern: "same/path/as/wspace" + - $updated_db: + type: directory + description: Directory containing the files that compose the updated genomicsdb + workspace, this is only output for update mode, and should be the same path + as the input wspace. + pattern: "same/path/as/wspace" - intervallist: - type: file - description: File containing the intervals used to generate the genomicsdb, only created by get intervals mode. - pattern: "*.interval_list" + - meta: + type: file + description: File containing the intervals used to generate the genomicsdb, + only created by get intervals mode. + pattern: "*.interval_list" + - "*.interval_list": + type: file + description: File containing the intervals used to generate the genomicsdb, + only created by get intervals mode. + pattern: "*.interval_list" + - list: + type: file + description: File containing the intervals used to generate the genomicsdb, + only created by get intervals mode. + pattern: "*.interval_list" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@GCJMackenzie" maintainers: diff --git a/modules/nf-core/gatk4/genomicsdbimport/tests/main.nf.test b/modules/nf-core/gatk4/genomicsdbimport/tests/main.nf.test index 9c207b3074..5fef5dd254 100644 --- a/modules/nf-core/gatk4/genomicsdbimport/tests/main.nf.test +++ b/modules/nf-core/gatk4/genomicsdbimport/tests/main.nf.test @@ -17,11 +17,11 @@ nextflow_process { """ // [meta, vcf, tbi, interval, interval_value, workspace ] input[0] = [ [ id:'test'], - file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true) , - file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz.tbi', checkIfExists: true) , - file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.interval_list', checkIfExists: true) , - [] , - [] ] + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true) , + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz.tbi', checkIfExists: true) , + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.interval_list', checkIfExists: true) , + [] , + [] ] // run_intlist input[1] = false // run_updatewspace @@ -36,12 +36,14 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot(file(process.out.genomicsdb.get(0).get(1)).list().sort()).match() } //{ assert snapshot(file(process.out.updatedb.get(0).get(1)).list().sort()).match() } //{ assert snapshot(process.out.intervallist.get(0).get(1)).match() } + { assert snapshot( + file(process.out.genomicsdb.get(0).get(1)).list().sort(), + process.out.versions + ).match() } ) } - } test("test_gatk4_genomicsdbimport_get_intervalslist") { @@ -76,10 +78,12 @@ nextflow_process { { assert process.success }, //{ assert snapshot(file(process.out.genomicsdb.get(0).get(1)).list().sort()).match() } //{ assert snapshot(file(process.out.updatedb.get(0).get(1)).list().sort()).match() } - { assert snapshot(process.out.intervallist.get(0).get(1)).match() } + { assert snapshot( + process.out.intervallist.get(0).get(1), + process.out.versions + ).match() } ) } - } test("test_gatk4_genomicsdbimport_update_genomicsdb") { @@ -113,11 +117,13 @@ nextflow_process { assertAll( { assert process.success }, //{ assert snapshot(file(process.out.genomicsdb.get(0).get(1)).list().sort()).match() } - { assert snapshot(file(process.out.updatedb.get(0).get(1)).list().sort()).match() } //{ assert snapshot(process.out.intervallist.get(0).get(1)).match() } + { assert snapshot( + file(process.out.updatedb.get(0).get(1)).list().sort(), + process.out.versions + ).match() } ) } - } test("test_gatk4_genomicsdbimport_stub") { @@ -129,11 +135,11 @@ nextflow_process { """ // [meta, vcf, tbi, interval, interval_value, workspace ] input[0] = [ [ id:'test'], - file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true) , - file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz.tbi', checkIfExists: true) , - file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.interval_list', checkIfExists: true) , - [] , - [] ] + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true) , + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz.tbi', checkIfExists: true) , + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.interval_list', checkIfExists: true) , + [] , + [] ] // run_intlist input[1] = false // run_updatewspace @@ -147,9 +153,8 @@ nextflow_process { then { assertAll( { assert process.success }, + { assert snapshot(process.out).match()} ) } - } - } diff --git a/modules/nf-core/gatk4/genomicsdbimport/tests/main.nf.test.snap b/modules/nf-core/gatk4/genomicsdbimport/tests/main.nf.test.snap index a633bbdc16..55ced0d880 100644 --- a/modules/nf-core/gatk4/genomicsdbimport/tests/main.nf.test.snap +++ b/modules/nf-core/gatk4/genomicsdbimport/tests/main.nf.test.snap @@ -1,40 +1,98 @@ { "test_gatk4_genomicsdbimport_get_intervalslist": { "content": [ - "test.interval_list:md5,4c85812ac15fc1cd29711a851d23c0bf" + "test.interval_list:md5,4c85812ac15fc1cd29711a851d23c0bf", + [ + "versions.yml:md5,c1233a04213021aa66599a36e0fb28cc" + ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "24.01.0" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-28T17:55:03.846241" + "timestamp": "2024-07-09T10:42:51.836379" }, "test_gatk4_genomicsdbimport_create_genomicsdb": { "content": [ - "__tiledb_workspace.tdb", - "callset.json", - "chr22$1$40001", - "vcfheader.vcf", - "vidmap.json" + [ + "__tiledb_workspace.tdb", + "callset.json", + "chr22$1$40001", + "vcfheader.vcf", + "vidmap.json" + ], + [ + "versions.yml:md5,c1233a04213021aa66599a36e0fb28cc" + ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-14T11:22:10.11423157" + "timestamp": "2024-07-09T10:42:36.846239" }, "test_gatk4_genomicsdbimport_update_genomicsdb": { "content": [ - "__tiledb_workspace.tdb", - "callset.json", - "chr22$1$40001", - "vcfheader.vcf", - "vidmap.json" + [ + "__tiledb_workspace.tdb", + "callset.json", + "chr22$1$40001", + "vcfheader.vcf", + "vidmap.json" + ], + [ + "versions.yml:md5,c1233a04213021aa66599a36e0fb28cc" + ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.2" }, - "timestamp": "2024-02-14T12:46:42.403794676" + "timestamp": "2024-07-09T10:43:09.00769" + }, + "test_gatk4_genomicsdbimport_stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,c1233a04213021aa66599a36e0fb28cc" + ], + "genomicsdb": [ + [ + { + "id": "test" + }, + "test:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "intervallist": [ + + ], + "updatedb": [ + + ], + "versions": [ + "versions.yml:md5,c1233a04213021aa66599a36e0fb28cc" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-07-09T10:43:20.921712" } } \ No newline at end of file diff --git a/modules/nf-core/gatk4/genotypegvcfs/environment.yml b/modules/nf-core/gatk4/genotypegvcfs/environment.yml index 6e1b7c04a0..55993f440c 100644 --- a/modules/nf-core/gatk4/genotypegvcfs/environment.yml +++ b/modules/nf-core/gatk4/genotypegvcfs/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_genotypegvcfs channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/genotypegvcfs/main.nf b/modules/nf-core/gatk4/genotypegvcfs/main.nf index 3a9fbb4e07..f180f74975 100644 --- a/modules/nf-core/gatk4/genotypegvcfs/main.nf +++ b/modules/nf-core/gatk4/genotypegvcfs/main.nf @@ -8,12 +8,12 @@ process GATK4_GENOTYPEGVCFS { 'biocontainers/gatk4:4.5.0.0--py36hdfd78af_0' }" input: - tuple val(meta), path(gvcf), path(gvcf_index), path(intervals), path(intervals_index) - path fasta - path fai - path dict - path dbsnp - path dbsnp_tbi + tuple val(meta), path(input), path(gvcf_index), path(intervals), path(intervals_index) + tuple val(meta2), path(fasta) + tuple val(meta3), path(fai) + tuple val(meta4), path(dict) + tuple val(meta5), path(dbsnp) + tuple val(meta6), path(dbsnp_tbi) output: tuple val(meta), path("*.vcf.gz"), emit: vcf @@ -26,7 +26,7 @@ process GATK4_GENOTYPEGVCFS { script: def args = task.ext.args ?: '' def prefix = task.ext.prefix ?: "${meta.id}" - def gvcf_command = gvcf.name.endsWith(".vcf") || gvcf.name.endsWith(".vcf.gz") ? "$gvcf" : "gendb://$gvcf" + def input_command = input.name.endsWith(".vcf") || input.name.endsWith(".vcf.gz") ? "$input" : "gendb://$input" def dbsnp_command = dbsnp ? "--dbsnp $dbsnp" : "" def interval_command = intervals ? "--intervals $intervals" : "" @@ -39,7 +39,7 @@ process GATK4_GENOTYPEGVCFS { """ gatk --java-options "-Xmx${avail_mem}M -XX:-UsePerfData" \\ GenotypeGVCFs \\ - --variant $gvcf_command \\ + --variant $input_command \\ --output ${prefix}.vcf.gz \\ --reference $fasta \\ $interval_command \\ @@ -57,7 +57,7 @@ process GATK4_GENOTYPEGVCFS { def prefix = task.ext.prefix ?: "${meta.id}" """ - touch ${prefix}.vcf.gz + echo | gzip > ${prefix}.vcf.gz touch ${prefix}.vcf.gz.tbi cat <<-END_VERSIONS > versions.yml diff --git a/modules/nf-core/gatk4/genotypegvcfs/meta.yml b/modules/nf-core/gatk4/genotypegvcfs/meta.yml index 8f1e377eb9..0c1fe491fe 100644 --- a/modules/nf-core/gatk4/genotypegvcfs/meta.yml +++ b/modules/nf-core/gatk4/genotypegvcfs/meta.yml @@ -14,66 +14,101 @@ tools: tool_dev_url: https://github.com/broadinstitute/gatk doi: "10.1158/1538-7445.AM2017-3590" licence: ["BSD-3-clause"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - gvcf: - type: file - description: | - gVCF(.gz) file or to a GenomicsDB - pattern: "*.{vcf,vcf.gz}" - - gvcf_index: - type: file - description: | - index of gvcf file, or empty when providing GenomicsDB - pattern: "*.{idx,tbi}" - - intervals: - type: file - description: Interval file with the genomic regions included in the library (optional) - - intervals_index: - type: file - description: Interval index file (optional) - - fasta: - type: file - description: Reference fasta file - pattern: "*.fasta" - - fai: - type: file - description: Reference fasta index file - pattern: "*.fai" - - dict: - type: file - description: Reference fasta sequence dict file - pattern: "*.dict" - - dbsnp: - type: file - description: dbSNP VCF file - pattern: "*.vcf.gz" - - dbsnp_tbi: - type: file - description: dbSNP VCF index file - pattern: "*.tbi" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: | + gVCF(.gz) file or a GenomicsDB + pattern: "*.{vcf,vcf.gz}" + - gvcf_index: + type: file + description: | + index of gvcf file, or empty when providing GenomicsDB + pattern: "*.{idx,tbi}" + - intervals: + type: file + description: Interval file with the genomic regions included in the library + (optional) + - intervals_index: + type: file + description: Interval index file (optional) + - - meta2: + type: map + description: | + Groovy Map containing fasta information + e.g. [ id:'test' ] + - fasta: + type: file + description: Reference fasta file + pattern: "*.fasta" + - - meta3: + type: map + description: | + Groovy Map containing fai information + e.g. [ id:'test' ] + - fai: + type: file + description: Reference fasta index file + pattern: "*.fai" + - - meta4: + type: map + description: | + Groovy Map containing dict information + e.g. [ id:'test' ] + - dict: + type: file + description: Reference fasta sequence dict file + pattern: "*.dict" + - - meta5: + type: map + description: | + Groovy Map containing dbsnp information + e.g. [ id:'test' ] + - dbsnp: + type: file + description: dbSNP VCF file + pattern: "*.vcf.gz" + - - meta6: + type: map + description: | + Groovy Map containing dbsnp tbi information + e.g. [ id:'test' ] + - dbsnp_tbi: + type: file + description: dbSNP VCF index file + pattern: "*.tbi" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - vcf: - type: file - description: Genotyped VCF file - pattern: "*.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.vcf.gz": + type: file + description: Genotyped VCF file + pattern: "*.vcf.gz" - tbi: - type: file - description: Tbi index for VCF file - pattern: "*.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tbi": + type: file + description: Tbi index for VCF file + pattern: "*.vcf.gz" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@santiagorevale" - "@maxulysse" diff --git a/modules/nf-core/gatk4/genotypegvcfs/tests/main.nf.test b/modules/nf-core/gatk4/genotypegvcfs/tests/main.nf.test new file mode 100644 index 0000000000..25bc2d3806 --- /dev/null +++ b/modules/nf-core/gatk4/genotypegvcfs/tests/main.nf.test @@ -0,0 +1,285 @@ +nextflow_process { + + name "Test Process GATK4_GENOTYPEGVCFS" + script "../main.nf" + process "GATK4_GENOTYPEGVCFS" + + tag "modules" + tag "modules_nfcore" + tag "gatk4" + tag "gatk4/genotypegvcfs" + tag "untar" + + setup { + run("UNTAR") { + script "../../../untar/main.nf" + process { + """ + input[0] = [ + [id:"test"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/test_genomicsdb.tar.gz', checkIfExists: true) + ] + """ + } + } + } + + test("homo_sapiens - [gvcf, idx, [], []], fasta, fai, dict, [], []") { + + when { + process { + """ + input[0] = [ + [id:"test"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.idx', checkIfExists: true), + [], + [] + ] + input[1] = [ + [id:"fasta"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists:true) + ] + input[2] = [ + [id:"fai"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists:true) + ] + input[3] = [ + [id:"dict"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.dict', checkIfExists:true) + ] + input[4] = [[],[]] + input[5] = [[],[]] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf.collect { [it[0], path(it[1]).vcf.variantsMD5] }, + process.out.tbi.collect { [it[0], file(it[1]).name] }, + process.out.versions + ).match() } + ) + } + + } + + test("homo_sapiens - [gvcf_gz, tbi, [], []], fasta, fai, dict, dbsnp, dbsnp_tbi") { + + when { + process { + """ + input[0] = [ + [id:"test"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz.tbi', checkIfExists: true), + [], + [] + ] + input[1] = [ + [id:"fasta"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists:true) + ] + input[2] = [ + [id:"fai"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists:true) + ] + input[3] = [ + [id:"dict"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.dict', checkIfExists:true) + ] + input[4] = [ + [id:"dbsnp"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/vcf/dbsnp_146.hg38.vcf.gz', checkIfExists:true) + ] + input[5] = [ + [id:"dbsnp"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/vcf/dbsnp_146.hg38.vcf.gz.tbi', checkIfExists:true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf.collect { [it[0], path(it[1]).vcf.variantsMD5] }, + process.out.tbi.collect { [it[0], file(it[1]).name] }, + process.out.versions + ).match() } + ) + } + + } + + test("homo_sapiens - [gvcf_gz, tbi, bed, []], fasta, fai, dict, [], []") { + + when { + process { + """ + input[0] = [ + [id:"test"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists:true), + [] + ] + input[1] = [ + [id:"fasta"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists:true) + ] + input[2] = [ + [id:"fai"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists:true) + ] + input[3] = [ + [id:"dict"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.dict', checkIfExists:true) + ] + input[4] = [[],[]] + input[5] = [[],[]] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf.collect { [it[0], path(it[1]).vcf.variantsMD5] }, + process.out.tbi.collect { [it[0], file(it[1]).name] }, + process.out.versions + ).match() } + ) + } + + } + + test("homo_sapiens - [gendb, [], [], []], fasta, fai, dict, [], []") { + + when { + process { + """ + input[0] = UNTAR.out.untar.map { meta, gendb -> [ meta, gendb, [], [], []] } + input[1] = [ + [id:"fasta"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists:true) + ] + input[2] = [ + [id:"fai"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists:true) + ] + input[3] = [ + [id:"dict"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.dict', checkIfExists:true) + ] + input[4] = [[],[]] + input[5] = [[],[]] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf.collect { [it[0], path(it[1]).vcf.variantsMD5] }, + process.out.tbi.collect { [it[0], file(it[1]).name] }, + process.out.versions + ).match() } + ) + } + + } + + test("homo_sapiens - [gendb, bed, [], []], fasta, fai, dict, [], []") { + + when { + process { + """ + input[0] = UNTAR.out.untar.map { meta, gendb -> + [ + meta, + gendb, + [], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists:true), + [] + ] + } + input[1] = [ + [id:"fasta"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists:true) + ] + input[2] = [ + [id:"fai"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists:true) + ] + input[3] = [ + [id:"dict"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.dict', checkIfExists:true) + ] + input[4] = [[],[]] + input[5] = [[],[]] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.vcf.collect { [it[0], path(it[1]).vcf.variantsMD5] }, + process.out.tbi.collect { [it[0], file(it[1]).name] }, + process.out.versions + ).match() } + ) + } + + } + + test("homo_sapiens - [gvcf, idx, [], []], fasta, fai, dict, [], [] - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [id:"test"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.idx', checkIfExists: true), + [], + [] + ] + input[1] = [ + [id:"fasta"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists:true) + ] + input[2] = [ + [id:"fai"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists:true) + ] + input[3] = [ + [id:"dict"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.dict', checkIfExists:true) + ] + input[4] = [[],[]] + input[5] = [[],[]] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/gatk4/genotypegvcfs/tests/main.nf.test.snap b/modules/nf-core/gatk4/genotypegvcfs/tests/main.nf.test.snap new file mode 100644 index 0000000000..1621618e7a --- /dev/null +++ b/modules/nf-core/gatk4/genotypegvcfs/tests/main.nf.test.snap @@ -0,0 +1,191 @@ +{ + "homo_sapiens - [gendb, [], [], []], fasta, fai, dict, [], []": { + "content": [ + [ + [ + { + "id": "test" + }, + "1ab95fbc5ec55b208f3001572bec54fa" + ] + ], + [ + [ + { + "id": "test" + }, + "test.vcf.gz.tbi" + ] + ], + [ + "versions.yml:md5,3c16cbf71737813609ad10d901d92ab3" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-04T14:27:24.926097884" + }, + "homo_sapiens - [gvcf, idx, [], []], fasta, fai, dict, [], []": { + "content": [ + [ + [ + { + "id": "test" + }, + "1ab95fbc5ec55b208f3001572bec54fa" + ] + ], + [ + [ + { + "id": "test" + }, + "test.vcf.gz.tbi" + ] + ], + [ + "versions.yml:md5,3c16cbf71737813609ad10d901d92ab3" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-04T14:26:24.426228557" + }, + "homo_sapiens - [gvcf_gz, tbi, bed, []], fasta, fai, dict, [], []": { + "content": [ + [ + [ + { + "id": "test" + }, + "1ab95fbc5ec55b208f3001572bec54fa" + ] + ], + [ + [ + { + "id": "test" + }, + "test.vcf.gz.tbi" + ] + ], + [ + "versions.yml:md5,3c16cbf71737813609ad10d901d92ab3" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-04T14:27:04.179308513" + }, + "homo_sapiens - [gvcf_gz, tbi, [], []], fasta, fai, dict, dbsnp, dbsnp_tbi": { + "content": [ + [ + [ + { + "id": "test" + }, + "9b7d476515e07e5486633c42abd86cc" + ] + ], + [ + [ + { + "id": "test" + }, + "test.vcf.gz.tbi" + ] + ], + [ + "versions.yml:md5,3c16cbf71737813609ad10d901d92ab3" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-04T14:26:43.9088684" + }, + "homo_sapiens - [gvcf, idx, [], []], fasta, fai, dict, [], [] - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test" + }, + "test.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,3c16cbf71737813609ad10d901d92ab3" + ], + "tbi": [ + [ + { + "id": "test" + }, + "test.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "vcf": [ + [ + { + "id": "test" + }, + "test.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,3c16cbf71737813609ad10d901d92ab3" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-04T14:19:57.615552867" + }, + "homo_sapiens - [gendb, bed, [], []], fasta, fai, dict, [], []": { + "content": [ + [ + [ + { + "id": "test" + }, + "1ab95fbc5ec55b208f3001572bec54fa" + ] + ], + [ + [ + { + "id": "test" + }, + "test.vcf.gz.tbi" + ] + ], + [ + "versions.yml:md5,3c16cbf71737813609ad10d901d92ab3" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-04T14:27:46.189794941" + } +} \ No newline at end of file diff --git a/modules/nf-core/gatk4/getpileupsummaries/environment.yml b/modules/nf-core/gatk4/getpileupsummaries/environment.yml index b99a28c177..55993f440c 100644 --- a/modules/nf-core/gatk4/getpileupsummaries/environment.yml +++ b/modules/nf-core/gatk4/getpileupsummaries/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_getpileupsummaries channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/getpileupsummaries/meta.yml b/modules/nf-core/gatk4/getpileupsummaries/meta.yml index fab3c1435e..86b851e13a 100644 --- a/modules/nf-core/gatk4/getpileupsummaries/meta.yml +++ b/modules/nf-core/gatk4/getpileupsummaries/meta.yml @@ -16,68 +16,78 @@ tools: documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test' ] - - input: - type: file - description: BAM/CRAM file to be summarised. - pattern: "*.{bam,cram}" - - input_index: - type: file - description: BAM/CRAM file index. - pattern: "*.{bai,crai}" - - intervals: - type: file - description: File containing specified sites to be used for the summary. If this option is not specified, variants file is used instead automatically. - pattern: "*.interval_list" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fasta: - type: file - description: The reference fasta file - pattern: "*.fasta" - - meta3: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fai: - type: file - description: Index of reference fasta file - pattern: "*.fasta.fai" - - meta4: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - dict: - type: file - description: GATK sequence dictionary - pattern: "*.dict" - - variants: - type: file - description: Population vcf of germline sequencing, containing allele fractions. Is also used as sites file if no separate sites file is specified. - pattern: "*.vcf.gz" - - variants_tbi: - type: file - description: Index file for the germline resource. - pattern: "*.vcf.gz.tbi" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test' ] + - input: + type: file + description: BAM/CRAM file to be summarised. + pattern: "*.{bam,cram}" + - index: + type: file + description: Index file for the input BAM/CRAM file. + pattern: "*.{bam.bai,cram.crai}" + - intervals: + type: file + description: File containing specified sites to be used for the summary. If + this option is not specified, variants file is used instead automatically. + pattern: "*.interval_list" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fasta: + type: file + description: The reference fasta file + pattern: "*.fasta" + - - meta3: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fai: + type: file + description: Index of reference fasta file + pattern: "*.fasta.fai" + - - meta4: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - dict: + type: file + description: GATK sequence dictionary + pattern: "*.dict" + - - variants: + type: file + description: Population vcf of germline sequencing, containing allele fractions. + Is also used as sites file if no separate sites file is specified. + pattern: "*.vcf.gz" + - - variants_tbi: + type: file + description: Index file for the germline resource. + pattern: "*.vcf.gz.tbi" output: - - pileup: - type: file - description: File containing the pileup summary table. - pattern: "*.pileups.table" + - table: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test' ] + - "*.pileups.table": + type: file + description: Table containing read counts for each site. + pattern: "*.pileups.table" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@GCJMackenzie" maintainers: diff --git a/modules/nf-core/gatk4/haplotypecaller/environment.yml b/modules/nf-core/gatk4/haplotypecaller/environment.yml index d4e8d36026..55993f440c 100644 --- a/modules/nf-core/gatk4/haplotypecaller/environment.yml +++ b/modules/nf-core/gatk4/haplotypecaller/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_haplotypecaller channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/haplotypecaller/main.nf b/modules/nf-core/gatk4/haplotypecaller/main.nf index 3043ee07ab..b2aff48969 100644 --- a/modules/nf-core/gatk4/haplotypecaller/main.nf +++ b/modules/nf-core/gatk4/haplotypecaller/main.nf @@ -44,6 +44,7 @@ process GATK4_HAPLOTYPECALLER { --input $input \\ --output ${prefix}.vcf.gz \\ --reference $fasta \\ + --native-pair-hmm-threads ${task.cpus} \\ $dbsnp_command \\ $interval_command \\ $dragstr_command \\ diff --git a/modules/nf-core/gatk4/haplotypecaller/meta.yml b/modules/nf-core/gatk4/haplotypecaller/meta.yml index 703b99a098..9d4a05e914 100644 --- a/modules/nf-core/gatk4/haplotypecaller/meta.yml +++ b/modules/nf-core/gatk4/haplotypecaller/meta.yml @@ -14,92 +14,108 @@ tools: documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM file from alignment - pattern: "*.{bam,cram}" - - input_index: - type: file - description: BAI/CRAI file from alignment - pattern: "*.{bai,crai}" - - intervals: - type: file - description: Bed file with the genomic regions included in the library (optional) - - dragstr_model: - type: file - description: Text file containing the DragSTR model of the used BAM/CRAM file (optional) - pattern: "*.txt" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'test_reference' ] - - fasta: - type: file - description: The reference fasta file - pattern: "*.fasta" - - meta3: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'test_reference' ] - - fai: - type: file - description: Index of reference fasta file - pattern: "fasta.fai" - - meta4: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'test_reference' ] - - dict: - type: file - description: GATK sequence dictionary - pattern: "*.dict" - - meta5: - type: map - description: | - Groovy Map containing dbsnp information - e.g. [ id:'test_dbsnp' ] - - dbsnp: - type: file - description: VCF file containing known sites (optional) - - meta6: - type: map - description: | - Groovy Map containing dbsnp information - e.g. [ id:'test_dbsnp' ] - - dbsnp_tbi: - type: file - description: VCF index of dbsnp (optional) + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM file from alignment + pattern: "*.{bam,cram}" + - input_index: + type: file + description: BAI/CRAI file from alignment + pattern: "*.{bai,crai}" + - intervals: + type: file + description: Bed file with the genomic regions included in the library (optional) + - dragstr_model: + type: file + description: Text file containing the DragSTR model of the used BAM/CRAM file + (optional) + pattern: "*.txt" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test_reference' ] + - fasta: + type: file + description: The reference fasta file + pattern: "*.fasta" + - - meta3: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test_reference' ] + - fai: + type: file + description: Index of reference fasta file + pattern: "fasta.fai" + - - meta4: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test_reference' ] + - dict: + type: file + description: GATK sequence dictionary + pattern: "*.dict" + - - meta5: + type: map + description: | + Groovy Map containing dbsnp information + e.g. [ id:'test_dbsnp' ] + - dbsnp: + type: file + description: VCF file containing known sites (optional) + - - meta6: + type: map + description: | + Groovy Map containing dbsnp information + e.g. [ id:'test_dbsnp' ] + - dbsnp_tbi: + type: file + description: VCF index of dbsnp (optional) output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - vcf: - type: file - description: Compressed VCF file - pattern: "*.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.vcf.gz": + type: file + description: Compressed VCF file + pattern: "*.vcf.gz" - tbi: - type: file - description: Index of VCF file - pattern: "*.vcf.gz.tbi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tbi": + type: file + description: Index of VCF file + pattern: "*.vcf.gz.tbi" - bam: - type: file - description: Assembled haplotypes and locally realigned reads - pattern: "*.realigned.bam" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.realigned.bam": + type: file + description: Assembled haplotypes and locally realigned reads + pattern: "*.realigned.bam" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@suzannejin" - "@FriederikeHanssen" diff --git a/modules/nf-core/gatk4/haplotypecaller/tests/main.nf.test b/modules/nf-core/gatk4/haplotypecaller/tests/main.nf.test index a124bff530..18d35f498c 100644 --- a/modules/nf-core/gatk4/haplotypecaller/tests/main.nf.test +++ b/modules/nf-core/gatk4/haplotypecaller/tests/main.nf.test @@ -17,14 +17,14 @@ nextflow_process { """ input[0] = [ [ id:'test_bam' ], // meta map - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam_bai'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), [], [] ] - input[1] = [ [ id:'test_fa' ], file(params.test_data['homo_sapiens']['genome']['genome_fasta'], checkIfExists: true) ] - input[2] = [ [ id:'test_fai' ], file(params.test_data['homo_sapiens']['genome']['genome_fasta_fai'], checkIfExists: true) ] - input[3] = [ [ id:'test_dict' ], file(params.test_data['homo_sapiens']['genome']['genome_dict'], checkIfExists: true) ] + input[1] = [ [ id:'test_fa' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) ] + input[2] = [ [ id:'test_fai' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true) ] + input[3] = [ [ id:'test_dict' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.dict', checkIfExists: true) ] input[4] = [ [], [] ] input[5] = [ [], [] ] @@ -35,9 +35,12 @@ nextflow_process { then { assertAll( { assert process.success }, - // { assert snapshot(process.out).match() }, // Unstable hashes - { assert snapshot(file(process.out.vcf.get(0).get(1)).name).match("gatk_hc_vcf_bam_input") }, - { assert snapshot(file(process.out.tbi.get(0).get(1)).name).match("gatk_hc_vcf_tbi_bam_input") }, + { assert snapshot( + file(process.out.vcf[0][1]).name, + file(process.out.tbi[0][1]).name, + process.out.versions + ).match() + } ) } @@ -50,14 +53,14 @@ nextflow_process { """ input[0] = [ [ id:'test_cram' ], // meta map - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_cram'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_cram_crai'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true), [], [] ] - input[1] = [ [ id:'test_fa' ], file(params.test_data['homo_sapiens']['genome']['genome_fasta'], checkIfExists: true) ] - input[2] = [ [ id:'test_fai' ], file(params.test_data['homo_sapiens']['genome']['genome_fasta_fai'], checkIfExists: true) ] - input[3] = [ [ id:'test_dict' ], file(params.test_data['homo_sapiens']['genome']['genome_dict'], checkIfExists: true) ] + input[1] = [ [ id:'test_fa' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) ] + input[2] = [ [ id:'test_fai' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true) ] + input[3] = [ [ id:'test_dict' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.dict', checkIfExists: true) ] input[4] = [ [], [] ] input[5] = [ [], [] ] """ @@ -67,9 +70,12 @@ nextflow_process { then { assertAll( { assert process.success }, - // { assert snapshot(process.out).match() }, // Unstable hashes - { assert snapshot(file(process.out.vcf.get(0).get(1)).name).match("gatk_hc_vcf_cram_input") }, - { assert snapshot(file(process.out.tbi.get(0).get(1)).name).match("gatk_hc_vcf_tbi_cram_input") }, + { assert snapshot( + file(process.out.vcf[0][1]).name, + file(process.out.tbi[0][1]).name, + process.out.versions + ).match() + } ) } @@ -82,16 +88,16 @@ nextflow_process { """ input[0] = [ [ id:'test_cram_sites' ], // meta map - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_cram'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_cram_crai'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true), [], [] ] - input[1] = [ [ id:'test_fa' ], file(params.test_data['homo_sapiens']['genome']['genome_fasta'], checkIfExists: true) ] - input[2] = [ [ id:'test_fai' ], file(params.test_data['homo_sapiens']['genome']['genome_fasta_fai'], checkIfExists: true) ] - input[3] = [ [ id:'test_dict' ], file(params.test_data['homo_sapiens']['genome']['genome_dict'], checkIfExists: true) ] - input[4] = [ [ id:'test_sites' ], file(params.test_data['homo_sapiens']['genome']['dbsnp_146_hg38_vcf_gz'], checkIfExists: true) ] - input[5] = [ [ id:'test_sites_tbi' ], file(params.test_data['homo_sapiens']['genome']['dbsnp_146_hg38_vcf_gz_tbi'], checkIfExists: true) ] + input[1] = [ [ id:'test_fa' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) ] + input[2] = [ [ id:'test_fai' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true) ] + input[3] = [ [ id:'test_dict' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.dict', checkIfExists: true) ] + input[4] = [ [ id:'test_sites' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/vcf/dbsnp_146.hg38.vcf.gz', checkIfExists: true) ] + input[5] = [ [ id:'test_sites_tbi' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/vcf/dbsnp_146.hg38.vcf.gz.tbi', checkIfExists: true) ] """ } } @@ -99,9 +105,12 @@ nextflow_process { then { assertAll( { assert process.success }, - // { assert snapshot(process.out).match() }, // Unstable hashes - { assert snapshot(file(process.out.vcf.get(0).get(1)).name).match("gatk_hc_vcf_cram_input_with_sites") }, - { assert snapshot(file(process.out.tbi.get(0).get(1)).name).match("gatk_hc_vcf_tbi_cram_input_with_sites") }, + { assert snapshot( + file(process.out.vcf[0][1]).name, + file(process.out.tbi[0][1]).name, + process.out.versions + ).match() + } ) } @@ -114,16 +123,16 @@ nextflow_process { """ input[0] = [ [ id:'test_cram_sites_dragstr' ], // meta map - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_cram'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_cram_crai'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true), [], - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_dragstrmodel'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/test_paired_end_sorted_dragstrmodel.txt', checkIfExists: true) ] - input[1] = [ [ id:'test_fa' ], file(params.test_data['homo_sapiens']['genome']['genome_fasta'], checkIfExists: true) ] - input[2] = [ [ id:'test_fai' ], file(params.test_data['homo_sapiens']['genome']['genome_fasta_fai'], checkIfExists: true) ] - input[3] = [ [ id:'test_dict' ], file(params.test_data['homo_sapiens']['genome']['genome_dict'], checkIfExists: true) ] - input[4] = [ [ id:'test_sites' ], file(params.test_data['homo_sapiens']['genome']['dbsnp_146_hg38_vcf_gz'], checkIfExists: true) ] - input[5] = [ [ id:'test_sites_tbi' ], file(params.test_data['homo_sapiens']['genome']['dbsnp_146_hg38_vcf_gz_tbi'], checkIfExists: true) ] + input[1] = [ [ id:'test_fa' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) ] + input[2] = [ [ id:'test_fai' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true) ] + input[3] = [ [ id:'test_dict' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.dict', checkIfExists: true) ] + input[4] = [ [ id:'test_sites' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/vcf/dbsnp_146.hg38.vcf.gz', checkIfExists: true) ] + input[5] = [ [ id:'test_sites_tbi' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/vcf/dbsnp_146.hg38.vcf.gz.tbi', checkIfExists: true) ] """ } } @@ -131,9 +140,12 @@ nextflow_process { then { assertAll( { assert process.success }, - // { assert snapshot(process.out).match() }, // Unstable hashes - { assert snapshot(file(process.out.vcf.get(0).get(1)).name).match("gatk_hc_vcf_cram_dragstr_input_with_sites") }, - { assert snapshot(file(process.out.tbi.get(0).get(1)).name).match("gatk_hc_vcf_tbi_cram_dragstr_input_with_sites") }, + { assert snapshot( + file(process.out.vcf[0][1]).name, + file(process.out.tbi[0][1]).name, + process.out.versions + ).match() + } ) } diff --git a/modules/nf-core/gatk4/haplotypecaller/tests/main.nf.test.snap b/modules/nf-core/gatk4/haplotypecaller/tests/main.nf.test.snap index 375025ee3c..0203fcfcf4 100644 --- a/modules/nf-core/gatk4/haplotypecaller/tests/main.nf.test.snap +++ b/modules/nf-core/gatk4/haplotypecaller/tests/main.nf.test.snap @@ -1,82 +1,58 @@ { - "gatk_hc_vcf_cram_dragstr_input_with_sites": { + "homo_sapiens - [cram, crai] - fasta - fai - dict": { "content": [ - "test_cram_sites_dragstr.vcf.gz" + "test_cram.vcf.gz", + "test_cram.vcf.gz.tbi", + [ + "versions.yml:md5,05431a0ab28c85412c8b3582e863a7ab" + ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.4" }, - "timestamp": "2024-02-20T13:24:45.142682" + "timestamp": "2024-08-14T09:36:54.158605" }, - "gatk_hc_vcf_bam_input": { + "homo_sapiens - [cram, crai] - fasta - fai - dict - sites - sites_tbi": { "content": [ - "test_bam.vcf.gz" + "test_cram_sites.vcf.gz", + "test_cram_sites.vcf.gz.tbi", + [ + "versions.yml:md5,05431a0ab28c85412c8b3582e863a7ab" + ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.4" }, - "timestamp": "2024-02-20T13:23:19.203837" + "timestamp": "2024-08-14T09:37:13.77024" }, - "gatk_hc_vcf_cram_input": { + "homo_sapiens - [bam, bai] - fasta - fai - dict": { "content": [ - "test_cram.vcf.gz" + "test_bam.vcf.gz", + "test_bam.vcf.gz.tbi", + [ + "versions.yml:md5,05431a0ab28c85412c8b3582e863a7ab" + ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.4" }, - "timestamp": "2024-02-20T13:23:48.434615" + "timestamp": "2024-08-14T09:36:34.77631" }, - "gatk_hc_vcf_cram_input_with_sites": { + "homo_sapiens - [cram, crai, dragstr_model] - fasta - fai - dict - sites - sites_tbi": { "content": [ - "test_cram_sites.vcf.gz" + "test_cram_sites_dragstr.vcf.gz", + "test_cram_sites_dragstr.vcf.gz.tbi", + [ + "versions.yml:md5,05431a0ab28c85412c8b3582e863a7ab" + ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.4" }, - "timestamp": "2024-02-20T13:24:17.147745" - }, - "gatk_hc_vcf_tbi_bam_input": { - "content": [ - "test_bam.vcf.gz.tbi" - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-02-20T13:23:19.23048" - }, - "gatk_hc_vcf_tbi_cram_input": { - "content": [ - "test_cram.vcf.gz.tbi" - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-02-20T13:23:48.45958" - }, - "gatk_hc_vcf_tbi_cram_dragstr_input_with_sites": { - "content": [ - "test_cram_sites_dragstr.vcf.gz.tbi" - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-02-20T13:24:45.154818" - }, - "gatk_hc_vcf_tbi_cram_input_with_sites": { - "content": [ - "test_cram_sites.vcf.gz.tbi" - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" - }, - "timestamp": "2024-02-20T13:24:17.158138" + "timestamp": "2024-08-14T09:37:32.967085" } } \ No newline at end of file diff --git a/modules/nf-core/gatk4/intervallisttobed/environment.yml b/modules/nf-core/gatk4/intervallisttobed/environment.yml index d4d2eba24c..55993f440c 100644 --- a/modules/nf-core/gatk4/intervallisttobed/environment.yml +++ b/modules/nf-core/gatk4/intervallisttobed/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_intervallisttobed channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/intervallisttobed/gatk4-intervallisttobed.diff b/modules/nf-core/gatk4/intervallisttobed/gatk4-intervallisttobed.diff new file mode 100644 index 0000000000..03086949c6 --- /dev/null +++ b/modules/nf-core/gatk4/intervallisttobed/gatk4-intervallisttobed.diff @@ -0,0 +1,27 @@ +Changes in module 'nf-core/gatk4/intervallisttobed' +'modules/nf-core/gatk4/intervallisttobed/environment.yml' is unchanged +'modules/nf-core/gatk4/intervallisttobed/meta.yml' is unchanged +Changes in 'gatk4/intervallisttobed/main.nf': +--- modules/nf-core/gatk4/intervallisttobed/main.nf ++++ modules/nf-core/gatk4/intervallisttobed/main.nf +@@ -40,4 +40,18 @@ + gatk4: \$(echo \$(gatk --version 2>&1) | sed 's/^.*(GATK) v//; s/ .*\$//') + END_VERSIONS + """ ++ ++ stub: ++ def prefix = task.ext.prefix ?: "${meta.id}.cram" ++ def metrics = task.ext.metrics ?: "${prefix}.metrics" ++ def prefix_basename = prefix.substring(0, prefix.lastIndexOf(".")) ++ ++ """ ++ touch ${prefix}.bed ++ ++ cat <<-END_VERSIONS > versions.yml ++ "${task.process}": ++ gawk: \$(awk -Wversion | sed '1!d; s/.*Awk //; s/,.*//') ++ END_VERSIONS ++ """ + } + +************************************************************ diff --git a/modules/nf-core/gatk4/intervallisttobed/main.nf b/modules/nf-core/gatk4/intervallisttobed/main.nf index 2f6893c0ca..743bb3413a 100644 --- a/modules/nf-core/gatk4/intervallisttobed/main.nf +++ b/modules/nf-core/gatk4/intervallisttobed/main.nf @@ -40,4 +40,18 @@ process GATK4_INTERVALLISTTOBED { gatk4: \$(echo \$(gatk --version 2>&1) | sed 's/^.*(GATK) v//; s/ .*\$//') END_VERSIONS """ + + stub: + def prefix = task.ext.prefix ?: "${meta.id}.cram" + def metrics = task.ext.metrics ?: "${prefix}.metrics" + def prefix_basename = prefix.substring(0, prefix.lastIndexOf(".")) + + """ + touch ${prefix}.bed + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + gawk: \$(awk -Wversion | sed '1!d; s/.*Awk //; s/,.*//') + END_VERSIONS + """ } diff --git a/modules/nf-core/gatk4/intervallisttobed/meta.yml b/modules/nf-core/gatk4/intervallisttobed/meta.yml index 28d264dfef..0779fa1822 100644 --- a/modules/nf-core/gatk4/intervallisttobed/meta.yml +++ b/modules/nf-core/gatk4/intervallisttobed/meta.yml @@ -13,30 +13,32 @@ tools: tool_dev_url: https://github.com/broadinstitute/gatk doi: "10.1158/1538-7445.AM2017-3590" licence: ["BSD-3-clause"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - interval: - type: file - description: Interval list - pattern: "*.{interval,interval_list}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - intervals: + type: file + description: IntervalList file output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - bed: - type: file - description: BED file - pattern: "*.bed" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bed": + type: file + description: BED file + pattern: "*.bed" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@FriederikeHanssen" maintainers: diff --git a/modules/nf-core/gatk4/learnreadorientationmodel/environment.yml b/modules/nf-core/gatk4/learnreadorientationmodel/environment.yml index a4c2a764dd..55993f440c 100644 --- a/modules/nf-core/gatk4/learnreadorientationmodel/environment.yml +++ b/modules/nf-core/gatk4/learnreadorientationmodel/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_learnreadorientationmodel channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/learnreadorientationmodel/meta.yml b/modules/nf-core/gatk4/learnreadorientationmodel/meta.yml index 4b73a51adb..fde7829c8d 100644 --- a/modules/nf-core/gatk4/learnreadorientationmodel/meta.yml +++ b/modules/nf-core/gatk4/learnreadorientationmodel/meta.yml @@ -16,25 +16,32 @@ tools: documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test' ] - - f1r2: - type: list - description: list of f1r2 files to be used as input. - pattern: "*.f1r2.tar.gz" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test' ] + - f1r2: + type: list + description: list of f1r2 files to be used as input. + pattern: "*.f1r2.tar.gz" output: - artifactprior: - type: file - description: file containing artifact-priors to be used by filtermutectcalls - pattern: "*.tar.gz" + - meta: + type: file + description: file containing artifact-priors to be used by filtermutectcalls + pattern: "*.tar.gz" + - "*.tar.gz": + type: file + description: file containing artifact-priors to be used by filtermutectcalls + pattern: "*.tar.gz" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@GCJMackenzie" maintainers: diff --git a/modules/nf-core/gatk4/learnreadorientationmodel/tests/main.nf.test b/modules/nf-core/gatk4/learnreadorientationmodel/tests/main.nf.test new file mode 100644 index 0000000000..bffe02e4f6 --- /dev/null +++ b/modules/nf-core/gatk4/learnreadorientationmodel/tests/main.nf.test @@ -0,0 +1,38 @@ + +nextflow_process { + + name "Test Process GATK4_LEARNREADORIENTATIONMODEL" + script "../main.nf" + process "GATK4_LEARNREADORIENTATIONMODEL" + config "./nextflow.config" + + tag "modules" + tag "modules_nfcore" + tag "gatk4" + tag "gatk4/learnreadorientationmodel" + + test("test-gatk4-learnreadorientationmodel") { + + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + [file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/paired_mutect2_calls/test_test2_paired_mutect2_calls.f1r2.tar.gz', checkIfExists: true)] ] + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + path(process.out.artifactprior[0][1]).linesGzip[3..7], + process.out.versions + ).match() + } + ) + } + } + +} diff --git a/modules/nf-core/gatk4/learnreadorientationmodel/tests/main.nf.test.snap b/modules/nf-core/gatk4/learnreadorientationmodel/tests/main.nf.test.snap new file mode 100644 index 0000000000..b829bd9c4a --- /dev/null +++ b/modules/nf-core/gatk4/learnreadorientationmodel/tests/main.nf.test.snap @@ -0,0 +1,21 @@ +{ + "test-gatk4-learnreadorientationmodel": { + "content": [ + [ + "CTT\tAAG\t2.7114986684474486E-6\t3.2076972826656866E-5\t2.6085822355549755E-6\t0.0\t2.6371799896540086E-6\t3.3869355267901446E-6\t2.6085822355549755E-6\t0.0\t0.9995881552107633\t4.6590850211691583E-5\t2.8848017683240004E-4\t3.0744010710100574E-5\t38334\t116", + "GTT\tAAC\t0.1\t0.1\t0.1\t0.0\t0.1\t0.1\t0.1\t0.0\t0.1\t0.1\t0.1\t0.1\t0\t0", + "TAT\tATA\t0.0\t5.548307163536064E-6\t5.144357865084592E-6\t6.205892051757818E-6\t0.0\t5.907388162200423E-6\t5.176730417709638E-6\t6.083872804985981E-6\t0.9924019419831304\t3.946972069386949E-5\t0.007516612822150651\t7.908925559851714E-6\t19439\t95", + "AAA\tTTT\t0.0\t1.7634470563520664E-6\t2.8327478284981175E-6\t1.8084237600021914E-6\t0.0\t1.7692606885284446E-6\t2.263339968296726E-6\t1.8660094002474611E-6\t0.9990845693211764\t1.8004690536795885E-5\t8.701192700921183E-4\t1.5003489492700572E-5\t56708\t130", + "CAA\tTTG\t0.0\t3.4445551925564533E-6\t3.435193155024585E-6\t5.139879646597498E-6\t0.0\t3.4674461103560476E-6\t3.428570449688764E-6\t6.343168047383713E-6\t0.9945238954147358\t1.3629167993931722E-4\t0.0052793454402581125\t3.5208652465150646E-5\t29263\t197" + ], + [ + "versions.yml:md5,88928ff140a0967e574e66944fd2a2f2" + ] + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-26T12:16:08.296564" + } +} \ No newline at end of file diff --git a/modules/nf-core/gatk4/learnreadorientationmodel/tests/nextflow.config b/modules/nf-core/gatk4/learnreadorientationmodel/tests/nextflow.config new file mode 100644 index 0000000000..79e4f67df3 --- /dev/null +++ b/modules/nf-core/gatk4/learnreadorientationmodel/tests/nextflow.config @@ -0,0 +1,5 @@ +process { + withName: GATK4_LEARNREADORIENTATIONMODEL { + ext.prefix = { "${meta.id}.artifact-prior" } + } +} diff --git a/modules/nf-core/gatk4/markduplicates/environment.yml b/modules/nf-core/gatk4/markduplicates/environment.yml index 7362eea6f2..3c73c17e43 100644 --- a/modules/nf-core/gatk4/markduplicates/environment.yml +++ b/modules/nf-core/gatk4/markduplicates/environment.yml @@ -1,9 +1,8 @@ -name: gatk4_markduplicates channels: - conda-forge - bioconda - - defaults + dependencies: - bioconda::gatk4=4.5.0.0 - - bioconda::samtools=1.19.2 - bioconda::htslib=1.19.1 + - bioconda::samtools=1.19.2 diff --git a/modules/nf-core/gatk4/markduplicates/meta.yml b/modules/nf-core/gatk4/markduplicates/meta.yml index b0f09d4b84..4772c5f39a 100644 --- a/modules/nf-core/gatk4/markduplicates/meta.yml +++ b/modules/nf-core/gatk4/markduplicates/meta.yml @@ -1,5 +1,6 @@ name: gatk4_markduplicates -description: This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA. +description: This tool locates and tags duplicate reads in a BAM or SAM file, where + duplicate reads are defined as originating from a single fragment of DNA. keywords: - bam - gatk4 @@ -7,60 +8,90 @@ keywords: - sort tools: - gatk4: - description: Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. + description: Developed in the Data Sciences Platform at the Broad Institute, the + toolkit offers a wide variety of tools with a primary focus on variant discovery + and genotyping. Its powerful processing engine and high-performance computing + features make it capable of taking on projects of any size. homepage: https://gatk.broadinstitute.org/hc/en-us documentation: https://gatk.broadinstitute.org/hc/en-us/articles/360037052812-MarkDuplicates-Picard- tool_dev_url: https://github.com/broadinstitute/gatk doi: 10.1158/1538-7445.AM2017-3590 licence: ["MIT"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: Sorted BAM file - pattern: "*.{bam}" - - fasta: - type: file - description: Fasta file - pattern: "*.{fasta}" - - fasta_fai: - type: file - description: Fasta index file - pattern: "*.{fai}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - bam: + type: file + description: Sorted BAM file + pattern: "*.{bam}" + - - fasta: + type: file + description: Fasta file + pattern: "*.{fasta}" + - - fasta_fai: + type: file + description: Fasta index file + pattern: "*.{fai}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - - bam: - type: file - description: Marked duplicates BAM file - pattern: "*.{bam}" - cram: - type: file - description: Marked duplicates CRAM file - pattern: "*.{cram}" - - bai: - type: file - description: BAM index file - pattern: "*.{bam.bai}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*cram": + type: file + description: Marked duplicates CRAM file + pattern: "*.{cram}" + - bam: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*bam": + type: file + description: Marked duplicates BAM file + pattern: "*.{bam}" - crai: - type: file - description: CRAM index file - pattern: "*.{cram.crai}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.crai": + type: file + description: CRAM index file + pattern: "*.{cram.crai}" + - bai: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bai": + type: file + description: BAM index file + pattern: "*.{bam.bai}" - metrics: - type: file - description: Duplicate metrics file generated by GATK - pattern: "*.{metrics.txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.metrics": + type: file + description: Duplicate metrics file generated by GATK + pattern: "*.{metrics.txt}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@ajodeh-juma" - "@FriederikeHanssen" diff --git a/modules/nf-core/gatk4/mergemutectstats/environment.yml b/modules/nf-core/gatk4/mergemutectstats/environment.yml index 756d408301..55993f440c 100644 --- a/modules/nf-core/gatk4/mergemutectstats/environment.yml +++ b/modules/nf-core/gatk4/mergemutectstats/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_mergemutectstats channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/mergemutectstats/meta.yml b/modules/nf-core/gatk4/mergemutectstats/meta.yml index 1269525657..09c8a54720 100644 --- a/modules/nf-core/gatk4/mergemutectstats/meta.yml +++ b/modules/nf-core/gatk4/mergemutectstats/meta.yml @@ -13,30 +13,33 @@ tools: tool_dev_url: https://github.com/broadinstitute/gatk doi: "10.1158/1538-7445.AM2017-3590" licence: ["BSD-3-clause"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - stats: - type: file - description: Stats file - pattern: "*.{stats}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - stats: + type: file + description: Stats file + pattern: "*.{stats}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - stats: - type: file - description: Stats file - pattern: "*.vcf.gz.stats" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.vcf.gz.stats": + type: file + description: Stats file + pattern: "*.vcf.gz.stats" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@FriederikeHanssen" maintainers: diff --git a/modules/nf-core/gatk4/mergevcfs/environment.yml b/modules/nf-core/gatk4/mergevcfs/environment.yml index efd9faa222..55993f440c 100644 --- a/modules/nf-core/gatk4/mergevcfs/environment.yml +++ b/modules/nf-core/gatk4/mergevcfs/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_mergevcfs channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/mergevcfs/meta.yml b/modules/nf-core/gatk4/mergevcfs/meta.yml index 996053fcc6..b4f61d780d 100644 --- a/modules/nf-core/gatk4/mergevcfs/meta.yml +++ b/modules/nf-core/gatk4/mergevcfs/meta.yml @@ -14,38 +14,50 @@ tools: documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test'] - - vcf: - type: list - description: Two or more VCF files - pattern: "*.{vcf,vcf.gz}" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome'] - - dict: - type: file - description: Optional Sequence Dictionary as input - pattern: "*.dict" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - vcf: + type: list + description: Two or more VCF files + pattern: "*.{vcf,vcf.gz}" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome'] + - dict: + type: file + description: Optional Sequence Dictionary as input + pattern: "*.dict" output: - vcf: - type: file - description: merged vcf file - pattern: "*.vcf.gz" + - meta: + type: file + description: merged vcf file + pattern: "*.vcf.gz" + - "*.vcf.gz": + type: file + description: merged vcf file + pattern: "*.vcf.gz" - tbi: - type: file - description: index files for the merged vcf files - pattern: "*.tbi" + - meta: + type: file + description: index files for the merged vcf files + pattern: "*.tbi" + - "*.tbi": + type: file + description: index files for the merged vcf files + pattern: "*.tbi" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@kevinmenden" maintainers: diff --git a/modules/nf-core/gatk4/mutect2/environment.yml b/modules/nf-core/gatk4/mutect2/environment.yml index 86f4bfae98..55993f440c 100644 --- a/modules/nf-core/gatk4/mutect2/environment.yml +++ b/modules/nf-core/gatk4/mutect2/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_mutect2 channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/mutect2/meta.yml b/modules/nf-core/gatk4/mutect2/meta.yml index 21c928ed96..27fd63a243 100644 --- a/modules/nf-core/gatk4/mutect2/meta.yml +++ b/modules/nf-core/gatk4/mutect2/meta.yml @@ -17,88 +17,113 @@ tools: documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test'] - - input: - type: list - description: list of BAM files, also able to take CRAM as an input - pattern: "*.{bam/cram}" - - input_index: - type: list - description: list of BAM file indexes, also able to take CRAM indexes as an input - pattern: "*.{bam.bai/cram.crai}" - - intervals: - type: file - description: Specify region the tools is run on. - pattern: ".{bed,interval_list}" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fasta: - type: file - description: The reference fasta file - pattern: "*.fasta" - - meta3: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fai: - type: file - description: Index of reference fasta file - pattern: "*.fasta.fai" - - meta4: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - dict: - type: file - description: GATK sequence dictionary - pattern: "*.dict" - - germline_resource: - type: file - description: Population vcf of germline sequencing, containing allele fractions. - pattern: "*.vcf.gz" - - germline_resource_tbi: - type: file - description: Index file for the germline resource. - pattern: "*.vcf.gz.tbi" - - panel_of_normals: - type: file - description: vcf file to be used as a panel of normals. - pattern: "*.vcf.gz" - - panel_of_normals_tbi: - type: file - description: Index for the panel of normals. - pattern: "*.vcf.gz.tbi" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - input: + type: list + description: list of BAM files, also able to take CRAM as an input + pattern: "*.{bam/cram}" + - input_index: + type: list + description: list of BAM file indexes, also able to take CRAM indexes as an + input + pattern: "*.{bam.bai/cram.crai}" + - intervals: + type: file + description: Specify region the tools is run on. + pattern: ".{bed,interval_list}" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fasta: + type: file + description: The reference fasta file + pattern: "*.fasta" + - - meta3: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fai: + type: file + description: Index of reference fasta file + pattern: "*.fasta.fai" + - - meta4: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - dict: + type: file + description: GATK sequence dictionary + pattern: "*.dict" + - - germline_resource: + type: file + description: Population vcf of germline sequencing, containing allele fractions. + pattern: "*.vcf.gz" + - - germline_resource_tbi: + type: file + description: Index file for the germline resource. + pattern: "*.vcf.gz.tbi" + - - panel_of_normals: + type: file + description: vcf file to be used as a panel of normals. + pattern: "*.vcf.gz" + - - panel_of_normals_tbi: + type: file + description: Index for the panel of normals. + pattern: "*.vcf.gz.tbi" output: - vcf: - type: file - description: compressed vcf file - pattern: "*.vcf.gz" + - meta: + type: file + description: compressed vcf file + pattern: "*.vcf.gz" + - "*.vcf.gz": + type: file + description: compressed vcf file + pattern: "*.vcf.gz" - tbi: - type: file - description: Index of vcf file - pattern: "*vcf.gz.tbi" + - meta: + type: file + description: Index of vcf file + pattern: "*vcf.gz.tbi" + - "*.tbi": + type: file + description: Index of vcf file + pattern: "*vcf.gz.tbi" - stats: - type: file - description: Stats file that pairs with output vcf file - pattern: "*vcf.gz.stats" + - meta: + type: file + description: Stats file that pairs with output vcf file + pattern: "*vcf.gz.stats" + - "*.stats": + type: file + description: Stats file that pairs with output vcf file + pattern: "*vcf.gz.stats" - f1r2: - type: file - description: file containing information to be passed to LearnReadOrientationModel (only outputted when tumor_normal_pair mode is run) - pattern: "*.f1r2.tar.gz" + - meta: + type: file + description: file containing information to be passed to LearnReadOrientationModel + (only outputted when tumor_normal_pair mode is run) + pattern: "*.f1r2.tar.gz" + - "*.f1r2.tar.gz": + type: file + description: file containing information to be passed to LearnReadOrientationModel + (only outputted when tumor_normal_pair mode is run) + pattern: "*.f1r2.tar.gz" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@GCJMackenzie" - "@ramprasadn" diff --git a/modules/nf-core/gatk4/mutect2/tests/main.nf.test b/modules/nf-core/gatk4/mutect2/tests/main.nf.test index d247ee3571..aea8d22694 100644 --- a/modules/nf-core/gatk4/mutect2/tests/main.nf.test +++ b/modules/nf-core/gatk4/mutect2/tests/main.nf.test @@ -21,31 +21,31 @@ nextflow_process { tumor_id:'tumour' ], [ - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_recalibrated_sorted_bam'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_recalibrated_sorted_bam'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.recalibrated.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.recalibrated.sorted.bam', checkIfExists: true) ], [ - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_recalibrated_sorted_bam_bai'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_recalibrated_sorted_bam_bai'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.recalibrated.sorted.bam.bai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.recalibrated.sorted.bam.bai', checkIfExists: true) ], [] ] input[1] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_21_fasta'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true) ] input[2] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_21_fasta_fai'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai', checkIfExists: true) ] input[3] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_21_dict'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.dict', checkIfExists: true) ] - input[4] = file(params.test_data['homo_sapiens']['genome']['gnomad_r2_1_1_21_vcf_gz'], checkIfExists: true) - input[5] = file(params.test_data['homo_sapiens']['genome']['gnomad_r2_1_1_21_vcf_gz_tbi'], checkIfExists: true) - input[6] = file(params.test_data['homo_sapiens']['genome']['mills_and_1000g_indels_21_vcf_gz'], checkIfExists: true) - input[7] = file(params.test_data['homo_sapiens']['genome']['mills_and_1000g_indels_21_vcf_gz_tbi'], checkIfExists: true) + input[4] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/gnomAD.r2.1.1.vcf.gz', checkIfExists: true) + input[5] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/gnomAD.r2.1.1.vcf.gz.tbi', checkIfExists: true) + input[6] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/mills_and_1000G.indels.hg38.vcf.gz', checkIfExists: true) + input[7] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/mills_and_1000G.indels.hg38.vcf.gz.tbi', checkIfExists: true) """ } } @@ -78,31 +78,31 @@ nextflow_process { tumor_id:'tumour' ], [ - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_recalibrated_sorted_bam'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_recalibrated_sorted_bam'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.recalibrated.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.recalibrated.sorted.bam', checkIfExists: true) ], [ - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_recalibrated_sorted_bam_bai'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_recalibrated_sorted_bam_bai'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.recalibrated.sorted.bam.bai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.recalibrated.sorted.bam.bai', checkIfExists: true) ], [] ] input[1] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_21_fasta'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true) ] input[2] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_21_fasta_fai'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai', checkIfExists: true) ] input[3] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_21_dict'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.dict', checkIfExists: true) ] - input[4] = file(params.test_data['homo_sapiens']['genome']['gnomad_r2_1_1_21_vcf_gz'], checkIfExists: true) - input[5] = file(params.test_data['homo_sapiens']['genome']['gnomad_r2_1_1_21_vcf_gz_tbi'], checkIfExists: true) - input[6] = file(params.test_data['homo_sapiens']['genome']['mills_and_1000g_indels_21_vcf_gz'], checkIfExists: true) - input[7] = file(params.test_data['homo_sapiens']['genome']['mills_and_1000g_indels_21_vcf_gz_tbi'], checkIfExists: true) + input[4] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/gnomAD.r2.1.1.vcf.gz', checkIfExists: true) + input[5] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/gnomAD.r2.1.1.vcf.gz.tbi', checkIfExists: true) + input[6] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/mills_and_1000G.indels.hg38.vcf.gz', checkIfExists: true) + input[7] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/mills_and_1000G.indels.hg38.vcf.gz.tbi', checkIfExists: true) """ } } @@ -127,26 +127,26 @@ nextflow_process { """ input[0] = [ [ id:'test'], - [ file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_recalibrated_sorted_bam'], checkIfExists: true)], - [ file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_recalibrated_sorted_bam_bai'], checkIfExists: true)], + [ file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.recalibrated.sorted.bam', checkIfExists: true)], + [ file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.recalibrated.sorted.bam.bai', checkIfExists: true)], [] ] input[1] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_21_fasta'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true) ] input[2] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_21_fasta_fai'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai', checkIfExists: true) ] input[3] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_21_dict'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.dict', checkIfExists: true) ] - input[4] = file(params.test_data['homo_sapiens']['genome']['gnomad_r2_1_1_21_vcf_gz'], checkIfExists: true) - input[5] = file(params.test_data['homo_sapiens']['genome']['gnomad_r2_1_1_21_vcf_gz_tbi'], checkIfExists: true) - input[6] = file(params.test_data['homo_sapiens']['genome']['mills_and_1000g_indels_21_vcf_gz'], checkIfExists: true) - input[7] = file(params.test_data['homo_sapiens']['genome']['mills_and_1000g_indels_21_vcf_gz_tbi'], checkIfExists: true) + input[4] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/gnomAD.r2.1.1.vcf.gz', checkIfExists: true) + input[5] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/gnomAD.r2.1.1.vcf.gz.tbi', checkIfExists: true) + input[6] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/mills_and_1000G.indels.hg38.vcf.gz', checkIfExists: true) + input[7] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/mills_and_1000G.indels.hg38.vcf.gz.tbi', checkIfExists: true) """ } } @@ -171,26 +171,26 @@ nextflow_process { """ input[0] = [ [ id:'test'], - [ file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_recalibrated_sorted_cram'], checkIfExists: true)], - [ file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_recalibrated_sorted_cram_crai'], checkIfExists: true)], + [ file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test2.paired_end.recalibrated.sorted.cram', checkIfExists: true)], + [ file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test2.paired_end.recalibrated.sorted.cram.crai', checkIfExists: true)], [] ] input[1] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_21_fasta'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true) ] input[2] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_21_fasta_fai'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai', checkIfExists: true) ] input[3] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_21_dict'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.dict', checkIfExists: true) ] - input[4] = file(params.test_data['homo_sapiens']['genome']['gnomad_r2_1_1_21_vcf_gz'], checkIfExists: true) - input[5] = file(params.test_data['homo_sapiens']['genome']['gnomad_r2_1_1_21_vcf_gz_tbi'], checkIfExists: true) - input[6] = file(params.test_data['homo_sapiens']['genome']['mills_and_1000g_indels_21_vcf_gz'], checkIfExists: true) - input[7] = file(params.test_data['homo_sapiens']['genome']['mills_and_1000g_indels_21_vcf_gz_tbi'], checkIfExists: true) + input[4] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/gnomAD.r2.1.1.vcf.gz', checkIfExists: true) + input[5] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/gnomAD.r2.1.1.vcf.gz.tbi', checkIfExists: true) + input[6] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/mills_and_1000G.indels.hg38.vcf.gz', checkIfExists: true) + input[7] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/mills_and_1000G.indels.hg38.vcf.gz.tbi', checkIfExists: true) """ } } @@ -216,21 +216,21 @@ nextflow_process { """ input[0] = [ [ id:'test'], - [ file(params.test_data['homo_sapiens']['illumina']['test_paired_end_recalibrated_sorted_bam'], checkIfExists: true)], - [ file(params.test_data['homo_sapiens']['illumina']['test_paired_end_recalibrated_sorted_bam_bai'], checkIfExists: true)], + [ file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.recalibrated.sorted.bam', checkIfExists: true)], + [ file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.recalibrated.sorted.bam.bai', checkIfExists: true)], [] ] input[1] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_21_fasta'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true) ] input[2] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_21_fasta_fai'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai', checkIfExists: true) ] input[3] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_21_dict'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.dict', checkIfExists: true) ] input[4] = [] input[5] = [] @@ -261,21 +261,21 @@ nextflow_process { """ input[0] = [ [ id:'test'], - [ file(params.test_data['homo_sapiens']['illumina']['mitochon_standin_recalibrated_sorted_bam'], checkIfExists: true)], - [ file(params.test_data['homo_sapiens']['illumina']['mitochon_standin_recalibrated_sorted_bam_bai'], checkIfExists: true)], - [ file(params.test_data['homo_sapiens']['genome']['genome_bed'], checkIfExists: true)] + [ file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/mitochon_standin.recalibrated.sorted.bam', checkIfExists: true)], + [ file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/mitochon_standin.recalibrated.sorted.bam.bai', checkIfExists: true)], + [ file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists: true)] ] input[1] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_fasta'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) ] input[2] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_fasta_fai'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true) ] input[3] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_dict'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.dict', checkIfExists: true) ] input[4] = [] input[5] = [] @@ -312,31 +312,31 @@ nextflow_process { tumor_id:'tumour' ], [ - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_recalibrated_sorted_bam'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_recalibrated_sorted_bam'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.recalibrated.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.recalibrated.sorted.bam', checkIfExists: true) ], [ - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_recalibrated_sorted_bam_bai'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test2_paired_end_recalibrated_sorted_bam_bai'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.recalibrated.sorted.bam.bai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.recalibrated.sorted.bam.bai', checkIfExists: true) ], [] ] input[1] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_21_fasta'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true) ] input[2] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_21_fasta_fai'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai', checkIfExists: true) ] input[3] = [ [ id:'genome' ], - file(params.test_data['homo_sapiens']['genome']['genome_21_dict'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.dict', checkIfExists: true) ] - input[4] = file(params.test_data['homo_sapiens']['genome']['gnomad_r2_1_1_21_vcf_gz'], checkIfExists: true) - input[5] = file(params.test_data['homo_sapiens']['genome']['gnomad_r2_1_1_21_vcf_gz_tbi'], checkIfExists: true) - input[6] = file(params.test_data['homo_sapiens']['genome']['mills_and_1000g_indels_21_vcf_gz'], checkIfExists: true) - input[7] = file(params.test_data['homo_sapiens']['genome']['mills_and_1000g_indels_21_vcf_gz_tbi'], checkIfExists: true) + input[4] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/gnomAD.r2.1.1.vcf.gz', checkIfExists: true) + input[5] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/gnomAD.r2.1.1.vcf.gz.tbi', checkIfExists: true) + input[6] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/mills_and_1000G.indels.hg38.vcf.gz', checkIfExists: true) + input[7] = file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/germlineresources/mills_and_1000G.indels.hg38.vcf.gz.tbi', checkIfExists: true) """ } } diff --git a/modules/nf-core/gatk4/variantrecalibrator/environment.yml b/modules/nf-core/gatk4/variantrecalibrator/environment.yml index 95b744c46b..55993f440c 100644 --- a/modules/nf-core/gatk4/variantrecalibrator/environment.yml +++ b/modules/nf-core/gatk4/variantrecalibrator/environment.yml @@ -1,7 +1,5 @@ -name: gatk4_variantrecalibrator channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4=4.5.0.0 diff --git a/modules/nf-core/gatk4/variantrecalibrator/meta.yml b/modules/nf-core/gatk4/variantrecalibrator/meta.yml index 39a415b61c..72fcfd601c 100644 --- a/modules/nf-core/gatk4/variantrecalibrator/meta.yml +++ b/modules/nf-core/gatk4/variantrecalibrator/meta.yml @@ -18,64 +18,92 @@ tools: homepage: https://gatk.broadinstitute.org/hc/en-us documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test' ] - - vcf: - type: file - description: input vcf file containing the variants to be recalibrated - pattern: "*.vcf.gz" - - tbi: - type: file - description: tbi file matching with -vcf - pattern: "*.vcf.gz.tbi" - - resource_vcf: - type: file - description: all resource vcf files that are used with the corresponding '--resource' label - pattern: "*.vcf.gz" - - resource_tbi: - type: file - description: all resource tbi files that are used with the corresponding '--resource' label - pattern: "*.vcf.gz.tbi" - - labels: - type: string - description: necessary arguments for GATK VariantRecalibrator. Specified to directly match the resources provided. More information can be found at https://gatk.broadinstitute.org/hc/en-us/articles/5358906115227-VariantRecalibrator - - fasta: - type: file - description: The reference fasta file - pattern: "*.fasta" - - fai: - type: file - description: Index of reference fasta file - pattern: "fasta.fai" - - dict: - type: file - description: GATK sequence dictionary - pattern: "*.dict" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test' ] + - vcf: + type: file + description: input vcf file containing the variants to be recalibrated + pattern: "*.vcf.gz" + - tbi: + type: file + description: tbi file matching with -vcf + pattern: "*.vcf.gz.tbi" + - - resource_vcf: + type: file + description: all resource vcf files that are used with the corresponding '--resource' + label + pattern: "*.vcf.gz" + - - resource_tbi: + type: file + description: all resource tbi files that are used with the corresponding '--resource' + label + pattern: "*.vcf.gz.tbi" + - - labels: + type: string + description: necessary arguments for GATK VariantRecalibrator. Specified to + directly match the resources provided. More information can be found at + https://gatk.broadinstitute.org/hc/en-us/articles/5358906115227-VariantRecalibrator + - - fasta: + type: file + description: The reference fasta file + pattern: "*.fasta" + - - fai: + type: file + description: Index of reference fasta file + pattern: "fasta.fai" + - - dict: + type: file + description: GATK sequence dictionary + pattern: "*.dict" output: - recal: - type: file - description: Output recal file used by ApplyVQSR - pattern: "*.recal" + - meta: + type: file + description: Output recal file used by ApplyVQSR + pattern: "*.recal" + - "*.recal": + type: file + description: Output recal file used by ApplyVQSR + pattern: "*.recal" - idx: - type: file - description: Index file for the recal output file - pattern: "*.idx" + - meta: + type: file + description: Index file for the recal output file + pattern: "*.idx" + - "*.idx": + type: file + description: Index file for the recal output file + pattern: "*.idx" - tranches: - type: file - description: Output tranches file used by ApplyVQSR - pattern: "*.tranches" + - meta: + type: file + description: Output tranches file used by ApplyVQSR + pattern: "*.tranches" + - "*.tranches": + type: file + description: Output tranches file used by ApplyVQSR + pattern: "*.tranches" - plots: - type: file - description: Optional output rscript file to aid in visualization of the input data and learned model. - pattern: "*plots.R" - - version: - type: file - description: File containing software versions - pattern: "*.versions.yml" + - meta: + type: file + description: Optional output rscript file to aid in visualization of the input + data and learned model. + pattern: "*plots.R" + - "*plots.R": + type: file + description: Optional output rscript file to aid in visualization of the input + data and learned model. + pattern: "*plots.R" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@GCJMackenzie" - "@nickhsmith" diff --git a/modules/nf-core/gatk4spark/applybqsr/environment.yml b/modules/nf-core/gatk4spark/applybqsr/environment.yml index de07029794..14075a574a 100644 --- a/modules/nf-core/gatk4spark/applybqsr/environment.yml +++ b/modules/nf-core/gatk4spark/applybqsr/environment.yml @@ -1,7 +1,5 @@ -name: gatk4spark_applybqsr channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4-spark=4.5.0.0 diff --git a/modules/nf-core/gatk4spark/applybqsr/meta.yml b/modules/nf-core/gatk4spark/applybqsr/meta.yml index 4904568d2e..609af2f450 100644 --- a/modules/nf-core/gatk4spark/applybqsr/meta.yml +++ b/modules/nf-core/gatk4spark/applybqsr/meta.yml @@ -16,56 +16,65 @@ tools: documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM file from alignment - pattern: "*.{bam,cram}" - - input_index: - type: file - description: BAI/CRAI file from alignment - pattern: "*.{bai,crai}" - - bqsr_table: - type: file - description: Recalibration table from gatk4_baserecalibrator - - intervals: - type: file - description: Bed file with the genomic regions included in the library (optional) - - fasta: - type: file - description: The reference fasta file - pattern: "*.fasta" - - fai: - type: file - description: Index of reference fasta file - pattern: "*.fasta.fai" - - dict: - type: file - description: GATK sequence dictionary - pattern: "*.dict" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM file from alignment + pattern: "*.{bam,cram}" + - input_index: + type: file + description: BAI/CRAI file from alignment + pattern: "*.{bai,crai}" + - bqsr_table: + type: file + description: Recalibration table from gatk4_baserecalibrator + - intervals: + type: file + description: Bed file with the genomic regions included in the library (optional) + - - fasta: + type: file + description: The reference fasta file + pattern: "*.fasta" + - - fai: + type: file + description: Index of reference fasta file + pattern: "*.fasta.fai" + - - dict: + type: file + description: GATK sequence dictionary + pattern: "*.dict" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - bam: - type: file - description: Recalibrated BAM file - pattern: "*.{bam}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bam": + type: file + description: Recalibrated BAM file + pattern: "*.{bam}" - cram: - type: file - description: Recalibrated CRAM file - pattern: "*.{cram}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.cram": + type: file + description: Recalibrated CRAM file + pattern: "*.{cram}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@yocra3" - "@FriederikeHanssen" diff --git a/modules/nf-core/gatk4spark/baserecalibrator/environment.yml b/modules/nf-core/gatk4spark/baserecalibrator/environment.yml index 84615886c1..14075a574a 100644 --- a/modules/nf-core/gatk4spark/baserecalibrator/environment.yml +++ b/modules/nf-core/gatk4spark/baserecalibrator/environment.yml @@ -1,7 +1,5 @@ -name: gatk4spark_baserecalibrator channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4-spark=4.5.0.0 diff --git a/modules/nf-core/gatk4spark/baserecalibrator/meta.yml b/modules/nf-core/gatk4spark/baserecalibrator/meta.yml index dd334a225f..abb0e1a65e 100644 --- a/modules/nf-core/gatk4spark/baserecalibrator/meta.yml +++ b/modules/nf-core/gatk4spark/baserecalibrator/meta.yml @@ -16,57 +16,60 @@ tools: documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM file from alignment - pattern: "*.{bam,cram}" - - input_index: - type: file - description: BAI/CRAI file from alignment - pattern: "*.{bai,crai}" - - intervals: - type: file - description: Bed file with the genomic regions included in the library (optional) - - fasta: - type: file - description: The reference fasta file - pattern: "*.fasta" - - fai: - type: file - description: Index of reference fasta file - pattern: "*.fasta.fai" - - dict: - type: file - description: GATK sequence dictionary - pattern: "*.dict" - - known_sites: - type: file - description: VCF files with known sites for indels / snps (optional) - pattern: "*.vcf.gz" - - known_sites_tbi: - type: file - description: Tabix index of the known_sites (optional) - pattern: "*.vcf.gz.tbi" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM file from alignment + pattern: "*.{bam,cram}" + - input_index: + type: file + description: BAI/CRAI file from alignment + pattern: "*.{bai,crai}" + - intervals: + type: file + description: Bed file with the genomic regions included in the library (optional) + - - fasta: + type: file + description: The reference fasta file + pattern: "*.fasta" + - - fai: + type: file + description: Index of reference fasta file + pattern: "*.fasta.fai" + - - dict: + type: file + description: GATK sequence dictionary + pattern: "*.dict" + - - known_sites: + type: file + description: VCF files with known sites for indels / snps (optional) + pattern: "*.vcf.gz" + - - known_sites_tbi: + type: file + description: Tabix index of the known_sites (optional) + pattern: "*.vcf.gz.tbi" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - table: - type: file - description: Recalibration table from BaseRecalibrator - pattern: "*.{table}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.table": + type: file + description: Recalibration table from BaseRecalibrator + pattern: "*.{table}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@yocra3" - "@FriederikeHanssen" diff --git a/modules/nf-core/gatk4spark/markduplicates/environment.yml b/modules/nf-core/gatk4spark/markduplicates/environment.yml index 94dd4a3c68..14075a574a 100644 --- a/modules/nf-core/gatk4spark/markduplicates/environment.yml +++ b/modules/nf-core/gatk4spark/markduplicates/environment.yml @@ -1,7 +1,5 @@ -name: gatk4spark_markduplicates channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::gatk4-spark=4.5.0.0 diff --git a/modules/nf-core/gatk4spark/markduplicates/meta.yml b/modules/nf-core/gatk4spark/markduplicates/meta.yml index 016a215b25..fc8dee3dff 100644 --- a/modules/nf-core/gatk4spark/markduplicates/meta.yml +++ b/modules/nf-core/gatk4spark/markduplicates/meta.yml @@ -1,5 +1,6 @@ name: gatk4spark_markduplicates -description: This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA. +description: This tool locates and tags duplicate reads in a BAM or SAM file, where + duplicate reads are defined as originating from a single fragment of DNA. keywords: - bam - gatk4spark @@ -7,52 +8,74 @@ keywords: - sort tools: - gatk4: - description: Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. + description: Developed in the Data Sciences Platform at the Broad Institute, the + toolkit offers a wide variety of tools with a primary focus on variant discovery + and genotyping. Its powerful processing engine and high-performance computing + features make it capable of taking on projects of any size. homepage: https://gatk.broadinstitute.org/hc/en-us documentation: https://gatk.broadinstitute.org/hc/en-us/articles/360037052812-MarkDuplicates-Picard- tool_dev_url: https://github.com/broadinstitute/gatk doi: 10.1158/1538-7445.AM2017-3590 licence: ["MIT"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: Sorted BAM file - pattern: "*.{bam}" - - fasta: - type: file - description: The reference fasta file - pattern: "*.fasta" - - fai: - type: file - description: Index of reference fasta file - pattern: "*.fasta.fai" - - dict: - type: file - description: GATK sequence dictionary - pattern: "*.dict" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - bam: + type: file + description: Sorted BAM file + pattern: "*.{bam}" + - - fasta: + type: file + description: The reference fasta file + pattern: "*.fasta" + - - fasta_fai: + type: file + description: Index of reference fasta file + pattern: "*.fai" + - - dict: + type: file + description: GATK sequence dictionary + pattern: "*.dict" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - output: - type: file - description: Marked duplicates BAM/CRAM file - pattern: "*.{bam,cram}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}: + type: file + description: Marked duplicates BAM/CRAM file + pattern: "*.{bam,cram}" - bam_index: - type: file - description: Optional BAM index file - pattern: "*.bai" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.bai: + type: file + description: Optional BAM index file + pattern: "*.bai" + - metrics: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.metrics": + type: file + description: Metrics file + pattern: "*.metrics" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@ajodeh-juma" - "@FriederikeHanssen" diff --git a/modules/nf-core/gawk/environment.yml b/modules/nf-core/gawk/environment.yml index 3d98a08b0c..315f6dc67e 100644 --- a/modules/nf-core/gawk/environment.yml +++ b/modules/nf-core/gawk/environment.yml @@ -1,7 +1,5 @@ -name: gawk channels: - conda-forge - bioconda - - defaults dependencies: - conda-forge::gawk=5.3.0 diff --git a/modules/nf-core/gawk/main.nf b/modules/nf-core/gawk/main.nf index ca4689297c..7514246eeb 100644 --- a/modules/nf-core/gawk/main.nf +++ b/modules/nf-core/gawk/main.nf @@ -8,7 +8,7 @@ process GAWK { 'biocontainers/gawk:5.3.0' }" input: - tuple val(meta), path(input) + tuple val(meta), path(input, arity: '0..*') path(program_file) output: @@ -22,15 +22,19 @@ process GAWK { def args = task.ext.args ?: '' // args is used for the main arguments of the tool def args2 = task.ext.args2 ?: '' // args2 is used to specify a program when no program file has been given prefix = task.ext.prefix ?: "${meta.id}" - suffix = task.ext.suffix ?: "${input.getExtension()}" + suffix = task.ext.suffix ?: "${input.collect{ it.getExtension()}.get(0)}" // use the first extension of the input files - program = program_file ? "-f ${program_file}" : "${args2}" + program = program_file ? "-f ${program_file}" : "${args2}" + lst_gz = input.collect{ it.getExtension().endsWith("gz") } + unzip = lst_gz.contains(false) ? "" : "find ${input} -exec zcat {} \\; | \\" + input_cmd = unzip ? "" : "${input}" """ + ${unzip} awk \\ ${args} \\ ${program} \\ - ${input} \\ + ${input_cmd} \\ > ${prefix}.${suffix} cat <<-END_VERSIONS > versions.yml diff --git a/modules/nf-core/gawk/meta.yml b/modules/nf-core/gawk/meta.yml index 2b6033b0b5..2da41405de 100644 --- a/modules/nf-core/gawk/meta.yml +++ b/modules/nf-core/gawk/meta.yml @@ -16,34 +16,41 @@ tools: documentation: "https://www.gnu.org/software/gawk/manual/" tool_dev_url: "https://www.gnu.org/prep/ftp.html" licence: ["GPL v3"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: The input file - Specify the logic that needs to be executed on this file on the `ext.args2` or in the program file - pattern: "*" - - program_file: - type: file - description: Optional file containing logic for awk to execute. If you don't wish to use a file, you can use `ext.args2` to specify the logic. - pattern: "*" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: The input file - Specify the logic that needs to be executed on + this file on the `ext.args2` or in the program file. + If the files have a `.gz` extension, they will be unzipped using `zcat`. + pattern: "*" + - - program_file: + type: file + description: Optional file containing logic for awk to execute. If you don't + wish to use a file, you can use `ext.args2` to specify the logic. + pattern: "*" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - output: - type: file - description: The output file - specify the name of this file using `ext.prefix` and the extension using `ext.suffix` - pattern: "*" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.${suffix}: + type: file + description: The output file - specify the name of this file using `ext.prefix` + and the extension using `ext.suffix` + pattern: "*" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@nvnieuwk" maintainers: diff --git a/modules/nf-core/gawk/tests/main.nf.test b/modules/nf-core/gawk/tests/main.nf.test index fce82ca95a..5952e9a293 100644 --- a/modules/nf-core/gawk/tests/main.nf.test +++ b/modules/nf-core/gawk/tests/main.nf.test @@ -8,7 +8,7 @@ nextflow_process { tag "modules_nfcore" tag "gawk" - test("convert fasta to bed") { + test("Convert fasta to bed") { config "./nextflow.config" when { @@ -31,7 +31,7 @@ nextflow_process { } } - test("convert fasta to bed with program file") { + test("Convert fasta to bed with program file") { config "./nextflow_with_program_file.config" when { @@ -53,4 +53,52 @@ nextflow_process { ) } } + + test("Extract first column from multiple files") { + config "./nextflow_with_program_file.config" + tag "test" + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + [file(params.modules_testdata_base_path + 'generic/txt/hello.txt', checkIfExists: true), + file(params.modules_testdata_base_path + 'generic/txt/species_names.txt', checkIfExists: true)] + ] + input[1] = Channel.of('BEGIN {FS=" "}; {print \$1}').collectFile(name:"program.txt") + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("Unzip files before processing") { + config "./nextflow_with_program_file.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + [file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/vcf/NA12878_chrM.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/vcf/NA24385_sv.vcf.gz', checkIfExists: true)] + ] + input[1] = Channel.of('/^#CHROM/ { print \$1, \$10 }').collectFile(name:"column_header.txt") + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } } \ No newline at end of file diff --git a/modules/nf-core/gawk/tests/main.nf.test.snap b/modules/nf-core/gawk/tests/main.nf.test.snap index 4f3a759c62..d396f738b6 100644 --- a/modules/nf-core/gawk/tests/main.nf.test.snap +++ b/modules/nf-core/gawk/tests/main.nf.test.snap @@ -1,5 +1,5 @@ { - "convert fasta to bed with program file": { + "Convert fasta to bed": { "content": [ { "0": [ @@ -28,11 +28,11 @@ ], "meta": { "nf-test": "0.8.4", - "nextflow": "24.03.0" + "nextflow": "24.04.4" }, - "timestamp": "2024-05-17T15:20:02.495430346" + "timestamp": "2024-10-19T13:14:02.347809811" }, - "convert fasta to bed": { + "Convert fasta to bed with program file": { "content": [ { "0": [ @@ -61,8 +61,74 @@ ], "meta": { "nf-test": "0.8.4", - "nextflow": "24.03.0" + "nextflow": "24.04.4" }, - "timestamp": "2024-05-17T15:19:53.291809648" + "timestamp": "2024-10-19T13:14:11.894616209" + }, + "Extract first column from multiple files": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.bed:md5,566c51674bd643227bb2d83e0963376d" + ] + ], + "1": [ + "versions.yml:md5,842acc9870dc8ac280954047cb2aa23a" + ], + "output": [ + [ + { + "id": "test" + }, + "test.bed:md5,566c51674bd643227bb2d83e0963376d" + ] + ], + "versions": [ + "versions.yml:md5,842acc9870dc8ac280954047cb2aa23a" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-19T22:04:47.729300129" + }, + "Unzip files before processing": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.bed:md5,1e31ebd4a060aab5433bbbd9ab24e403" + ] + ], + "1": [ + "versions.yml:md5,842acc9870dc8ac280954047cb2aa23a" + ], + "output": [ + [ + { + "id": "test" + }, + "test.bed:md5,1e31ebd4a060aab5433bbbd9ab24e403" + ] + ], + "versions": [ + "versions.yml:md5,842acc9870dc8ac280954047cb2aa23a" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-19T22:08:19.533527657" } -} \ No newline at end of file +} diff --git a/modules/nf-core/goleft/indexcov/environment.yml b/modules/nf-core/goleft/indexcov/environment.yml new file mode 100644 index 0000000000..813146929c --- /dev/null +++ b/modules/nf-core/goleft/indexcov/environment.yml @@ -0,0 +1,6 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::goleft=0.2.4 + - bioconda::htslib=1.12 diff --git a/modules/nf-core/goleft/indexcov/main.nf b/modules/nf-core/goleft/indexcov/main.nf new file mode 100644 index 0000000000..5d0ed5dfb0 --- /dev/null +++ b/modules/nf-core/goleft/indexcov/main.nf @@ -0,0 +1,65 @@ +process GOLEFT_INDEXCOV { + tag "${meta.id}" + label 'process_single' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/goleft:0.2.4--h9ee0642_1': + 'biocontainers/goleft:0.2.4--h9ee0642_1' }" + + input: + tuple val(meta), path(bams), path(indexes) + tuple val(meta2), path(fai) + + output: + tuple val(meta), path("${prefix}/*") , emit: output + tuple val(meta), path("${prefix}/*ped") , emit: ped , optional: true + tuple val(meta), path("${prefix}/*bed.gz") , emit: bed , optional: true + tuple val(meta), path("${prefix}/*bed.gz.tbi"), emit: bed_index , optional: true + tuple val(meta), path("${prefix}/*roc") , emit: roc , optional: true + tuple val(meta), path("${prefix}/*html") , emit: html, optional: true + tuple val(meta), path("${prefix}/*png") , emit: png , optional: true + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + prefix = task.ext.prefix ?: "${meta.id}" + // indexcov uses BAM files or CRAI + def input_files = bams.findAll{it.name.endsWith(".bam")} + indexes.findAll{it.name.endsWith(".crai")} + def extranormalize = input_files.any{it.name.endsWith(".crai")} ? " --extranormalize " : "" + """ + goleft indexcov \\ + --fai ${fai} \\ + --directory ${prefix} \\ + ${extranormalize} \\ + $args \\ + ${input_files.join(" ")} + + if [ -f "${prefix}/${prefix}-indexcov.bed.gz" ] ; then + tabix -p bed "${prefix}/${prefix}-indexcov.bed.gz" + fi + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + goleft: \$(goleft --version 2>&1 | head -n 1 | sed 's/^.*goleft Version: //') + tabix: \$(echo \$(tabix -h 2>&1) | sed 's/^.*Version: //; s/ .*\$//') + END_VERSIONS + """ + stub: + def args = task.ext.args ?: '' + prefix = task.ext.prefix ?: "${meta.id}" + """ + mkdir "${prefix}" + echo "" | gzip > "${prefix}/${prefix}-indexcov.bed.gz" + touch "${prefix}/${prefix}-indexcov.bed.gz.tbi" + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + goleft: \$(goleft --version 2>&1 | head -n 1 | sed 's/^.*goleft Version: //') + tabix: \$(echo \$(tabix -h 2>&1) | sed 's/^.*Version: //; s/ .*\$//') + END_VERSIONS + """ +} diff --git a/modules/nf-core/goleft/indexcov/meta.yml b/modules/nf-core/goleft/indexcov/meta.yml new file mode 100644 index 0000000000..1619caf32d --- /dev/null +++ b/modules/nf-core/goleft/indexcov/meta.yml @@ -0,0 +1,122 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/yaml-schema.json +name: "goleft_indexcov" +description: Quickly estimate coverage from a whole-genome bam or cram index. A bam + index has 16KB resolution so that's what this gives, but it provides what appears + to be a high-quality coverage estimate in seconds per genome. +keywords: + - coverage + - cnv + - genomics + - depth +tools: + - "goleft": + description: "goleft is a collection of bioinformatics tools distributed under + MIT license in a single static binary" + homepage: "https://github.com/brentp/goleft" + documentation: "https://github.com/brentp/goleft" + tool_dev_url: "https://github.com/brentp/goleft" + doi: "10.1093/gigascience/gix090" + licence: ["MIT"] + identifier: "" +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false] + - bams: + type: file + description: Sorted BAM/CRAM/SAM files + pattern: "*.{bam,cram,sam}" + - indexes: + type: file + description: BAI/CRAI files + pattern: "*.{bai,crai}" + - - meta2: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false] + - fai: + type: file + description: FASTA index + pattern: "*.{fai}" +output: + - output: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*: + type: file + description: Files generated by indexcov + - ped: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*ped: + type: file + description: ped files + pattern: "*ped" + - bed: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*bed.gz: + type: file + description: bed files + pattern: "*bed.gz" + - bed_index: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*bed.gz.tbi: + type: file + description: bed index files + pattern: "*bed.gz.tbi" + - roc: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*roc: + type: file + description: roc files + pattern: "*roc" + - html: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*html: + type: file + description: html files + pattern: "*html" + - png: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/*png: + type: file + description: png files + pattern: "*png" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@lindenb" +maintainers: + - "@lindenb" diff --git a/modules/nf-core/goleft/indexcov/tests/main.nf.test b/modules/nf-core/goleft/indexcov/tests/main.nf.test new file mode 100644 index 0000000000..1296c644cd --- /dev/null +++ b/modules/nf-core/goleft/indexcov/tests/main.nf.test @@ -0,0 +1,131 @@ +nextflow_process { + + name "Test Process GOLEFT_INDEXCOV" + script "../main.nf" + process "GOLEFT_INDEXCOV" + + tag "modules" + tag "modules_nfcore" + tag "goleft" + tag "goleft/indexcov" + + test("sarscov2 - bam") { + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test' ], // meta map + [ + file(params.modules_testdata_base_path + "genomics/sarscov2/illumina/bam/test.single_end.sorted.bam", checkIfExists: true), + file(params.modules_testdata_base_path + "genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam", checkIfExists: true) + ], + [ + file(params.modules_testdata_base_path + "genomics/sarscov2/illumina/bam/test.single_end.sorted.bam.bai", checkIfExists: true), + file(params.modules_testdata_base_path + "genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam.bai", checkIfExists: true) + ], + ]) + + input[1] = Channel.of( + [ + [:], + file(params.modules_testdata_base_path + "genomics/sarscov2/genome/genome.fasta.fai", checkIfExists: true) + ] + ) + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.ped, + process.out.bed, + file(process.out.bed_index[0][1]).name, + process.out.roc, + process.out.html, + process.out.png, + process.out.versions + ).match() } + ) + } + + } + + + test("sarscov2 - crai") { + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test' ], // meta map + [ + file(params.modules_testdata_base_path + "genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram", checkIfExists: true), + file(params.modules_testdata_base_path + "genomics/homo_sapiens/illumina/cram/test.paired_end.recalibrated.sorted.cram", checkIfExists: true) + ], + [ + file(params.modules_testdata_base_path + "genomics/homo_sapiens/illumina/cram/test.paired_end.markduplicates.sorted.cram.crai", checkIfExists: true), + file(params.modules_testdata_base_path + "genomics/homo_sapiens/illumina/cram/test.paired_end.recalibrated.sorted.cram.crai", checkIfExists: true) + ] + ]) + + input[1] = Channel.of( + [ + [:], + file(params.modules_testdata_base_path + "genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai", checkIfExists: true) + ] + ) + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.ped, + process.out.bed, + file(process.out.bed_index[0][1]).name, + process.out.roc, + process.out.html, + process.out.png, + process.out.versions + ).match() } + ) + } + + } + + test("sarscov2 - stub") { + + options "-stub" + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test' ], // meta map + [], + [] + ]) + + input[1] = Channel.of([ + [:], + [] + ]) + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/goleft/indexcov/tests/main.nf.test.snap b/modules/nf-core/goleft/indexcov/tests/main.nf.test.snap new file mode 100644 index 0000000000..1c79232db0 --- /dev/null +++ b/modules/nf-core/goleft/indexcov/tests/main.nf.test.snap @@ -0,0 +1,205 @@ +{ + "sarscov2 - crai": { + "content": [ + [ + [ + { + "id": "test" + }, + "test-indexcov.ped:md5,8737714b6ea160e06d5282391f89f791" + ] + ], + [ + [ + { + "id": "test" + }, + "test-indexcov.bed.gz:md5,04aa3637cffca5d99316df7741c06589" + ] + ], + "test-indexcov.bed.gz.tbi", + [ + [ + { + "id": "test" + }, + "test-indexcov.roc:md5,548b76fdf16e97768b0c9b8ecbfd5bef" + ] + ], + [ + [ + { + "id": "test" + }, + [ + "index.html:md5,41840ede180b20cdf6074c431269929e", + "test-indexcov-depth-chr21.html:md5,4c839b03f2f41e3fdca5642903c35008", + "test-indexcov-roc-chr21.html:md5,f84b547328a23196f16f71d093eb7450" + ] + ] + ], + [ + [ + { + "id": "test" + }, + [ + "test-indexcov-depth-chr21.png:md5,1999b0bf1cd0680f6d107d438e7257cf", + "test-indexcov-roc-chr21.png:md5,41f1460535b255fff053da59fcccf698" + ] + ] + ], + [ + "versions.yml:md5,f9c06c1c05a2a31854b4e04e449a24c5" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-22T06:40:17.142801459" + }, + "sarscov2 - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + [ + "test-indexcov.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test-indexcov.bed.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "1": [ + + ], + "2": [ + [ + { + "id": "test" + }, + "test-indexcov.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "3": [ + [ + { + "id": "test" + }, + "test-indexcov.bed.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + + ], + "5": [ + + ], + "6": [ + + ], + "7": [ + "versions.yml:md5,f9c06c1c05a2a31854b4e04e449a24c5" + ], + "bed": [ + [ + { + "id": "test" + }, + "test-indexcov.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "bed_index": [ + [ + { + "id": "test" + }, + "test-indexcov.bed.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "html": [ + + ], + "output": [ + [ + { + "id": "test" + }, + [ + "test-indexcov.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test-indexcov.bed.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "ped": [ + + ], + "png": [ + + ], + "roc": [ + + ], + "versions": [ + "versions.yml:md5,f9c06c1c05a2a31854b4e04e449a24c5" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-22T06:44:59.203730744" + }, + "sarscov2 - bam": { + "content": [ + [ + [ + { + "id": "test" + }, + "test-indexcov.ped:md5,da2bd9882474d2f00f8ad2ab20b140c9" + ] + ], + [ + [ + { + "id": "test" + }, + "test-indexcov.bed.gz:md5,eab7a78287e261d600c06def12a33029" + ] + ], + "test-indexcov.bed.gz.tbi", + [ + [ + { + "id": "test" + }, + "test-indexcov.roc:md5,3f460308bb86203d1ada71b7c84d995d" + ] + ], + [ + [ + { + "id": "test" + }, + "index.html:md5,d1cc28023cd827446e0f9c905c94fe3e" + ] + ], + [ + + ], + [ + "versions.yml:md5,f9c06c1c05a2a31854b4e04e449a24c5" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-22T06:39:48.470187823" + } +} \ No newline at end of file diff --git a/modules/nf-core/goleft/indexcov/tests/tags.yml b/modules/nf-core/goleft/indexcov/tests/tags.yml new file mode 100644 index 0000000000..c27c4b9d5e --- /dev/null +++ b/modules/nf-core/goleft/indexcov/tests/tags.yml @@ -0,0 +1,2 @@ +goleft/indexcov: + - "modules/nf-core/goleft/indexcov/**" diff --git a/modules/nf-core/lofreq/callparallel/environment.yml b/modules/nf-core/lofreq/callparallel/environment.yml new file mode 100644 index 0000000000..011ce6cbda --- /dev/null +++ b/modules/nf-core/lofreq/callparallel/environment.yml @@ -0,0 +1,5 @@ +channels: + - conda-forge + - bioconda +dependencies: + - bioconda::lofreq=2.1.5 diff --git a/modules/nf-core/lofreq/callparallel/main.nf b/modules/nf-core/lofreq/callparallel/main.nf new file mode 100644 index 0000000000..93f9a3dfb1 --- /dev/null +++ b/modules/nf-core/lofreq/callparallel/main.nf @@ -0,0 +1,70 @@ +process LOFREQ_CALLPARALLEL { + tag "$meta.id" + label 'process_high' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/lofreq:2.1.5--py38h588ecb2_4' : + 'biocontainers/lofreq:2.1.5--py38h588ecb2_4' }" + + input: + tuple val(meta) , path(bam), path(bai), path(intervals) + tuple val(meta2), path(fasta) + tuple val(meta3), path(fai) + + output: + tuple val(meta), path("*.vcf.gz") , emit: vcf + tuple val(meta), path("*.vcf.gz.tbi"), emit: tbi + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def options_intervals = intervals ? "-l ${intervals}" : "" + + def alignment_cram = bam.Extension == "cram" ? true : false + def alignment_bam = bam.Extension == "bam" ? true : false + def alignment_out = alignment_cram ? bam.BaseName + ".bam" : "${bam}" + + def samtools_cram_convert = '' + samtools_cram_convert += alignment_cram ? " samtools view -T ${fasta} ${bam} -@ $task.cpus -o ${alignment_out}\n" : '' + samtools_cram_convert += alignment_cram ? " samtools index ${alignment_out}\n" : '' + + def samtools_cram_remove = '' + samtools_cram_remove += alignment_cram ? " rm ${alignment_out}\n" : '' + samtools_cram_remove += alignment_cram ? " rm ${alignment_out}.bai\n " : '' + """ + $samtools_cram_convert + + lofreq \\ + call-parallel \\ + --pp-threads $task.cpus \\ + $args \\ + $options_intervals \\ + -f $fasta \\ + -o ${prefix}.vcf.gz \\ + $alignment_out + + $samtools_cram_remove + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + lofreq: \$(echo \$(lofreq version 2>&1) | sed 's/^version: //; s/ *commit.*\$//') + END_VERSIONS + """ + + stub: + def prefix = task.ext.prefix ?: "${meta.id}" + """ + echo "" | gzip > ${prefix}.vcf.gz + echo "" | gzip > ${prefix}.vcf.gz.tbi + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + lofreq: \$(echo \$(lofreq version 2>&1) | sed 's/^version: //; s/ *commit.*\$//') + END_VERSIONS + """ +} diff --git a/modules/nf-core/lofreq/callparallel/meta.yml b/modules/nf-core/lofreq/callparallel/meta.yml new file mode 100644 index 0000000000..25a33e85c4 --- /dev/null +++ b/modules/nf-core/lofreq/callparallel/meta.yml @@ -0,0 +1,87 @@ +name: lofreq_callparallel +description: It predicts variants using multiple processors +keywords: + - variant calling + - low frequency variant calling + - call + - variants +tools: + - lofreq: + description: Lofreq is a fast and sensitive variant-caller for inferring SNVs + and indels from next-generation sequencing data. It's call-parallel programme + predicts variants using multiple processors + homepage: https://csb5.github.io/lofreq/ + documentation: https://csb5.github.io/lofreq/ + doi: "10.1093/nar/gks918" + licence: ["MIT"] + identifier: biotools:lofreq +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test' ] + - bam: + type: file + description: Tumor sample sorted BAM file + pattern: "*.{bam}" + - bai: + type: file + description: BAM index file + pattern: "*.{bam.bai}" + - intervals: + type: file + description: BED file containing target regions for variant calling + pattern: "*.{bed}" + - - meta2: + type: map + description: | + Groovy Map containing sample information about the reference fasta + e.g. [ id:'reference' ] + - fasta: + type: file + description: Reference genome FASTA file + pattern: "*.{fasta}" + - - meta3: + type: map + description: | + Groovy Map containing sample information about the reference fasta fai + e.g. [ id:'reference' ] + - fai: + type: file + description: Reference genome FASTA index file + pattern: "*.{fai}" +output: + - vcf: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.vcf.gz": + type: file + description: Predicted variants file + pattern: "*.{vcf}" + - tbi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.vcf.gz.tbi": + type: file + description: Index of vcf file + pattern: "*.{vcf.gz.tbi}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@kaurravneet4123" + - "@bjohnnyd" +maintainers: + - "@kaurravneet4123" + - "@bjohnnyd" + - "@nevinwu" + - "@AitorPeseta" diff --git a/modules/nf-core/lofreq/callparallel/tests/main.nf.test b/modules/nf-core/lofreq/callparallel/tests/main.nf.test new file mode 100644 index 0000000000..31f199208b --- /dev/null +++ b/modules/nf-core/lofreq/callparallel/tests/main.nf.test @@ -0,0 +1,109 @@ +nextflow_process { + + name "Test Process LOFREQ_CALLPARALLEL" + script "../main.nf" + process "LOFREQ_CALLPARALLEL" + + tag "modules" + tag "modules_nfcore" + tag "lofreq" + tag "lofreq/callparallel" + + test("sarscov2 - bam") { + + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), + [] + ] + input[1] = [ [ id:'fasta' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + input[2] = [ [ id:'fai' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.versions, + process.out.tbi, + path(process.out.vcf[0][1]).vcf.summary + ).match() }, + ) + } + + } + + test("sarscov2 - bam - bed") { + + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/test.bed', checkIfExists: true) + ] + input[1] = [ [ id:'fasta' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + input[2] = [ [ id:'fai' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.versions, + process.out.tbi, + path(process.out.vcf[0][1]).vcf.summary + ).match() }, + ) + } + + } + + test("sarscov2 - bam - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), + [] + ] + input[1] = [ [ id:'fasta' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + input[2] = [ [ id:'fai' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/lofreq/callparallel/tests/main.nf.test.snap b/modules/nf-core/lofreq/callparallel/tests/main.nf.test.snap new file mode 100644 index 0000000000..11587f4ace --- /dev/null +++ b/modules/nf-core/lofreq/callparallel/tests/main.nf.test.snap @@ -0,0 +1,93 @@ +{ + "sarscov2 - bam - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test" + }, + "test.vcf.gz.tbi:md5,1a60c330fb42841e8dcf3cd507a70bfc" + ] + ], + "2": [ + "versions.yml:md5,56d45e0015add277b2689f071a4fe3e4" + ], + "tbi": [ + [ + { + "id": "test" + }, + "test.vcf.gz.tbi:md5,1a60c330fb42841e8dcf3cd507a70bfc" + ] + ], + "vcf": [ + [ + { + "id": "test" + }, + "test.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,56d45e0015add277b2689f071a4fe3e4" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.04.0" + }, + "timestamp": "2024-08-28T12:01:24.268196316" + }, + "sarscov2 - bam": { + "content": [ + [ + "versions.yml:md5,56d45e0015add277b2689f071a4fe3e4" + ], + [ + [ + { + "id": "test" + }, + "test.vcf.gz.tbi:md5,4cb176febbc8c26d717a6c6e67b9c905" + ] + ], + "VcfFile [chromosomes=[], sampleCount=0, variantCount=0, phased=true, phasedAutodetect=true]" + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.04.0" + }, + "timestamp": "2024-08-28T14:14:55.381365088" + }, + "sarscov2 - bam - bed": { + "content": [ + [ + "versions.yml:md5,56d45e0015add277b2689f071a4fe3e4" + ], + [ + [ + { + "id": "test" + }, + "test.vcf.gz.tbi:md5,4cb176febbc8c26d717a6c6e67b9c905" + ] + ], + "VcfFile [chromosomes=[], sampleCount=0, variantCount=0, phased=true, phasedAutodetect=true]" + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.04.0" + }, + "timestamp": "2024-08-28T14:15:18.221515296" + } +} \ No newline at end of file diff --git a/modules/nf-core/lofreq/callparallel/tests/tags.yml b/modules/nf-core/lofreq/callparallel/tests/tags.yml new file mode 100644 index 0000000000..14c36bc274 --- /dev/null +++ b/modules/nf-core/lofreq/callparallel/tests/tags.yml @@ -0,0 +1,2 @@ +lofreq/callparallel: + - "modules/nf-core/lofreq/callparallel/**" diff --git a/modules/nf-core/manta/germline/environment.yml b/modules/nf-core/manta/germline/environment.yml index 4a63d3084b..fe5ade5068 100644 --- a/modules/nf-core/manta/germline/environment.yml +++ b/modules/nf-core/manta/germline/environment.yml @@ -1,7 +1,5 @@ -name: manta_germline channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::manta=1.6.0 diff --git a/modules/nf-core/manta/germline/meta.yml b/modules/nf-core/manta/germline/meta.yml index 72ed15f8bc..4072ab8e56 100644 --- a/modules/nf-core/manta/germline/meta.yml +++ b/modules/nf-core/manta/germline/meta.yml @@ -1,5 +1,7 @@ name: manta_germline -description: Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs. +description: Manta calls structural variants (SVs) and indels from mapped paired-end + sequencing reads. It is optimized for analysis of germline variation in small sets + of individuals and somatic variation in tumor/normal sample pairs. keywords: - somatic - wgs @@ -16,84 +18,117 @@ tools: tool_dev_url: https://github.com/Illumina/manta doi: "10.1093/bioinformatics/btv710" licence: ["GPL v3"] + identifier: biotools:manta_sv input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM/SAM file. For joint calling use a list of files. - pattern: "*.{bam,cram,sam}" - - index: - type: file - description: BAM/CRAM/SAM index file. For joint calling use a list of files. - pattern: "*.{bai,crai,sai}" - - target_bed: - type: file - description: BED file containing target regions for variant calling - pattern: "*.{bed}" - - target_bed_tbi: - type: file - description: Index for BED file containing target regions for variant calling - pattern: "*.{bed.tbi}" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fasta: - type: file - description: Genome reference FASTA file - pattern: "*.{fa,fasta}" - - meta3: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fai: - type: file - description: Genome reference FASTA index file - pattern: "*.{fa.fai,fasta.fai}" - - config: - type: file - description: Manta configuration file - pattern: "*.{ini,conf,config}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM/SAM file. For joint calling use a list of files. + pattern: "*.{bam,cram,sam}" + - index: + type: file + description: BAM/CRAM/SAM index file. For joint calling use a list of files. + pattern: "*.{bai,crai,sai}" + - target_bed: + type: file + description: BED file containing target regions for variant calling + pattern: "*.{bed}" + - target_bed_tbi: + type: file + description: Index for BED file containing target regions for variant calling + pattern: "*.{bed.tbi}" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fasta: + type: file + description: Genome reference FASTA file + pattern: "*.{fa,fasta}" + - - meta3: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fai: + type: file + description: Genome reference FASTA index file + pattern: "*.{fa.fai,fasta.fai}" + - - config: + type: file + description: Manta configuration file + pattern: "*.{ini,conf,config}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - candidate_small_indels_vcf: - type: file - description: Gzipped VCF file containing variants - pattern: "*.{vcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*candidate_small_indels.vcf.gz": + type: file + description: Gzipped VCF file containing variants + pattern: "*.{vcf.gz}" - candidate_small_indels_vcf_tbi: - type: file - description: Index for gzipped VCF file containing variants - pattern: "*.{vcf.gz.tbi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*candidate_small_indels.vcf.gz.tbi": + type: file + description: Index for gzipped VCF file containing variants + pattern: "*.{vcf.gz.tbi}" - candidate_sv_vcf: - type: file - description: Gzipped VCF file containing variants - pattern: "*.{vcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*candidate_sv.vcf.gz": + type: file + description: Gzipped VCF file containing variants + pattern: "*.{vcf.gz}" - candidate_sv_vcf_tbi: - type: file - description: Index for gzipped VCF file containing variants - pattern: "*.{vcf.gz.tbi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*candidate_sv.vcf.gz.tbi": + type: file + description: Index for gzipped VCF file containing variants + pattern: "*.{vcf.gz.tbi}" - diploid_sv_vcf: - type: file - description: Gzipped VCF file containing variants - pattern: "*.{vcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*diploid_sv.vcf.gz": + type: file + description: Gzipped VCF file containing variants + pattern: "*.{vcf.gz}" - diploid_sv_vcf_tbi: - type: file - description: Index for gzipped VCF file containing variants - pattern: "*.{vcf.gz.tbi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*diploid_sv.vcf.gz.tbi": + type: file + description: Index for gzipped VCF file containing variants + pattern: "*.{vcf.gz.tbi}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@maxulysse" - "@ramprasadn" diff --git a/modules/nf-core/manta/somatic/environment.yml b/modules/nf-core/manta/somatic/environment.yml index aac8827dfc..fe5ade5068 100644 --- a/modules/nf-core/manta/somatic/environment.yml +++ b/modules/nf-core/manta/somatic/environment.yml @@ -1,7 +1,5 @@ -name: manta_somatic channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::manta=1.6.0 diff --git a/modules/nf-core/manta/somatic/meta.yml b/modules/nf-core/manta/somatic/meta.yml index e658edaaa4..11f1fc1ca0 100644 --- a/modules/nf-core/manta/somatic/meta.yml +++ b/modules/nf-core/manta/somatic/meta.yml @@ -1,5 +1,7 @@ name: manta_somatic -description: Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs. +description: Manta calls structural variants (SVs) and indels from mapped paired-end + sequencing reads. It is optimized for analysis of germline variation in small sets + of individuals and somatic variation in tumor/normal sample pairs. keywords: - somatic - wgs @@ -16,100 +18,145 @@ tools: tool_dev_url: https://github.com/Illumina/manta doi: "10.1093/bioinformatics/btv710" licence: ["GPL v3"] + identifier: biotools:manta_sv input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input_normal: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - input_index_normal: - type: file - description: BAM/CRAM/SAM index file - pattern: "*.{bai,crai,sai}" - - input_tumor: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - input_index_tumor: - type: file - description: BAM/CRAM/SAM index file - pattern: "*.{bai,crai,sai}" - - target_bed: - type: file - description: BED file containing target regions for variant calling - pattern: "*.{bed}" - - target_bed_tbi: - type: file - description: Index for BED file containing target regions for variant calling - pattern: "*.{bed.tbi}" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fasta: - type: file - description: Genome reference FASTA file - pattern: "*.{fa,fasta}" - - meta3: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fai: - type: file - description: Genome reference FASTA index file - pattern: "*.{fa.fai,fasta.fai}" - - config: - type: file - description: Manta configuration file - pattern: "*.{ini,conf,config}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input_normal: + type: file + description: BAM/CRAM/SAM file + pattern: "*.{bam,cram,sam}" + - input_index_normal: + type: file + description: BAM/CRAM/SAM index file + pattern: "*.{bai,crai,sai}" + - input_tumor: + type: file + description: BAM/CRAM/SAM file + pattern: "*.{bam,cram,sam}" + - input_index_tumor: + type: file + description: BAM/CRAM/SAM index file + pattern: "*.{bai,crai,sai}" + - target_bed: + type: file + description: BED file containing target regions for variant calling + pattern: "*.{bed}" + - target_bed_tbi: + type: file + description: Index for BED file containing target regions for variant calling + pattern: "*.{bed.tbi}" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fasta: + type: file + description: Genome reference FASTA file + pattern: "*.{fa,fasta}" + - - meta3: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fai: + type: file + description: Genome reference FASTA index file + pattern: "*.{fa.fai,fasta.fai}" + - - config: + type: file + description: Manta configuration file + pattern: "*.{ini,conf,config}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - candidate_small_indels_vcf: - type: file - description: Gzipped VCF file containing variants - pattern: "*.{vcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.candidate_small_indels.vcf.gz": + type: file + description: Gzipped VCF file containing variants + pattern: "*.{vcf.gz}" - candidate_small_indels_vcf_tbi: - type: file - description: Index for gzipped VCF file containing variants - pattern: "*.{vcf.gz.tbi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.candidate_small_indels.vcf.gz.tbi": + type: file + description: Index for gzipped VCF file containing variants + pattern: "*.{vcf.gz.tbi}" - candidate_sv_vcf: - type: file - description: Gzipped VCF file containing variants - pattern: "*.{vcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.candidate_sv.vcf.gz": + type: file + description: Gzipped VCF file containing variants + pattern: "*.{vcf.gz}" - candidate_sv_vcf_tbi: - type: file - description: Index for gzipped VCF file containing variants - pattern: "*.{vcf.gz.tbi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.candidate_sv.vcf.gz.tbi": + type: file + description: Index for gzipped VCF file containing variants + pattern: "*.{vcf.gz.tbi}" - diploid_sv_vcf: - type: file - description: Gzipped VCF file containing variants - pattern: "*.{vcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.diploid_sv.vcf.gz": + type: file + description: Gzipped VCF file containing variants + pattern: "*.{vcf.gz}" - diploid_sv_vcf_tbi: - type: file - description: Index for gzipped VCF file containing variants - pattern: "*.{vcf.gz.tbi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.diploid_sv.vcf.gz.tbi": + type: file + description: Index for gzipped VCF file containing variants + pattern: "*.{vcf.gz.tbi}" - somatic_sv_vcf: - type: file - description: Gzipped VCF file containing variants - pattern: "*.{vcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.somatic_sv.vcf.gz": + type: file + description: Gzipped VCF file containing variants + pattern: "*.{vcf.gz}" - somatic_sv_vcf_tbi: - type: file - description: Index for gzipped VCF file containing variants - pattern: "*.{vcf.gz.tbi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.somatic_sv.vcf.gz.tbi": + type: file + description: Index for gzipped VCF file containing variants + pattern: "*.{vcf.gz.tbi}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@FriederikeHanssen" - "@nvnieuwk" diff --git a/modules/nf-core/manta/tumoronly/environment.yml b/modules/nf-core/manta/tumoronly/environment.yml index cf5db361e0..fe5ade5068 100644 --- a/modules/nf-core/manta/tumoronly/environment.yml +++ b/modules/nf-core/manta/tumoronly/environment.yml @@ -1,7 +1,5 @@ -name: manta_tumoronly channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::manta=1.6.0 diff --git a/modules/nf-core/manta/tumoronly/meta.yml b/modules/nf-core/manta/tumoronly/meta.yml index 63556c59b4..6f629b24dc 100644 --- a/modules/nf-core/manta/tumoronly/meta.yml +++ b/modules/nf-core/manta/tumoronly/meta.yml @@ -1,5 +1,7 @@ name: manta_tumoronly -description: Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs. +description: Manta calls structural variants (SVs) and indels from mapped paired-end + sequencing reads. It is optimized for analysis of germline variation in small sets + of individuals and somatic variation in tumor/normal sample pairs. keywords: - somatic - wgs @@ -16,84 +18,117 @@ tools: tool_dev_url: https://github.com/Illumina/manta doi: "10.1093/bioinformatics/btv710" licence: ["GPL v3"] + identifier: biotools:manta_sv input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - input_index: - type: file - description: BAM/CRAM/SAM index file - pattern: "*.{bai,crai,sai}" - - target_bed: - type: file - description: BED file containing target regions for variant calling - pattern: "*.{bed}" - - target_bed_tbi: - type: file - description: Index for BED file containing target regions for variant calling - pattern: "*.{bed.tbi}" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fasta: - type: file - description: Genome reference FASTA file - pattern: "*.{fa,fasta}" - - meta3: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fai: - type: file - description: Genome reference FASTA index file - pattern: "*.{fa.fai,fasta.fai}" - - config: - type: file - description: Manta configuration file - pattern: "*.{ini,conf,config}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM/SAM file + pattern: "*.{bam,cram,sam}" + - input_index: + type: file + description: BAM/CRAM/SAM index file + pattern: "*.{bai,crai,sai}" + - target_bed: + type: file + description: BED file containing target regions for variant calling + pattern: "*.{bed}" + - target_bed_tbi: + type: file + description: Index for BED file containing target regions for variant calling + pattern: "*.{bed.tbi}" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fasta: + type: file + description: Genome reference FASTA file + pattern: "*.{fa,fasta}" + - - meta3: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fai: + type: file + description: Genome reference FASTA index file + pattern: "*.{fa.fai,fasta.fai}" + - - config: + type: file + description: Manta configuration file + pattern: "*.{ini,conf,config}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - candidate_small_indels_vcf: - type: file - description: Gzipped VCF file containing variants - pattern: "*.{vcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*candidate_small_indels.vcf.gz": + type: file + description: Gzipped VCF file containing variants + pattern: "*.{vcf.gz}" - candidate_small_indels_vcf_tbi: - type: file - description: Index for gzipped VCF file containing variants - pattern: "*.{vcf.gz.tbi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*candidate_small_indels.vcf.gz.tbi": + type: file + description: Index for gzipped VCF file containing variants + pattern: "*.{vcf.gz.tbi}" - candidate_sv_vcf: - type: file - description: Gzipped VCF file containing variants - pattern: "*.{vcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*candidate_sv.vcf.gz": + type: file + description: Gzipped VCF file containing variants + pattern: "*.{vcf.gz}" - candidate_sv_vcf_tbi: - type: file - description: Index for gzipped VCF file containing variants - pattern: "*.{vcf.gz.tbi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*candidate_sv.vcf.gz.tbi": + type: file + description: Index for gzipped VCF file containing variants + pattern: "*.{vcf.gz.tbi}" - tumor_sv_vcf: - type: file - description: Gzipped VCF file containing variants - pattern: "*.{vcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*tumor_sv.vcf.gz": + type: file + description: Gzipped VCF file containing variants + pattern: "*.{vcf.gz}" - tumor_sv_vcf_tbi: - type: file - description: Index for gzipped VCF file containing variants - pattern: "*.{vcf.gz.tbi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*tumor_sv.vcf.gz.tbi": + type: file + description: Index for gzipped VCF file containing variants + pattern: "*.{vcf.gz.tbi}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@maxulysse" - "@nvnieuwk" diff --git a/modules/nf-core/mosdepth/environment.yml b/modules/nf-core/mosdepth/environment.yml index bcb9d64a7a..e937987380 100644 --- a/modules/nf-core/mosdepth/environment.yml +++ b/modules/nf-core/mosdepth/environment.yml @@ -1,8 +1,6 @@ -name: mosdepth channels: - conda-forge - bioconda - - defaults dependencies: # renovate: datasource=conda depName=bioconda/mosdepth - mosdepth=0.3.8 diff --git a/modules/nf-core/mosdepth/meta.yml b/modules/nf-core/mosdepth/meta.yml index 9caaf2cdbc..dc783c9006 100644 --- a/modules/nf-core/mosdepth/meta.yml +++ b/modules/nf-core/mosdepth/meta.yml @@ -12,91 +12,161 @@ tools: documentation: https://github.com/brentp/mosdepth doi: 10.1093/bioinformatics/btx699 licence: ["MIT"] + identifier: biotools:mosdepth input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: Input BAM/CRAM file - pattern: "*.{bam,cram}" - - bai: - type: file - description: Index for BAM/CRAM file - pattern: "*.{bai,crai}" - - bed: - type: file - description: BED file with intersected intervals - pattern: "*.{bed}" - - meta2: - type: map - description: | - Groovy Map containing bed information - e.g. [ id:'test' ] - - fasta: - type: file - description: Reference genome FASTA file - pattern: "*.{fa,fasta}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - bam: + type: file + description: Input BAM/CRAM file + pattern: "*.{bam,cram}" + - bai: + type: file + description: Index for BAM/CRAM file + pattern: "*.{bai,crai}" + - bed: + type: file + description: BED file with intersected intervals + pattern: "*.{bed}" + - - meta2: + type: map + description: | + Groovy Map containing bed information + e.g. [ id:'test' ] + - fasta: + type: file + description: Reference genome FASTA file + pattern: "*.{fa,fasta}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - global_txt: - type: file - description: Text file with global cumulative coverage distribution - pattern: "*.{global.dist.txt}" - - regions_txt: - type: file - description: Text file with region cumulative coverage distribution - pattern: "*.{region.dist.txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.global.dist.txt": + type: file + description: Text file with global cumulative coverage distribution + pattern: "*.{global.dist.txt}" - summary_txt: - type: file - description: Text file with summary mean depths per chromosome and regions - pattern: "*.{summary.txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.summary.txt": + type: file + description: Text file with summary mean depths per chromosome and regions + pattern: "*.{summary.txt}" + - regions_txt: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.region.dist.txt": + type: file + description: Text file with region cumulative coverage distribution + pattern: "*.{region.dist.txt}" + - per_base_d4: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.per-base.d4": + type: file + description: D4 file with per-base coverage + pattern: "*.{per-base.d4}" - per_base_bed: - type: file - description: BED file with per-base coverage - pattern: "*.{per-base.bed.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.per-base.bed.gz": + type: file + description: BED file with per-base coverage + pattern: "*.{per-base.bed.gz}" - per_base_csi: - type: file - description: Index file for BED file with per-base coverage - pattern: "*.{per-base.bed.gz.csi}" - - per_base_d4: - type: file - description: D4 file with per-base coverage - pattern: "*.{per-base.d4}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.per-base.bed.gz.csi": + type: file + description: Index file for BED file with per-base coverage + pattern: "*.{per-base.bed.gz.csi}" - regions_bed: - type: file - description: BED file with per-region coverage - pattern: "*.{regions.bed.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.regions.bed.gz": + type: file + description: BED file with per-region coverage + pattern: "*.{regions.bed.gz}" - regions_csi: - type: file - description: Index file for BED file with per-region coverage - pattern: "*.{regions.bed.gz.csi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.regions.bed.gz.csi": + type: file + description: Index file for BED file with per-region coverage + pattern: "*.{regions.bed.gz.csi}" - quantized_bed: - type: file - description: BED file with binned coverage - pattern: "*.{quantized.bed.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.quantized.bed.gz": + type: file + description: BED file with binned coverage + pattern: "*.{quantized.bed.gz}" - quantized_csi: - type: file - description: Index file for BED file with binned coverage - pattern: "*.{quantized.bed.gz.csi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.quantized.bed.gz.csi": + type: file + description: Index file for BED file with binned coverage + pattern: "*.{quantized.bed.gz.csi}" - thresholds_bed: - type: file - description: BED file with the number of bases in each region that are covered at or above each threshold - pattern: "*.{thresholds.bed.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.thresholds.bed.gz": + type: file + description: BED file with the number of bases in each region that are covered + at or above each threshold + pattern: "*.{thresholds.bed.gz}" - thresholds_csi: - type: file - description: Index file for BED file with threshold coverage - pattern: "*.{thresholds.bed.gz.csi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.thresholds.bed.gz.csi": + type: file + description: Index file for BED file with threshold coverage + pattern: "*.{thresholds.bed.gz.csi}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/mosdepth/tests/main.nf.test b/modules/nf-core/mosdepth/tests/main.nf.test index 21eebc1fa5..0b3c860d32 100644 --- a/modules/nf-core/mosdepth/tests/main.nf.test +++ b/modules/nf-core/mosdepth/tests/main.nf.test @@ -15,8 +15,8 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:true ], - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam_bai'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), [] ] input[1] = [[],[]] @@ -40,9 +40,9 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:true ], - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam_bai'], checkIfExists: true), - file(params.test_data['homo_sapiens']['genome']['genome_bed'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists: true) ] input[1] = [[],[]] """ @@ -65,13 +65,13 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:true ], - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_cram'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_cram_crai'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true), [] ] input[1] = [ [ id:'test' ], - file(params.test_data['homo_sapiens']['genome']['genome_fasta'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) ] """ } @@ -93,13 +93,13 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:true ], - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_cram'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_cram_crai'], checkIfExists: true), - file(params.test_data['homo_sapiens']['genome']['genome_bed'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists: true) ] input[1] = [ [ id:'test' ], - file(params.test_data['homo_sapiens']['genome']['genome_fasta'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) ] """ } @@ -122,8 +122,8 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:true ], - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam_bai'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), [] ] input[1] = [[],[]] @@ -148,8 +148,8 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:true ], - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam_bai'], checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), [] ] input[1] = [[],[]] @@ -174,9 +174,9 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:true ], - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam_bai'], checkIfExists: true), - file(params.test_data['homo_sapiens']['genome']['genome_bed'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists: true) ] input[1] = [[],[]] """ @@ -200,9 +200,9 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:true ], - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam_bai'], checkIfExists: true), - file(params.test_data['homo_sapiens']['genome']['genome_bed'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists: true) ] input[1] = [[],[]] """ @@ -225,9 +225,9 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:true ], - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_bam_bai'], checkIfExists: true), - file(params.test_data['homo_sapiens']['genome']['genome_bed'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists: true) ] input[1] = [[],[]] """ diff --git a/modules/nf-core/msisensorpro/msisomatic/environment.yml b/modules/nf-core/msisensorpro/msisomatic/environment.yml index 147a9d6b85..f67b9b733e 100644 --- a/modules/nf-core/msisensorpro/msisomatic/environment.yml +++ b/modules/nf-core/msisensorpro/msisomatic/environment.yml @@ -1,7 +1,5 @@ -name: msisensorpro_msisomatic channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::msisensor-pro=1.2.0 diff --git a/modules/nf-core/msisensorpro/msisomatic/meta.yml b/modules/nf-core/msisensorpro/msisomatic/meta.yml index a6dda66ff2..48daa6cc64 100644 --- a/modules/nf-core/msisensorpro/msisomatic/meta.yml +++ b/modules/nf-core/msisensorpro/msisomatic/meta.yml @@ -1,5 +1,7 @@ name: msisensorpro_msisomatic -description: MSIsensor-pro evaluates Microsatellite Instability (MSI) for cancer patients with next generation sequencing data. It accepts the whole genome sequencing, whole exome sequencing and target region (panel) sequencing data as input +description: MSIsensor-pro evaluates Microsatellite Instability (MSI) for cancer patients + with next generation sequencing data. It accepts the whole genome sequencing, whole + exome sequencing and target region (panel) sequencing data as input keywords: - micro-satellite-scan - msisensor-pro @@ -7,72 +9,91 @@ keywords: - somatic tools: - msisensorpro: - description: Microsatellite Instability (MSI) detection using high-throughput sequencing data. + description: Microsatellite Instability (MSI) detection using high-throughput + sequencing data. homepage: https://github.com/xjtu-omics/msisensor-pro documentation: https://github.com/xjtu-omics/msisensor-pro/wiki tool_dev_url: https://github.com/xjtu-omics/msisensor-pro doi: "10.1016/j.gpb.2020.02.001" licence: ["Custom Licence"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - normal: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - normal_index: - type: file - description: BAM/CRAM/SAM index file - pattern: "*.{bai,crai,sai}" - - tumor: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - tumor_index: - type: file - description: BAM/CRAM/SAM index file - pattern: "*.{bai,crai,sai}" - - intervals: - type: file - description: bed file containing interval information, optional - pattern: "*.{bed}" - - fasta: - type: file - description: Reference genome - pattern: "*.{fasta}" - - msisensor_scan: - type: file - description: Output from msisensor-pro/scan, conaining list of msi regions - pattern: "*.list" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - normal: + type: file + description: BAM/CRAM/SAM file + pattern: "*.{bam,cram,sam}" + - normal_index: + type: file + description: BAM/CRAM/SAM index file + pattern: "*.{bai,crai,sai}" + - tumor: + type: file + description: BAM/CRAM/SAM file + pattern: "*.{bam,cram,sam}" + - tumor_index: + type: file + description: BAM/CRAM/SAM index file + pattern: "*.{bai,crai,sai}" + - intervals: + type: file + description: bed file containing interval information, optional + pattern: "*.{bed}" + - - fasta: + type: file + description: Reference genome + pattern: "*.{fasta}" + - - msisensor_scan: + type: file + description: Output from msisensor-pro/scan, containing list of msi regions + pattern: "*.list" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - output_report: - type: file - description: File containing final report with all detected microsatellites, unstable somatic microsatellites, msi score + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}: + type: file + description: File containing final report with all detected microsatellites, + unstable somatic microsatellites, msi score - output_dis: - type: file - description: File containing distribution results + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}_dis: + type: file + description: File containing distribution results - output_germline: - type: file - description: File containing germline results + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}_germline: + type: file + description: File containing germline results - output_somatic: - type: file - description: File containing somatic results + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}_somatic: + type: file + description: File containing somatic results - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - - list: - type: file - description: File containing microsatellite list - pattern: "*.{list}" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@FriederikeHanssen" maintainers: diff --git a/modules/nf-core/msisensorpro/scan/environment.yml b/modules/nf-core/msisensorpro/scan/environment.yml index 377c28a61b..f67b9b733e 100644 --- a/modules/nf-core/msisensorpro/scan/environment.yml +++ b/modules/nf-core/msisensorpro/scan/environment.yml @@ -1,7 +1,5 @@ -name: msisensorpro_scan channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::msisensor-pro=1.2.0 diff --git a/modules/nf-core/msisensorpro/scan/main.nf b/modules/nf-core/msisensorpro/scan/main.nf index 9c7dce2596..265e6a132a 100644 --- a/modules/nf-core/msisensorpro/scan/main.nf +++ b/modules/nf-core/msisensorpro/scan/main.nf @@ -32,4 +32,15 @@ process MSISENSORPRO_SCAN { msisensor-pro: \$(msisensor-pro 2>&1 | sed -nE 's/Version:\\sv([0-9]\\.[0-9])/\\1/ p') END_VERSIONS """ + + stub: + def prefix = task.ext.prefix ?: "${meta.id}" + """ + touch ${prefix}.msisensor_scan.list + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + msisensor-pro: \$(msisensor-pro 2>&1 | sed -nE 's/Version:\\sv([0-9]\\.[0-9])/\\1/ p') + END_VERSIONS + """ } diff --git a/modules/nf-core/msisensorpro/scan/meta.yml b/modules/nf-core/msisensorpro/scan/meta.yml index aec743ede5..ac9674b838 100644 --- a/modules/nf-core/msisensorpro/scan/meta.yml +++ b/modules/nf-core/msisensorpro/scan/meta.yml @@ -1,41 +1,47 @@ name: msisensorpro_scan -description: MSIsensor-pro evaluates Microsatellite Instability (MSI) for cancer patients with next generation sequencing data. It accepts the whole genome sequencing, whole exome sequencing and target region (panel) sequencing data as input +description: MSIsensor-pro evaluates Microsatellite Instability (MSI) for cancer patients + with next generation sequencing data. It accepts the whole genome sequencing, whole + exome sequencing and target region (panel) sequencing data as input keywords: - micro-satellite-scan - msisensor-pro - scan tools: - msisensorpro: - description: Microsatellite Instability (MSI) detection using high-throughput sequencing data. + description: Microsatellite Instability (MSI) detection using high-throughput + sequencing data. homepage: https://github.com/xjtu-omics/msisensor-pro documentation: https://github.com/xjtu-omics/msisensor-pro/wiki tool_dev_url: https://github.com/xjtu-omics/msisensor-pro doi: "10.1016/j.gpb.2020.02.001" licence: ["Custom Licence"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - fasta: - type: file - description: Reference genome - pattern: "*.{fasta}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: Reference genome + pattern: "*.{fasta}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - list: - type: file - description: File containing microsatellite list - pattern: "*.{list}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.list": + type: file + description: File containing microsatellite list + pattern: "*.{list}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@FriederikeHanssen" maintainers: diff --git a/modules/nf-core/msisensorpro/scan/tests/main.nf.test b/modules/nf-core/msisensorpro/scan/tests/main.nf.test new file mode 100644 index 0000000000..38334ac221 --- /dev/null +++ b/modules/nf-core/msisensorpro/scan/tests/main.nf.test @@ -0,0 +1,58 @@ + +nextflow_process { + + name "Test Process MSISENSORPRO_SCAN" + script "../main.nf" + process "MSISENSORPRO_SCAN" + + tag "modules" + tag "modules_nfcore" + tag "msisensorpro" + tag "msisensorpro/scan" + + test("test-msisensorpro-scan") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true) + ] + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test-msisensorpro-scan-stub") { + options '-stub' + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true) + ] + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + +} diff --git a/modules/nf-core/msisensorpro/scan/tests/main.nf.test.snap b/modules/nf-core/msisensorpro/scan/tests/main.nf.test.snap new file mode 100644 index 0000000000..f7ea66fc84 --- /dev/null +++ b/modules/nf-core/msisensorpro/scan/tests/main.nf.test.snap @@ -0,0 +1,72 @@ +{ + "test-msisensorpro-scan-stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.msisensor_scan.list:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,e99820cdb69a600f5919ee1d7d5d1c3f" + ], + "list": [ + [ + { + "id": "test", + "single_end": false + }, + "test.msisensor_scan.list:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,e99820cdb69a600f5919ee1d7d5d1c3f" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-05T16:44:21.450285" + }, + "test-msisensorpro-scan": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.msisensor_scan.list:md5,309d41b136993db24a9f3dade877753b" + ] + ], + "1": [ + "versions.yml:md5,e99820cdb69a600f5919ee1d7d5d1c3f" + ], + "list": [ + [ + { + "id": "test", + "single_end": false + }, + "test.msisensor_scan.list:md5,309d41b136993db24a9f3dade877753b" + ] + ], + "versions": [ + "versions.yml:md5,e99820cdb69a600f5919ee1d7d5d1c3f" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-05T16:44:09.684249" + } +} \ No newline at end of file diff --git a/modules/nf-core/multiqc/environment.yml b/modules/nf-core/multiqc/environment.yml index ca39fb67e2..6f5b867b76 100644 --- a/modules/nf-core/multiqc/environment.yml +++ b/modules/nf-core/multiqc/environment.yml @@ -1,7 +1,5 @@ -name: multiqc channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::multiqc=1.21 + - bioconda::multiqc=1.25.1 diff --git a/modules/nf-core/multiqc/main.nf b/modules/nf-core/multiqc/main.nf index 47ac352f94..cc0643e1d5 100644 --- a/modules/nf-core/multiqc/main.nf +++ b/modules/nf-core/multiqc/main.nf @@ -3,14 +3,16 @@ process MULTIQC { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/multiqc:1.21--pyhdfd78af_0' : - 'biocontainers/multiqc:1.21--pyhdfd78af_0' }" + 'https://depot.galaxyproject.org/singularity/multiqc:1.25.1--pyhdfd78af_0' : + 'biocontainers/multiqc:1.25.1--pyhdfd78af_0' }" input: path multiqc_files, stageAs: "?/*" path(multiqc_config) path(extra_multiqc_config) path(multiqc_logo) + path(replace_names) + path(sample_names) output: path "*multiqc_report.html", emit: report @@ -23,16 +25,22 @@ process MULTIQC { script: def args = task.ext.args ?: '' + def prefix = task.ext.prefix ? "--filename ${task.ext.prefix}.html" : '' def config = multiqc_config ? "--config $multiqc_config" : '' def extra_config = extra_multiqc_config ? "--config $extra_multiqc_config" : '' - def logo = multiqc_logo ? /--cl-config 'custom_logo: "${multiqc_logo}"'/ : '' + def logo = multiqc_logo ? "--cl-config 'custom_logo: \"${multiqc_logo}\"'" : '' + def replace = replace_names ? "--replace-names ${replace_names}" : '' + def samples = sample_names ? "--sample-names ${sample_names}" : '' """ multiqc \\ --force \\ $args \\ $config \\ + $prefix \\ $extra_config \\ $logo \\ + $replace \\ + $samples \\ . cat <<-END_VERSIONS > versions.yml @@ -44,7 +52,7 @@ process MULTIQC { stub: """ mkdir multiqc_data - touch multiqc_plots + mkdir multiqc_plots touch multiqc_report.html cat <<-END_VERSIONS > versions.yml diff --git a/modules/nf-core/multiqc/meta.yml b/modules/nf-core/multiqc/meta.yml index 45a9bc35e1..b16c187923 100644 --- a/modules/nf-core/multiqc/meta.yml +++ b/modules/nf-core/multiqc/meta.yml @@ -1,5 +1,6 @@ name: multiqc -description: Aggregate results from bioinformatics analyses across many samples into a single report +description: Aggregate results from bioinformatics analyses across many samples into + a single report keywords: - QC - bioinformatics tools @@ -12,40 +13,59 @@ tools: homepage: https://multiqc.info/ documentation: https://multiqc.info/docs/ licence: ["GPL-3.0-or-later"] + identifier: biotools:multiqc input: - - multiqc_files: - type: file - description: | - List of reports / files recognised by MultiQC, for example the html and zip output of FastQC - - multiqc_config: - type: file - description: Optional config yml for MultiQC - pattern: "*.{yml,yaml}" - - extra_multiqc_config: - type: file - description: Second optional config yml for MultiQC. Will override common sections in multiqc_config. - pattern: "*.{yml,yaml}" - - multiqc_logo: - type: file - description: Optional logo file for MultiQC - pattern: "*.{png}" + - - multiqc_files: + type: file + description: | + List of reports / files recognised by MultiQC, for example the html and zip output of FastQC + - - multiqc_config: + type: file + description: Optional config yml for MultiQC + pattern: "*.{yml,yaml}" + - - extra_multiqc_config: + type: file + description: Second optional config yml for MultiQC. Will override common sections + in multiqc_config. + pattern: "*.{yml,yaml}" + - - multiqc_logo: + type: file + description: Optional logo file for MultiQC + pattern: "*.{png}" + - - replace_names: + type: file + description: | + Optional two-column sample renaming file. First column a set of + patterns, second column a set of corresponding replacements. Passed via + MultiQC's `--replace-names` option. + pattern: "*.{tsv}" + - - sample_names: + type: file + description: | + Optional TSV file with headers, passed to the MultiQC --sample_names + argument. + pattern: "*.{tsv}" output: - report: - type: file - description: MultiQC report file - pattern: "multiqc_report.html" + - "*multiqc_report.html": + type: file + description: MultiQC report file + pattern: "multiqc_report.html" - data: - type: directory - description: MultiQC data dir - pattern: "multiqc_data" + - "*_data": + type: directory + description: MultiQC data dir + pattern: "multiqc_data" - plots: - type: file - description: Plots created by MultiQC - pattern: "*_data" + - "*_plots": + type: file + description: Plots created by MultiQC + pattern: "*_data" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@abhi18av" - "@bunop" diff --git a/modules/nf-core/multiqc/tests/main.nf.test b/modules/nf-core/multiqc/tests/main.nf.test index f1c4242ef2..33316a7ddb 100644 --- a/modules/nf-core/multiqc/tests/main.nf.test +++ b/modules/nf-core/multiqc/tests/main.nf.test @@ -8,6 +8,8 @@ nextflow_process { tag "modules_nfcore" tag "multiqc" + config "./nextflow.config" + test("sarscov2 single-end [fastqc]") { when { @@ -17,6 +19,8 @@ nextflow_process { input[1] = [] input[2] = [] input[3] = [] + input[4] = [] + input[5] = [] """ } } @@ -41,6 +45,8 @@ nextflow_process { input[1] = Channel.of(file("https://github.com/nf-core/tools/raw/dev/nf_core/pipeline-template/assets/multiqc_config.yml", checkIfExists: true)) input[2] = [] input[3] = [] + input[4] = [] + input[5] = [] """ } } @@ -66,6 +72,8 @@ nextflow_process { input[1] = [] input[2] = [] input[3] = [] + input[4] = [] + input[5] = [] """ } } diff --git a/modules/nf-core/multiqc/tests/main.nf.test.snap b/modules/nf-core/multiqc/tests/main.nf.test.snap index bfebd80298..2fcbb5ff7d 100644 --- a/modules/nf-core/multiqc/tests/main.nf.test.snap +++ b/modules/nf-core/multiqc/tests/main.nf.test.snap @@ -2,14 +2,14 @@ "multiqc_versions_single": { "content": [ [ - "versions.yml:md5,21f35ee29416b9b3073c28733efe4b7d" + "versions.yml:md5,41f391dcedce7f93ca188f3a3ffa0916" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-29T08:48:55.657331" + "timestamp": "2024-10-02T17:51:46.317523" }, "multiqc_stub": { "content": [ @@ -17,25 +17,25 @@ "multiqc_report.html", "multiqc_data", "multiqc_plots", - "versions.yml:md5,21f35ee29416b9b3073c28733efe4b7d" + "versions.yml:md5,41f391dcedce7f93ca188f3a3ffa0916" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-29T08:49:49.071937" + "timestamp": "2024-10-02T17:52:20.680978" }, "multiqc_versions_config": { "content": [ [ - "versions.yml:md5,21f35ee29416b9b3073c28733efe4b7d" + "versions.yml:md5,41f391dcedce7f93ca188f3a3ffa0916" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-29T08:49:25.457567" + "timestamp": "2024-10-02T17:52:09.185842" } } \ No newline at end of file diff --git a/modules/nf-core/multiqc/tests/nextflow.config b/modules/nf-core/multiqc/tests/nextflow.config new file mode 100644 index 0000000000..c537a6a3e7 --- /dev/null +++ b/modules/nf-core/multiqc/tests/nextflow.config @@ -0,0 +1,5 @@ +process { + withName: 'MULTIQC' { + ext.prefix = null + } +} diff --git a/modules/nf-core/ngscheckmate/ncm/environment.yml b/modules/nf-core/ngscheckmate/ncm/environment.yml index bf185fc23e..117b4e5783 100644 --- a/modules/nf-core/ngscheckmate/ncm/environment.yml +++ b/modules/nf-core/ngscheckmate/ncm/environment.yml @@ -1,7 +1,5 @@ -name: ngscheckmate_ncm channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::ngscheckmate=1.0.1 diff --git a/modules/nf-core/ngscheckmate/ncm/meta.yml b/modules/nf-core/ngscheckmate/ncm/meta.yml index 0defad0064..06c131d6b0 100644 --- a/modules/nf-core/ngscheckmate/ncm/meta.yml +++ b/modules/nf-core/ngscheckmate/ncm/meta.yml @@ -1,70 +1,105 @@ name: ngscheckmate_ncm -description: Determining whether sequencing data comes from the same individual by using SNP matching. Designed for humans on vcf or bam files. +description: Determining whether sequencing data comes from the same individual by + using SNP matching. Designed for humans on vcf or bam files. keywords: - ngscheckmate - matching - snp tools: - ngscheckmate: - description: NGSCheckMate is a software package for identifying next generation sequencing (NGS) data files from the same individual, including matching between DNA and RNA. + description: NGSCheckMate is a software package for identifying next generation + sequencing (NGS) data files from the same individual, including matching between + DNA and RNA. homepage: https://github.com/parklab/NGSCheckMate documentation: https://github.com/parklab/NGSCheckMate tool_dev_url: https://github.com/parklab/NGSCheckMate doi: "10.1093/nar/gkx193" licence: ["MIT"] + identifier: biotools:ngscheckmate input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test'] - - files: - type: file - description: VCF or BAM files for each sample, in a merged channel (possibly gzipped). BAM files require an index too. - pattern: "*.{vcf,vcf.gz,bam,bai}" - - meta2: - type: map - description: | - Groovy Map containing SNP information - e.g. [ id:'test' ] - - snp_bed: - type: file - description: BED file containing the SNPs to analyse - pattern: "*.{bed}" - - meta3: - type: map - description: | - Groovy Map containing reference fasta index information - e.g. [ id:'test' ] - - fasta: - type: file - description: fasta file for the genome, only used in the bam mode - pattern: "*.{bed}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - files: + type: file + description: VCF or BAM files for each sample, in a merged channel (possibly + gzipped). BAM files require an index too. + pattern: "*.{vcf,vcf.gz,bam,bai}" + - - meta2: + type: map + description: | + Groovy Map containing SNP information + e.g. [ id:'test' ] + - snp_bed: + type: file + description: BED file containing the SNPs to analyse + pattern: "*.{bed}" + - - meta3: + type: map + description: | + Groovy Map containing reference fasta index information + e.g. [ id:'test' ] + - fasta: + type: file + description: fasta file for the genome, only used in the bam mode + pattern: "*.{bed}" output: - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - - pdf: - type: file - description: A pdf containing a dendrogram showing how the samples match up - pattern: "*.{pdf}" - corr_matrix: - type: file - description: A text file containing the correlation matrix between each sample - pattern: "*corr_matrix.txt" + - meta: + type: file + description: A text file containing the correlation matrix between each sample + pattern: "*corr_matrix.txt" + - "*_corr_matrix.txt": + type: file + description: A text file containing the correlation matrix between each sample + pattern: "*corr_matrix.txt" - matched: - type: file - description: A txt file containing only the samples that match with each other - pattern: "*matched.txt" + - meta: + type: file + description: A txt file containing only the samples that match with each other + pattern: "*matched.txt" + - "*_matched.txt": + type: file + description: A txt file containing only the samples that match with each other + pattern: "*matched.txt" - all: - type: file - description: A txt file containing all the sample comparisons, whether they match or not - pattern: "*all.txt" + - meta: + type: file + description: A txt file containing all the sample comparisons, whether they + match or not + pattern: "*all.txt" + - "*_all.txt": + type: file + description: A txt file containing all the sample comparisons, whether they + match or not + pattern: "*all.txt" + - pdf: + - meta: + type: file + description: A pdf containing a dendrogram showing how the samples match up + pattern: "*.{pdf}" + - "*.pdf": + type: file + description: A pdf containing a dendrogram showing how the samples match up + pattern: "*.{pdf}" - vcf: - type: file - description: If ran in bam mode, vcf files for each sample giving the SNP calls used - pattern: "*.vcf" + - meta: + type: file + description: If ran in bam mode, vcf files for each sample giving the SNP calls + used + pattern: "*.vcf" + - "*.vcf": + type: file + description: If ran in bam mode, vcf files for each sample giving the SNP calls + used + pattern: "*.vcf" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@sppearce" maintainers: diff --git a/modules/nf-core/ngscheckmate/ncm/tests/main.nf.test b/modules/nf-core/ngscheckmate/ncm/tests/main.nf.test index 9dafbdf6df..1263be3583 100644 --- a/modules/nf-core/ngscheckmate/ncm/tests/main.nf.test +++ b/modules/nf-core/ngscheckmate/ncm/tests/main.nf.test @@ -19,8 +19,8 @@ nextflow_process { process { """ input[0] = [ [ id:'test' ], - file(params.test_data['sarscov2']['genome']['test_bed'], checkIfExists: true) - ] + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/test.bed', checkIfExists: true) + ] """ } } @@ -30,13 +30,13 @@ nextflow_process { process { """ input[0] = [ - [ id:'test1' ], // meta map - file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['sarscov2']['genome']['test_bed'], checkIfExists: true) - ] + [ id:'test1' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/test.bed', checkIfExists: true) + ] input[1] = [ [ id:'sarscov2' ], - file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true) - ] + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] input[2] = false """ } @@ -47,13 +47,13 @@ nextflow_process { process { """ input[0] = [ - [ id:'test2' ], // meta map - file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam'], checkIfExists: true), - file(params.test_data['sarscov2']['genome']['test_bed'], checkIfExists: true) - ] + [ id:'test2' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/test.bed', checkIfExists: true) + ] input[1] = [ [ id:'sarscov2' ], - file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true) - ] + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] input[2] = false """ } @@ -68,16 +68,17 @@ nextflow_process { process { """ input[0] = [ [ id: 'combined_bams' ], - [file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam'] , checkIfExists: true ), - file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam_bai'] , checkIfExists: true ), - file(params.test_data['sarscov2']['illumina']['test_paired_end_methylated_sorted_bam'] , checkIfExists: true ), - file(params.test_data['sarscov2']['illumina']['test_paired_end_methylated_sorted_bam_bai'], checkIfExists: true ) - ] - ] + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.methylated.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.methylated.sorted.bam.bai', checkIfExists: true) + ] + ] input[1] = BEDTOOLS_MAKEWINDOWS.out.bed input[2] = [ [ id:'sarscov2' ], - file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true) - ] + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] """ } } @@ -86,9 +87,9 @@ nextflow_process { assertAll( { assert process.success }, { assert snapshot( - process.out.corr_matrix, - process.out.matched, - process.out.all, + file(process.out.corr_matrix[0][1]).name, // Content md5 not stable under conda + file(process.out.matched[0][1]).name, // Content md5 not stable under conda + file(process.out.all[0][1]).name, // Content md5 not stable under conda process.out.versions ).match() } ) @@ -104,8 +105,8 @@ nextflow_process { input[0] = BCFTOOLS_MPILEUP1.out.vcf.combine(BCFTOOLS_MPILEUP2.out.vcf.map{it[1]}).map{meta, one, two -> [meta, [one, two]]}.map{meta, stuff -> [meta, stuff.flatten()]} input[1] = BEDTOOLS_MAKEWINDOWS.out.bed input[2] = [ [ id:'sarscov2' ], - file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true) - ] + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] """ } } @@ -133,16 +134,17 @@ nextflow_process { process { """ input[0] = [ [ id: 'combined_bams' ], - [file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam'] , checkIfExists: true ), - file(params.test_data['sarscov2']['illumina']['test_paired_end_sorted_bam_bai'] , checkIfExists: true ), - file(params.test_data['sarscov2']['illumina']['test_paired_end_methylated_sorted_bam'] , checkIfExists: true ), - file(params.test_data['sarscov2']['illumina']['test_paired_end_methylated_sorted_bam_bai'], checkIfExists: true ) - ] - ] + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.methylated.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.methylated.sorted.bam.bai', checkIfExists: true) + ] + ] input[1] = BEDTOOLS_MAKEWINDOWS.out.bed input[2] = [ [ id:'sarscov2' ], - file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true) - ] + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] """ } } @@ -166,8 +168,8 @@ nextflow_process { input[0] = BCFTOOLS_MPILEUP1.out.vcf.combine(BCFTOOLS_MPILEUP2.out.vcf.map{it[1]}) input[1] = BEDTOOLS_MAKEWINDOWS.out.bed input[2] = [ [ id:'sarscov2' ], - file(params.test_data['sarscov2']['genome']['genome_fasta'], checkIfExists: true) - ] + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] """ } } diff --git a/modules/nf-core/ngscheckmate/ncm/tests/main.nf.test.snap b/modules/nf-core/ngscheckmate/ncm/tests/main.nf.test.snap index 46a98ccc98..e78cecc8f5 100644 --- a/modules/nf-core/ngscheckmate/ncm/tests/main.nf.test.snap +++ b/modules/nf-core/ngscheckmate/ncm/tests/main.nf.test.snap @@ -125,39 +125,18 @@ }, "sarscov2 - bam": { "content": [ - [ - [ - { - "id": "combined_bams" - }, - "combined_bams_output_corr_matrix.txt:md5,b8bfd203232680b746ac91ccb290b5e3" - ] - ], - [ - [ - { - "id": "combined_bams" - }, - "combined_bams_matched.txt:md5,14d0b35765e127aab0ffa5ea5406b4ab" - ] - ], - [ - [ - { - "id": "combined_bams" - }, - "combined_bams_all.txt:md5,14d0b35765e127aab0ffa5ea5406b4ab" - ] - ], + "combined_bams_output_corr_matrix.txt", + "combined_bams_matched.txt", + "combined_bams_all.txt", [ "versions.yml:md5,7ac92d9cbf4fc44b3253832f3a8b2a80" ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "24.04.1" + "nextflow": "24.04.3" }, - "timestamp": "2024-05-22T16:04:35.032277154" + "timestamp": "2024-08-06T11:05:56.136156" }, "sarscov2 - vcf - stub": { "content": [ diff --git a/modules/nf-core/samblaster/environment.yml b/modules/nf-core/samblaster/environment.yml index ac83824150..fc8cd9e515 100644 --- a/modules/nf-core/samblaster/environment.yml +++ b/modules/nf-core/samblaster/environment.yml @@ -1,9 +1,6 @@ -name: samblaster - channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::htslib=1.19.1 diff --git a/modules/nf-core/samblaster/meta.yml b/modules/nf-core/samblaster/meta.yml index 5c1e5a9700..5faf3a6c3a 100644 --- a/modules/nf-core/samblaster/meta.yml +++ b/modules/nf-core/samblaster/meta.yml @@ -23,30 +23,33 @@ tools: tool_dev_url: https://github.com/GregoryFaust/samblaster doi: "10.1093/bioinformatics/btu314" licence: ["MIT"] + identifier: biotools:samblaster input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: BAM file - pattern: "*.bam" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - bam: + type: file + description: BAM file + pattern: "*.bam" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - bam: - type: file - description: Tagged or filtered BAM file - pattern: "*.bam" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bam": + type: file + description: Tagged or filtered BAM file + pattern: "*.bam" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@lescai" maintainers: diff --git a/modules/nf-core/samblaster/tests/main.nf.test b/modules/nf-core/samblaster/tests/main.nf.test index 0179430715..20d505bbef 100644 --- a/modules/nf-core/samblaster/tests/main.nf.test +++ b/modules/nf-core/samblaster/tests/main.nf.test @@ -15,7 +15,7 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:false ], // meta map - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_umi_unsorted_bam'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/umi/test.paired_end.umi_unsorted.bam', checkIfExists: true) ] """ } @@ -24,7 +24,12 @@ nextflow_process { then { assertAll( { assert process.success }, - { assert snapshot(process.out).match() } + { assert snapshot( + bam(process.out.bam[0][1]).getHeaderMD5(), + bam(process.out.bam[0][1]).getReadsMD5(), + process.out.versions + ).match() + } ) } @@ -39,7 +44,7 @@ nextflow_process { """ input[0] = [ [ id:'test', single_end:false ], // meta map - file(params.test_data['homo_sapiens']['illumina']['test_paired_end_umi_unsorted_bam'], checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/umi/test.paired_end.umi_unsorted.bam', checkIfExists: true) ] """ } diff --git a/modules/nf-core/samblaster/tests/main.nf.test.snap b/modules/nf-core/samblaster/tests/main.nf.test.snap index 917c8f1fb1..1a1481a1e3 100644 --- a/modules/nf-core/samblaster/tests/main.nf.test.snap +++ b/modules/nf-core/samblaster/tests/main.nf.test.snap @@ -36,37 +36,16 @@ }, "homo_sapiens-test_paired_end_umi_unsorted_bam": { "content": [ - { - "0": [ - [ - { - "id": "test", - "single_end": false - }, - "test.bam:md5,634a6bd541478e970f0a4c279f399889" - ] - ], - "1": [ - "versions.yml:md5,8a70467f2dfc2e0d8e81787223d2fc77" - ], - "bam": [ - [ - { - "id": "test", - "single_end": false - }, - "test.bam:md5,634a6bd541478e970f0a4c279f399889" - ] - ], - "versions": [ - "versions.yml:md5,8a70467f2dfc2e0d8e81787223d2fc77" - ] - } + "e21efac7a4734b9b16f7210901cf02af", + "c1b74864f32583faf7d9bcd82217ff4c", + [ + "versions.yml:md5,8a70467f2dfc2e0d8e81787223d2fc77" + ] ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.3" }, - "timestamp": "2024-02-26T14:04:38.118875" + "timestamp": "2024-07-29T20:15:04.264504" } } \ No newline at end of file diff --git a/modules/nf-core/samtools/bam2fq/environment.yml b/modules/nf-core/samtools/bam2fq/environment.yml index 9c98946adc..62054fc97a 100644 --- a/modules/nf-core/samtools/bam2fq/environment.yml +++ b/modules/nf-core/samtools/bam2fq/environment.yml @@ -1,8 +1,8 @@ -name: samtools_bam2fq +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::samtools=1.20 - - bioconda::htslib=1.20 + - bioconda::htslib=1.21 + - bioconda::samtools=1.21 diff --git a/modules/nf-core/samtools/bam2fq/main.nf b/modules/nf-core/samtools/bam2fq/main.nf index a34a13a984..1d3049e565 100644 --- a/modules/nf-core/samtools/bam2fq/main.nf +++ b/modules/nf-core/samtools/bam2fq/main.nf @@ -4,8 +4,8 @@ process SAMTOOLS_BAM2FQ { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/samtools:1.20--h50ea8bc_0' : - 'biocontainers/samtools:1.20--h50ea8bc_0' }" + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h50ea8bc_0' : + 'biocontainers/samtools:1.21--h50ea8bc_0' }" input: tuple val(meta), path(inputbam) diff --git a/modules/nf-core/samtools/bam2fq/meta.yml b/modules/nf-core/samtools/bam2fq/meta.yml index 7769046b54..b17ed608d0 100644 --- a/modules/nf-core/samtools/bam2fq/meta.yml +++ b/modules/nf-core/samtools/bam2fq/meta.yml @@ -11,40 +11,43 @@ tools: description: Tools for dealing with SAM, BAM and CRAM files documentation: http://www.htslib.org/doc/1.1/samtools.html licence: ["MIT"] + identifier: biotools:samtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - inputbam: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - split: - type: boolean - description: | - TRUE/FALSE value to indicate if reads should be separated into - /1, /2 and if present other, or singleton. - Note: choosing TRUE will generate 4 different files. - Choosing FALSE will produce a single file, which will be interleaved in case - the input contains paired reads. + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - inputbam: + type: file + description: BAM/CRAM/SAM file + pattern: "*.{bam,cram,sam}" + - - split: + type: boolean + description: | + TRUE/FALSE value to indicate if reads should be separated into + /1, /2 and if present other, or singleton. + Note: choosing TRUE will generate 4 different files. + Choosing FALSE will produce a single file, which will be interleaved in case + the input contains paired reads. output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - reads: - type: file - description: | - FASTQ files, which will be either a group of 4 files (read_1, read_2, other and singleton) - or a single interleaved .fq.gz file if the user chooses not to split the reads. - pattern: "*.fq.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.fq.gz": + type: file + description: | + FASTQ files, which will be either a group of 4 files (read_1, read_2, other and singleton) + or a single interleaved .fq.gz file if the user chooses not to split the reads. + pattern: "*.fq.gz" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@lescai" maintainers: diff --git a/modules/nf-core/samtools/bam2fq/tests/main.nf.test.snap b/modules/nf-core/samtools/bam2fq/tests/main.nf.test.snap index aa0f8c34d1..17ea1d4f90 100644 --- a/modules/nf-core/samtools/bam2fq/tests/main.nf.test.snap +++ b/modules/nf-core/samtools/bam2fq/tests/main.nf.test.snap @@ -51,25 +51,25 @@ "bam_versions": { "content": [ [ - "versions.yml:md5,90c1cf8971540ef05e330db5a560195c" + "versions.yml:md5,afea8d3a8a729d71eac7d7e034012aee" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:43:52.415053401" + "timestamp": "2024-09-16T07:46:07.695175929" }, "bam_split_versions": { "content": [ [ - "versions.yml:md5,90c1cf8971540ef05e330db5a560195c" + "versions.yml:md5,afea8d3a8a729d71eac7d7e034012aee" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:43:57.740193087" + "timestamp": "2024-09-16T07:46:17.91793415" } } \ No newline at end of file diff --git a/modules/nf-core/samtools/collatefastq/environment.yml b/modules/nf-core/samtools/collatefastq/environment.yml index 93734c5670..62054fc97a 100644 --- a/modules/nf-core/samtools/collatefastq/environment.yml +++ b/modules/nf-core/samtools/collatefastq/environment.yml @@ -1,8 +1,8 @@ -name: samtools_collatefastq +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::samtools=1.20 - - bioconda::htslib=1.20 + - bioconda::htslib=1.21 + - bioconda::samtools=1.21 diff --git a/modules/nf-core/samtools/collatefastq/main.nf b/modules/nf-core/samtools/collatefastq/main.nf index a75f56e8c6..8b70ebd345 100644 --- a/modules/nf-core/samtools/collatefastq/main.nf +++ b/modules/nf-core/samtools/collatefastq/main.nf @@ -4,8 +4,8 @@ process SAMTOOLS_COLLATEFASTQ { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/samtools:1.20--h50ea8bc_0' : - 'biocontainers/samtools:1.20--h50ea8bc_0' }" + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h50ea8bc_0' : + 'biocontainers/samtools:1.21--h50ea8bc_0' }" input: tuple val(meta), path(input) @@ -13,11 +13,11 @@ process SAMTOOLS_COLLATEFASTQ { val(interleave) output: - tuple val(meta), path("*_{1,2}.fq.gz") , optional:true, emit: fastq - tuple val(meta), path("*_interleaved.fq.gz") , optional:true, emit: fastq_interleaved - tuple val(meta), path("*_other.fq.gz") , emit: fastq_other - tuple val(meta), path("*_singleton.fq.gz") , optional:true, emit: fastq_singleton - path "versions.yml" , emit: versions + tuple val(meta), path("*_{1,2}.fq.gz") , optional:true, emit: fastq + tuple val(meta), path("*_interleaved.fq") , optional:true, emit: fastq_interleaved + tuple val(meta), path("*_other.fq.gz") , emit: fastq_other + tuple val(meta), path("*_singleton.fq.gz") , optional:true, emit: fastq_singleton + path "versions.yml" , emit: versions when: task.ext.when == null || task.ext.when @@ -27,7 +27,7 @@ process SAMTOOLS_COLLATEFASTQ { def args2 = task.ext.args2 ?: '' def prefix = task.ext.prefix ?: "${meta.id}" def reference = fasta ? "--reference ${fasta}" : "" - def output = (interleave && ! meta.single_end) ? "> ${prefix}_interleaved.fq.gz" : + def output = (interleave && ! meta.single_end) ? "> ${prefix}_interleaved.fq" : meta.single_end ? "-1 ${prefix}_1.fq.gz -s ${prefix}_singleton.fq.gz" : "-1 ${prefix}_1.fq.gz -2 ${prefix}_2.fq.gz -s ${prefix}_singleton.fq.gz" @@ -52,4 +52,25 @@ process SAMTOOLS_COLLATEFASTQ { samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') END_VERSIONS """ + + stub: + def prefix = task.ext.prefix ?: "${meta.id}" + def empty = "echo '' | gzip " + def singletoncommand = "${empty}> ${prefix}_singleton.fq.gz" + def interleavecommand = interleave && !meta.single_end ? "${empty}> ${prefix}_interleaved.fq.gz" : "" + def output1command = !interleave ? "${empty}> ${prefix}_1.fq.gz" : "" + def output2command = !interleave && !meta.single_end ? "${empty}> ${prefix}_2.fq.gz" : "" + + """ + ${output1command} + ${output2command} + ${interleavecommand} + ${singletoncommand} + ${empty}> ${prefix}_other.fq.gz + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + END_VERSIONS + """ } diff --git a/modules/nf-core/samtools/collatefastq/meta.yml b/modules/nf-core/samtools/collatefastq/meta.yml index 898cdbdad7..5bc912496e 100644 --- a/modules/nf-core/samtools/collatefastq/meta.yml +++ b/modules/nf-core/samtools/collatefastq/meta.yml @@ -11,61 +11,90 @@ tools: description: Tools for dealing with SAM, BAM and CRAM files documentation: http://www.htslib.org/doc/1.1/samtools.html licence: ["MIT"] + identifier: biotools:samtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'test' ] - - fasta: - type: file - description: Reference genome fasta file - pattern: "*.{fasta,fa}" - - interleave: - type: boolean - description: | - If true, the output is a single interleaved paired-end FASTQ - If false, the output split paired-end FASTQ - default: false + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM/SAM file + pattern: "*.{bam,cram,sam}" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test' ] + - fasta: + type: file + description: Reference genome fasta file + pattern: "*.{fasta,fa}" + - - interleave: + type: boolean + description: | + If true, the output is a single interleaved paired-end FASTQ + If false, the output split paired-end FASTQ + default: false output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - fastq: - type: file - description: | - R1 and R2 FASTQ files - pattern: "*_{1,2}.fq.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*_{1,2}.fq.gz" + - "*_{1,2}.fq.gz": + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*_{1,2}.fq.gz" - fastq_interleaved: - type: file - description: | - Interleaved paired end FASTQ files - pattern: "*_interleaved.fq.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*_interleaved.fq.gz" + - "*_interleaved.fq": + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*_interleaved.fq.gz" - fastq_other: - type: file - description: | - FASTQ files with reads where the READ1 and READ2 FLAG bits set are either both set or both unset. - pattern: "*_other.fq.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*_other.fq.gz" + - "*_other.fq.gz": + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*_other.fq.gz" - fastq_singleton: - type: file - description: | - FASTQ files with singleton reads. - pattern: "*_singleton.fq.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*_singleton.fq.gz" + - "*_singleton.fq.gz": + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*_singleton.fq.gz" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@lescai" - "@maxulysse" diff --git a/modules/nf-core/samtools/collatefastq/tests/main.nf.test b/modules/nf-core/samtools/collatefastq/tests/main.nf.test new file mode 100644 index 0000000000..bc66ebf6e3 --- /dev/null +++ b/modules/nf-core/samtools/collatefastq/tests/main.nf.test @@ -0,0 +1,242 @@ +nextflow_process { + + name "Test Process SAMTOOLS_COLLATEFASTQ" + script "../main.nf" + process "SAMTOOLS_COLLATEFASTQ" + + tag "modules" + tag "modules_nfcore" + tag "samtools" + tag "samtools/collatefastq" + + test("human - bam - paired_end") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true) + ] + input[1] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) + ] + input[2] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out.fastq, + process.out.versions).match() } + ) + } + + } + + test("human - bam - single_end") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:true ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.single_end.bam', checkIfExists: true) + ] + input[1] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + input[2] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out.fastq_other, + process.out.versions).match() } + ) + } + + } + + test("human - cram") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true) + ] + input[1] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) + ] + input[2] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out.fastq, + process.out.versions).match() } + ) + } + + } + + test("human - bam - paired_end - interleaved") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true) + ] + input[1] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) + ] + input[2] = true + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out.fastq_interleaved, + process.out.fastq_singleton, + process.out.fastq, + process.out.versions).match() } + ) + } + + } + + test("human - bam - paired_end -stub") { + options "-stub" + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true) + ] + input[1] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) + ] + input[2] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out.fastq, + process.out.versions).match() } + ) + } + + } + + test("human - bam - single_end - stub") { + options "-stub" + when { + process { + """ + input[0] = [ + [ id:'test', single_end:true ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.single_end.bam', checkIfExists: true) + ] + input[1] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + input[2] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out.fastq_other, + process.out.versions).match() } + ) + } + + } + + test("human - cram - stub") { + options "-stub" + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true) + ] + input[1] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) + ] + input[2] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out.fastq, + process.out.versions).match() } + ) + } + + } + + test("human - bam - paired_end - interleaved - stub") { + options "-stub" + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true) + ] + input[1] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) + ] + input[2] = true + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out.fastq_interleaved, + process.out.fastq_singleton, + process.out.fastq, + process.out.versions).match() } + ) + } + + } + + + +} diff --git a/modules/nf-core/samtools/collatefastq/tests/main.nf.test.snap b/modules/nf-core/samtools/collatefastq/tests/main.nf.test.snap new file mode 100644 index 0000000000..26f5293699 --- /dev/null +++ b/modules/nf-core/samtools/collatefastq/tests/main.nf.test.snap @@ -0,0 +1,194 @@ +{ + "human - bam - paired_end - interleaved - stub": { + "content": [ + [ + + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test_singleton.fq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + [ + + ], + [ + "versions.yml:md5,6359bf4b480f99e14e14b9e7dcb2fd5a" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-16T07:50:35.483719031" + }, + "human - bam - single_end": { + "content": [ + [ + [ + { + "id": "test", + "single_end": true + }, + "test_other.fq.gz:md5,a6c101a06b5c9d5f8b91c0acd4ac5045" + ] + ], + [ + "versions.yml:md5,6359bf4b480f99e14e14b9e7dcb2fd5a" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-16T07:49:31.667042764" + }, + "human - bam - paired_end -stub": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + [ + "versions.yml:md5,6359bf4b480f99e14e14b9e7dcb2fd5a" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-16T07:50:04.607920717" + }, + "human - cram - stub": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test_2.fq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + [ + "versions.yml:md5,6359bf4b480f99e14e14b9e7dcb2fd5a" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-16T07:50:25.354607106" + }, + "human - cram": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fq.gz:md5,1cf671980643af6c1148ae5e8e94e350", + "test_2.fq.gz:md5,38c1e9829115f9025f95435c5a4373d3" + ] + ] + ], + [ + "versions.yml:md5,6359bf4b480f99e14e14b9e7dcb2fd5a" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-16T07:49:43.12646213" + }, + "human - bam - paired_end - interleaved": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "test_interleaved.fq:md5,4f2b93d492f0442fa89b02532c9b3530" + ] + ], + [ + + ], + [ + + ], + [ + "versions.yml:md5,6359bf4b480f99e14e14b9e7dcb2fd5a" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-16T07:49:53.799565712" + }, + "human - bam - single_end - stub": { + "content": [ + [ + [ + { + "id": "test", + "single_end": true + }, + "test_other.fq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + [ + "versions.yml:md5,6359bf4b480f99e14e14b9e7dcb2fd5a" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-16T07:50:15.487972669" + }, + "human - bam - paired_end": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + [ + "test_1.fq.gz:md5,1cf671980643af6c1148ae5e8e94e350", + "test_2.fq.gz:md5,38c1e9829115f9025f95435c5a4373d3" + ] + ] + ], + [ + "versions.yml:md5,6359bf4b480f99e14e14b9e7dcb2fd5a" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-16T07:49:19.805942524" + } +} \ No newline at end of file diff --git a/modules/nf-core/samtools/collatefastq/tests/tags.yml b/modules/nf-core/samtools/collatefastq/tests/tags.yml new file mode 100644 index 0000000000..67380776a2 --- /dev/null +++ b/modules/nf-core/samtools/collatefastq/tests/tags.yml @@ -0,0 +1,2 @@ +samtools/collatefastq: + - "modules/nf-core/samtools/collatefastq/**" diff --git a/modules/nf-core/samtools/convert/environment.yml b/modules/nf-core/samtools/convert/environment.yml index 7a95ca614a..62054fc97a 100644 --- a/modules/nf-core/samtools/convert/environment.yml +++ b/modules/nf-core/samtools/convert/environment.yml @@ -1,8 +1,8 @@ -name: samtools_convert +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::samtools=1.20 - - bioconda::htslib=1.20 + - bioconda::htslib=1.21 + - bioconda::samtools=1.21 diff --git a/modules/nf-core/samtools/convert/main.nf b/modules/nf-core/samtools/convert/main.nf index 03b7b5259d..cf9253d10f 100644 --- a/modules/nf-core/samtools/convert/main.nf +++ b/modules/nf-core/samtools/convert/main.nf @@ -4,8 +4,8 @@ process SAMTOOLS_CONVERT { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/samtools:1.20--h50ea8bc_0' : - 'biocontainers/samtools:1.20--h50ea8bc_0' }" + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h50ea8bc_0' : + 'biocontainers/samtools:1.21--h50ea8bc_0' }" input: tuple val(meta), path(input), path(index) diff --git a/modules/nf-core/samtools/convert/meta.yml b/modules/nf-core/samtools/convert/meta.yml index 558289715c..d5bfa161ba 100644 --- a/modules/nf-core/samtools/convert/meta.yml +++ b/modules/nf-core/samtools/convert/meta.yml @@ -15,50 +15,85 @@ tools: documentation: http://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:samtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM file - pattern: "*.{bam,cram}" - - index: - type: file - description: BAM/CRAM index file - pattern: "*.{bai,crai}" - - fasta: - type: file - description: Reference file to create the CRAM file - pattern: "*.{fasta,fa}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM file + pattern: "*.{bam,cram}" + - index: + type: file + description: BAM/CRAM index file + pattern: "*.{bai,crai}" + - - meta2: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: Reference file to create the CRAM file + pattern: "*.{fasta,fa}" + - - meta3: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fai: + type: file + description: Reference index file to create the CRAM file + pattern: "*.{fai}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - bam: - type: file - description: filtered/converted BAM file - pattern: "*{.bam}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bam": + type: file + description: filtered/converted BAM file + pattern: "*{.bam}" - cram: - type: file - description: filtered/converted CRAM file - pattern: "*{cram}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.cram": + type: file + description: filtered/converted CRAM file + pattern: "*{cram}" - bai: - type: file - description: filtered/converted BAM index - pattern: "*{.bai}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bai": + type: file + description: filtered/converted BAM index + pattern: "*{.bai}" - crai: - type: file - description: filtered/converted CRAM index - pattern: "*{.crai}" - - version: - type: file - description: File containing software version - pattern: "*.{version.txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.crai": + type: file + description: filtered/converted CRAM index + pattern: "*{.crai}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@FriederikeHanssen" - "@maxulysse" diff --git a/modules/nf-core/samtools/convert/tests/main.nf.test.snap b/modules/nf-core/samtools/convert/tests/main.nf.test.snap index 513629022b..a021254e8b 100644 --- a/modules/nf-core/samtools/convert/tests/main.nf.test.snap +++ b/modules/nf-core/samtools/convert/tests/main.nf.test.snap @@ -22,26 +22,26 @@ "cram_to_bam_versions": { "content": [ [ - "versions.yml:md5,b1040cd80ce16abb9b2c2902b62d5fcd" + "versions.yml:md5,5bc6eb42ab2a1ea6661f8ee998467ad6" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:44:34.618037105" + "timestamp": "2024-09-16T07:52:35.516411351" }, "bam_to_cram_versions": { "content": [ [ - "versions.yml:md5,b1040cd80ce16abb9b2c2902b62d5fcd" + "versions.yml:md5,5bc6eb42ab2a1ea6661f8ee998467ad6" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:44:29.165839679" + "timestamp": "2024-09-16T07:52:24.694454205" }, "stub": { "content": [ @@ -71,7 +71,7 @@ ] ], "4": [ - "versions.yml:md5,b1040cd80ce16abb9b2c2902b62d5fcd" + "versions.yml:md5,5bc6eb42ab2a1ea6661f8ee998467ad6" ], "bai": [ @@ -98,15 +98,15 @@ ] ], "versions": [ - "versions.yml:md5,b1040cd80ce16abb9b2c2902b62d5fcd" + "versions.yml:md5,5bc6eb42ab2a1ea6661f8ee998467ad6" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:44:40.258233921" + "timestamp": "2024-09-16T07:52:45.799885099" }, "bam_to_cram_index": { "content": [ diff --git a/modules/nf-core/samtools/faidx/environment.yml b/modules/nf-core/samtools/faidx/environment.yml index f8450fa566..62054fc97a 100644 --- a/modules/nf-core/samtools/faidx/environment.yml +++ b/modules/nf-core/samtools/faidx/environment.yml @@ -1,10 +1,8 @@ -name: samtools_faidx - +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults - dependencies: - - bioconda::htslib=1.20 - - bioconda::samtools=1.20 + - bioconda::htslib=1.21 + - bioconda::samtools=1.21 diff --git a/modules/nf-core/samtools/faidx/main.nf b/modules/nf-core/samtools/faidx/main.nf index bdcdbc954d..28c0a81cf7 100644 --- a/modules/nf-core/samtools/faidx/main.nf +++ b/modules/nf-core/samtools/faidx/main.nf @@ -4,8 +4,8 @@ process SAMTOOLS_FAIDX { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/samtools:1.20--h50ea8bc_0' : - 'biocontainers/samtools:1.20--h50ea8bc_0' }" + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h50ea8bc_0' : + 'biocontainers/samtools:1.21--h50ea8bc_0' }" input: tuple val(meta), path(fasta) diff --git a/modules/nf-core/samtools/faidx/meta.yml b/modules/nf-core/samtools/faidx/meta.yml index f3c25de20f..6721b2cb84 100644 --- a/modules/nf-core/samtools/faidx/meta.yml +++ b/modules/nf-core/samtools/faidx/meta.yml @@ -14,47 +14,62 @@ tools: documentation: http://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:samtools input: - - meta: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'test' ] - - fasta: - type: file - description: FASTA file - pattern: "*.{fa,fasta}" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'test' ] - - fai: - type: file - description: FASTA index file - pattern: "*.{fai}" + - - meta: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test' ] + - fasta: + type: file + description: FASTA file + pattern: "*.{fa,fasta}" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test' ] + - fai: + type: file + description: FASTA index file + pattern: "*.{fai}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - fa: - type: file - description: FASTA file - pattern: "*.{fa}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.{fa,fasta}": + type: file + description: FASTA file + pattern: "*.{fa}" - fai: - type: file - description: FASTA index file - pattern: "*.{fai}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.fai": + type: file + description: FASTA index file + pattern: "*.{fai}" - gzi: - type: file - description: Optional gzip index file for compressed inputs - pattern: "*.gzi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.gzi": + type: file + description: Optional gzip index file for compressed inputs + pattern: "*.gzi" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@ewels" diff --git a/modules/nf-core/samtools/faidx/tests/main.nf.test.snap b/modules/nf-core/samtools/faidx/tests/main.nf.test.snap index 3223b72bc6..1bbb3ec2b0 100644 --- a/modules/nf-core/samtools/faidx/tests/main.nf.test.snap +++ b/modules/nf-core/samtools/faidx/tests/main.nf.test.snap @@ -18,7 +18,7 @@ ], "3": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ], "fa": [ @@ -36,15 +36,15 @@ ], "versions": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:42:14.779784761" + "timestamp": "2024-09-16T07:57:47.450887871" }, "test_samtools_faidx_bgzip": { "content": [ @@ -71,7 +71,7 @@ ] ], "3": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ], "fa": [ @@ -95,15 +95,15 @@ ] ], "versions": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:42:20.256633877" + "timestamp": "2024-09-16T07:58:04.804905659" }, "test_samtools_faidx_fasta": { "content": [ @@ -124,7 +124,7 @@ ], "3": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ], "fa": [ [ @@ -142,15 +142,15 @@ ], "versions": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:42:25.632577273" + "timestamp": "2024-09-16T07:58:23.831268154" }, "test_samtools_faidx_stub_fasta": { "content": [ @@ -171,7 +171,7 @@ ], "3": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ], "fa": [ [ @@ -189,15 +189,15 @@ ], "versions": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:42:31.058424849" + "timestamp": "2024-09-16T07:58:35.600243706" }, "test_samtools_faidx_stub_fai": { "content": [ @@ -218,7 +218,7 @@ ], "3": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ], "fa": [ @@ -236,14 +236,14 @@ ], "versions": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:42:36.479929617" + "timestamp": "2024-09-16T07:58:54.705460167" } } \ No newline at end of file diff --git a/modules/nf-core/samtools/index/environment.yml b/modules/nf-core/samtools/index/environment.yml index 260d516be9..62054fc97a 100644 --- a/modules/nf-core/samtools/index/environment.yml +++ b/modules/nf-core/samtools/index/environment.yml @@ -1,8 +1,8 @@ -name: samtools_index +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::samtools=1.20 - - bioconda::htslib=1.20 + - bioconda::htslib=1.21 + - bioconda::samtools=1.21 diff --git a/modules/nf-core/samtools/index/main.nf b/modules/nf-core/samtools/index/main.nf index b523c21b43..311756102d 100644 --- a/modules/nf-core/samtools/index/main.nf +++ b/modules/nf-core/samtools/index/main.nf @@ -4,8 +4,8 @@ process SAMTOOLS_INDEX { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/samtools:1.20--h50ea8bc_0' : - 'biocontainers/samtools:1.20--h50ea8bc_0' }" + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h50ea8bc_0' : + 'biocontainers/samtools:1.21--h50ea8bc_0' }" input: tuple val(meta), path(input) @@ -35,10 +35,11 @@ process SAMTOOLS_INDEX { """ stub: + def args = task.ext.args ?: '' + def extension = file(input).getExtension() == 'cram' ? + "crai" : args.contains("-c") ? "csi" : "bai" """ - touch ${input}.bai - touch ${input}.crai - touch ${input}.csi + touch ${input}.${extension} cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/samtools/index/meta.yml b/modules/nf-core/samtools/index/meta.yml index 01a4ee03eb..db8df0d505 100644 --- a/modules/nf-core/samtools/index/meta.yml +++ b/modules/nf-core/samtools/index/meta.yml @@ -15,38 +15,52 @@ tools: documentation: http://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:samtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: input file output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - bai: - type: file - description: BAM/CRAM/SAM index file - pattern: "*.{bai,crai,sai}" - - crai: - type: file - description: BAM/CRAM/SAM index file - pattern: "*.{bai,crai,sai}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bai": + type: file + description: BAM/CRAM/SAM index file + pattern: "*.{bai,crai,sai}" - csi: - type: file - description: CSI index file - pattern: "*.{csi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: CSI index file + pattern: "*.{csi}" + - crai: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.crai": + type: file + description: BAM/CRAM/SAM index file + pattern: "*.{bai,crai,sai}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@ewels" diff --git a/modules/nf-core/samtools/index/tests/main.nf.test b/modules/nf-core/samtools/index/tests/main.nf.test index bb7756d1ca..ca34fb5cd4 100644 --- a/modules/nf-core/samtools/index/tests/main.nf.test +++ b/modules/nf-core/samtools/index/tests/main.nf.test @@ -9,11 +9,7 @@ nextflow_process { tag "samtools/index" test("bai") { - when { - params { - outdir = "$outputDir" - } process { """ input[0] = Channel.of([ @@ -27,18 +23,13 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert snapshot(process.out.bai).match("bai") }, - { assert snapshot(process.out.versions).match("bai_versions") } + { assert snapshot(process.out).match() } ) } } test("crai") { - when { - params { - outdir = "$outputDir" - } process { """ input[0] = Channel.of([ @@ -52,20 +43,83 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert snapshot(process.out.crai).match("crai") }, - { assert snapshot(process.out.versions).match("crai_versions") } + { assert snapshot(process.out).match() } ) } } test("csi") { - config "./csi.nextflow.config" when { - params { - outdir = "$outputDir" + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true) + ]) + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot( + file(process.out.csi[0][1]).name, + process.out.versions + ).match() } + ) + } + } + + test("bai - stub") { + options "-stub" + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true) + ]) + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("crai - stub") { + options "-stub" + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.recalibrated.sorted.cram', checkIfExists: true) + ]) + """ } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("csi - stub") { + options "-stub" + config "./csi.nextflow.config" + + when { process { """ input[0] = Channel.of([ @@ -79,8 +133,7 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert path(process.out.csi.get(0).get(1)).exists() }, - { assert snapshot(process.out.versions).match("csi_versions") } + { assert snapshot(process.out).match() } ) } } diff --git a/modules/nf-core/samtools/index/tests/main.nf.test.snap b/modules/nf-core/samtools/index/tests/main.nf.test.snap index 52756e85c6..72d65e81af 100644 --- a/modules/nf-core/samtools/index/tests/main.nf.test.snap +++ b/modules/nf-core/samtools/index/tests/main.nf.test.snap @@ -1,74 +1,250 @@ { - "crai_versions": { + "csi - stub": { "content": [ - [ - "versions.yml:md5,802c9776d9c5e95314e888cf18e96d77" - ] + { + "0": [ + + ], + "1": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.sorted.bam.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" + ], + "bai": [ + + ], + "crai": [ + + ], + "csi": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.sorted.bam.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" + ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:42:04.203740976" + "timestamp": "2024-09-16T08:21:25.261127166" }, - "csi_versions": { + "crai - stub": { "content": [ - [ - "versions.yml:md5,802c9776d9c5e95314e888cf18e96d77" - ] + { + "0": [ + + ], + "1": [ + + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.recalibrated.sorted.cram.crai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" + ], + "bai": [ + + ], + "crai": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.recalibrated.sorted.cram.crai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "csi": [ + + ], + "versions": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" + ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:42:09.57475878" + "timestamp": "2024-09-16T08:21:12.653194876" }, - "crai": { + "bai - stub": { "content": [ - [ - [ - { - "id": "test", - "single_end": false - }, - "test.paired_end.recalibrated.sorted.cram.crai:md5,14bc3bd5c89cacc8f4541f9062429029" + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.sorted.bam.bai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" + ], + "bai": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.sorted.bam.bai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "crai": [ + + ], + "csi": [ + + ], + "versions": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" ] - ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.04.3" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-12T18:41:38.446424" + "timestamp": "2024-09-16T08:21:01.854932651" }, - "bai": { + "csi": { "content": [ + "test.paired_end.sorted.bam.csi", [ - [ - { - "id": "test", - "single_end": false - }, - "test.paired_end.sorted.bam.bai:md5,704c10dd1326482448ca3073fdebc2f4" - ] + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.04.3" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-02-12T18:40:46.579747" + "timestamp": "2024-09-16T08:20:51.485364222" }, - "bai_versions": { + "crai": { "content": [ - [ - "versions.yml:md5,802c9776d9c5e95314e888cf18e96d77" - ] + { + "0": [ + + ], + "1": [ + + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.recalibrated.sorted.cram.crai:md5,14bc3bd5c89cacc8f4541f9062429029" + ] + ], + "3": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" + ], + "bai": [ + + ], + "crai": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.recalibrated.sorted.cram.crai:md5,14bc3bd5c89cacc8f4541f9062429029" + ] + ], + "csi": [ + + ], + "versions": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-16T08:20:40.518873972" + }, + "bai": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.sorted.bam.bai:md5,704c10dd1326482448ca3073fdebc2f4" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" + ], + "bai": [ + [ + { + "id": "test", + "single_end": false + }, + "test.paired_end.sorted.bam.bai:md5,704c10dd1326482448ca3073fdebc2f4" + ] + ], + "crai": [ + + ], + "csi": [ + + ], + "versions": [ + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" + ] + } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:41:57.929287369" + "timestamp": "2024-09-16T08:20:21.184050361" } } \ No newline at end of file diff --git a/modules/nf-core/samtools/merge/environment.yml b/modules/nf-core/samtools/merge/environment.yml index cd366d6de3..62054fc97a 100644 --- a/modules/nf-core/samtools/merge/environment.yml +++ b/modules/nf-core/samtools/merge/environment.yml @@ -1,8 +1,8 @@ -name: samtools_merge +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::samtools=1.20 - - bioconda::htslib=1.20 + - bioconda::htslib=1.21 + - bioconda::samtools=1.21 diff --git a/modules/nf-core/samtools/merge/main.nf b/modules/nf-core/samtools/merge/main.nf index 693b1d80f4..34da4c7c87 100644 --- a/modules/nf-core/samtools/merge/main.nf +++ b/modules/nf-core/samtools/merge/main.nf @@ -4,8 +4,8 @@ process SAMTOOLS_MERGE { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/samtools:1.20--h50ea8bc_0' : - 'biocontainers/samtools:1.20--h50ea8bc_0' }" + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h50ea8bc_0' : + 'biocontainers/samtools:1.21--h50ea8bc_0' }" input: tuple val(meta), path(input_files, stageAs: "?/*") diff --git a/modules/nf-core/samtools/merge/meta.yml b/modules/nf-core/samtools/merge/meta.yml index 2e8f3dbbb5..235aa21945 100644 --- a/modules/nf-core/samtools/merge/meta.yml +++ b/modules/nf-core/samtools/merge/meta.yml @@ -15,60 +15,81 @@ tools: documentation: http://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:samtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input_files: - type: file - description: BAM/CRAM file - pattern: "*.{bam,cram,sam}" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fasta: - type: file - description: Reference file the CRAM was created with (optional) - pattern: "*.{fasta,fa}" - - meta3: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fai: - type: file - description: Index of the reference file the CRAM was created with (optional) - pattern: "*.fai" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input_files: + type: file + description: BAM/CRAM file + pattern: "*.{bam,cram,sam}" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fasta: + type: file + description: Reference file the CRAM was created with (optional) + pattern: "*.{fasta,fa}" + - - meta3: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fai: + type: file + description: Index of the reference file the CRAM was created with (optional) + pattern: "*.fai" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - bam: - type: file - description: BAM file - pattern: "*.{bam}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.bam: + type: file + description: BAM file + pattern: "*.{bam}" - cram: - type: file - description: CRAM file - pattern: "*.{cram}" - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.cram: + type: file + description: CRAM file + pattern: "*.{cram}" - csi: - type: file - description: BAM index file (optional) - pattern: "*.csi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: BAM index file (optional) + pattern: "*.csi" - crai: - type: file - description: CRAM index file (optional) - pattern: "*.crai" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.crai": + type: file + description: CRAM index file (optional) + pattern: "*.crai" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@yuukiiwa " diff --git a/modules/nf-core/samtools/merge/tests/main.nf.test.snap b/modules/nf-core/samtools/merge/tests/main.nf.test.snap index 17bc846fa2..0a41e01af6 100644 --- a/modules/nf-core/samtools/merge/tests/main.nf.test.snap +++ b/modules/nf-core/samtools/merge/tests/main.nf.test.snap @@ -80,14 +80,14 @@ "bam_versions": { "content": [ [ - "versions.yml:md5,84dab54b9812780df48f5cecef690c34" + "versions.yml:md5,d51d18a97513e370e43f0c891c51dfc4" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:46:35.851936597" + "timestamp": "2024-09-16T09:16:30.476887194" }, "bams_csi": { "content": [ @@ -124,14 +124,14 @@ "bams_stub_versions": { "content": [ [ - "versions.yml:md5,84dab54b9812780df48f5cecef690c34" + "versions.yml:md5,d51d18a97513e370e43f0c891c51dfc4" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:46:41.405707643" + "timestamp": "2024-09-16T09:16:52.203823961" }, "bam_cram": { "content": [ @@ -158,14 +158,14 @@ "bams_versions": { "content": [ [ - "versions.yml:md5,84dab54b9812780df48f5cecef690c34" + "versions.yml:md5,d51d18a97513e370e43f0c891c51dfc4" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:45:51.695689923" + "timestamp": "2024-09-16T08:29:57.524363148" }, "crams_bam": { "content": [ @@ -182,14 +182,14 @@ "crams_versions": { "content": [ [ - "versions.yml:md5,84dab54b9812780df48f5cecef690c34" + "versions.yml:md5,d51d18a97513e370e43f0c891c51dfc4" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:46:30.185392319" + "timestamp": "2024-09-16T09:16:06.977096207" }, "bam_csi": { "content": [ diff --git a/modules/nf-core/samtools/mpileup/environment.yml b/modules/nf-core/samtools/mpileup/environment.yml index add717dafc..62054fc97a 100644 --- a/modules/nf-core/samtools/mpileup/environment.yml +++ b/modules/nf-core/samtools/mpileup/environment.yml @@ -1,8 +1,8 @@ -name: samtools_mpileup +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::samtools=1.20 - - bioconda::htslib=1.20 + - bioconda::htslib=1.21 + - bioconda::samtools=1.21 diff --git a/modules/nf-core/samtools/mpileup/main.nf b/modules/nf-core/samtools/mpileup/main.nf index b51f96d369..925c8244e0 100644 --- a/modules/nf-core/samtools/mpileup/main.nf +++ b/modules/nf-core/samtools/mpileup/main.nf @@ -4,15 +4,16 @@ process SAMTOOLS_MPILEUP { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/samtools:1.20--h50ea8bc_0' : - 'biocontainers/samtools:1.20--h50ea8bc_0' }" + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h50ea8bc_0' : + 'biocontainers/samtools:1.21--h50ea8bc_0' }" + input: tuple val(meta), path(input), path(intervals) path fasta output: tuple val(meta), path("*.mpileup.gz"), emit: mpileup - path "versions.yml" , emit: versions + path "versions.yml" , emit: versions when: task.ext.when == null || task.ext.when @@ -20,15 +21,29 @@ process SAMTOOLS_MPILEUP { script: def args = task.ext.args ?: '' def prefix = task.ext.prefix ?: "${meta.id}" - def intervals = intervals ? "-l ${intervals}" : "" + def intervals_arg = intervals ? "-l ${intervals}" : "" """ samtools mpileup \\ --fasta-ref $fasta \\ --output ${prefix}.mpileup \\ $args \\ - $intervals \\ + $intervals_arg \\ $input bgzip ${prefix}.mpileup + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + END_VERSIONS + """ + + stub: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def intervals_arg = intervals ? "-l ${intervals}" : "" + """ + touch ${prefix}.mpileup.gz + cat <<-END_VERSIONS > versions.yml "${task.process}": samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') diff --git a/modules/nf-core/samtools/mpileup/meta.yml b/modules/nf-core/samtools/mpileup/meta.yml index 13038fbc9b..0655e08727 100644 --- a/modules/nf-core/samtools/mpileup/meta.yml +++ b/modules/nf-core/samtools/mpileup/meta.yml @@ -15,38 +15,41 @@ tools: documentation: http://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:samtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - fasta: - type: file - description: FASTA reference file - pattern: "*.{fasta,fa}" - - intervals: - type: file - description: Interval FILE - pattern: "*.bed" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM/SAM file + pattern: "*.{bam,cram,sam}" + - intervals: + type: file + description: Interval FILE + pattern: "*.bed" + - - fasta: + type: file + description: FASTA reference file + pattern: "*.{fasta,fa}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - mpileup: - type: file - description: mpileup file - pattern: "*.{mpileup}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.mpileup.gz": + type: file + description: mpileup file + pattern: "*.{mpileup}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@joseespinosa" diff --git a/modules/nf-core/samtools/mpileup/tests/main.nf.test.snap b/modules/nf-core/samtools/mpileup/tests/main.nf.test.snap index 76b2086b40..fb6b6f1b7d 100644 --- a/modules/nf-core/samtools/mpileup/tests/main.nf.test.snap +++ b/modules/nf-core/samtools/mpileup/tests/main.nf.test.snap @@ -2,26 +2,26 @@ "bam_bed_versions": { "content": [ [ - "versions.yml:md5,449485eab74a465dc9023760be2c12a1" + "versions.yml:md5,0ad5d578348ec8f358197302530db925" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:43:41.535946155" + "timestamp": "2024-09-16T09:20:04.969588285" }, "bam_bed_fasta_versions": { "content": [ [ - "versions.yml:md5,449485eab74a465dc9023760be2c12a1" + "versions.yml:md5,0ad5d578348ec8f358197302530db925" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:43:46.787790118" + "timestamp": "2024-09-16T09:20:30.887599345" }, "bam_bed_mpileup": { "content": [ diff --git a/modules/nf-core/samtools/stats/environment.yml b/modules/nf-core/samtools/stats/environment.yml index 1cc83bd954..62054fc97a 100644 --- a/modules/nf-core/samtools/stats/environment.yml +++ b/modules/nf-core/samtools/stats/environment.yml @@ -1,8 +1,8 @@ -name: samtools_stats +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::samtools=1.20 - - bioconda::htslib=1.20 + - bioconda::htslib=1.21 + - bioconda::samtools=1.21 diff --git a/modules/nf-core/samtools/stats/main.nf b/modules/nf-core/samtools/stats/main.nf index 982bc28e7f..493525a9e0 100644 --- a/modules/nf-core/samtools/stats/main.nf +++ b/modules/nf-core/samtools/stats/main.nf @@ -4,8 +4,8 @@ process SAMTOOLS_STATS { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/samtools:1.20--h50ea8bc_0' : - 'biocontainers/samtools:1.20--h50ea8bc_0' }" + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h50ea8bc_0' : + 'biocontainers/samtools:1.21--h50ea8bc_0' }" input: tuple val(meta), path(input), path(input_index) diff --git a/modules/nf-core/samtools/stats/meta.yml b/modules/nf-core/samtools/stats/meta.yml index 735ff8122a..77b020f76e 100644 --- a/modules/nf-core/samtools/stats/meta.yml +++ b/modules/nf-core/samtools/stats/meta.yml @@ -16,43 +16,46 @@ tools: documentation: http://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:samtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM file from alignment - pattern: "*.{bam,cram}" - - input_index: - type: file - description: BAI/CRAI file from alignment - pattern: "*.{bai,crai}" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fasta: - type: file - description: Reference file the CRAM was created with (optional) - pattern: "*.{fasta,fa}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM file from alignment + pattern: "*.{bam,cram}" + - input_index: + type: file + description: BAI/CRAI file from alignment + pattern: "*.{bai,crai}" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fasta: + type: file + description: Reference file the CRAM was created with (optional) + pattern: "*.{fasta,fa}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - stats: - type: file - description: File containing samtools stats output - pattern: "*.{stats}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.stats": + type: file + description: File containing samtools stats output + pattern: "*.{stats}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@FriederikeHanssen" diff --git a/modules/nf-core/samtools/stats/tests/main.nf.test b/modules/nf-core/samtools/stats/tests/main.nf.test index e3d5cb14ce..5bc8930959 100644 --- a/modules/nf-core/samtools/stats/tests/main.nf.test +++ b/modules/nf-core/samtools/stats/tests/main.nf.test @@ -3,6 +3,7 @@ nextflow_process { name "Test Process SAMTOOLS_STATS" script "../main.nf" process "SAMTOOLS_STATS" + tag "modules" tag "modules_nfcore" tag "samtools" @@ -11,9 +12,6 @@ nextflow_process { test("bam") { when { - params { - outdir = "$outputDir" - } process { """ input[0] = Channel.of([ @@ -37,9 +35,59 @@ nextflow_process { test("cram") { when { - params { - outdir = "$outputDir" + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.recalibrated.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.recalibrated.sorted.cram.crai', checkIfExists: true) + ]) + input[1] = Channel.of([ + [ id:'genome' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true) + ]) + """ } + } + + then { + assertAll( + {assert process.success}, + {assert snapshot(process.out).match()} + ) + } + } + + test("bam - stub") { + + options "-stub" + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true) + ]) + input[1] = [[],[]] + """ + } + } + + then { + assertAll( + {assert process.success}, + {assert snapshot(process.out).match()} + ) + } + } + + test("cram - stub") { + + options "-stub" + + when { process { """ input[0] = Channel.of([ @@ -49,7 +97,7 @@ nextflow_process { ]) input[1] = Channel.of([ [ id:'genome' ], // meta map - file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta', checkIfExists: true) ]) """ } diff --git a/modules/nf-core/samtools/stats/tests/main.nf.test.snap b/modules/nf-core/samtools/stats/tests/main.nf.test.snap index 2747fd6c61..df507be7a3 100644 --- a/modules/nf-core/samtools/stats/tests/main.nf.test.snap +++ b/modules/nf-core/samtools/stats/tests/main.nf.test.snap @@ -8,11 +8,11 @@ "id": "test", "single_end": false }, - "test.stats:md5,c9d39b38c22de2057fc2f89949090975" + "test.stats:md5,a27fe55e49a341f92379bb20a65c6a06" ] ], "1": [ - "versions.yml:md5,b3b70b126f867fdbb7dcea5e36e49d4a" + "versions.yml:md5,15b91d8c0e0440332e0fe4df80957043" ], "stats": [ [ @@ -20,19 +20,89 @@ "id": "test", "single_end": false }, - "test.stats:md5,c9d39b38c22de2057fc2f89949090975" + "test.stats:md5,a27fe55e49a341f92379bb20a65c6a06" ] ], "versions": [ - "versions.yml:md5,b3b70b126f867fdbb7dcea5e36e49d4a" + "versions.yml:md5,15b91d8c0e0440332e0fe4df80957043" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:45:24.403941966" + "timestamp": "2024-09-16T09:29:16.767396182" + }, + "bam - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.stats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,15b91d8c0e0440332e0fe4df80957043" + ], + "stats": [ + [ + { + "id": "test", + "single_end": false + }, + "test.stats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,15b91d8c0e0440332e0fe4df80957043" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-16T09:29:29.721580274" + }, + "cram - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.stats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,15b91d8c0e0440332e0fe4df80957043" + ], + "stats": [ + [ + { + "id": "test", + "single_end": false + }, + "test.stats:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,15b91d8c0e0440332e0fe4df80957043" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-16T09:29:53.567964304" }, "bam": { "content": [ @@ -43,11 +113,11 @@ "id": "test", "single_end": false }, - "test.stats:md5,d522a1fa016b259d6a55620ae53dcd63" + "test.stats:md5,d53a2584376d78942839e9933a34d11b" ] ], "1": [ - "versions.yml:md5,b3b70b126f867fdbb7dcea5e36e49d4a" + "versions.yml:md5,15b91d8c0e0440332e0fe4df80957043" ], "stats": [ [ @@ -55,18 +125,18 @@ "id": "test", "single_end": false }, - "test.stats:md5,d522a1fa016b259d6a55620ae53dcd63" + "test.stats:md5,d53a2584376d78942839e9933a34d11b" ] ], "versions": [ - "versions.yml:md5,b3b70b126f867fdbb7dcea5e36e49d4a" + "versions.yml:md5,15b91d8c0e0440332e0fe4df80957043" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:45:06.711251947" + "timestamp": "2024-09-16T09:28:50.73610604" } } \ No newline at end of file diff --git a/modules/nf-core/samtools/view/environment.yml b/modules/nf-core/samtools/view/environment.yml index 150c377771..62054fc97a 100644 --- a/modules/nf-core/samtools/view/environment.yml +++ b/modules/nf-core/samtools/view/environment.yml @@ -1,8 +1,8 @@ -name: samtools_view +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::samtools=1.20 - - bioconda::htslib=1.20 + - bioconda::htslib=1.21 + - bioconda::samtools=1.21 diff --git a/modules/nf-core/samtools/view/main.nf b/modules/nf-core/samtools/view/main.nf index 38df857604..37e05cec88 100644 --- a/modules/nf-core/samtools/view/main.nf +++ b/modules/nf-core/samtools/view/main.nf @@ -4,8 +4,8 @@ process SAMTOOLS_VIEW { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/samtools:1.20--h50ea8bc_0' : - 'biocontainers/samtools:1.20--h50ea8bc_0' }" + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h50ea8bc_0' : + 'biocontainers/samtools:1.21--h50ea8bc_0' }" input: tuple val(meta), path(input), path(index) @@ -13,13 +13,15 @@ process SAMTOOLS_VIEW { path qname output: - tuple val(meta), path("*.bam"), emit: bam, optional: true - tuple val(meta), path("*.cram"), emit: cram, optional: true - tuple val(meta), path("*.sam"), emit: sam, optional: true - tuple val(meta), path("*.bai"), emit: bai, optional: true - tuple val(meta), path("*.csi"), emit: csi, optional: true - tuple val(meta), path("*.crai"), emit: crai, optional: true - path "versions.yml", emit: versions + tuple val(meta), path("${prefix}.bam"), emit: bam, optional: true + tuple val(meta), path("${prefix}.cram"), emit: cram, optional: true + tuple val(meta), path("${prefix}.sam"), emit: sam, optional: true + tuple val(meta), path("${prefix}.${file_type}.bai"), emit: bai, optional: true + tuple val(meta), path("${prefix}.${file_type}.csi"), emit: csi, optional: true + tuple val(meta), path("${prefix}.${file_type}.crai"), emit: crai, optional: true + tuple val(meta), path("${prefix}.unselected.${file_type}"), emit: unselected, optional: true + tuple val(meta), path("${prefix}.unselected.${file_type}.{bai,csi,crsi}"), emit: unselected_index, optional: true + path "versions.yml", emit: versions when: task.ext.when == null || task.ext.when @@ -27,13 +29,13 @@ process SAMTOOLS_VIEW { script: def args = task.ext.args ?: '' def args2 = task.ext.args2 ?: '' - def prefix = task.ext.prefix ?: "${meta.id}" + prefix = task.ext.prefix ?: "${meta.id}" def reference = fasta ? "--reference ${fasta}" : "" - def readnames = qname ? "--qname-file ${qname}": "" - def file_type = args.contains("--output-fmt sam") ? "sam" : - args.contains("--output-fmt bam") ? "bam" : - args.contains("--output-fmt cram") ? "cram" : - input.getExtension() + file_type = args.contains("--output-fmt sam") ? "sam" : + args.contains("--output-fmt bam") ? "bam" : + args.contains("--output-fmt cram") ? "cram" : + input.getExtension() + readnames = qname ? "--qname-file ${qname} --output-unselected ${prefix}.unselected.${file_type}": "" if ("$input" == "${prefix}.${file_type}") error "Input and output names are the same, use \"task.ext.prefix\" to disambiguate!" """ samtools \\ @@ -54,14 +56,14 @@ process SAMTOOLS_VIEW { stub: def args = task.ext.args ?: '' - def prefix = task.ext.prefix ?: "${meta.id}" - def file_type = args.contains("--output-fmt sam") ? "sam" : - args.contains("--output-fmt bam") ? "bam" : - args.contains("--output-fmt cram") ? "cram" : - input.getExtension() + prefix = task.ext.prefix ?: "${meta.id}" + file_type = args.contains("--output-fmt sam") ? "sam" : + args.contains("--output-fmt bam") ? "bam" : + args.contains("--output-fmt cram") ? "cram" : + input.getExtension() if ("$input" == "${prefix}.${file_type}") error "Input and output names are the same, use \"task.ext.prefix\" to disambiguate!" - def index = args.contains("--write-index") ? "touch ${prefix}.csi" : "" + def index = args.contains("--write-index") ? "touch ${prefix}.${file_type}.csi" : "" """ touch ${prefix}.${file_type} diff --git a/modules/nf-core/samtools/view/meta.yml b/modules/nf-core/samtools/view/meta.yml index 3dadafae75..caa7b0150d 100644 --- a/modules/nf-core/samtools/view/meta.yml +++ b/modules/nf-core/samtools/view/meta.yml @@ -15,68 +15,120 @@ tools: documentation: http://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:samtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - index: - type: file - description: BAM.BAI/BAM.CSI/CRAM.CRAI file (optional) - pattern: "*.{.bai,.csi,.crai}" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'test' ] - - fasta: - type: file - description: Reference file the CRAM was created with (optional) - pattern: "*.{fasta,fa}" - - qname: - type: file - description: Optional file with read names to output only select alignments - pattern: "*.{txt,list}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM/SAM file + pattern: "*.{bam,cram,sam}" + - index: + type: file + description: BAM.BAI/BAM.CSI/CRAM.CRAI file (optional) + pattern: "*.{.bai,.csi,.crai}" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test' ] + - fasta: + type: file + description: Reference file the CRAM was created with (optional) + pattern: "*.{fasta,fa}" + - - qname: + type: file + description: Optional file with read names to output only select alignments + pattern: "*.{txt,list}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - bam: - type: file - description: optional filtered/converted BAM file - pattern: "*.{bam}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.bam: + type: file + description: optional filtered/converted BAM file + pattern: "*.{bam}" - cram: - type: file - description: optional filtered/converted CRAM file - pattern: "*.{cram}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.cram: + type: file + description: optional filtered/converted CRAM file + pattern: "*.{cram}" - sam: - type: file - description: optional filtered/converted SAM file - pattern: "*.{sam}" - # bai, csi, and crai are created with `--write-index` + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.sam: + type: file + description: optional filtered/converted SAM file + pattern: "*.{sam}" - bai: - type: file - description: optional BAM file index - pattern: "*.{bai}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.${file_type}.bai: + type: file + description: optional BAM file index + pattern: "*.{bai}" - csi: - type: file - description: optional tabix BAM file index - pattern: "*.{csi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.${file_type}.csi: + type: file + description: optional tabix BAM file index + pattern: "*.{csi}" - crai: - type: file - description: optional CRAM file index - pattern: "*.{crai}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.${file_type}.crai: + type: file + description: optional CRAM file index + pattern: "*.{crai}" + - unselected: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.unselected.${file_type}: + type: file + description: optional file with unselected alignments + pattern: "*.unselected.{bam,cram,sam}" + - unselected_index: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.unselected.${file_type}.{bai,csi,crsi}: + type: file + description: index for the "unselected" file + pattern: "*.unselected.{bai,csi,crai}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@joseespinosa" diff --git a/modules/nf-core/samtools/view/tests/main.nf.test b/modules/nf-core/samtools/view/tests/main.nf.test index 45a0defbae..37b81a9163 100644 --- a/modules/nf-core/samtools/view/tests/main.nf.test +++ b/modules/nf-core/samtools/view/tests/main.nf.test @@ -172,6 +172,8 @@ nextflow_process { { assert snapshot(process.out.crai).match("cram_to_bam_index_qname_crai") }, { assert snapshot(process.out.cram).match("cram_to_bam_index_qname_cram") }, { assert snapshot(process.out.sam).match("cram_to_bam_index_qname_sam") }, + { assert snapshot(file(process.out.unselected[0][1]).name).match("cram_to_bam_index_qname_unselected") }, + { assert snapshot(file(process.out.unselected_index[0][1]).name).match("cram_to_bam_index_qname_unselected_csi") }, { assert snapshot(process.out.versions).match("cram_to_bam_index_qname_versions") } ) } diff --git a/modules/nf-core/samtools/view/tests/main.nf.test.snap b/modules/nf-core/samtools/view/tests/main.nf.test.snap index eb0c577c90..63849b037b 100644 --- a/modules/nf-core/samtools/view/tests/main.nf.test.snap +++ b/modules/nf-core/samtools/view/tests/main.nf.test.snap @@ -56,14 +56,14 @@ "bam_stub_versions": { "content": [ [ - "versions.yml:md5,6cd41a9a3b4a95271ec011ea990a2838" + "versions.yml:md5,176db5ec46b965219604bcdbb3ef9e07" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:43:20.390692583" + "timestamp": "2024-09-16T09:26:24.461775464" }, "cram_to_bam_index_cram": { "content": [ @@ -169,6 +169,16 @@ }, "timestamp": "2024-02-12T19:37:56.490286" }, + "cram_to_bam_index_qname_unselected_csi": { + "content": [ + "test.unselected.bam.csi" + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.04.3" + }, + "timestamp": "2024-02-12T19:38:23.328458" + }, "bam_csi": { "content": [ [ @@ -208,14 +218,14 @@ "cram_to_bam_index_qname_versions": { "content": [ [ - "versions.yml:md5,6cd41a9a3b4a95271ec011ea990a2838" + "versions.yml:md5,176db5ec46b965219604bcdbb3ef9e07" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:43:15.007493874" + "timestamp": "2024-09-16T09:25:51.953436682" }, "cram_to_bam_bam": { "content": [ @@ -240,14 +250,14 @@ "cram_to_bam_index_versions": { "content": [ [ - "versions.yml:md5,6cd41a9a3b4a95271ec011ea990a2838" + "versions.yml:md5,176db5ec46b965219604bcdbb3ef9e07" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:43:09.472376824" + "timestamp": "2024-09-16T09:25:14.475388399" }, "cram_to_bam_bai": { "content": [ @@ -264,14 +274,14 @@ "cram_to_bam_versions": { "content": [ [ - "versions.yml:md5,6cd41a9a3b4a95271ec011ea990a2838" + "versions.yml:md5,176db5ec46b965219604bcdbb3ef9e07" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:43:04.080050906" + "timestamp": "2024-09-16T09:24:49.673441798" }, "cram_bam": { "content": [ @@ -355,17 +365,37 @@ }, "timestamp": "2024-02-12T19:38:23.322874" }, + "cram_to_bam_index_qname_unselected": { + "content": [ + "test.unselected.bam" + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.04.3" + }, + "timestamp": "2024-02-12T19:38:23.322874" + }, + "cram_to_bam_index_qname_unselected_csi": { + "content": [ + "test.unselected.bam.csi" + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "23.04.3" + }, + "timestamp": "2024-02-12T19:38:23.328458" + }, "bam_versions": { "content": [ [ - "versions.yml:md5,6cd41a9a3b4a95271ec011ea990a2838" + "versions.yml:md5,176db5ec46b965219604bcdbb3ef9e07" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:42:52.978954857" + "timestamp": "2024-09-16T09:23:27.151650338" }, "cram_to_bam_index_qname_cram": { "content": [ @@ -430,14 +460,24 @@ "cram_versions": { "content": [ [ - "versions.yml:md5,6cd41a9a3b4a95271ec011ea990a2838" + "versions.yml:md5,176db5ec46b965219604bcdbb3ef9e07" ] ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-16T09:24:12.95416913" + }, + "cram_to_bam_index_qname_unselected": { + "content": [ + "test.unselected.bam" + ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "23.04.3" }, - "timestamp": "2024-05-28T15:42:58.400776109" + "timestamp": "2024-02-12T19:38:23.322874" }, "bam_sam": { "content": [ @@ -477,7 +517,7 @@ }, "bam_stub_csi": { "content": [ - "test.csi" + "test.bam.csi" ], "meta": { "nf-test": "0.8.4", diff --git a/modules/nf-core/sentieon/applyvarcal/environment.yml b/modules/nf-core/sentieon/applyvarcal/environment.yml index 0af79bedcd..d7abf668ea 100644 --- a/modules/nf-core/sentieon/applyvarcal/environment.yml +++ b/modules/nf-core/sentieon/applyvarcal/environment.yml @@ -1,7 +1,5 @@ -name: sentieon_applyvarcal channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::sentieon=202308.02 + - bioconda::sentieon=202308.03 diff --git a/modules/nf-core/sentieon/applyvarcal/main.nf b/modules/nf-core/sentieon/applyvarcal/main.nf index 6e5e863c51..724912d689 100644 --- a/modules/nf-core/sentieon/applyvarcal/main.nf +++ b/modules/nf-core/sentieon/applyvarcal/main.nf @@ -3,12 +3,10 @@ process SENTIEON_APPLYVARCAL { label 'process_low' label 'sentieon' - secret 'SENTIEON_LICENSE_BASE64' - conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/sentieon:202308.02--h43eeafb_0' : - 'biocontainers/sentieon:202308.02--h43eeafb_0' }" + 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/a6/a64461f38d76bebea8e21441079e76e663e1168b0c59dafee6ee58440ad8c8ac/data' : + 'community.wave.seqera.io/library/sentieon:202308.03--59589f002351c221' }" input: tuple val(meta), path(vcf), path(vcf_tbi), path(recal), path(recal_index), path(tranches) @@ -24,38 +22,13 @@ process SENTIEON_APPLYVARCAL { task.ext.when == null || task.ext.when script: - // The following code sets LD_LIBRARY_PATH in the script-section when the module is run by Singularity. - // That turned out to be one way of overcoming the following issue with the Singularity-Sentieon-containers from galaxy, Sentieon (LD_LIBRARY_PATH) and the way Nextflow runs Singularity-containers. - // The galaxy container uses a runscript which is responsible for setting LD_PRELOAD properly. Nextflow executes singularity containers using `singularity exec`, which avoids the run script, leading to the LD_LIBRARY_PATH/libstdc++.so.6 error. - if (workflow.containerEngine in ['singularity','apptainer']) { - fix_ld_library_path = 'LD_LIBRARY_PATH=/usr/local/lib/:\$LD_LIBRARY_PATH;export LD_LIBRARY_PATH' - } else { - fix_ld_library_path = '' - } - def args = task.ext.args ?: '' def prefix = task.ext.prefix ?: "${meta.id}" - def sentieon_auth_mech_base64 = task.ext.sentieon_auth_mech_base64 ?: '' - def sentieon_auth_data_base64 = task.ext.sentieon_auth_data_base64 ?: '' - + def sentieonLicense = secrets.SENTIEON_LICENSE_BASE64 ? + "export SENTIEON_LICENSE=\$(mktemp);echo -e \"${secrets.SENTIEON_LICENSE_BASE64}\" | base64 -d > \$SENTIEON_LICENSE; " : + "" """ - if [ "\${#SENTIEON_LICENSE_BASE64}" -lt "1500" ]; then # If the string SENTIEON_LICENSE_BASE64 is short, then it is an encrypted url. - export SENTIEON_LICENSE=\$(echo -e "\$SENTIEON_LICENSE_BASE64" | base64 -d) - else # Localhost license file - # The license file is stored as a nextflow variable like, for instance, this: - # nextflow secrets set SENTIEON_LICENSE_BASE64 \$(cat | base64 -w 0) - export SENTIEON_LICENSE=\$(mktemp) - echo -e "\$SENTIEON_LICENSE_BASE64" | base64 -d > \$SENTIEON_LICENSE - fi - - if [ ${sentieon_auth_mech_base64} ] && [ ${sentieon_auth_data_base64} ]; then - # If sentieon_auth_mech_base64 and sentieon_auth_data_base64 are non-empty strings, then Sentieon is mostly likely being run with some test-license. - export SENTIEON_AUTH_MECH=\$(echo -n "${sentieon_auth_mech_base64}" | base64 -d) - export SENTIEON_AUTH_DATA=\$(echo -n "${sentieon_auth_data_base64}" | base64 -d) - echo "Decoded and exported Sentieon test-license system environment variables" - fi - - $fix_ld_library_path + $sentieonLicense sentieon driver -r ${fasta} --algo ApplyVarCal \\ -v $vcf \\ @@ -71,19 +44,8 @@ process SENTIEON_APPLYVARCAL { """ stub: - // The following code sets LD_LIBRARY_PATH in the script-section when the module is run by Singularity. - // That turned out to be one way of overcoming the following issue with the Singularity-Sentieon-containers from galaxy, Sentieon (LD_LIBRARY_PATH) and the way Nextflow runs Singularity-containers. - // The galaxy container uses a runscript which is responsible for setting LD_PRELOAD properly. Nextflow executes singularity containers using `singularity exec`, which avoids the run script, leading to the LD_LIBRARY_PATH/libstdc++.so.6 error. - if (workflow.containerEngine in ['singularity','apptainer']) { - fix_ld_library_path = 'LD_LIBRARY_PATH=/usr/local/lib/:\$LD_LIBRARY_PATH;export LD_LIBRARY_PATH' - } else { - fix_ld_library_path = '' - } - def prefix = task.ext.prefix ?: "${meta.id}" """ - $fix_ld_library_path - touch ${prefix}.vcf.gz touch ${prefix}.vcf.gz.tbi diff --git a/modules/nf-core/sentieon/applyvarcal/meta.yml b/modules/nf-core/sentieon/applyvarcal/meta.yml index da92ce3436..e8505a2ae5 100644 --- a/modules/nf-core/sentieon/applyvarcal/meta.yml +++ b/modules/nf-core/sentieon/applyvarcal/meta.yml @@ -17,68 +17,80 @@ tools: Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system. homepage: https://www.sentieon.com/ documentation: https://www.sentieon.com/ + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test'] - - vcf: - type: file - description: VCF file to be recalibrated, this should be the same file as used for the first stage VariantRecalibrator. - pattern: "*.vcf" - - vcf_tbi: - type: file - description: tabix index for the input vcf file. - pattern: "*.vcf.tbi" - - recal: - type: file - description: Recalibration file produced when the input vcf was run through VariantRecalibrator in stage 1. - pattern: "*.recal" - - recal_index: - type: file - description: Index file for the recalibration file. - pattern: ".recal.idx" - - tranches: - type: file - description: Tranches file produced when the input vcf was run through VariantRecalibrator in stage 1. - pattern: ".tranches" - - meta2: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test'] - - fasta: - type: file - description: The reference fasta file - pattern: "*.fasta" - - meta3: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test'] - - fai: - type: file - description: Index of reference fasta file - pattern: "*.fasta.fai" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - vcf: + type: file + description: VCF file to be recalibrated, this should be the same file as used + for the first stage VariantRecalibrator. + pattern: "*.vcf" + - vcf_tbi: + type: file + description: tabix index for the input vcf file. + pattern: "*.vcf.tbi" + - recal: + type: file + description: Recalibration file produced when the input vcf was run through + VariantRecalibrator in stage 1. + pattern: "*.recal" + - recal_index: + type: file + description: Index file for the recalibration file. + pattern: ".recal.idx" + - tranches: + type: file + description: Tranches file produced when the input vcf was run through VariantRecalibrator + in stage 1. + pattern: ".tranches" + - - meta2: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - fasta: + type: file + description: The reference fasta file + pattern: "*.fasta" + - - meta3: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - fai: + type: file + description: Index of reference fasta file + pattern: "*.fasta.fai" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test'] - vcf: - type: file - description: compressed vcf file containing the recalibrated variants. - pattern: "*.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - "*.vcf.gz": + type: file + description: compressed vcf file containing the recalibrated variants. + pattern: "*.vcf.gz" - tbi: - type: file - description: Index of recalibrated vcf file. - pattern: "*vcf.gz.tbi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - "*.tbi": + type: file + description: Index of recalibrated vcf file. + pattern: "*vcf.gz.tbi" - versions: - type: file - description: File containing software versions. - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions. + pattern: "versions.yml" authors: - "@assp8200" maintainers: diff --git a/modules/nf-core/sentieon/bwamem/environment.yml b/modules/nf-core/sentieon/bwamem/environment.yml index f03db6f8a3..d7abf668ea 100644 --- a/modules/nf-core/sentieon/bwamem/environment.yml +++ b/modules/nf-core/sentieon/bwamem/environment.yml @@ -1,7 +1,5 @@ -name: sentieon_bwamem channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::sentieon=202308.02 + - bioconda::sentieon=202308.03 diff --git a/modules/nf-core/sentieon/bwamem/main.nf b/modules/nf-core/sentieon/bwamem/main.nf index 62693851c3..c038a857bf 100644 --- a/modules/nf-core/sentieon/bwamem/main.nf +++ b/modules/nf-core/sentieon/bwamem/main.nf @@ -3,12 +3,10 @@ process SENTIEON_BWAMEM { label 'process_high' label 'sentieon' - secret 'SENTIEON_LICENSE_BASE64' - conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/sentieon:202308.02--h43eeafb_0' : - 'biocontainers/sentieon:202308.02--h43eeafb_0' }" + 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/a6/a64461f38d76bebea8e21441079e76e663e1168b0c59dafee6ee58440ad8c8ac/data' : + 'community.wave.seqera.io/library/sentieon:202308.03--59589f002351c221' }" input: tuple val(meta), path(reads) @@ -17,45 +15,22 @@ process SENTIEON_BWAMEM { tuple val(meta4), path(fasta_fai) output: - tuple val(meta), path("*.bam"), path("*.bai"), emit: bam_and_bai + tuple val(meta), path("${prefix}"), path("${prefix}.{bai,crai}"), emit: bam_and_bai path "versions.yml" , emit: versions when: task.ext.when == null || task.ext.when script: - // The following code sets LD_LIBRARY_PATH in the script-section when the module is run by Singularity. - // That turned out to be one way of overcoming the following issue with the Singularity-Sentieon-containers from galaxy, Sentieon (LD_LIBRARY_PATH) and the way Nextflow runs Singularity-containers. - // The galaxy container uses a runscript which is responsible for setting LD_PRELOAD properly. Nextflow executes singularity containers using `singularity exec`, which avoids the run script, leading to the LD_LIBRARY_PATH/libstdc++.so.6 error. - if (workflow.containerEngine in ['singularity','apptainer']) { - fix_ld_library_path = 'LD_LIBRARY_PATH=/usr/local/lib/:\$LD_LIBRARY_PATH;export LD_LIBRARY_PATH' - } else { - fix_ld_library_path = '' - } - def args = task.ext.args ?: '' - def prefix = task.ext.prefix ?: "${meta.id}" - def sentieon_auth_mech_base64 = task.ext.sentieon_auth_mech_base64 ?: '' - def sentieon_auth_data_base64 = task.ext.sentieon_auth_data_base64 ?: '' + prefix = task.ext.prefix ?: "${meta.id}.bam" + def sentieonLicense = secrets.SENTIEON_LICENSE_BASE64 ? + "export SENTIEON_LICENSE=\$(mktemp);echo -e \"${secrets.SENTIEON_LICENSE_BASE64}\" | base64 -d > \$SENTIEON_LICENSE; " : + "" """ - if [ "\${#SENTIEON_LICENSE_BASE64}" -lt "1500" ]; then # If the string SENTIEON_LICENSE_BASE64 is short, then it is an encrypted url. - export SENTIEON_LICENSE=\$(echo -e "\$SENTIEON_LICENSE_BASE64" | base64 -d) - else # Localhost license file - # The license file is stored as a nextflow variable like, for instance, this: - # nextflow secrets set SENTIEON_LICENSE_BASE64 \$(cat | base64 -w 0) - export SENTIEON_LICENSE=\$(mktemp) - echo -e "\$SENTIEON_LICENSE_BASE64" | base64 -d > \$SENTIEON_LICENSE - fi - - if [ ${sentieon_auth_mech_base64} ] && [ ${sentieon_auth_data_base64} ]; then - # If sentieon_auth_mech_base64 and sentieon_auth_data_base64 are non-empty strings, then Sentieon is mostly likely being run with some test-license. - export SENTIEON_AUTH_MECH=\$(echo -n "${sentieon_auth_mech_base64}" | base64 -d) - export SENTIEON_AUTH_DATA=\$(echo -n "${sentieon_auth_data_base64}" | base64 -d) - echo "Decoded and exported Sentieon test-license system environment variables" - fi - - $fix_ld_library_path + $sentieonLicense + export bwt_max_mem="${(task.memory * 0.9).toGiga()}G" INDEX=`find -L ./ -name "*.amb" | sed 's/.amb//'` @@ -64,7 +39,12 @@ process SENTIEON_BWAMEM { -t $task.cpus \\ \$INDEX \\ $reads \\ - | sentieon util sort -r $fasta -t $task.cpus -o ${prefix}.bam --sam2bam - + | sentieon util sort -r $fasta -t $task.cpus -o ${prefix} --sam2bam - + + # Delete *.bai file if prefix ends with .cram + if [[ "${prefix}" == *.cram ]]; then + rm -f "${prefix}.bai" + fi cat <<-END_VERSIONS > versions.yml "${task.process}": @@ -74,21 +54,12 @@ process SENTIEON_BWAMEM { """ stub: - // The following code sets LD_LIBRARY_PATH in the script-section when the module is run by Singularity. - // That turned out to be one way of overcoming the following issue with the Singularity-Sentieon-containers from galaxy, Sentieon (LD_LIBRARY_PATH) and the way Nextflow runs Singularity-containers. - // The galaxy container uses a runscript which is responsible for setting LD_PRELOAD properly. Nextflow executes singularity containers using `singularity exec`, which avoids the run script, leading to the LD_LIBRARY_PATH/libstdc++.so.6 error. - if (workflow.containerEngine in ['singularity','apptainer']) { - fix_ld_library_path = 'LD_LIBRARY_PATH=/usr/local/lib/:\$LD_LIBRARY_PATH;export LD_LIBRARY_PATH' - } else { - fix_ld_library_path = '' - } + prefix = task.ext.prefix ?: "${meta.id}.bam" + index = prefix.tokenize('.')[-1] == "bam" ? "bai" : "crai" - def prefix = task.ext.prefix ?: "${meta.id}" """ - $fix_ld_library_path - - touch ${prefix}.bam - touch ${prefix}.bam.bai + touch ${prefix} + touch ${prefix}.${index} cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/sentieon/bwamem/meta.yml b/modules/nf-core/sentieon/bwamem/meta.yml index 0859a923ca..b27dd4877f 100644 --- a/modules/nf-core/sentieon/bwamem/meta.yml +++ b/modules/nf-core/sentieon/bwamem/meta.yml @@ -15,61 +15,65 @@ tools: Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system. homepage: https://www.sentieon.com/ documentation: https://www.sentieon.com/ + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing reference information. - e.g. [ id:'test', single_end:false ] - - reads: - type: file - description: Genome fastq files (single-end or paired-end) - - meta2: - type: map - description: | - Groovy Map containing reference information. - e.g. [ id:'test', single_end:false ] - - index: - type: file - description: BWA genome index files - pattern: "*.{amb,ann,bwt,pac,sa}" - - meta3: - type: map - description: | - Groovy Map containing reference information. - e.g. [ id:'test', single_end:false ] - - fasta: - type: file - description: Genome fasta file - pattern: "*.{fa,fasta}" - - meta4: - type: map - description: | - Groovy Map containing reference information. - e.g. [ id:'test', single_end:false ] - - fasta_fai: - type: file - description: The index of the FASTA reference. - pattern: "*.fai" + - - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - reads: + type: file + description: Genome fastq files (single-end or paired-end) + - - meta2: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - index: + type: file + description: BWA genome index files + pattern: "*.{amb,ann,bwt,pac,sa}" + - - meta3: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: Genome fasta file + pattern: "*.{fa,fasta}" + - - meta4: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - fasta_fai: + type: file + description: The index of the FASTA reference. + pattern: "*.fai" output: - - meta: - type: map - description: | - Groovy Map containing reference information. - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: BAM file. - pattern: "*.bam" - - bai: - type: file - description: BAI file - pattern: "*.bai" + - bam_and_bai: + - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - ${prefix}: + type: file + description: BAM file with corresponding index. + pattern: "*.{bam,bai}" + - ${prefix}.{bai,crai}: + type: file + description: BAM file with corresponding index. + pattern: "*.{bam,bai}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@asp8200" maintainers: - "@asp8200" + - "@DonFreed" diff --git a/modules/nf-core/sentieon/bwamem/tests/main.nf.test b/modules/nf-core/sentieon/bwamem/tests/main.nf.test new file mode 100644 index 0000000000..074722882c --- /dev/null +++ b/modules/nf-core/sentieon/bwamem/tests/main.nf.test @@ -0,0 +1,262 @@ +nextflow_process { + + name "Test Process SENTIEON_BWAMEM" + tag "modules_nfcore" + tag "modules" + tag "sentieon" + tag "bwamem" + tag "bwaindex" + tag "sentieon/bwaindex" + tag "sentieon/bwamem" + + script "../main.nf" + process "SENTIEON_BWAMEM" + + test("Single-End") { + config "./nextflow.config" + + setup { + run("SENTIEON_BWAINDEX") { + script "../../bwaindex/main.nf" + process { + """ + input[0] = [ + [id: 'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:true ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ] + ] + input[1] = SENTIEON_BWAINDEX.out.index + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true)] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("Single-End Output CRAM") { + config "./nextflow_out_cram.config" + + setup { + run("SENTIEON_BWAINDEX") { + script "../../bwaindex/main.nf" + process { + """ + input[0] = [ + [id: 'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:true ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ] + ] + input[1] = SENTIEON_BWAINDEX.out.index + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true)] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("Paired-End") { + config "./nextflow.config" + + setup { + run("SENTIEON_BWAINDEX") { + script "../../bwaindex/main.nf" + process { + """ + input[0] = [ + [id: 'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ] + input[1] = SENTIEON_BWAINDEX.out.index + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true)] + """ + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ] + input[1] = SENTIEON_BWAINDEX.out.index + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true)] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("Single-End - stub") { + config "./nextflow.config" + options "-stub" + + setup { + run("SENTIEON_BWAINDEX") { + script "../../bwaindex/main.nf" + process { + """ + input[0] = [ + [id: 'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:true ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true) + ] + ] + input[1] = SENTIEON_BWAINDEX.out.index + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true)] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("Paired-End - stub") { + config "./nextflow.config" + options "-stub" + + setup { + run("SENTIEON_BWAINDEX") { + script "../../bwaindex/main.nf" + process { + """ + input[0] = [ + [id: 'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true) + ] + """ + } + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ] + input[1] = SENTIEON_BWAINDEX.out.index + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true)] + """ + } + } + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true) + ] + ] + input[1] = SENTIEON_BWAINDEX.out.index + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[3] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true)] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + +} diff --git a/modules/nf-core/sentieon/bwamem/tests/main.nf.test.snap b/modules/nf-core/sentieon/bwamem/tests/main.nf.test.snap new file mode 100644 index 0000000000..77070ccc45 --- /dev/null +++ b/modules/nf-core/sentieon/bwamem/tests/main.nf.test.snap @@ -0,0 +1,187 @@ +{ + "Single-End": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.bam:md5,7b62831cb67d6d4a8e33b3cb788dfb1b", + "test.bam.bai:md5,6fc1dff58fab0491ecfa48f016041a18" + ] + ], + "1": [ + "versions.yml:md5,755d24c7416c1408313ec93814cef759" + ], + "bam_and_bai": [ + [ + { + "id": "test", + "single_end": true + }, + "test.bam:md5,7b62831cb67d6d4a8e33b3cb788dfb1b", + "test.bam.bai:md5,6fc1dff58fab0491ecfa48f016041a18" + ] + ], + "versions": [ + "versions.yml:md5,755d24c7416c1408313ec93814cef759" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:25:33.275731212" + }, + "Paired-End - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.bam.bai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,755d24c7416c1408313ec93814cef759" + ], + "bam_and_bai": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.bam.bai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,755d24c7416c1408313ec93814cef759" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:26:16.58588651" + }, + "Paired-End": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam:md5,b0c8606d660dbe50a34cf80a376bb268", + "test.bam.bai:md5,be4ad85790468042f7fc01ca2e36a919" + ] + ], + "1": [ + "versions.yml:md5,755d24c7416c1408313ec93814cef759" + ], + "bam_and_bai": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bam:md5,b0c8606d660dbe50a34cf80a376bb268", + "test.bam.bai:md5,be4ad85790468042f7fc01ca2e36a919" + ] + ], + "versions": [ + "versions.yml:md5,755d24c7416c1408313ec93814cef759" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:25:55.068934639" + }, + "Single-End - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.bam.bai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,755d24c7416c1408313ec93814cef759" + ], + "bam_and_bai": [ + [ + { + "id": "test", + "single_end": true + }, + "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.bam.bai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,755d24c7416c1408313ec93814cef759" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:26:05.151760076" + }, + "Single-End Output CRAM": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": true + }, + "test.cram:md5,817cf0847ae0c89062e2ee4be312101a", + "test.cram.crai:md5,60f801c550a18982e55207adb31ec351" + ] + ], + "1": [ + "versions.yml:md5,755d24c7416c1408313ec93814cef759" + ], + "bam_and_bai": [ + [ + { + "id": "test", + "single_end": true + }, + "test.cram:md5,817cf0847ae0c89062e2ee4be312101a", + "test.cram.crai:md5,60f801c550a18982e55207adb31ec351" + ] + ], + "versions": [ + "versions.yml:md5,755d24c7416c1408313ec93814cef759" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:25:44.360755915" + } +} \ No newline at end of file diff --git a/modules/nf-core/sentieon/bwamem/tests/nextflow.config b/modules/nf-core/sentieon/bwamem/tests/nextflow.config new file mode 100644 index 0000000000..717fb52c7a --- /dev/null +++ b/modules/nf-core/sentieon/bwamem/tests/nextflow.config @@ -0,0 +1,15 @@ +env { + // NOTE This is how nf-core/sarek users will use Sentieon in real world use + SENTIEON_LICENSE = "$SENTIEON_LICSRVR_IP" + // NOTE This should only happen in GitHub actions or nf-core MegaTests + SENTIEON_AUTH_MECH = "$SENTIEON_AUTH_MECH" + SENTIEON_AUTH_DATA = secrets.SENTIEON_AUTH_DATA + // NOTE This is how nf-core/sarek users will test out Sentieon in Sarek with a license file + // nextflow secrets set SENTIEON_LICENSE_BASE64 \$(cat | base64 -w 0) +} + +process { + withName: 'SENTIEON_BWAMEM' { + ext.args = "-R \"@RG\\tID:sample_lane\\tPU:lane\\tSM:patient_sample\\tLB:sample\\tDS:fasta\\tPL:seqplatform\"" + } +} diff --git a/modules/nf-core/sentieon/bwamem/tests/nextflow_out_cram.config b/modules/nf-core/sentieon/bwamem/tests/nextflow_out_cram.config new file mode 100644 index 0000000000..07ae63d98e --- /dev/null +++ b/modules/nf-core/sentieon/bwamem/tests/nextflow_out_cram.config @@ -0,0 +1,16 @@ +env { + // NOTE This is how nf-core/sarek users will use Sentieon in real world use + SENTIEON_LICENSE = "$SENTIEON_LICSRVR_IP" + // NOTE This should only happen in GitHub actions or nf-core MegaTests + SENTIEON_AUTH_MECH = "$SENTIEON_AUTH_MECH" + SENTIEON_AUTH_DATA = secrets.SENTIEON_AUTH_DATA + // NOTE This is how nf-core/sarek users will test out Sentieon in Sarek with a license file + // nextflow secrets set SENTIEON_LICENSE_BASE64 \$(cat | base64 -w 0) +} + +process { + withName: 'SENTIEON_BWAMEM' { + ext.args = "-R \"@RG\\tID:sample_lane\\tPU:lane\\tSM:patient_sample\\tLB:sample\\tDS:fasta\\tPL:seqplatform\"" + ext.prefix = { "${meta.id}.cram" } + } +} diff --git a/modules/nf-core/sentieon/bwamem/tests/tags.yml b/modules/nf-core/sentieon/bwamem/tests/tags.yml new file mode 100644 index 0000000000..fbc2bb3cc2 --- /dev/null +++ b/modules/nf-core/sentieon/bwamem/tests/tags.yml @@ -0,0 +1,2 @@ +sentieon/bwamem: + - "modules/nf-core/sentieon/bwamem/**" diff --git a/modules/nf-core/sentieon/dedup/environment.yml b/modules/nf-core/sentieon/dedup/environment.yml index e29cfff3e4..d7abf668ea 100644 --- a/modules/nf-core/sentieon/dedup/environment.yml +++ b/modules/nf-core/sentieon/dedup/environment.yml @@ -1,7 +1,5 @@ -name: sentieon_dedup channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::sentieon=202308.02 + - bioconda::sentieon=202308.03 diff --git a/modules/nf-core/sentieon/dedup/main.nf b/modules/nf-core/sentieon/dedup/main.nf index 5f19ab56d8..5735df7340 100644 --- a/modules/nf-core/sentieon/dedup/main.nf +++ b/modules/nf-core/sentieon/dedup/main.nf @@ -3,12 +3,10 @@ process SENTIEON_DEDUP { label 'process_medium' label 'sentieon' - secret 'SENTIEON_LICENSE_BASE64' - conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/sentieon:202308.02--h43eeafb_0' : - 'biocontainers/sentieon:202308.02--h43eeafb_0' }" + 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/a6/a64461f38d76bebea8e21441079e76e663e1168b0c59dafee6ee58440ad8c8ac/data' : + 'community.wave.seqera.io/library/sentieon:202308.03--59589f002351c221' }" input: tuple val(meta), path(bam), path(bai) @@ -29,47 +27,23 @@ process SENTIEON_DEDUP { task.ext.when == null || task.ext.when script: - // The following code sets LD_LIBRARY_PATH in the script-section when the module is run by Singularity. - // That turned out to be one way of overcoming the following issue with the Singularity-Sentieon-containers from galaxy, Sentieon (LD_LIBRARY_PATH) and the way Nextflow runs Singularity-containers. - // The galaxy container uses a runscript which is responsible for setting LD_PRELOAD properly. Nextflow executes singularity containers using `singularity exec`, which avoids the run script, leading to the LD_LIBRARY_PATH/libstdc++.so.6 error. - if (workflow.containerEngine in ['singularity','apptainer']) { - fix_ld_library_path = 'LD_LIBRARY_PATH=/usr/local/lib/:\$LD_LIBRARY_PATH;export LD_LIBRARY_PATH' - } else { - fix_ld_library_path = '' - } - def args = task.ext.args ?: '' def args2 = task.ext.args2 ?: '' def args3 = task.ext.args3 ?: '' def args4 = task.ext.args4 ?: '' - def prefix = task.ext.prefix ?: "${meta.id}" - def suffix = task.ext.suffix ?: ".cram" // The suffix should be either ".cram" or ".bam". - def metrics = task.ext.metrics ?: "${prefix}${suffix}.metrics" - def sentieon_auth_mech_base64 = task.ext.sentieon_auth_mech_base64 ?: '' - def sentieon_auth_data_base64 = task.ext.sentieon_auth_data_base64 ?: '' + def prefix = task.ext.prefix ?: "${meta.id}.cram" + def metrics = task.ext.metrics ?: "${prefix}.metrics" def input_list = bam.collect{"-i $it"}.join(' ') - + def prefix_basename = prefix.substring(0, prefix.lastIndexOf(".")) + def sentieonLicense = secrets.SENTIEON_LICENSE_BASE64 ? + "export SENTIEON_LICENSE=\$(mktemp);echo -e \"${secrets.SENTIEON_LICENSE_BASE64}\" | base64 -d > \$SENTIEON_LICENSE; " : + "" """ - if [ "\${#SENTIEON_LICENSE_BASE64}" -lt "1500" ]; then # If the string SENTIEON_LICENSE_BASE64 is short, then it is an encrypted url. - export SENTIEON_LICENSE=\$(echo -e "\$SENTIEON_LICENSE_BASE64" | base64 -d) - else # Localhost license file - # The license file is stored as a nextflow variable like, for instance, this: - # nextflow secrets set SENTIEON_LICENSE_BASE64 \$(cat | base64 -w 0) - export SENTIEON_LICENSE=\$(mktemp) - echo -e "\$SENTIEON_LICENSE_BASE64" | base64 -d > \$SENTIEON_LICENSE - fi + $sentieonLicense - if [ ${sentieon_auth_mech_base64} ] && [ ${sentieon_auth_data_base64} ]; then - # If sentieon_auth_mech_base64 and sentieon_auth_data_base64 are non-empty strings, then Sentieon is mostly likely being run with some test-license. - export SENTIEON_AUTH_MECH=\$(echo -n "${sentieon_auth_mech_base64}" | base64 -d) - export SENTIEON_AUTH_DATA=\$(echo -n "${sentieon_auth_data_base64}" | base64 -d) - echo "Decoded and exported Sentieon test-license system environment variables" - fi + sentieon driver $args -t $task.cpus $input_list -r ${fasta} --algo LocusCollector $args2 --fun score_info ${prefix_basename}.score + sentieon driver $args3 -t $task.cpus $input_list -r ${fasta} --algo Dedup $args4 --score_info ${prefix_basename}.score --metrics ${metrics} ${prefix} - $fix_ld_library_path - - sentieon driver $args $input_list -r ${fasta} --algo LocusCollector $args2 --fun score_info ${prefix}.score - sentieon driver $args3 -t $task.cpus $input_list -r ${fasta} --algo Dedup $args4 --score_info ${prefix}.score --metrics ${metrics} ${prefix}${suffix} # This following tsv-file is produced in order to get a proper tsv-file with Dedup-metrics for importing in MultiQC as "custom content". # It should be removed once MultiQC has a module for displaying Dedup-metrics. head -3 ${metrics} > ${metrics}.multiqc.tsv @@ -81,25 +55,17 @@ process SENTIEON_DEDUP { """ stub: - // The following code sets LD_LIBRARY_PATH in the script-section when the module is run by Singularity. - // That turned out to be one way of overcoming the following issue with the Singularity-Sentieon-containers from galaxy, Sentieon (LD_LIBRARY_PATH) and the way Nextflow runs Singularity-containers. - // The galaxy container uses a runscript which is responsible for setting LD_PRELOAD properly. Nextflow executes singularity containers using `singularity exec`, which avoids the run script, leading to the LD_LIBRARY_PATH/libstdc++.so.6 error. - if (workflow.containerEngine in ['singularity','apptainer']) { - fix_ld_library_path = 'LD_LIBRARY_PATH=/usr/local/lib/:\$LD_LIBRARY_PATH;export LD_LIBRARY_PATH' - } else { - fix_ld_library_path = '' - } - def prefix = task.ext.prefix ?: "${meta.id}" - def suffix = task.ext.suffix ?: ".cram" // The suffix should be either ".cram" or ".bam". - def metrics = task.ext.metrics ?: "${prefix}${suffix}.metrics" - """ - $fix_ld_library_path + def prefix = task.ext.prefix ?: "${meta.id}.cram" + def metrics = task.ext.metrics ?: "${prefix}.metrics" + def prefix_basename = prefix.substring(0, prefix.lastIndexOf(".")) - touch "${prefix}${suffix}" - touch "${prefix}${suffix}\$(echo ${suffix} | sed 's/m\$/i/')" + """ + touch "${prefix}" + touch "${prefix}.crai" + touch "${prefix}.bai" touch "${metrics}" touch "${metrics}.multiqc.tsv" - touch "${prefix}.score" + touch "${prefix_basename}.score" cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/sentieon/dedup/meta.yml b/modules/nf-core/sentieon/dedup/meta.yml index 0efbb96c22..003d74befe 100644 --- a/modules/nf-core/sentieon/dedup/meta.yml +++ b/modules/nf-core/sentieon/dedup/meta.yml @@ -1,5 +1,7 @@ name: sentieon_dedup -description: Runs the sentieon tool LocusCollector followed by Dedup. LocusCollector collects read information that is used by Dedup which in turn marks or removes duplicate reads. +description: Runs the sentieon tool LocusCollector followed by Dedup. LocusCollector + collects read information that is used by Dedup which in turn marks or removes duplicate + reads. keywords: - mem - dedup @@ -14,76 +16,116 @@ tools: Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system. homepage: https://www.sentieon.com/ documentation: https://www.sentieon.com/ + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing reference information. - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: BAM file. - pattern: "*.bam" - - bai: - type: file - description: BAI file - pattern: "*.bai" - - meta2: - type: map - description: | - Groovy Map containing reference information. - e.g. [ id:'test', single_end:false ] - - fasta: - type: file - description: Genome fasta file - pattern: "*.{fa,fasta}" - - meta3: - type: map - description: | - Groovy Map containing reference information. - e.g. [ id:'test', single_end:false ] - - fasta_fai: - type: file - description: The index of the FASTA reference. - pattern: "*.fai" + - - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - bam: + type: file + description: BAM file. + pattern: "*.bam" + - bai: + type: file + description: BAI file + pattern: "*.bai" + - - meta2: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: Genome fasta file + pattern: "*.{fa,fasta}" + - - meta3: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - fasta_fai: + type: file + description: The index of the FASTA reference. + pattern: "*.fai" output: - - meta: - type: map - description: | - Groovy Map containing reference information. - e.g. [ id:'test', single_end:false ] - cram: - type: file - description: CRAM file - pattern: "*.cram" + - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - "*.cram": + type: file + description: CRAM file + pattern: "*.cram" - crai: - type: file - description: CRAM index file - pattern: "*.crai" + - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - "*.crai": + type: file + description: CRAM index file + pattern: "*.crai" - bam: - type: file - description: BAM file. - pattern: "*.bam" + - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - "*.bam": + type: file + description: BAM file. + pattern: "*.bam" - bai: - type: file - description: BAI file - pattern: "*.bai" + - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - "*.bai": + type: file + description: BAI file + pattern: "*.bai" - score: - type: file - description: The score file indicates which reads LocusCollector finds are likely duplicates. - pattern: "*.score" + - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - "*.score": + type: file + description: The score file indicates which reads LocusCollector finds are likely + duplicates. + pattern: "*.score" - metrics: - type: file - description: Output file containing Dedup metrics incl. histogram data. - pattern: "*.metrics" + - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - "*.metrics": + type: file + description: Output file containing Dedup metrics incl. histogram data. + pattern: "*.metrics" - metrics_multiqc_tsv: - type: file - description: Output tsv-file containing Dedup metrics excl. histogram data. - pattern: "*.metrics.multiqc.tsv" + - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - "*.metrics.multiqc.tsv": + type: file + description: Output tsv-file containing Dedup metrics excl. histogram data. + pattern: "*.metrics.multiqc.tsv" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@asp8200" maintainers: diff --git a/modules/nf-core/sentieon/dedup/tests/main.nf.test b/modules/nf-core/sentieon/dedup/tests/main.nf.test new file mode 100644 index 0000000000..c842a4a00c --- /dev/null +++ b/modules/nf-core/sentieon/dedup/tests/main.nf.test @@ -0,0 +1,90 @@ +nextflow_process { + + name "Test Process SENTIEON_DEDUP" + tag "modules_nfcore" + tag "modules" + tag "sentieon" + tag "dedup" + tag "sentieon/dedup" + + script "../main.nf" + process "SENTIEON_DEDUP" + + test("Test marking duplicates") { + config "./nextflow.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true) ], + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true) ] + ] + input[1] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true)] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("Test removing duplicates") { + config "./nextflow_rmdup.config" + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true) ], + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true) ] + ] + input[1] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true)] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("Test stub") { + config "./nextflow.config" + options "-stub" + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.single_end.sorted.bam', checkIfExists: true) ], + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.single_end.sorted.bam.bai', checkIfExists: true) ] + ] + input[1] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true)] + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta.fai', checkIfExists: true)] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } +} diff --git a/modules/nf-core/sentieon/dedup/tests/main.nf.test.snap b/modules/nf-core/sentieon/dedup/tests/main.nf.test.snap new file mode 100644 index 0000000000..26117a7cdf --- /dev/null +++ b/modules/nf-core/sentieon/dedup/tests/main.nf.test.snap @@ -0,0 +1,359 @@ +{ + "Test marking duplicates": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.cram:md5,e46e97256846338e1cff32d862105491" + ] + ], + "1": [ + [ + { + "id": "test" + }, + "test.cram.crai:md5,4b7b2152b33c5334f9477cc3650f8c91" + ] + ], + "2": [ + + ], + "3": [ + [ + { + "id": "test" + }, + "test.cram.bai:md5,889503338dc569b24e44e5e3aec815ea" + ] + ], + "4": [ + [ + { + "id": "test" + }, + "test.score:md5,835f05ecc5d3ef5d4e31ba7f831d9a8b" + ] + ], + "5": [ + [ + { + "id": "test" + }, + "test.cram.metrics:md5,208f7c5fa2f489cfaaffbce116fed0bc" + ] + ], + "6": [ + [ + { + "id": "test" + }, + "test.cram.metrics.multiqc.tsv:md5,208f7c5fa2f489cfaaffbce116fed0bc" + ] + ], + "7": [ + "versions.yml:md5,763463853476be96846b6da5aecfacf4" + ], + "bai": [ + [ + { + "id": "test" + }, + "test.cram.bai:md5,889503338dc569b24e44e5e3aec815ea" + ] + ], + "bam": [ + + ], + "crai": [ + [ + { + "id": "test" + }, + "test.cram.crai:md5,4b7b2152b33c5334f9477cc3650f8c91" + ] + ], + "cram": [ + [ + { + "id": "test" + }, + "test.cram:md5,e46e97256846338e1cff32d862105491" + ] + ], + "metrics": [ + [ + { + "id": "test" + }, + "test.cram.metrics:md5,208f7c5fa2f489cfaaffbce116fed0bc" + ] + ], + "metrics_multiqc_tsv": [ + [ + { + "id": "test" + }, + "test.cram.metrics.multiqc.tsv:md5,208f7c5fa2f489cfaaffbce116fed0bc" + ] + ], + "score": [ + [ + { + "id": "test" + }, + "test.score:md5,835f05ecc5d3ef5d4e31ba7f831d9a8b" + ] + ], + "versions": [ + "versions.yml:md5,763463853476be96846b6da5aecfacf4" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:28:10.570152622" + }, + "Test removing duplicates": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.cram:md5,8075d3e7c66d36fdbb81270eefc996d4" + ] + ], + "1": [ + [ + { + "id": "test" + }, + "test.cram.crai:md5,c617398ead281c1339d78d5df0d606e9" + ] + ], + "2": [ + + ], + "3": [ + [ + { + "id": "test" + }, + "test.cram.bai:md5,a1ea729eca4732ca3a5dee946a70fbc8" + ] + ], + "4": [ + [ + { + "id": "test" + }, + "test.score:md5,835f05ecc5d3ef5d4e31ba7f831d9a8b" + ] + ], + "5": [ + [ + { + "id": "test" + }, + "test.cram.metrics:md5,2a41239de0275a8321f4658286d97d65" + ] + ], + "6": [ + [ + { + "id": "test" + }, + "test.cram.metrics.multiqc.tsv:md5,2a41239de0275a8321f4658286d97d65" + ] + ], + "7": [ + "versions.yml:md5,763463853476be96846b6da5aecfacf4" + ], + "bai": [ + [ + { + "id": "test" + }, + "test.cram.bai:md5,a1ea729eca4732ca3a5dee946a70fbc8" + ] + ], + "bam": [ + + ], + "crai": [ + [ + { + "id": "test" + }, + "test.cram.crai:md5,c617398ead281c1339d78d5df0d606e9" + ] + ], + "cram": [ + [ + { + "id": "test" + }, + "test.cram:md5,8075d3e7c66d36fdbb81270eefc996d4" + ] + ], + "metrics": [ + [ + { + "id": "test" + }, + "test.cram.metrics:md5,2a41239de0275a8321f4658286d97d65" + ] + ], + "metrics_multiqc_tsv": [ + [ + { + "id": "test" + }, + "test.cram.metrics.multiqc.tsv:md5,2a41239de0275a8321f4658286d97d65" + ] + ], + "score": [ + [ + { + "id": "test" + }, + "test.score:md5,835f05ecc5d3ef5d4e31ba7f831d9a8b" + ] + ], + "versions": [ + "versions.yml:md5,763463853476be96846b6da5aecfacf4" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:28:19.377946074" + }, + "Test stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.cram:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + [ + { + "id": "test" + }, + "test.cram.crai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + + ], + "3": [ + [ + { + "id": "test" + }, + "test.cram.bai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + [ + { + "id": "test" + }, + "test.score:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "5": [ + [ + { + "id": "test" + }, + "test.cram.metrics:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "6": [ + [ + { + "id": "test" + }, + "test.cram.metrics.multiqc.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "7": [ + "versions.yml:md5,763463853476be96846b6da5aecfacf4" + ], + "bai": [ + [ + { + "id": "test" + }, + "test.cram.bai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "bam": [ + + ], + "crai": [ + [ + { + "id": "test" + }, + "test.cram.crai:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "cram": [ + [ + { + "id": "test" + }, + "test.cram:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "metrics": [ + [ + { + "id": "test" + }, + "test.cram.metrics:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "metrics_multiqc_tsv": [ + [ + { + "id": "test" + }, + "test.cram.metrics.multiqc.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "score": [ + [ + { + "id": "test" + }, + "test.score:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,763463853476be96846b6da5aecfacf4" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:28:28.792696026" + } +} \ No newline at end of file diff --git a/modules/nf-core/sentieon/dedup/tests/nextflow.config b/modules/nf-core/sentieon/dedup/tests/nextflow.config new file mode 100644 index 0000000000..09a068ee62 --- /dev/null +++ b/modules/nf-core/sentieon/dedup/tests/nextflow.config @@ -0,0 +1,9 @@ +env { + // NOTE This is how nf-core/sarek users will use Sentieon in real world use + SENTIEON_LICENSE = "$SENTIEON_LICSRVR_IP" + // NOTE This should only happen in GitHub actions or nf-core MegaTests + SENTIEON_AUTH_MECH = "$SENTIEON_AUTH_MECH" + SENTIEON_AUTH_DATA = secrets.SENTIEON_AUTH_DATA + // NOTE This is how nf-core/sarek users will test out Sentieon in Sarek with a license file + // nextflow secrets set SENTIEON_LICENSE_BASE64 \$(cat | base64 -w 0) +} diff --git a/modules/nf-core/sentieon/dedup/tests/nextflow_rmdup.config b/modules/nf-core/sentieon/dedup/tests/nextflow_rmdup.config new file mode 100644 index 0000000000..21e7b945d2 --- /dev/null +++ b/modules/nf-core/sentieon/dedup/tests/nextflow_rmdup.config @@ -0,0 +1,15 @@ +env { + // NOTE This is how nf-core/sarek users will use Sentieon in real world use + SENTIEON_LICENSE = "$SENTIEON_LICSRVR_IP" + // NOTE This should only happen in GitHub actions or nf-core MegaTests + SENTIEON_AUTH_MECH = "$SENTIEON_AUTH_MECH" + SENTIEON_AUTH_DATA = secrets.SENTIEON_AUTH_DATA + // NOTE This is how nf-core/sarek users will test out Sentieon in Sarek with a license file + // nextflow secrets set SENTIEON_LICENSE_BASE64 \$(cat | base64 -w 0) +} + +process { + withName: 'SENTIEON_DEDUP' { + ext.args4 = '--rmdup' + } +} diff --git a/modules/nf-core/sentieon/dnamodelapply/environment.yml b/modules/nf-core/sentieon/dnamodelapply/environment.yml index a2f8819313..d7abf668ea 100644 --- a/modules/nf-core/sentieon/dnamodelapply/environment.yml +++ b/modules/nf-core/sentieon/dnamodelapply/environment.yml @@ -1,7 +1,5 @@ -name: sentieon_dnamodelapply channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::sentieon=202308.02 + - bioconda::sentieon=202308.03 diff --git a/modules/nf-core/sentieon/dnamodelapply/main.nf b/modules/nf-core/sentieon/dnamodelapply/main.nf index 9a0c70dc22..85fd601b39 100644 --- a/modules/nf-core/sentieon/dnamodelapply/main.nf +++ b/modules/nf-core/sentieon/dnamodelapply/main.nf @@ -3,12 +3,10 @@ process SENTIEON_DNAMODELAPPLY { label 'process_high' label 'sentieon' - secret 'SENTIEON_LICENSE_BASE64' - conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/sentieon:202308.02--h43eeafb_0' : - 'biocontainers/sentieon:202308.02--h43eeafb_0' }" + 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/a6/a64461f38d76bebea8e21441079e76e663e1168b0c59dafee6ee58440ad8c8ac/data' : + 'community.wave.seqera.io/library/sentieon:202308.03--59589f002351c221' }" input: tuple val(meta), path(vcf), path(idx) @@ -25,38 +23,13 @@ process SENTIEON_DNAMODELAPPLY { task.ext.when == null || task.ext.when script: - // The following code sets LD_LIBRARY_PATH in the script-section when the module is run by Singularity. - // That turned out to be one way of overcoming the following issue with the Singularity-Sentieon-containers from galaxy, Sentieon (LD_LIBRARY_PATH) and the way Nextflow runs Singularity-containers. - // The galaxy container uses a runscript which is responsible for setting LD_PRELOAD properly. Nextflow executes singularity containers using `singularity exec`, which avoids the run script, leading to the LD_LIBRARY_PATH/libstdc++.so.6 error. - if (workflow.containerEngine in ['singularity','apptainer']) { - fix_ld_library_path = 'LD_LIBRARY_PATH=/usr/local/lib/:\$LD_LIBRARY_PATH;export LD_LIBRARY_PATH' - } else { - fix_ld_library_path = '' - } - def args = task.ext.args ?: '' def prefix = task.ext.prefix ?: "${meta.id}" - def sentieon_auth_mech_base64 = task.ext.sentieon_auth_mech_base64 ?: '' - def sentieon_auth_data_base64 = task.ext.sentieon_auth_data_base64 ?: '' - + def sentieonLicense = secrets.SENTIEON_LICENSE_BASE64 ? + "export SENTIEON_LICENSE=\$(mktemp);echo -e \"${secrets.SENTIEON_LICENSE_BASE64}\" | base64 -d > \$SENTIEON_LICENSE; " : + "" """ - if [ "\${#SENTIEON_LICENSE_BASE64}" -lt "1500" ]; then # If the string SENTIEON_LICENSE_BASE64 is short, then it is an encrypted url. - export SENTIEON_LICENSE=\$(echo -e "\$SENTIEON_LICENSE_BASE64" | base64 -d) - else # Localhost license file - # The license file is stored as a nextflow variable like, for instance, this: - # nextflow secrets set SENTIEON_LICENSE_BASE64 \$(cat | base64 -w 0) - export SENTIEON_LICENSE=\$(mktemp) - echo -e "\$SENTIEON_LICENSE_BASE64" | base64 -d > \$SENTIEON_LICENSE - fi - - if [ ${sentieon_auth_mech_base64} ] && [ ${sentieon_auth_data_base64} ]; then - # If sentieon_auth_mech_base64 and sentieon_auth_data_base64 are non-empty strings, then Sentieon is mostly likely being run with some test-license. - export SENTIEON_AUTH_MECH=\$(echo -n "${sentieon_auth_mech_base64}" | base64 -d) - export SENTIEON_AUTH_DATA=\$(echo -n "${sentieon_auth_data_base64}" | base64 -d) - echo "Decoded and exported Sentieon test-license system environment variables" - fi - - $fix_ld_library_path + $sentieonLicense sentieon driver \\ -t $task.cpus \\ @@ -74,20 +47,8 @@ process SENTIEON_DNAMODELAPPLY { """ stub: - // The following code sets LD_LIBRARY_PATH in the script-section when the module is run by Singularity. - // That turned out to be one way of overcoming the following issue with the Singularity-Sentieon-containers from galaxy, Sentieon (LD_LIBRARY_PATH) and the way Nextflow runs Singularity-containers. - // The galaxy container uses a runscript which is responsible for setting LD_PRELOAD properly. Nextflow executes singularity containers using `singularity exec`, which avoids the run script, leading to the LD_LIBRARY_PATH/libstdc++.so.6 error. - if (workflow.containerEngine in ['singularity','apptainer']) { - fix_ld_library_path = 'LD_LIBRARY_PATH=/usr/local/lib/:\$LD_LIBRARY_PATH;export LD_LIBRARY_PATH' - } else { - fix_ld_library_path = '' - } - def prefix = task.ext.prefix ?: "${meta.id}" - """ - $fix_ld_library_path - touch ${prefix}.vcf.gz touch ${prefix}.vcf.gz.tbi diff --git a/modules/nf-core/sentieon/dnamodelapply/meta.yml b/modules/nf-core/sentieon/dnamodelapply/meta.yml index 2507654577..2505aff74e 100644 --- a/modules/nf-core/sentieon/dnamodelapply/meta.yml +++ b/modules/nf-core/sentieon/dnamodelapply/meta.yml @@ -12,65 +12,74 @@ tools: Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system. homepage: https://www.sentieon.com/ documentation: https://www.sentieon.com/ + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. `[ id:'test', single_end:false ]` - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. `[ id:'test' ]` - - meta3: - type: map - description: | - Groovy Map containing reference information - e.g. `[ id:'test' ]` - - meta4: - type: map - description: | - Groovy Map containing reference information - e.g. `[ id:'test' ]` - - vcf: - type: file - description: INPUT VCF file - pattern: "*.{vcf,vcf.gz}" - - idx: - type: file - description: Index of the input VCF file - pattern: "*.{tbi}" - - fasta: - type: file - description: Genome fasta file - pattern: "*.{fa,fasta}" - - fai: - type: file - description: Index of the genome fasta file - pattern: "*.fai" - - ml_model: - type: file - description: machine learning model file - pattern: "*.model" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - vcf: + type: file + description: INPUT VCF file + pattern: "*.{vcf,vcf.gz}" + - idx: + type: file + description: Index of the input VCF file + pattern: "*.{tbi}" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. `[ id:'test' ]` + - fasta: + type: file + description: Genome fasta file + pattern: "*.{fa,fasta}" + - - meta3: + type: map + description: | + Groovy Map containing reference information + e.g. `[ id:'test' ]` + - fai: + type: file + description: Index of the genome fasta file + pattern: "*.fai" + - - meta4: + type: map + description: | + Groovy Map containing reference information + e.g. `[ id:'test' ]` + - ml_model: + type: file + description: machine learning model file + pattern: "*.model" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. `[ id:'test', single_end:false ]` - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - vcf: - type: file - description: INPUT VCF file - pattern: "*.{vcf,vcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - "*.vcf.gz": + type: file + description: INPUT VCF file + pattern: "*.{vcf,vcf.gz}" - index: - type: file - description: Index of the input VCF file - pattern: "*.{tbi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test', single_end:false ]` + - "*.vcf.gz.tbi": + type: file + description: Index of the input VCF file + pattern: "*.{tbi}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@ramprasadn" maintainers: diff --git a/modules/nf-core/sentieon/dnascope/environment.yml b/modules/nf-core/sentieon/dnascope/environment.yml index e6da2dde3e..d7abf668ea 100644 --- a/modules/nf-core/sentieon/dnascope/environment.yml +++ b/modules/nf-core/sentieon/dnascope/environment.yml @@ -1,7 +1,5 @@ -name: sentieon_dnascope channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::sentieon=202308.02 + - bioconda::sentieon=202308.03 diff --git a/modules/nf-core/sentieon/dnascope/main.nf b/modules/nf-core/sentieon/dnascope/main.nf index 0671307ba0..bdeb62521a 100644 --- a/modules/nf-core/sentieon/dnascope/main.nf +++ b/modules/nf-core/sentieon/dnascope/main.nf @@ -3,12 +3,10 @@ process SENTIEON_DNASCOPE { label 'process_high' label 'sentieon' - secret 'SENTIEON_LICENSE_BASE64' - conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/sentieon:202308.02--h43eeafb_0' : - 'biocontainers/sentieon:202308.02--h43eeafb_0' }" + 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/a6/a64461f38d76bebea8e21441079e76e663e1168b0c59dafee6ee58440ad8c8ac/data' : + 'community.wave.seqera.io/library/sentieon:202308.03--59589f002351c221' }" input: tuple val(meta), path(bam), path(bai), path(intervals) @@ -32,15 +30,6 @@ process SENTIEON_DNASCOPE { task.ext.when == null || task.ext.when script: - // The following code sets LD_LIBRARY_PATH in the script-section when the module is run by Singularity. - // That turned out to be one way of overcoming the following issue with the Singularity-Sentieon-containers from galaxy, Sentieon (LD_LIBRARY_PATH) and the way Nextflow runs Singularity-containers. - // The galaxy container uses a runscript which is responsible for setting LD_PRELOAD properly. Nextflow executes singularity containers using `singularity exec`, which avoids the run script, leading to the LD_LIBRARY_PATH/libstdc++.so.6 error. - if (workflow.containerEngine in ['singularity','apptainer']) { - fix_ld_library_path = 'LD_LIBRARY_PATH=/usr/local/lib/:\$LD_LIBRARY_PATH;export LD_LIBRARY_PATH' - } else { - fix_ld_library_path = '' - } - def args = task.ext.args ?: '' // options for the driver def args2 = task.ext.args2 ?: '' // options for the vcf generation def args3 = task.ext.args3 ?: '' // options for the gvcf generation @@ -49,8 +38,6 @@ process SENTIEON_DNASCOPE { def model_cmd = ml_model ? " --model ${ml_model}" : '' def pcr_indel_model_cmd = pcr_indel_model ? " --pcr_indel_model ${pcr_indel_model}" : '' def prefix = task.ext.prefix ?: "${meta.id}" - def sentieon_auth_mech_base64 = task.ext.sentieon_auth_mech_base64 ?: '' - def sentieon_auth_data_base64 = task.ext.sentieon_auth_data_base64 ?: '' def vcf_cmd = "" def gvcf_cmd = "" def base_cmd = '--algo DNAscope ' + dbsnp_cmd + ' ' @@ -63,24 +50,11 @@ process SENTIEON_DNASCOPE { gvcf_cmd = base_cmd + args3 + ' ' + model_cmd + pcr_indel_model_cmd + ' --emit_mode gvcf ' + prefix + '.g.vcf.gz' } + def sentieonLicense = secrets.SENTIEON_LICENSE_BASE64 ? + "export SENTIEON_LICENSE=\$(mktemp);echo -e \"${secrets.SENTIEON_LICENSE_BASE64}\" | base64 -d > \$SENTIEON_LICENSE; " : + "" """ - if [ "\${#SENTIEON_LICENSE_BASE64}" -lt "1500" ]; then # If the string SENTIEON_LICENSE_BASE64 is short, then it is an encrypted url. - export SENTIEON_LICENSE=\$(echo -e "\$SENTIEON_LICENSE_BASE64" | base64 -d) - else # Localhost license file - # The license file is stored as a nextflow variable like, for instance, this: - # nextflow secrets set SENTIEON_LICENSE_BASE64 \$(cat | base64 -w 0) - export SENTIEON_LICENSE=\$(mktemp) - echo -e "\$SENTIEON_LICENSE_BASE64" | base64 -d > \$SENTIEON_LICENSE - fi - - if [ ${sentieon_auth_mech_base64} ] && [ ${sentieon_auth_data_base64} ]; then - # If sentieon_auth_mech_base64 and sentieon_auth_data_base64 are non-empty strings, then Sentieon is mostly likely being run with some test-license. - export SENTIEON_AUTH_MECH=\$(echo -n "${sentieon_auth_mech_base64}" | base64 -d) - export SENTIEON_AUTH_DATA=\$(echo -n "${sentieon_auth_data_base64}" | base64 -d) - echo "Decoded and exported Sentieon test-license system environment variables" - fi - - $fix_ld_library_path + $sentieonLicense sentieon driver $args -r $fasta -t $task.cpus -i $bam $interval $vcf_cmd $gvcf_cmd @@ -91,20 +65,8 @@ process SENTIEON_DNASCOPE { """ stub: - // The following code sets LD_LIBRARY_PATH in the script-section when the module is run by Singularity. - // That turned out to be one way of overcoming the following issue with the Singularity-Sentieon-containers from galaxy, Sentieon (LD_LIBRARY_PATH) and the way Nextflow runs Singularity-containers. - // The galaxy container uses a runscript which is responsible for setting LD_PRELOAD properly. Nextflow executes singularity containers using `singularity exec`, which avoids the run script, leading to the LD_LIBRARY_PATH/libstdc++.so.6 error. - if (workflow.containerEngine in ['singularity','apptainer']) { - fix_ld_library_path = 'LD_LIBRARY_PATH=/usr/local/lib/:\$LD_LIBRARY_PATH;export LD_LIBRARY_PATH' - } else { - fix_ld_library_path = '' - } - def prefix = task.ext.prefix ?: "${meta.id}" - """ - $fix_ld_library_path - touch ${prefix}.unfiltered.vcf.gz touch ${prefix}.unfiltered.vcf.gz.tbi touch ${prefix}.g.vcf.gz diff --git a/modules/nf-core/sentieon/dnascope/meta.yml b/modules/nf-core/sentieon/dnascope/meta.yml index 6b61cee828..e3e0eba8c8 100644 --- a/modules/nf-core/sentieon/dnascope/meta.yml +++ b/modules/nf-core/sentieon/dnascope/meta.yml @@ -1,5 +1,6 @@ name: sentieon_dnascope -description: DNAscope algorithm performs an improved version of Haplotype variant calling. +description: DNAscope algorithm performs an improved version of Haplotype variant + calling. keywords: - dnascope - sentieon @@ -11,109 +12,127 @@ tools: Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system. homepage: https://www.sentieon.com/ documentation: https://www.sentieon.com/ + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information. - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: BAM file. - pattern: "*.bam" - - bai: - type: file - description: BAI file - pattern: "*.bai" - - intervals: - type: file - description: bed or interval_list file containing interval in the reference that will be used in the analysis - pattern: "*.{bed,interval_list}" - - meta2: - type: map - description: | - Groovy Map containing meta information for fasta. - - fasta: - type: file - description: Genome fasta file - pattern: "*.{fa,fasta}" - - meta3: - type: map - description: | - Groovy Map containing meta information for fasta index. - - fai: - type: file - description: Index of the genome fasta file - pattern: "*.fai" - - meta4: - type: map - description: | - Groovy Map containing meta information for dbsnp. - - dbsnp: - type: file - description: Single Nucleotide Polymorphism database (dbSNP) file - pattern: "*.vcf.gz" - - meta5: - type: map - description: | - Groovy Map containing meta information for dbsnp_tbi. - - dbsnp_tbi: - type: file - description: Index of the Single Nucleotide Polymorphism database (dbSNP) file - pattern: "*.vcf.gz.tbi" - - meta6: - type: map - description: | - Groovy Map containing meta information for machine learning model for Dnascope. - - ml_model: - type: file - description: machine learning model file - pattern: "*.model" - - ml_model: - type: file - description: machine learning model file - pattern: "*.model" - - pcr_indel_model: - type: string - description: | - Controls the option pcr_indel_model for Dnascope. - The possible options are "NONE" (used for PCR free samples), and "HOSTILE", "AGGRESSIVE" and "CONSERVATIVE". - See Sentieons documentation for further explanation. - - emit_vcf: - type: string - description: | - Controls the vcf output from Dnascope. - Possible options are "all", "confident" and "variant". - See Sentieons documentation for further explanation. - - emit_gvcf: - type: boolean - description: If true, the haplotyper will output a gvcf + - - meta: + type: map + description: | + Groovy Map containing sample information. + e.g. [ id:'test', single_end:false ] + - bam: + type: file + description: BAM file. + pattern: "*.bam" + - bai: + type: file + description: BAI file + pattern: "*.bai" + - intervals: + type: file + description: bed or interval_list file containing interval in the reference + that will be used in the analysis + pattern: "*.{bed,interval_list}" + - - meta2: + type: map + description: | + Groovy Map containing meta information for fasta. + - fasta: + type: file + description: Genome fasta file + pattern: "*.{fa,fasta}" + - - meta3: + type: map + description: | + Groovy Map containing meta information for fasta index. + - fai: + type: file + description: Index of the genome fasta file + pattern: "*.fai" + - - meta4: + type: map + description: | + Groovy Map containing meta information for dbsnp. + - dbsnp: + type: file + description: Single Nucleotide Polymorphism database (dbSNP) file + pattern: "*.vcf.gz" + - - meta5: + type: map + description: | + Groovy Map containing meta information for dbsnp_tbi. + - dbsnp_tbi: + type: file + description: Index of the Single Nucleotide Polymorphism database (dbSNP) file + pattern: "*.vcf.gz.tbi" + - - meta6: + type: map + description: | + Groovy Map containing meta information for machine learning model for Dnascope. + - ml_model: + type: file + description: machine learning model file + pattern: "*.model" + - - pcr_indel_model: + type: string + description: | + Controls the option pcr_indel_model for Dnascope. + The possible options are "NONE" (used for PCR free samples), and "HOSTILE", "AGGRESSIVE" and "CONSERVATIVE". + See Sentieons documentation for further explanation. + - - emit_vcf: + type: string + description: | + Controls the vcf output from Dnascope. + Possible options are "all", "confident" and "variant". + See Sentieons documentation for further explanation. + - - emit_gvcf: + type: boolean + description: If true, the haplotyper will output a gvcf output: - - meta: - type: map - description: | - Groovy Map containing reference information. - e.g. [ id:'test', single_end:false ] - vcf: - type: file - description: Compressed VCF file - pattern: "*.unfiltered.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - "*.unfiltered.vcf.gz": + type: file + description: Compressed VCF file + pattern: "*.unfiltered.vcf.gz" - vcf_tbi: - type: file - description: Index of VCF file - pattern: "*.unfiltered.vcf.gz.tbi" + - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - "*.unfiltered.vcf.gz.tbi": + type: file + description: Index of VCF file + pattern: "*.unfiltered.vcf.gz.tbi" - gvcf: - type: file - description: Compressed GVCF file - pattern: "*.g.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - "*.g.vcf.gz": + type: file + description: Compressed GVCF file + pattern: "*.g.vcf.gz" - gvcf_tbi: - type: file - description: Index of GVCF file - pattern: "*.g.vcf.gz.tbi" + - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - "*.g.vcf.gz.tbi": + type: file + description: Index of GVCF file + pattern: "*.g.vcf.gz.tbi" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@ramprasadn" maintainers: diff --git a/modules/nf-core/sentieon/gvcftyper/environment.yml b/modules/nf-core/sentieon/gvcftyper/environment.yml index 732e2ca846..d7abf668ea 100644 --- a/modules/nf-core/sentieon/gvcftyper/environment.yml +++ b/modules/nf-core/sentieon/gvcftyper/environment.yml @@ -1,7 +1,5 @@ -name: sentieon_gvcftyper channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::sentieon=202308.02 + - bioconda::sentieon=202308.03 diff --git a/modules/nf-core/sentieon/gvcftyper/main.nf b/modules/nf-core/sentieon/gvcftyper/main.nf index 7539214507..6817c6dbe2 100644 --- a/modules/nf-core/sentieon/gvcftyper/main.nf +++ b/modules/nf-core/sentieon/gvcftyper/main.nf @@ -3,19 +3,17 @@ process SENTIEON_GVCFTYPER { label 'process_high' label 'sentieon' - secret 'SENTIEON_LICENSE_BASE64' - conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/sentieon:202308.02--h43eeafb_0' : - 'biocontainers/sentieon:202308.02--h43eeafb_0' }" + 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/a6/a64461f38d76bebea8e21441079e76e663e1168b0c59dafee6ee58440ad8c8ac/data' : + 'community.wave.seqera.io/library/sentieon:202308.03--59589f002351c221' }" input: tuple val(meta), path(gvcfs), path(tbis), path(intervals) - path fasta - path fai - path dbsnp - path dbsnp_tbi + tuple val(meta1), path(fasta) + tuple val(meta2), path(fai) + tuple val(meta3), path(dbsnp) + tuple val(meta4), path(dbsnp_tbi) output: tuple val(meta), path("*.vcf.gz") , emit: vcf_gz @@ -26,39 +24,14 @@ process SENTIEON_GVCFTYPER { task.ext.when == null || task.ext.when script: - // The following code sets LD_LIBRARY_PATH in the script-section when the module is run by Singularity. - // That turned out to be one way of overcoming the following issue with the Singularity-Sentieon-containers from galaxy, Sentieon (LD_LIBRARY_PATH) and the way Nextflow runs Singularity-containers. - // The galaxy container uses a runscript which is responsible for setting LD_PRELOAD properly. Nextflow executes singularity containers using `singularity exec`, which avoids the run script, leading to the LD_LIBRARY_PATH/libstdc++.so.6 error. - if (workflow.containerEngine in ['singularity','apptainer']) { - fix_ld_library_path = 'LD_LIBRARY_PATH=/usr/local/lib/:\$LD_LIBRARY_PATH;export LD_LIBRARY_PATH' - } else { - fix_ld_library_path = '' - } - def prefix = task.ext.prefix ?: "${meta.id}" - def sentieon_auth_mech_base64 = task.ext.sentieon_auth_mech_base64 ?: '' - def sentieon_auth_data_base64 = task.ext.sentieon_auth_data_base64 ?: '' def gvcfs_input = '-v ' + gvcfs.join(' -v ') def dbsnp_cmd = dbsnp ? "--dbsnp $dbsnp" : "" - + def sentieonLicense = secrets.SENTIEON_LICENSE_BASE64 ? + "export SENTIEON_LICENSE=\$(mktemp);echo -e \"${secrets.SENTIEON_LICENSE_BASE64}\" | base64 -d > \$SENTIEON_LICENSE; " : + "" """ - if [ "\${#SENTIEON_LICENSE_BASE64}" -lt "1500" ]; then # If the string SENTIEON_LICENSE_BASE64 is short, then it is an encrypted url. - export SENTIEON_LICENSE=\$(echo -e "\$SENTIEON_LICENSE_BASE64" | base64 -d) - else # Localhost license file - # The license file is stored as a nextflow variable like, for instance, this: - # nextflow secrets set SENTIEON_LICENSE_BASE64 \$(cat | base64 -w 0) - export SENTIEON_LICENSE=\$(mktemp) - echo -e "\$SENTIEON_LICENSE_BASE64" | base64 -d > \$SENTIEON_LICENSE - fi - - if [ ${sentieon_auth_mech_base64} ] && [ ${sentieon_auth_data_base64} ]; then - # If sentieon_auth_mech_base64 and sentieon_auth_data_base64 are non-empty strings, then Sentieon is mostly likely being run with some test-license. - export SENTIEON_AUTH_MECH=\$(echo -n "${sentieon_auth_mech_base64}" | base64 -d) - export SENTIEON_AUTH_DATA=\$(echo -n "${sentieon_auth_data_base64}" | base64 -d) - echo "Decoded and exported Sentieon test-license system environment variables" - fi - - $fix_ld_library_path + $sentieonLicense sentieon driver -r ${fasta} --algo GVCFtyper ${gvcfs_input} ${dbsnp_cmd} ${prefix}.vcf.gz @@ -69,21 +42,9 @@ process SENTIEON_GVCFTYPER { """ stub: - // The following code sets LD_LIBRARY_PATH in the script-section when the module is run by Singularity. - // That turned out to be one way of overcoming the following issue with the Singularity-Sentieon-containers from galaxy, Sentieon (LD_LIBRARY_PATH) and the way Nextflow runs Singularity-containers. - // The galaxy container uses a runscript which is responsible for setting LD_PRELOAD properly. Nextflow executes singularity containers using `singularity exec`, which avoids the run script, leading to the LD_LIBRARY_PATH/libstdc++.so.6 error. - if (workflow.containerEngine in ['singularity','apptainer']) { - fix_ld_library_path = 'LD_LIBRARY_PATH=/usr/local/lib/:\$LD_LIBRARY_PATH;export LD_LIBRARY_PATH' - } else { - fix_ld_library_path = '' - } - def prefix = task.ext.prefix ?: "${meta.id}" - """ - $fix_ld_library_path - - touch ${prefix}.vcf.gz + echo "" | gzip >${prefix}.vcf.gz touch ${prefix}.vcf.gz.tbi cat <<-END_VERSIONS > versions.yml diff --git a/modules/nf-core/sentieon/gvcftyper/meta.yml b/modules/nf-core/sentieon/gvcftyper/meta.yml index 5a83eb0308..f022553711 100644 --- a/modules/nf-core/sentieon/gvcftyper/meta.yml +++ b/modules/nf-core/sentieon/gvcftyper/meta.yml @@ -12,59 +12,89 @@ tools: Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system. homepage: https://www.sentieon.com/ documentation: https://www.sentieon.com/ + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - gvcfs: - type: file - description: | - gVCF(.gz) file - pattern: "*.{vcf,vcf.gz}" - - tbis: - type: file - description: | - index of gvcf file - pattern: "*.tbi" - - intervals: - type: file - description: Interval file with the genomic regions included in the library (optional) - - fasta: - type: file - description: Reference fasta file - pattern: "*.fasta" - - fai: - type: file - description: Reference fasta index file - pattern: "*.fai" - - dbsnp: - type: file - description: dbSNP VCF file - pattern: "*.vcf.gz" - - dbsnp_tbi: - type: file - description: dbSNP VCF index file - pattern: "*.tbi" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - gvcfs: + type: file + description: | + gVCF(.gz) file + pattern: "*.{vcf,vcf.gz}" + - tbis: + type: file + description: | + index of gvcf file + pattern: "*.tbi" + - intervals: + type: file + description: Interval file with the genomic regions included in the library + (optional) + - - meta1: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: Reference fasta file + pattern: "*.fasta" + - - meta2: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fai: + type: file + description: Reference fasta index file + pattern: "*.fai" + - - meta3: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - dbsnp: + type: file + description: dbSNP VCF file + pattern: "*.vcf.gz" + - - meta4: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - dbsnp_tbi: + type: file + description: dbSNP VCF index file + pattern: "*.tbi" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: Genotyped VCF file - pattern: "*.vcf.gz" - - tbi: - type: file - description: Tbi index for VCF file - pattern: "*.vcf.gz" + - vcf_gz: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.vcf.gz": + type: file + description: VCF file + pattern: "*.vcf.gz" + - vcf_gz_tbi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.vcf.gz.tbi": + type: file + description: VCF index file + pattern: "*.vcf.gz.tbi" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@asp8200" maintainers: diff --git a/modules/nf-core/sentieon/gvcftyper/tests/main.nf.test b/modules/nf-core/sentieon/gvcftyper/tests/main.nf.test new file mode 100644 index 0000000000..0f66feb681 --- /dev/null +++ b/modules/nf-core/sentieon/gvcftyper/tests/main.nf.test @@ -0,0 +1,212 @@ +nextflow_process { + + name "Test Process SENTIEON_GVCFTYPER" + script "../main.nf" + process "SENTIEON_GVCFTYPER" + config "./nextflow.config" + + tag "modules" + tag "modules_nfcore" + tag "sentieon" + tag "sentieon/gvcftyper" + + test("sentieon gvcftyper vcf") { + + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics//homo_sapiens/illumina/gvcf/test.genome.vcf.idx', checkIfExists: true), + [] + ] + + input[1] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)] + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true)] + input[3] = [[:], []] + input[4] = [[:], []] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.versions, + file(process.out.vcf_gz_tbi.get(0).get(1)).name, + path(process.out.vcf_gz[0][1]).vcf.variantsMD5 + ).match() + } + + ) + } + + } + + test("sentieon gvcftyper vcf.gz") { + + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics//homo_sapiens/illumina/gvcf/test.genome.vcf.gz.tbi', checkIfExists: true), + [] + ] + + input[1] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)] + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true)] + input[3] = [[:], []] + input[4] = [[:], []] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.versions, + file(process.out.vcf_gz_tbi.get(0).get(1)).name, + path(process.out.vcf_gz[0][1]).vcf.variantsMD5 + ).match() + } + + ) + } + + } + + test("sentieon gvcftyper dbsnp") { + + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics//homo_sapiens/illumina/gvcf/test.genome.vcf.gz.tbi', checkIfExists: true), + [] + ] + + input[1] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)] + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true)] + input[3] = [[id: 'test'], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/vcf/dbsnp_146.hg38.vcf.gz', checkIfExists: true)] + input[4] = [[id: 'test'], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/vcf/dbsnp_146.hg38.vcf.gz.tbi', checkIfExists: true)] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.versions, + file(process.out.vcf_gz_tbi.get(0).get(1)).name, + path(process.out.vcf_gz[0][1]).vcf.variantsMD5 + ).match() + } + + ) + } + + } + + test("sentieon gvcftyper intervals") { + + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics//homo_sapiens/illumina/gvcf/test.genome.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists: true) + ] + + input[1] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)] + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true)] + input[3] = [[:], []] + input[4] = [[:], []] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.versions, + file(process.out.vcf_gz_tbi.get(0).get(1)).name, + path(process.out.vcf_gz[0][1]).vcf.variantsMD5 + ).match() + } + + ) + } + + } + + test("sentieon gvcftyper dbsnp intervals") { + + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gvcf/test.genome.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics//homo_sapiens/illumina/gvcf/test.genome.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists: true) + ] + + input[1] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)] + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true)] + input[3] = [[id: 'test'], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/vcf/dbsnp_146.hg38.vcf.gz', checkIfExists: true)] + input[4] = [[id: 'test'], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/vcf/dbsnp_146.hg38.vcf.gz.tbi', checkIfExists: true)] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.versions, + file(process.out.vcf_gz_tbi.get(0).get(1)).name, + path(process.out.vcf_gz[0][1]).vcf.variantsMD5 + ).match() + } + + ) + } + + } + + test("sentieon gvcftyper - stub") { + + options "-stub" + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true), + [] // no intervals + ] + input[1] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)] + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true)] + input[3] = [[:],[]] + input[4] = [[:],[]] + """ + } + } + + then { + + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + + } + } + +} diff --git a/modules/nf-core/sentieon/gvcftyper/tests/main.nf.test.snap b/modules/nf-core/sentieon/gvcftyper/tests/main.nf.test.snap new file mode 100644 index 0000000000..627b62fd87 --- /dev/null +++ b/modules/nf-core/sentieon/gvcftyper/tests/main.nf.test.snap @@ -0,0 +1,121 @@ +{ + "sentieon gvcftyper dbsnp": { + "content": [ + [ + "versions.yml:md5,03a2696e8be5117cccfe48a9bfd8c68a" + ], + "test.genotyped.vcf.gz.tbi", + "21606383c760bf676d4c1f747b97d118" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:29:01.102534193" + }, + "sentieon gvcftyper dbsnp intervals": { + "content": [ + [ + "versions.yml:md5,03a2696e8be5117cccfe48a9bfd8c68a" + ], + "test.genotyped.vcf.gz.tbi", + "21606383c760bf676d4c1f747b97d118" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:29:20.933217951" + }, + "sentieon gvcftyper vcf.gz": { + "content": [ + [ + "versions.yml:md5,03a2696e8be5117cccfe48a9bfd8c68a" + ], + "test.genotyped.vcf.gz.tbi", + "d13216836f1452e200b215b796606671" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:28:50.937002394" + }, + "sentieon gvcftyper intervals": { + "content": [ + [ + "versions.yml:md5,03a2696e8be5117cccfe48a9bfd8c68a" + ], + "test.genotyped.vcf.gz.tbi", + "d13216836f1452e200b215b796606671" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:29:11.029924476" + }, + "sentieon gvcftyper - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.genotyped.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test" + }, + "test.genotyped.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + "versions.yml:md5,03a2696e8be5117cccfe48a9bfd8c68a" + ], + "vcf_gz": [ + [ + { + "id": "test" + }, + "test.genotyped.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "vcf_gz_tbi": [ + [ + { + "id": "test" + }, + "test.genotyped.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,03a2696e8be5117cccfe48a9bfd8c68a" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:29:30.788262037" + }, + "sentieon gvcftyper vcf": { + "content": [ + [ + "versions.yml:md5,03a2696e8be5117cccfe48a9bfd8c68a" + ], + "test.genotyped.vcf.gz.tbi", + "d13216836f1452e200b215b796606671" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:28:41.276698125" + } +} \ No newline at end of file diff --git a/modules/nf-core/sentieon/gvcftyper/tests/nextflow.config b/modules/nf-core/sentieon/gvcftyper/tests/nextflow.config new file mode 100644 index 0000000000..1561a7c4d2 --- /dev/null +++ b/modules/nf-core/sentieon/gvcftyper/tests/nextflow.config @@ -0,0 +1,15 @@ +env { + // NOTE This is how pipeline users will use Sentieon in real world use + SENTIEON_LICENSE = "$SENTIEON_LICSRVR_IP" + // NOTE This should only happen in GitHub actions or nf-core MegaTests + SENTIEON_AUTH_MECH = "$SENTIEON_AUTH_MECH" + SENTIEON_AUTH_DATA = secrets.SENTIEON_AUTH_DATA + // NOTE This is how pipepline users will test out Sentieon with a license file + // nextflow secrets set SENTIEON_LICENSE_BASE64 \$(cat | base64 -w 0) +} + +process { + withName: SENTIEON_GVCFTYPER { + ext.prefix = { "${meta.id}.genotyped" } + } +} diff --git a/modules/nf-core/sentieon/haplotyper/environment.yml b/modules/nf-core/sentieon/haplotyper/environment.yml index 89108f8e8b..d7abf668ea 100644 --- a/modules/nf-core/sentieon/haplotyper/environment.yml +++ b/modules/nf-core/sentieon/haplotyper/environment.yml @@ -1,7 +1,5 @@ -name: sentieon_haplotyper channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::sentieon=202308.02 + - bioconda::sentieon=202308.03 diff --git a/modules/nf-core/sentieon/haplotyper/main.nf b/modules/nf-core/sentieon/haplotyper/main.nf index 16eb775744..a04b342caf 100644 --- a/modules/nf-core/sentieon/haplotyper/main.nf +++ b/modules/nf-core/sentieon/haplotyper/main.nf @@ -3,19 +3,17 @@ process SENTIEON_HAPLOTYPER { label 'process_medium' label 'sentieon' - secret 'SENTIEON_LICENSE_BASE64' - conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/sentieon:202308.02--h43eeafb_0' : - 'biocontainers/sentieon:202308.02--h43eeafb_0' }" + 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/a6/a64461f38d76bebea8e21441079e76e663e1168b0c59dafee6ee58440ad8c8ac/data' : + 'community.wave.seqera.io/library/sentieon:202308.03--59589f002351c221' }" input: - tuple val(meta), path(input), path(input_index), path(intervals) - path fasta - path fai - path dbsnp - path dbsnp_tbi + tuple val(meta), path(input), path(input_index), path(intervals), path(recal_table) + tuple val(meta1), path(fasta) + tuple val(meta2), path(fai) + tuple val(meta3), path(dbsnp) + tuple val(meta4), path(dbsnp_tbi) val(emit_vcf) val(emit_gvcf) @@ -30,55 +28,46 @@ process SENTIEON_HAPLOTYPER { task.ext.when == null || task.ext.when script: - // The following code sets LD_LIBRARY_PATH in the script-section when the module is run by Singularity. - // That turned out to be one way of overcoming the following issue with the Singularity-Sentieon-containers from galaxy, Sentieon (LD_LIBRARY_PATH) and the way Nextflow runs Singularity-containers. - // The galaxy container uses a runscript which is responsible for setting LD_PRELOAD properly. Nextflow executes singularity containers using `singularity exec`, which avoids the run script, leading to the LD_LIBRARY_PATH/libstdc++.so.6 error. - if (workflow.containerEngine in ['singularity','apptainer']) { - fix_ld_library_path = 'LD_LIBRARY_PATH=/usr/local/lib/:\$LD_LIBRARY_PATH;export LD_LIBRARY_PATH' - } else { - fix_ld_library_path = '' - } - def args = task.ext.args ?: '' // options for the driver def args2 = task.ext.args2 ?: '' // options for the vcf generation def args3 = task.ext.args3 ?: '' // options for the gvcf generation def prefix = task.ext.prefix ?: "${meta.id}" - def dbsnp_command = dbsnp ? "-d $dbsnp " : "" - def interval_command = intervals ? "--interval $intervals" : "" - def sentieon_auth_mech_base64 = task.ext.sentieon_auth_mech_base64 ?: '' - def sentieon_auth_data_base64 = task.ext.sentieon_auth_data_base64 ?: '' - def vcf_cmd = "" - def gvcf_cmd = "" + def input_list = input instanceof List ? input.collect{"-i $it"}.join(' ') : "-i $input" + def dbsnp_command = dbsnp ? "-d $dbsnp " : "" + def interval_command = intervals ? "--interval $intervals" : "" + def recal_table_command = recal_table ? "-q $recal_table" : "" def base_cmd = '--algo Haplotyper ' + dbsnp_command - if (emit_vcf) { // emit_vcf can be the empty string, 'variant', 'confident' or 'all' but NOT 'gvcf' - vcf_cmd = base_cmd + args2 + ' --emit_mode ' + emit_vcf + ' ' + prefix + '.unfiltered.vcf.gz' - } - - if (emit_gvcf) { // emit_gvcf can be either true or false - gvcf_cmd = base_cmd + args3 + ' --emit_mode gvcf ' + prefix + '.g.vcf.gz' - } - - """ - if [ "\${#SENTIEON_LICENSE_BASE64}" -lt "1500" ]; then # If the string SENTIEON_LICENSE_BASE64 is short, then it is an encrypted url. - export SENTIEON_LICENSE=\$(echo -e "\$SENTIEON_LICENSE_BASE64" | base64 -d) - else # Localhost license file - # The license file is stored as a nextflow variable like, for instance, this: - # nextflow secrets set SENTIEON_LICENSE_BASE64 \$(cat | base64 -w 0) - export SENTIEON_LICENSE=\$(mktemp) - echo -e "\$SENTIEON_LICENSE_BASE64" | base64 -d > \$SENTIEON_LICENSE - fi + // The Sentieon --algo Haplotyper can create a VCF or gVCF but not both + // Luckily, we can run it twice while reading the BAM once, therefore we construct the two separate commands + // and run them twice while using the sentieon driver once. This allows us to create both types of VCF indels + // one process - if [ ${sentieon_auth_mech_base64} ] && [ ${sentieon_auth_data_base64} ]; then - # If sentieon_auth_mech_base64 and sentieon_auth_data_base64 are non-empty strings, then Sentieon is mostly likely being run with some test-license. - export SENTIEON_AUTH_MECH=\$(echo -n "${sentieon_auth_mech_base64}" | base64 -d) - export SENTIEON_AUTH_DATA=\$(echo -n "${sentieon_auth_data_base64}" | base64 -d) - echo "Decoded and exported Sentieon test-license system environment variables" - fi + // Create VCF command to export a VCF + def vcf_cmd = emit_vcf ? + base_cmd + args2 + ' --emit_mode ' + emit_vcf + ' ' + prefix + '.unfiltered.vcf.gz' : + "" - $fix_ld_library_path + // Create a gVCF command to export a gVCF + def gvcf_cmd = emit_gvcf ? + gvcf_cmd = base_cmd + args3 + ' --emit_mode gvcf ' + prefix + '.g.vcf.gz' : + "" - sentieon driver $args -r $fasta -t $task.cpus -i $input $interval_command $vcf_cmd $gvcf_cmd + def sentieonLicense = secrets.SENTIEON_LICENSE_BASE64 ? + "export SENTIEON_LICENSE=\$(mktemp);echo -e \"${secrets.SENTIEON_LICENSE_BASE64}\" | base64 -d > \$SENTIEON_LICENSE; " : + "" + """ + $sentieonLicense + + sentieon driver \\ + $args \\ + -r $fasta \\ + -t $task.cpus \\ + $interval_command \\ + ${input_list} \\ + $recal_table_command \\ + $vcf_cmd \\ + $gvcf_cmd cat <<-END_VERSIONS > versions.yml "${task.process}": @@ -87,22 +76,11 @@ process SENTIEON_HAPLOTYPER { """ stub: - // The following code sets LD_LIBRARY_PATH in the script-section when the module is run by Singularity. - // That turned out to be one way of overcoming the following issue with the Singularity-Sentieon-containers from galaxy, Sentieon (LD_LIBRARY_PATH) and the way Nextflow runs Singularity-containers. - // The galaxy container uses a runscript which is responsible for setting LD_PRELOAD properly. Nextflow executes singularity containers using `singularity exec`, which avoids the run script, leading to the LD_LIBRARY_PATH/libstdc++.so.6 error. - if (workflow.containerEngine in ['singularity','apptainer']) { - fix_ld_library_path = 'LD_LIBRARY_PATH=/usr/local/lib/:\$LD_LIBRARY_PATH;export LD_LIBRARY_PATH' - } else { - fix_ld_library_path = '' - } - def prefix = task.ext.prefix ?: "${meta.id}" """ - $fix_ld_library_path - - touch ${prefix}.unfiltered.vcf.gz + echo "" | gzip > ${prefix}.unfiltered.vcf.gz touch ${prefix}.unfiltered.vcf.gz.tbi - touch ${prefix}.g.vcf.gz + echo "" | gzip > ${prefix}.g.vcf.gz touch ${prefix}.g.vcf.gz.tbi cat <<-END_VERSIONS > versions.yml diff --git a/modules/nf-core/sentieon/haplotyper/meta.yml b/modules/nf-core/sentieon/haplotyper/meta.yml index c248db3fca..ee0e6152f0 100644 --- a/modules/nf-core/sentieon/haplotyper/meta.yml +++ b/modules/nf-core/sentieon/haplotyper/meta.yml @@ -11,73 +11,117 @@ tools: Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system. homepage: https://www.sentieon.com/ documentation: https://www.sentieon.com/ + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing reference information. - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM file from alignment - pattern: "*.{bam,cram}" - - input_index: - type: file - description: BAI/CRAI file from alignment - pattern: "*.{bai,crai}" - - intervals: - type: file - description: Bed file with the genomic regions included in the library (optional) - - fasta: - type: file - description: Genome fasta file - pattern: "*.{fa,fasta}" - - fai: - type: file - description: The index of the FASTA reference. - pattern: "*.fai" - - dbsnp: - type: file - description: VCF file containing known sites (optional) - - dbsnp_tbi: - type: file - description: VCF index of dbsnp (optional) - - emit_vcf: - type: string - description: | - Controls the vcf output from the haplotyper. - If emit_vcf is set to "all" then the haplotyper will output a vcf generated by the haplotyper in emit-mode "all". - If emit_vcf is set to "confident" then the haplotyper will output a vcf generated by the haplotyper in emit-mode "confident". - If emit_vcf is set to "variant" then the haplotyper will output a vcf generated by the haplotyper in emit_mode "confident". - - emit_gvcf: - type: boolean - description: If true, the haplotyper will output a gvcf + - - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM file from alignment + pattern: "*.{bam,cram}" + - input_index: + type: file + description: BAI/CRAI file from alignment + pattern: "*.{bai,crai}" + - intervals: + type: file + description: Bed file with the genomic regions included in the library (optional) + - recal_table: + type: file + description: Recalibration table from sentieon/qualcal (optional) + - - meta1: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: Genome fasta file + pattern: "*.{fa,fasta}" + - - meta2: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fai: + type: file + description: The index of the FASTA reference. + pattern: "*.fai" + - - meta3: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - dbsnp: + type: file + description: VCF file containing known sites (optional) + - - meta4: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - dbsnp_tbi: + type: file + description: VCF index of dbsnp (optional) + - - emit_vcf: + type: string + description: | + Controls the vcf output from the haplotyper. + If emit_vcf is set to "all" then the haplotyper will output a vcf generated by the haplotyper in emit-mode "all". + If emit_vcf is set to "confident" then the haplotyper will output a vcf generated by the haplotyper in emit-mode "confident". + If emit_vcf is set to "variant" then the haplotyper will output a vcf generated by the haplotyper in emit_mode "confident". + - - emit_gvcf: + type: boolean + description: If true, the haplotyper will output a gvcf output: - - meta: - type: map - description: | - Groovy Map containing reference information. - e.g. [ id:'test', single_end:false ] - vcf: - type: file - description: Compressed VCF file - pattern: "*.unfiltered.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - "*.unfiltered.vcf.gz": + type: file + description: Compressed VCF file + pattern: "*.unfiltered.vcf.gz" - vcf_tbi: - type: file - description: Index of VCF file - pattern: "*.unfiltered.vcf.gz.tbi" + - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - "*.unfiltered.vcf.gz.tbi": + type: file + description: Index of VCF file + pattern: "*.unfiltered.vcf.gz.tbi" - gvcf: - type: file - description: Compressed GVCF file - pattern: "*.g.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - "*.g.vcf.gz": + type: file + description: Compressed GVCF file + pattern: "*.g.vcf.gz" - gvcf_tbi: - type: file - description: Index of GVCF file - pattern: "*.g.vcf.gz.tbi" + - meta: + type: map + description: | + Groovy Map containing reference information. + e.g. [ id:'test', single_end:false ] + - "*.g.vcf.gz.tbi": + type: file + description: Index of GVCF file + pattern: "*.g.vcf.gz.tbi" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@asp8200" maintainers: diff --git a/modules/nf-core/sentieon/haplotyper/tests/main.nf.test b/modules/nf-core/sentieon/haplotyper/tests/main.nf.test new file mode 100644 index 0000000000..c06ed17597 --- /dev/null +++ b/modules/nf-core/sentieon/haplotyper/tests/main.nf.test @@ -0,0 +1,327 @@ +nextflow_process { + + name "Test Process SENTIEON_HAPLOTYPER" + script "../main.nf" + process "SENTIEON_HAPLOTYPER" + config "./nextflow.config" + + tag "modules" + tag "modules_nfcore" + tag "sentieon" + tag "sentieon/haplotyper" + tag "sentieon/qualcal" + + test("Sentieon Haplotyper VCF") { + + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true), + [], // no intervals + [] // no recal table + ] + input[1] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)] + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true)] + input[3] = [[:],[]] + input[4] = [[:],[]] + input[5] = 'variant' + input[6] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.versions, + file(process.out.vcf_tbi.get(0).get(1)).name, + path(process.out.vcf[0][1]).vcf.variantsMD5 + ).match() + } + ) + } + + } + + test("Sentieon Haplotyper GVCF") { + + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true), + [], // no intervals + [] // no recal table + ] + input[1] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)] + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true)] + input[3] = [[:],[]] + input[4] = [[:],[]] + input[5] = '' + input[6] = true + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.versions, + file(process.out.gvcf_tbi.get(0).get(1)).name, + path(process.out.gvcf[0][1]).vcf.variantsMD5 + ).match() + } + ) + } + + } + + test("Sentieon Haplotyper BOTH") { + + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true), + [], // no intervals + [] // no recal table + ] + input[1] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)] + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true)] + input[3] = [[:],[]] + input[4] = [[:],[]] + input[5] = 'variant' + input[6] = true + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.versions, + file(process.out.gvcf_tbi.get(0).get(1)).name, + path(process.out.gvcf[0][1]).vcf.variantsMD5, + file(process.out.vcf_tbi.get(0).get(1)).name, + path(process.out.vcf[0][1]).vcf.variantsMD5 + ).match() + } + ) + } + + } + + test("Sentieon Haplotyper Intervals BOTH") { + + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists: true), + [] + ] + input[1] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)] + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true)] + input[3] = [[:],[]] + input[4] = [[:],[]] + input[5] = 'variant' + input[6] = true + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.versions, + file(process.out.gvcf_tbi.get(0).get(1)).name, + path(process.out.gvcf[0][1]).vcf.variantsMD5, + file(process.out.vcf_tbi.get(0).get(1)).name, + path(process.out.vcf[0][1]).vcf.variantsMD5 + ).match() + } + ) + } + + } + + test("Sentieon Haplotyper DBSNP BOTH") { + + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists: true), + [] + ] + input[1] = [[id: 'test'], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)] + input[2] = [[id: 'test'], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true)] + input[3] = [[id: 'test'], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/vcf/dbsnp_146.hg38.vcf.gz', checkIfExists: true)] + input[4] = [[id: 'test'], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/vcf/dbsnp_146.hg38.vcf.gz.tbi', checkIfExists: true)] + input[5] = 'variant' + input[6] = true + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.versions, + file(process.out.gvcf_tbi.get(0).get(1)).name, + path(process.out.gvcf[0][1]).vcf.variantsMD5, + file(process.out.vcf_tbi.get(0).get(1)).name, + path(process.out.vcf[0][1]).vcf.variantsMD5 + ).match() + } + ) + } + } + + test("Sentieon Haplotyper Recalibration") { + + setup { + run("SENTIEON_QUALCAL") { + script "../../qualcal/main.nf" + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true) + ] + input[1] = [ [ id:'fasta' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) ] + input[2] = [ [ id:'fasta' ], file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true) ] + + input[3] = [ [ id:'knownSites' ],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/vcf/dbsnp_146.hg38.vcf.gz', checkIfExists: true) ] + input[4] = [ [ id:'knownSites' ],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/vcf/dbsnp_146.hg38.vcf.gz.tbi', checkIfExists: true) ] + input[5] = [[:],[]] + input[6] = false + """ + } + } + } + + when { + process { + """ + recal_table = SENTIEON_QUALCAL.out.table + bam = Channel.of([ [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true), + [] // no intervals + + ]) + input[0] = bam.join(recal_table) + input[1] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)] + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true)] + input[3] = [[:],[]] + input[4] = [[:],[]] + input[5] = 'variant' + input[6] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.versions, + file(process.out.vcf_tbi.get(0).get(1)).name, + path(process.out.vcf[0][1]).vcf.variantsMD5 + ).match() + } + ) + } + + } + + test("Sentieon Haplotyper multiple CRAMs") { + + when { + process { + """ + input[0] = [ + [ id:'test' ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.sorted.bam', checkIfExists: true), + ], + [ + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test2.paired_end.sorted.bam.bai', checkIfExists: true), + ], + [], // no intervals + [] // no recal table + ] + input[1] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)] + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true)] + input[3] = [[:],[]] + input[4] = [[:],[]] + input[5] = 'variant' + input[6] = false + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.versions, + file(process.out.vcf_tbi.get(0).get(1)).name, + path(process.out.vcf[0][1]).vcf.variantsMD5 + ).match() + } + ) + } + + } + + test("Sentieon Haplotyper - stub") { + + options "-stub" + when { + process { + """ + input[0] = [ [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai', checkIfExists: true), + [], // no intervals + [] // no recal table + ] + input[1] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true)] + input[2] = [[id: 'test'],file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true)] + input[3] = [[:],[]] + input[4] = [[:],[]] + input[5] = 'variant' + input[6] = true + """ + } + } + + then { + + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + + } + } + +} diff --git a/modules/nf-core/sentieon/haplotyper/tests/main.nf.test.snap b/modules/nf-core/sentieon/haplotyper/tests/main.nf.test.snap new file mode 100644 index 0000000000..0527f0fcbe --- /dev/null +++ b/modules/nf-core/sentieon/haplotyper/tests/main.nf.test.snap @@ -0,0 +1,187 @@ +{ + "Sentieon Haplotyper VCF": { + "content": [ + [ + "versions.yml:md5,1a7b41acc44d0724c8dca247e6323877" + ], + "test.unfiltered.vcf.gz.tbi", + "cea0045051da7877b38a1e25df812a91" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:29:42.675527558" + }, + "Sentieon Haplotyper Recalibration": { + "content": [ + [ + "versions.yml:md5,1a7b41acc44d0724c8dca247e6323877" + ], + "test.unfiltered.vcf.gz.tbi", + "10faa3b669c49826098e09784d8a4716" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:30:38.708688756" + }, + "Sentieon Haplotyper GVCF": { + "content": [ + [ + "versions.yml:md5,1a7b41acc44d0724c8dca247e6323877" + ], + "test.g.vcf.gz.tbi", + "338fc3c37b208d6595948576833eb665" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:29:53.99302993" + }, + "Sentieon Haplotyper BOTH": { + "content": [ + [ + "versions.yml:md5,1a7b41acc44d0724c8dca247e6323877" + ], + "test.g.vcf.gz.tbi", + "338fc3c37b208d6595948576833eb665", + "test.unfiltered.vcf.gz.tbi", + "cea0045051da7877b38a1e25df812a91" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:30:03.323463525" + }, + "Sentieon Haplotyper DBSNP BOTH": { + "content": [ + [ + "versions.yml:md5,1a7b41acc44d0724c8dca247e6323877" + ], + "test.g.vcf.gz.tbi", + "228556b7921205f023fec51098feeb97", + "test.unfiltered.vcf.gz.tbi", + "cc1f3d4bd615f3640e7fd103cc39d2f8" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:30:25.917634004" + }, + "Sentieon Haplotyper Intervals BOTH": { + "content": [ + [ + "versions.yml:md5,1a7b41acc44d0724c8dca247e6323877" + ], + "test.g.vcf.gz.tbi", + "338fc3c37b208d6595948576833eb665", + "test.unfiltered.vcf.gz.tbi", + "cea0045051da7877b38a1e25df812a91" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:30:14.249175276" + }, + "Sentieon Haplotyper - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.unfiltered.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test" + }, + "test.unfiltered.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "2": [ + [ + { + "id": "test" + }, + "test.g.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "3": [ + [ + { + "id": "test" + }, + "test.g.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + "versions.yml:md5,1a7b41acc44d0724c8dca247e6323877" + ], + "gvcf": [ + [ + { + "id": "test" + }, + "test.g.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "gvcf_tbi": [ + [ + { + "id": "test" + }, + "test.g.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "vcf": [ + [ + { + "id": "test" + }, + "test.unfiltered.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "vcf_tbi": [ + [ + { + "id": "test" + }, + "test.unfiltered.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,1a7b41acc44d0724c8dca247e6323877" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:30:56.435076872" + }, + "Sentieon Haplotyper multiple CRAMs": { + "content": [ + [ + "versions.yml:md5,1a7b41acc44d0724c8dca247e6323877" + ], + "test.unfiltered.vcf.gz.tbi", + "b5d6e09e336438e38f7bf5531799e3a" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-02T10:30:49.266709749" + } +} \ No newline at end of file diff --git a/modules/nf-core/sentieon/haplotyper/tests/nextflow.config b/modules/nf-core/sentieon/haplotyper/tests/nextflow.config new file mode 100644 index 0000000000..78f19cfb26 --- /dev/null +++ b/modules/nf-core/sentieon/haplotyper/tests/nextflow.config @@ -0,0 +1,16 @@ +env { + // NOTE This is how pipeline users will use Sentieon in real world use + SENTIEON_LICENSE = "$SENTIEON_LICSRVR_IP" + // NOTE This should only happen in GitHub actions or nf-core MegaTests + SENTIEON_AUTH_MECH = "$SENTIEON_AUTH_MECH" + SENTIEON_AUTH_DATA = secrets.SENTIEON_AUTH_DATA + // NOTE This is how pipeline users will test out Sentieon with a license file + // nextflow secrets set SENTIEON_LICENSE_BASE64 \$(cat | base64 -w 0) +} + +process { + withName: 'SENTIEON_HAPLOTYPER' { + ext.args2 = "--genotype_model multinomial" + ext.args3 = "--genotype_model multinomial" + } +} diff --git a/modules/nf-core/sentieon/haplotyper/tests/tags.yml b/modules/nf-core/sentieon/haplotyper/tests/tags.yml new file mode 100644 index 0000000000..3178c146c0 --- /dev/null +++ b/modules/nf-core/sentieon/haplotyper/tests/tags.yml @@ -0,0 +1,2 @@ +sentieon/haplotyper: + - "modules/nf-core/sentieon/haplotyper/**" diff --git a/modules/nf-core/sentieon/varcal/environment.yml b/modules/nf-core/sentieon/varcal/environment.yml index 481da2ce84..d7abf668ea 100644 --- a/modules/nf-core/sentieon/varcal/environment.yml +++ b/modules/nf-core/sentieon/varcal/environment.yml @@ -1,7 +1,5 @@ -name: sentieon_varcal channels: - conda-forge - bioconda - - defaults dependencies: - - bioconda::sentieon=202308.02 + - bioconda::sentieon=202308.03 diff --git a/modules/nf-core/sentieon/varcal/main.nf b/modules/nf-core/sentieon/varcal/main.nf index aa8847f839..d78eacb44a 100644 --- a/modules/nf-core/sentieon/varcal/main.nf +++ b/modules/nf-core/sentieon/varcal/main.nf @@ -3,12 +3,10 @@ process SENTIEON_VARCAL { label 'process_low' label 'sentieon' - secret 'SENTIEON_LICENSE_BASE64' - conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/sentieon:202308.02--h43eeafb_0' : - 'biocontainers/sentieon:202308.02--h43eeafb_0' }" + 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/a6/a64461f38d76bebea8e21441079e76e663e1168b0c59dafee6ee58440ad8c8ac/data' : + 'community.wave.seqera.io/library/sentieon:202308.03--59589f002351c221' }" input: tuple val(meta), path(vcf), path(tbi) // input vcf and tbi of variants to recalibrate @@ -29,15 +27,6 @@ process SENTIEON_VARCAL { task.ext.when == null || task.ext.when script: - // The following code sets LD_LIBRARY_PATH in the script-section when the module is run by Singularity. - // That turned out to be one way of overcoming the following issue with the Singularity-Sentieon-containers from galaxy, Sentieon (LD_LIBRARY_PATH) and the way Nextflow runs Singularity-containers. - // The galaxy container uses a runscript which is responsible for setting LD_PRELOAD properly. Nextflow executes singularity containers using `singularity exec`, which avoids the run script, leading to the LD_LIBRARY_PATH/libstdc++.so.6 error. - if (workflow.containerEngine in ['singularity','apptainer']) { - fix_ld_library_path = 'LD_LIBRARY_PATH=/usr/local/lib/:\$LD_LIBRARY_PATH;export LD_LIBRARY_PATH' - } else { - fix_ld_library_path = '' - } - def args = task.ext.args ?: '' def prefix = task.ext.prefix ?: "${meta.id}" def reference_command = fasta ? "--reference $fasta " : '' @@ -56,28 +45,11 @@ process SENTIEON_VARCAL { labels_command += "--resource ${items[1]} --resource_param ${items[0]} " } } - - def sentieon_auth_mech_base64 = task.ext.sentieon_auth_mech_base64 ?: '' - def sentieon_auth_data_base64 = task.ext.sentieon_auth_data_base64 ?: '' - + def sentieonLicense = secrets.SENTIEON_LICENSE_BASE64 ? + "export SENTIEON_LICENSE=\$(mktemp);echo -e \"${secrets.SENTIEON_LICENSE_BASE64}\" | base64 -d > \$SENTIEON_LICENSE; " : + "" """ - if [ "\${#SENTIEON_LICENSE_BASE64}" -lt "1500" ]; then # If the string SENTIEON_LICENSE_BASE64 is short, then it is an encrypted url. - export SENTIEON_LICENSE=\$(echo -e "\$SENTIEON_LICENSE_BASE64" | base64 -d) - else # Localhost license file - # The license file is stored as a nextflow variable like, for instance, this: - # nextflow secrets set SENTIEON_LICENSE_BASE64 \$(cat | base64 -w 0) - export SENTIEON_LICENSE=\$(mktemp) - echo -e "\$SENTIEON_LICENSE_BASE64" | base64 -d > \$SENTIEON_LICENSE - fi - - if [ ${sentieon_auth_mech_base64} ] && [ ${sentieon_auth_data_base64} ]; then - # If sentieon_auth_mech_base64 and sentieon_auth_data_base64 are non-empty strings, then Sentieon is mostly likely being run with some test-license. - export SENTIEON_AUTH_MECH=\$(echo -n "${sentieon_auth_mech_base64}" | base64 -d) - export SENTIEON_AUTH_DATA=\$(echo -n "${sentieon_auth_data_base64}" | base64 -d) - echo "Decoded and exported Sentieon test-license system environment variables" - fi - - $fix_ld_library_path + $sentieonLicense sentieon driver -r ${fasta} --algo VarCal \\ -v $vcf \\ @@ -93,19 +65,8 @@ process SENTIEON_VARCAL { """ stub: - // The following code sets LD_LIBRARY_PATH in the script-section when the module is run by Singularity. - // That turned out to be one way of overcoming the following issue with the Singularity-Sentieon-containers from galaxy, Sentieon (LD_LIBRARY_PATH) and the way Nextflow runs Singularity-containers. - // The galaxy container uses a runscript which is responsible for setting LD_PRELOAD properly. Nextflow executes singularity containers using `singularity exec`, which avoids the run script, leading to the LD_LIBRARY_PATH/libstdc++.so.6 error. - if (workflow.containerEngine in ['singularity','apptainer']) { - fix_ld_library_path = 'LD_LIBRARY_PATH=/usr/local/lib/:\$LD_LIBRARY_PATH;export LD_LIBRARY_PATH' - } else { - fix_ld_library_path = '' - } - def prefix = task.ext.prefix ?: "${meta.id}" """ - $fix_ld_library_path - touch ${prefix}.recal touch ${prefix}.idx touch ${prefix}.tranches diff --git a/modules/nf-core/sentieon/varcal/meta.yml b/modules/nf-core/sentieon/varcal/meta.yml index cad7ee106f..4661dc92d8 100644 --- a/modules/nf-core/sentieon/varcal/meta.yml +++ b/modules/nf-core/sentieon/varcal/meta.yml @@ -14,60 +14,87 @@ tools: Our software improves upon BWA, STAR, Minimap2, GATK, HaplotypeCaller, Mutect, and Mutect2 based pipelines and is deployable on any generic-CPU-based computing system. homepage: https://www.sentieon.com/ documentation: https://www.sentieon.com/ + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test' ] - - vcf: - type: file - description: input vcf file containing the variants to be recalibrated - pattern: "*.vcf.gz" - - tbi: - type: file - description: tbi file matching with -vcf - pattern: "*.vcf.gz.tbi" - - resource_vcf: - type: file - description: all resource vcf files that are used with the corresponding '--resource' label - pattern: "*.vcf.gz" - - resource_tbi: - type: file - description: all resource tbi files that are used with the corresponding '--resource' label - pattern: "*.vcf.gz.tbi" - - labels: - type: string - description: necessary arguments for Sentieon's VarCal. Specified to directly match the resources provided. More information can be found at https://support.sentieon.com/manual/usages/general/#varcal-algorithm - - fasta: - type: file - description: The reference fasta file - pattern: "*.fasta" - - fai: - type: file - description: Index of reference fasta file - pattern: "fasta.fai" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test' ] + - vcf: + type: file + description: input vcf file containing the variants to be recalibrated + pattern: "*.vcf.gz" + - tbi: + type: file + description: tbi file matching with -vcf + pattern: "*.vcf.gz.tbi" + - - resource_vcf: + type: file + description: all resource vcf files that are used with the corresponding '--resource' + label + pattern: "*.vcf.gz" + - - resource_tbi: + type: file + description: all resource tbi files that are used with the corresponding '--resource' + label + pattern: "*.vcf.gz.tbi" + - - labels: + type: string + description: necessary arguments for Sentieon's VarCal. Specified to directly + match the resources provided. More information can be found at https://support.sentieon.com/manual/usages/general/#varcal-algorithm + - - fasta: + type: file + description: The reference fasta file + pattern: "*.fasta" + - - fai: + type: file + description: Index of reference fasta file + pattern: "fasta.fai" output: - recal: - type: file - description: Output recal file used by ApplyVQSR - pattern: "*.recal" + - meta: + type: file + description: Output recal file used by ApplyVQSR + pattern: "*.recal" + - "*.recal": + type: file + description: Output recal file used by ApplyVQSR + pattern: "*.recal" - idx: - type: file - description: Index file for the recal output file - pattern: "*.idx" + - meta: + type: file + description: Index file for the recal output file + pattern: "*.idx" + - "*.idx": + type: file + description: Index file for the recal output file + pattern: "*.idx" - tranches: - type: file - description: Output tranches file used by ApplyVQSR - pattern: "*.tranches" + - meta: + type: file + description: Output tranches file used by ApplyVQSR + pattern: "*.tranches" + - "*.tranches": + type: file + description: Output tranches file used by ApplyVQSR + pattern: "*.tranches" - plots: - type: file - description: Optional output rscript file to aid in visualization of the input data and learned model. - pattern: "*plots.R" - - version: - type: file - description: File containing software versions - pattern: "*.versions.yml" + - meta: + type: file + description: Optional output rscript file to aid in visualization of the input + data and learned model. + pattern: "*plots.R" + - "*plots.R": + type: file + description: Optional output rscript file to aid in visualization of the input + data and learned model. + pattern: "*plots.R" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@asp8200" maintainers: diff --git a/modules/nf-core/snpeff/download/environment.yml b/modules/nf-core/snpeff/download/environment.yml index 62f3d5aad6..f2ad925161 100644 --- a/modules/nf-core/snpeff/download/environment.yml +++ b/modules/nf-core/snpeff/download/environment.yml @@ -1,7 +1,5 @@ -name: snpeff_download channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::snpeff=5.1 diff --git a/modules/nf-core/snpeff/download/main.nf b/modules/nf-core/snpeff/download/main.nf index f1fc4cc395..e42259337d 100644 --- a/modules/nf-core/snpeff/download/main.nf +++ b/modules/nf-core/snpeff/download/main.nf @@ -8,7 +8,7 @@ process SNPEFF_DOWNLOAD { 'biocontainers/snpeff:5.1--hdfd78af_2' }" input: - tuple val(meta), val(genome), val(cache_version) + tuple val(meta), val(snpeff_db) output: tuple val(meta), path('snpeff_cache'), emit: cache @@ -28,7 +28,7 @@ process SNPEFF_DOWNLOAD { """ snpEff \\ -Xmx${avail_mem}M \\ - download ${genome}.${cache_version} \\ + download ${snpeff_db} \\ -dataDir \${PWD}/snpeff_cache \\ ${args} @@ -41,7 +41,10 @@ process SNPEFF_DOWNLOAD { stub: """ - mkdir ${genome}.${cache_version} + mkdir -p snpeff_cache/${snpeff_db} + + touch snpeff_cache/${snpeff_db}/sequence.I.bin + touch snpeff_cache/${snpeff_db}/sequence.bin cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/snpeff/download/meta.yml b/modules/nf-core/snpeff/download/meta.yml index f367c69664..a3211fc7c9 100644 --- a/modules/nf-core/snpeff/download/meta.yml +++ b/modules/nf-core/snpeff/download/meta.yml @@ -14,29 +14,35 @@ tools: homepage: https://pcingola.github.io/SnpEff/ documentation: https://pcingola.github.io/SnpEff/se_introduction/ licence: ["MIT"] + identifier: biotools:snpeff input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: | - vcf to annotate - - db: - type: string - description: | - which db to annotate with + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - genome: + type: file + description: Reference genome in FASTA format + pattern: "*.{fasta,fna,fa}" + - cache_version: + type: string + description: Version of the snpEff cache to download output: - cache: - type: file - description: | - snpEff cache + - meta: + type: file + description: | + snpEff cache + - snpeff_cache: + type: file + description: | + snpEff cache - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@maxulysse" maintainers: diff --git a/modules/nf-core/snpeff/download/tests/main.nf.test b/modules/nf-core/snpeff/download/tests/main.nf.test new file mode 100644 index 0000000000..ef547c6f2e --- /dev/null +++ b/modules/nf-core/snpeff/download/tests/main.nf.test @@ -0,0 +1,51 @@ + +nextflow_process { + + name "Test Process SNPEFF_DOWNLOAD" + script "../main.nf" + process "SNPEFF_DOWNLOAD" + + tag "modules" + tag "modules_nfcore" + tag "snpeff" + tag "snpeff/download" + + test("test-snpeff-download") { + + when { + process { + """ + input[0] = [ [ id:"WBcel235.105" ], "WBcel235.105" ] + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("test-snpeff-download-stub") { + options '-stub' + when { + process { + """ + input[0] = [ [ id:"WBcel235.105" ], "WBcel235.105" ] + + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + +} diff --git a/modules/nf-core/snpeff/download/tests/main.nf.test.snap b/modules/nf-core/snpeff/download/tests/main.nf.test.snap new file mode 100644 index 0000000000..5bccdd8ad3 --- /dev/null +++ b/modules/nf-core/snpeff/download/tests/main.nf.test.snap @@ -0,0 +1,100 @@ +{ + "test-snpeff-download-stub": { + "content": [ + { + "0": [ + [ + { + "id": "WBcel235.105" + }, + [ + [ + "sequence.I.bin:md5,d41d8cd98f00b204e9800998ecf8427e", + "sequence.bin:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ] + ], + "1": [ + "versions.yml:md5,5fc7ed9f548eccf5fac9fdefc12ef56e" + ], + "cache": [ + [ + { + "id": "WBcel235.105" + }, + [ + [ + "sequence.I.bin:md5,d41d8cd98f00b204e9800998ecf8427e", + "sequence.bin:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ] + ], + "versions": [ + "versions.yml:md5,5fc7ed9f548eccf5fac9fdefc12ef56e" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-19T12:48:45.183665736" + }, + "test-snpeff-download": { + "content": [ + { + "0": [ + [ + { + "id": "WBcel235.105" + }, + [ + [ + "sequence.I.bin:md5,2fd1694bd91cf7952cbad8cfed161e53", + "sequence.II.bin:md5,bacedbdea89508e108223767fa260a4c", + "sequence.III.bin:md5,444118a9fb9d0a03c37e86094d8e52a9", + "sequence.IV.bin:md5,ff756628faa0b71cd65495668c3d82b5", + "sequence.V.bin:md5,d6ad5476162ac45829f719dd4ee3f4e7", + "sequence.X.bin:md5,b79bec6cc8f96b8373dac56bab5d0a6c", + "sequence.bin:md5,ec2bc2ae81755ab90fcf1848bc7ce41f", + "snpEffectPredictor.bin:md5,1d99251d0405f0a42913ed8b5b2c2fa7" + ] + ] + ] + ], + "1": [ + "versions.yml:md5,5fc7ed9f548eccf5fac9fdefc12ef56e" + ], + "cache": [ + [ + { + "id": "WBcel235.105" + }, + [ + [ + "sequence.I.bin:md5,2fd1694bd91cf7952cbad8cfed161e53", + "sequence.II.bin:md5,bacedbdea89508e108223767fa260a4c", + "sequence.III.bin:md5,444118a9fb9d0a03c37e86094d8e52a9", + "sequence.IV.bin:md5,ff756628faa0b71cd65495668c3d82b5", + "sequence.V.bin:md5,d6ad5476162ac45829f719dd4ee3f4e7", + "sequence.X.bin:md5,b79bec6cc8f96b8373dac56bab5d0a6c", + "sequence.bin:md5,ec2bc2ae81755ab90fcf1848bc7ce41f", + "snpEffectPredictor.bin:md5,1d99251d0405f0a42913ed8b5b2c2fa7" + ] + ] + ] + ], + "versions": [ + "versions.yml:md5,5fc7ed9f548eccf5fac9fdefc12ef56e" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-08-29T14:27:47.123555" + } +} \ No newline at end of file diff --git a/modules/nf-core/snpeff/snpeff/environment.yml b/modules/nf-core/snpeff/snpeff/environment.yml index b492e6a88e..f2ad925161 100644 --- a/modules/nf-core/snpeff/snpeff/environment.yml +++ b/modules/nf-core/snpeff/snpeff/environment.yml @@ -1,7 +1,5 @@ -name: snpeff_snpeff channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::snpeff=5.1 diff --git a/modules/nf-core/snpeff/snpeff/meta.yml b/modules/nf-core/snpeff/snpeff/meta.yml index 7559c3de08..ef3d495ae7 100644 --- a/modules/nf-core/snpeff/snpeff/meta.yml +++ b/modules/nf-core/snpeff/snpeff/meta.yml @@ -14,46 +14,76 @@ tools: homepage: https://pcingola.github.io/SnpEff/ documentation: https://pcingola.github.io/SnpEff/se_introduction/ licence: ["MIT"] + identifier: biotools:snpeff input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: | - vcf to annotate - - db: - type: string - description: | - which db to annotate with - - cache: - type: file - description: | - path to snpEff cache (optional) + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: | + vcf to annotate + - - db: + type: string + description: | + which db to annotate with + - - meta2: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - cache: + type: file + description: | + path to snpEff cache (optional) output: - vcf: - type: file - description: | - annotated vcf - pattern: "*.ann.vcf" + - meta: + type: file + description: | + annotated vcf + pattern: "*.ann.vcf" + - "*.ann.vcf": + type: file + description: | + annotated vcf + pattern: "*.ann.vcf" - report: - type: file - description: snpEff report csv file - pattern: "*.csv" + - meta: + type: file + description: snpEff report csv file + pattern: "*.csv" + - "*.csv": + type: file + description: snpEff report csv file + pattern: "*.csv" - summary_html: - type: file - description: snpEff summary statistics in html file - pattern: "*.html" + - meta: + type: file + description: snpEff summary statistics in html file + pattern: "*.html" + - "*.html": + type: file + description: snpEff summary statistics in html file + pattern: "*.html" - genes_txt: - type: file - description: txt (tab separated) file having counts of the number of variants affecting each transcript and gene - pattern: "*.genes.txt" + - meta: + type: file + description: txt (tab separated) file having counts of the number of variants + affecting each transcript and gene + pattern: "*.genes.txt" + - "*.genes.txt": + type: file + description: txt (tab separated) file having counts of the number of variants + affecting each transcript and gene + pattern: "*.genes.txt" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@maxulysse" maintainers: diff --git a/modules/nf-core/snpeff/snpeff/tests/main.nf.test b/modules/nf-core/snpeff/snpeff/tests/main.nf.test index dd37f275ad..2be0b7d72c 100644 --- a/modules/nf-core/snpeff/snpeff/tests/main.nf.test +++ b/modules/nf-core/snpeff/snpeff/tests/main.nf.test @@ -6,6 +6,7 @@ nextflow_process { config "./nextflow.config" tag "modules" tag "modules_nfcore" + tag "modules_snpeff" tag "snpeff" tag "snpeff/download" tag "snpeff/snpeff" @@ -17,7 +18,7 @@ nextflow_process { script "../../download/main.nf" process { """ - input[0] = Channel.of([[id:params.snpeff_genome + '.' + params.snpeff_cache_version], params.snpeff_genome, params.snpeff_cache_version]) + input[0] = Channel.of([[id:params.snpeff_db], params.snpeff_db]) """ } } @@ -30,7 +31,7 @@ nextflow_process { [ id:'test' ], // meta map file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) ]) - input[1] = params.snpeff_genome + '.' + params.snpeff_cache_version + input[1] = params.snpeff_db input[2] = SNPEFF_DOWNLOAD.out.cache """ } diff --git a/modules/nf-core/snpeff/snpeff/tests/nextflow.config b/modules/nf-core/snpeff/snpeff/tests/nextflow.config index d31ebf6b7c..a950a0475d 100644 --- a/modules/nf-core/snpeff/snpeff/tests/nextflow.config +++ b/modules/nf-core/snpeff/snpeff/tests/nextflow.config @@ -1,4 +1,3 @@ params { - snpeff_cache_version = "105" - snpeff_genome = "WBcel235" + snpeff_db = "WBcel235.105" } diff --git a/modules/nf-core/spring/decompress/environment.yml b/modules/nf-core/spring/decompress/environment.yml index d960ee714a..abeb16b095 100644 --- a/modules/nf-core/spring/decompress/environment.yml +++ b/modules/nf-core/spring/decompress/environment.yml @@ -1,7 +1,5 @@ -name: spring_decompress channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::spring=1.1.1 diff --git a/modules/nf-core/spring/decompress/main.nf b/modules/nf-core/spring/decompress/main.nf index 4cf7829917..86ced26906 100644 --- a/modules/nf-core/spring/decompress/main.nf +++ b/modules/nf-core/spring/decompress/main.nf @@ -38,4 +38,17 @@ process SPRING_DECOMPRESS { spring: ${VERSION} END_VERSIONS """ + + stub: + def prefix = task.ext.prefix ?: "${meta.id}" + def VERSION = '1.1.1' // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions. + def output = write_one_fastq_gz ? "echo '' | gzip > ${prefix}.fastq.gz" : "echo '' | gzip > ${prefix}_R1.fastq.gz; echo '' | gzip > ${prefix}_R2.fastq.gz" + """ + ${output} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + spring: ${VERSION} + END_VERSIONS + """ } diff --git a/modules/nf-core/spring/decompress/meta.yml b/modules/nf-core/spring/decompress/meta.yml index a3449b4fb3..72b72b75da 100644 --- a/modules/nf-core/spring/decompress/meta.yml +++ b/modules/nf-core/spring/decompress/meta.yml @@ -6,41 +6,45 @@ keywords: - lossless tools: - "spring": - description: "SPRING is a compression tool for Fastq files (containing up to 4.29 Billion reads)" + description: "SPRING is a compression tool for Fastq files (containing up to 4.29 + Billion reads)" homepage: "https://github.com/shubhamchandak94/Spring" documentation: "https://github.com/shubhamchandak94/Spring/blob/master/README.md" tool_dev_url: "https://github.com/shubhamchandak94/Spring" doi: "10.1093/bioinformatics/bty1015" licence: ["Free for non-commercial use"] + identifier: biotools:spring input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - spring: - type: file - description: Spring file to decompress. - pattern: "*.{spring}" - - write_one_fastq_gz: - type: boolean - description: | - Controls whether spring should write one fastq.gz file with reads from both directions or two fastq.gz files with reads from distinct directions - pattern: "true or false" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - spring: + type: file + description: Spring file to decompress. + pattern: "*.{spring}" + - - write_one_fastq_gz: + type: boolean + description: | + Controls whether spring should write one fastq.gz file with reads from both directions or two fastq.gz files with reads from distinct directions + pattern: "true or false" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - fastq: - type: file - description: Decompressed FASTQ file(s). - pattern: "*.{fastq.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.fastq.gz": + type: file + description: Decompressed FASTQ file(s). + pattern: "*.{fastq.gz}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@xec-cm" maintainers: diff --git a/modules/nf-core/spring/decompress/test/main.nf.test b/modules/nf-core/spring/decompress/test/main.nf.test new file mode 100644 index 0000000000..9428a86bcf --- /dev/null +++ b/modules/nf-core/spring/decompress/test/main.nf.test @@ -0,0 +1,155 @@ +nextflow_process { + + name "Test Process SPRING_DECOMPRESS" + tag "modules_nfcore" + tag "modules" + tag "spring" + tag "spring/decompress" + script "../main.nf" + process "SPRING_DECOMPRESS" + + + + test("Write-One-File") { + + setup { + run("SPRING_COMPRESS") { + script "../../compress/main.nf" + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + '/genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + [] + ] + """ + } + } + } + + when { + process { + """ + input[0] = SPRING_COMPRESS.out.spring + input[1] = true // write_one_fastq_gz + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("Write-Two-Files") { + + setup { + run("SPRING_COMPRESS") { + script "../../compress/main.nf" + process { + """ + input[0] = [ + [ id:'test2' ], // meta map + file(params.modules_testdata_base_path + '/genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + '/genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true), + ] + """ + } + } + } + + when { + process { + """ + input[0] = SPRING_COMPRESS.out.spring + input[1] = false // write_one_fastq_gz + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("Write-One-File-stub") { + + options "-stub" + + setup { + run("SPRING_COMPRESS") { + options "-stub" + script "../../compress/main.nf" + process { + """ + input[0] = [ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + '/genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + [] + ] + """ + } + } + } + + when { + process { + """ + input[0] = SPRING_COMPRESS.out.spring + input[1] = true // write_one_fastq_gz + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("Write-Two-Files-stub") { + + options "-stub" + + setup { + run("SPRING_COMPRESS") { + options "-stub" + script "../../compress/main.nf" + process { + """ + input[0] = [ + [ id:'test2' ], // meta map + file(params.modules_testdata_base_path + '/genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + '/genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true), + ] + """ + } + } + } + + when { + process { + """ + input[0] = SPRING_COMPRESS.out.spring + input[1] = false // write_one_fastq_gz + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + +} \ No newline at end of file diff --git a/modules/nf-core/spring/decompress/test/main.nf.test.snap b/modules/nf-core/spring/decompress/test/main.nf.test.snap new file mode 100644 index 0000000000..7dcadbab61 --- /dev/null +++ b/modules/nf-core/spring/decompress/test/main.nf.test.snap @@ -0,0 +1,218 @@ +{ + "Write-One-File stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "/home/ramprasad.neethiraj/nextflow/modules/.nf-test/tests/2a6cfab794852e23e6324eb4955668b2/work/42/aee6c82c1ca502c3b02339f597188b/test.fastq.gz" + ] + ], + "1": [ + "versions.yml:md5,4711df5941f1464e3693d24dd29c705b" + ], + "fastq": [ + [ + { + "id": "test" + }, + "/home/ramprasad.neethiraj/nextflow/modules/.nf-test/tests/2a6cfab794852e23e6324eb4955668b2/work/42/aee6c82c1ca502c3b02339f597188b/test.fastq.gz" + ] + ], + "versions": [ + "versions.yml:md5,4711df5941f1464e3693d24dd29c705b" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-08T10:03:50.626223289" + }, + "Write-Two-Files stub": { + "content": [ + { + "0": [ + [ + { + "id": "test2" + }, + [ + "/home/ramprasad.neethiraj/nextflow/modules/.nf-test/tests/528557b5a81e4bffb57c38b19c7aa351/work/74/fc5d116d011bcd47d6f7de8d42ac34/test2_R1.fastq.gz", + "/home/ramprasad.neethiraj/nextflow/modules/.nf-test/tests/528557b5a81e4bffb57c38b19c7aa351/work/74/fc5d116d011bcd47d6f7de8d42ac34/test2_R2.fastq.gz" + ] + ] + ], + "1": [ + "versions.yml:md5,4711df5941f1464e3693d24dd29c705b" + ], + "fastq": [ + [ + { + "id": "test2" + }, + [ + "/home/ramprasad.neethiraj/nextflow/modules/.nf-test/tests/528557b5a81e4bffb57c38b19c7aa351/work/74/fc5d116d011bcd47d6f7de8d42ac34/test2_R1.fastq.gz", + "/home/ramprasad.neethiraj/nextflow/modules/.nf-test/tests/528557b5a81e4bffb57c38b19c7aa351/work/74/fc5d116d011bcd47d6f7de8d42ac34/test2_R2.fastq.gz" + ] + ] + ], + "versions": [ + "versions.yml:md5,4711df5941f1464e3693d24dd29c705b" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-08T10:03:57.417015606" + }, + "Write-Two-Files": { + "content": [ + { + "0": [ + [ + { + "id": "test2" + }, + [ + "test2_R1.fastq.gz:md5,4161df271f9bfcd25d5845a1e220dbec", + "test2_R2.fastq.gz:md5,2ebae722295ea66d84075a3b042e2b42" + ] + ] + ], + "1": [ + "versions.yml:md5,4711df5941f1464e3693d24dd29c705b" + ], + "fastq": [ + [ + { + "id": "test2" + }, + [ + "test2_R1.fastq.gz:md5,4161df271f9bfcd25d5845a1e220dbec", + "test2_R2.fastq.gz:md5,2ebae722295ea66d84075a3b042e2b42" + ] + ] + ], + "versions": [ + "versions.yml:md5,4711df5941f1464e3693d24dd29c705b" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-21T13:41:46.090761471" + }, + "Write-One-File": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.fastq.gz:md5,4161df271f9bfcd25d5845a1e220dbec" + ] + ], + "1": [ + "versions.yml:md5,4711df5941f1464e3693d24dd29c705b" + ], + "fastq": [ + [ + { + "id": "test" + }, + "test.fastq.gz:md5,4161df271f9bfcd25d5845a1e220dbec" + ] + ], + "versions": [ + "versions.yml:md5,4711df5941f1464e3693d24dd29c705b" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-21T13:02:07.466039653" + }, + "Write-One-File-stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + "versions.yml:md5,4711df5941f1464e3693d24dd29c705b" + ], + "fastq": [ + [ + { + "id": "test" + }, + "test.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,4711df5941f1464e3693d24dd29c705b" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-08T13:55:53.594615215" + }, + "Write-Two-Files-stub": { + "content": [ + { + "0": [ + [ + { + "id": "test2" + }, + [ + "test2_R1.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test2_R2.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "1": [ + "versions.yml:md5,4711df5941f1464e3693d24dd29c705b" + ], + "fastq": [ + [ + { + "id": "test2" + }, + [ + "test2_R1.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test2_R2.fastq.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + ], + "versions": [ + "versions.yml:md5,4711df5941f1464e3693d24dd29c705b" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-08T13:56:01.212228909" + } +} \ No newline at end of file diff --git a/modules/nf-core/spring/decompress/test/nextflow.config b/modules/nf-core/spring/decompress/test/nextflow.config new file mode 100644 index 0000000000..50f50a7a35 --- /dev/null +++ b/modules/nf-core/spring/decompress/test/nextflow.config @@ -0,0 +1,5 @@ +process { + + publishDir = { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" } + +} \ No newline at end of file diff --git a/modules/nf-core/spring/decompress/test/tags.yml b/modules/nf-core/spring/decompress/test/tags.yml new file mode 100644 index 0000000000..1fe70aec91 --- /dev/null +++ b/modules/nf-core/spring/decompress/test/tags.yml @@ -0,0 +1,3 @@ +spring/decompress: + - modules/nf-core/spring/compress/** + - modules/nf-core/spring/decompress/** diff --git a/modules/nf-core/strelka/germline/environment.yml b/modules/nf-core/strelka/germline/environment.yml index 23bd165b21..052c6baa5f 100644 --- a/modules/nf-core/strelka/germline/environment.yml +++ b/modules/nf-core/strelka/germline/environment.yml @@ -1,7 +1,5 @@ -name: strelka_germline channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::strelka=2.9.10 diff --git a/modules/nf-core/strelka/germline/meta.yml b/modules/nf-core/strelka/germline/meta.yml index 9a597ef01f..5536dd8a56 100644 --- a/modules/nf-core/strelka/germline/meta.yml +++ b/modules/nf-core/strelka/germline/meta.yml @@ -1,5 +1,6 @@ name: strelka_germline -description: Strelka2 is a fast and accurate small variant caller optimized for analysis of germline variation +description: Strelka2 is a fast and accurate small variant caller optimized for analysis + of germline variation keywords: - variantcalling - germline @@ -8,68 +9,90 @@ keywords: - variants tools: - strelka: - description: Strelka calls somatic and germline small variants from mapped sequencing reads + description: Strelka calls somatic and germline small variants from mapped sequencing + reads homepage: https://github.com/Illumina/strelka documentation: https://github.com/Illumina/strelka/blob/v2.9.x/docs/userGuide/README.md tool_dev_url: https://github.com/Illumina/strelka doi: 10.1038/s41592-018-0051-x licence: ["GPL v3"] + identifier: biotools:strelka input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test'] - - input: - type: file - description: BAM/CRAM file - pattern: "*.{bam,cram}" - - input_index: - type: file - description: BAM/CRAI index file - pattern: "*.{bai,crai}" - - fasta: - type: file - description: Genome reference FASTA file - pattern: "*.{fa,fasta}" - - fai: - type: file - description: Genome reference FASTA index file - pattern: "*.{fa.fai,fasta.fai}" - - target_bed: - type: file - description: BED file containing target regions for variant calling - pattern: "*.{bed}" - - target_bed_index: - type: file - description: Index for BED file containing target regions for variant calling - pattern: "*.{bed.tbi}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - input: + type: file + description: BAM/CRAM file + pattern: "*.{bam,cram}" + - input_index: + type: file + description: BAM/CRAI index file + pattern: "*.{bai,crai}" + - target_bed: + type: file + description: BED file containing target regions for variant calling + pattern: "*.{bed}" + - target_bed_index: + type: file + description: Index for BED file containing target regions for variant calling + pattern: "*.{bed.tbi}" + - - fasta: + type: file + description: Genome reference FASTA file + pattern: "*.{fa,fasta}" + - - fai: + type: file + description: Genome reference FASTA index file + pattern: "*.{fa.fai,fasta.fai}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test'] - vcf: - type: file - description: gzipped germline variant file - pattern: "*.{vcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - "*variants.vcf.gz": + type: file + description: gzipped germline variant file + pattern: "*.{vcf.gz}" - vcf_tbi: - type: file - description: index file for the vcf file - pattern: "*.vcf.gz.tbi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - "*variants.vcf.gz.tbi": + type: file + description: index file for the vcf file + pattern: "*.vcf.gz.tbi" - genome_vcf: - type: file - description: variant records and compressed non-variant blocks - pattern: "*_genome.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - "*genome.vcf.gz": + type: file + description: variant records and compressed non-variant blocks + pattern: "*_genome.vcf.gz" - genome_vcf_tbi: - type: file - description: index file for the genome_vcf file - pattern: "*_genome.vcf.gz.tbi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - "*genome.vcf.gz.tbi": + type: file + description: index file for the genome_vcf file + pattern: "*_genome.vcf.gz.tbi" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@arontommi" maintainers: diff --git a/modules/nf-core/strelka/germline/tests/main.nf.test b/modules/nf-core/strelka/germline/tests/main.nf.test index d662016898..e2be9e1830 100644 --- a/modules/nf-core/strelka/germline/tests/main.nf.test +++ b/modules/nf-core/strelka/germline/tests/main.nf.test @@ -30,7 +30,7 @@ nextflow_process { { assert process.success }, { assert path(process.out.vcf.get(0).get(1)).linesGzip.contains("##fileformat=VCFv4.1") }, { assert path(process.out.genome_vcf.get(0).get(1)).linesGzip.contains("##fileformat=VCFv4.1") }, - { assert snapshot(process.out.version).match("version") } + { assert snapshot(process.out.versions).match() } ) } @@ -58,7 +58,7 @@ nextflow_process { { assert process.success }, { assert path(process.out.vcf.get(0).get(1)).linesGzip.contains("##fileformat=VCFv4.1") }, { assert path(process.out.genome_vcf.get(0).get(1)).linesGzip.contains("##fileformat=VCFv4.1") }, - { assert snapshot(process.out.version).match("target_version") } + { assert snapshot(process.out.versions).match() } ) } diff --git a/modules/nf-core/strelka/germline/tests/main.nf.test.snap b/modules/nf-core/strelka/germline/tests/main.nf.test.snap index 2604707f81..2085fdbfaa 100644 --- a/modules/nf-core/strelka/germline/tests/main.nf.test.snap +++ b/modules/nf-core/strelka/germline/tests/main.nf.test.snap @@ -80,20 +80,28 @@ }, "timestamp": "2024-03-20T16:07:30.81195" }, - "version": { - "content": null, + "human - cram": { + "content": [ + [ + "versions.yml:md5,5f72393fd2ab4358e3f0ad16d1937f65" + ] + ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.0" + "nextflow": "24.04.4" }, - "timestamp": "2024-03-20T16:05:34.583702" + "timestamp": "2024-08-28T16:50:28.337512" }, - "target_version": { - "content": null, + "human - cram - target": { + "content": [ + [ + "versions.yml:md5,5f72393fd2ab4358e3f0ad16d1937f65" + ] + ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.0" + "nextflow": "24.04.4" }, - "timestamp": "2024-03-20T16:11:37.54108" + "timestamp": "2024-08-28T16:51:35.930772" } -} \ No newline at end of file +} diff --git a/modules/nf-core/strelka/germline/tests/tags.yml b/modules/nf-core/strelka/germline/tests/tags.yml deleted file mode 100644 index 4a72ab31a4..0000000000 --- a/modules/nf-core/strelka/germline/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -strelka/germline: - - "modules/nf-core/strelka/germline/**" diff --git a/modules/nf-core/strelka/somatic/environment.yml b/modules/nf-core/strelka/somatic/environment.yml index ecbc865ec9..052c6baa5f 100644 --- a/modules/nf-core/strelka/somatic/environment.yml +++ b/modules/nf-core/strelka/somatic/environment.yml @@ -1,7 +1,5 @@ -name: strelka_somatic channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::strelka=2.9.10 diff --git a/modules/nf-core/strelka/somatic/meta.yml b/modules/nf-core/strelka/somatic/meta.yml index 428bcb3f84..6f2caaa3e7 100644 --- a/modules/nf-core/strelka/somatic/meta.yml +++ b/modules/nf-core/strelka/somatic/meta.yml @@ -1,5 +1,7 @@ name: strelka_somatic -description: Strelka2 is a fast and accurate small variant caller optimized for analysis of germline variation in small cohorts and somatic variation in tumor/normal sample pairs +description: Strelka2 is a fast and accurate small variant caller optimized for analysis + of germline variation in small cohorts and somatic variation in tumor/normal sample + pairs keywords: - variant calling - germline @@ -8,84 +10,106 @@ keywords: - variants tools: - strelka: - description: Strelka calls somatic and germline small variants from mapped sequencing reads + description: Strelka calls somatic and germline small variants from mapped sequencing + reads homepage: https://github.com/Illumina/strelka documentation: https://github.com/Illumina/strelka/blob/v2.9.x/docs/userGuide/README.md tool_dev_url: https://github.com/Illumina/strelka doi: 10.1038/s41592-018-0051-x licence: ["GPL v3"] + identifier: biotools:strelka input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input_normal: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - input_index_normal: - type: file - description: BAM/CRAM/SAM index file - pattern: "*.{bai,crai,sai}" - - input_tumor: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - input_index_tumor: - type: file - description: BAM/CRAM/SAM index file - pattern: "*.{bai,crai,sai}" - - manta_candidate_small_indels: - type: file - description: VCF.gz file - pattern: "*.{vcf.gz}" - - manta_candidate_small_indels_tbi: - type: file - description: VCF.gz index file - pattern: "*.tbi" - - fasta: - type: file - description: Genome reference FASTA file - pattern: "*.{fa,fasta}" - - fai: - type: file - description: Genome reference FASTA index file - pattern: "*.{fa.fai,fasta.fai}" - - target_bed: - type: file - description: BED file containing target regions for variant calling - pattern: "*.{bed}" - - target_bed_index: - type: file - description: Index for BED file containing target regions for variant calling - pattern: "*.{bed.tbi}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input_normal: + type: file + description: BAM/CRAM/SAM file + pattern: "*.{bam,cram,sam}" + - input_index_normal: + type: file + description: BAM/CRAM/SAM index file + pattern: "*.{bai,crai,sai}" + - input_tumor: + type: file + description: BAM/CRAM/SAM file + pattern: "*.{bam,cram,sam}" + - input_index_tumor: + type: file + description: BAM/CRAM/SAM index file + pattern: "*.{bai,crai,sai}" + - manta_candidate_small_indels: + type: file + description: VCF.gz file + pattern: "*.{vcf.gz}" + - manta_candidate_small_indels_tbi: + type: file + description: VCF.gz index file + pattern: "*.tbi" + - target_bed: + type: file + description: BED file containing target regions for variant calling + pattern: "*.{bed}" + - target_bed_index: + type: file + description: Index for BED file containing target regions for variant calling + pattern: "*.{bed.tbi}" + - - fasta: + type: file + description: Genome reference FASTA file + pattern: "*.{fa,fasta}" + - - fai: + type: file + description: Genome reference FASTA index file + pattern: "*.{fa.fai,fasta.fai}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - vcf_indels: - type: file - description: Gzipped VCF file containing variants - pattern: "*.{vcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.somatic_indels.vcf.gz": + type: file + description: Gzipped VCF file containing variants + pattern: "*.{vcf.gz}" - vcf_indels_tbi: - type: file - description: Index for gzipped VCF file containing variants - pattern: "*.{vcf.gz.tbi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.somatic_indels.vcf.gz.tbi": + type: file + description: Index for gzipped VCF file containing variants + pattern: "*.{vcf.gz.tbi}" - vcf_snvs: - type: file - description: Gzipped VCF file containing variants - pattern: "*.{vcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.somatic_snvs.vcf.gz": + type: file + description: Gzipped VCF file containing variants + pattern: "*.{vcf.gz}" - vcf_snvs_tbi: - type: file - description: Index for gzipped VCF file containing variants - pattern: "*.{vcf.gz.tbi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.somatic_snvs.vcf.gz.tbi": + type: file + description: Index for gzipped VCF file containing variants + pattern: "*.{vcf.gz.tbi}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" maintainers: diff --git a/modules/nf-core/svdb/merge/environment.yml b/modules/nf-core/svdb/merge/environment.yml index e6fec08877..ab87ec70a1 100644 --- a/modules/nf-core/svdb/merge/environment.yml +++ b/modules/nf-core/svdb/merge/environment.yml @@ -1,10 +1,7 @@ -name: svdb_merge channels: - conda-forge - bioconda - - defaults + dependencies: - - svdb=2.8.1 - # renovate: datasource=conda depName=bioconda/samtools - - samtools=1.19.2 - - htslib=1.19.1 + - bcftools=1.21 + - svdb=2.8.2 diff --git a/modules/nf-core/svdb/merge/main.nf b/modules/nf-core/svdb/merge/main.nf index c24a9a7c38..5b19a29931 100644 --- a/modules/nf-core/svdb/merge/main.nf +++ b/modules/nf-core/svdb/merge/main.nf @@ -3,57 +3,102 @@ process SVDB_MERGE { label 'process_medium' conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/mulled-v2-c8daa8f9d69d3c5a1a4ff08283a166c18edb0000:511069f65a53621c5503e5cfee319aa3c735abfa-0': - 'biocontainers/mulled-v2-c8daa8f9d69d3c5a1a4ff08283a166c18edb0000:511069f65a53621c5503e5cfee319aa3c735abfa-0' }" + 'https://depot.galaxyproject.org/singularity/mulled-v2-375a758a4ca8c128fb9d38047a68a9f4322d2acd:b3615e06ef17566f2988a215ce9e10808c1d08bf-0': + 'biocontainers/mulled-v2-375a758a4ca8c128fb9d38047a68a9f4322d2acd:b3615e06ef17566f2988a215ce9e10808c1d08bf-0' }" input: tuple val(meta), path(vcfs) - val (priority) + val(input_priority) + val(sort_inputs) output: - tuple val(meta), path("*.vcf.gz"), emit: vcf - path "versions.yml" , emit: versions + tuple val(meta), path("*.{vcf,vcf.gz,bcf,bcf.gz}"), emit: vcf + tuple val(meta), path("*.tbi") , emit: tbi, optional: true + tuple val(meta), path("*.csi") , emit: csi, optional: true + path "versions.yml" , emit: versions when: task.ext.when == null || task.ext.when script: def args = task.ext.args ?: '' + def args2 = task.ext.args2 ?: '' def prefix = task.ext.prefix ?: "${meta.id}" - def input = "${vcfs.join(" ")}" - def prio = "" - if(priority) { - prio = "--priority ${priority.join(',')}" + + // Ensure priority list matches the number of VCFs if priority is provided + if (input_priority && vcfs.collect().size() != input_priority.collect().size()) { + error "If priority is used, one tag per VCF is needed" + } + + def input = "" + def prio = "" + if (input_priority) { + if (vcfs.collect().size() > 1 && sort_inputs) { + // make vcf-prioprity pairs and sort on VCF name, so priority is also sorted the same + def pairs = vcfs.indices.collect { [vcfs[it], input_priority[it]] } + pairs = pairs.sort { a, b -> a[0].name <=> b[0].name } + vcfs = pairs.collect { it[0] } + priority = pairs.collect { it[1] } + } else { + priority = input_priority + } + + // Build inputs + prio = "--priority ${input_priority.join(',')}" input = "" - for (int index = 0; index < vcfs.size(); index++) { - input += " ${vcfs[index]}:${priority[index]}" + for (int index = 0; index < vcfs.collect().size(); index++) { + input += "${vcfs[index]}:${priority[index]} " } + + } else { + // if there's no priority input just sort the vcfs by name if possible + input = (vcfs.collect().size() > 1 && sort_inputs) ? vcfs.sort { it.name } : vcfs } + + def extension = args2.contains("--output-type b") || args2.contains("-Ob") ? "bcf.gz" : + args2.contains("--output-type u") || args2.contains("-Ou") ? "bcf" : + args2.contains("--output-type z") || args2.contains("-Oz") ? "vcf.gz" : + args2.contains("--output-type v") || args2.contains("-Ov") ? "vcf" : + "vcf" """ svdb \\ --merge \\ $args \\ $prio \\ - --vcf $input \\ - > ${prefix}.vcf - bgzip ${prefix}.vcf + --vcf $input |\\ + bcftools view \\ + --threads ${task.cpus} \\ + --output ${prefix}.${extension} cat <<-END_VERSIONS > versions.yml "${task.process}": svdb: \$( echo \$(svdb) | head -1 | sed 's/usage: SVDB-\\([0-9]\\.[0-9]\\.[0-9]\\).*/\\1/' ) - samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') END_VERSIONS """ stub: def prefix = task.ext.prefix ?: "${meta.id}" + def args2 = task.ext.args2 ?: '' + def extension = args2.contains("--output-type b") || args2.contains("-Ob") ? "bcf.gz" : + args2.contains("--output-type u") || args2.contains("-Ou") ? "bcf" : + args2.contains("--output-type z") || args2.contains("-Oz") ? "vcf.gz" : + args2.contains("--output-type v") || args2.contains("-Ov") ? "vcf" : + "vcf" + def index = args2.contains("--write-index=tbi") || args2.contains("-W=tbi") ? "tbi" : + args2.contains("--write-index=csi") || args2.contains("-W=csi") ? "csi" : + args2.contains("--write-index") || args2.contains("-W") ? "csi" : + "" + def create_cmd = extension.endsWith(".gz") ? "echo '' | gzip >" : "touch" + def create_index = extension.endsWith(".gz") && index.matches("csi|tbi") ? "touch ${prefix}.${extension}.${index}" : "" """ - touch ${prefix}.vcf.gz + ${create_cmd} ${prefix}.${extension} + ${create_index} cat <<-END_VERSIONS > versions.yml "${task.process}": svdb: \$( echo \$(svdb) | head -1 | sed 's/usage: SVDB-\\([0-9]\\.[0-9]\\.[0-9]\\).*/\\1/' ) - samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//') + bcftools: \$(bcftools --version 2>&1 | head -n1 | sed 's/^.*bcftools //; s/ .*\$//') END_VERSIONS """ } diff --git a/modules/nf-core/svdb/merge/meta.yml b/modules/nf-core/svdb/merge/meta.yml index e53e61fe74..c34a9cb157 100644 --- a/modules/nf-core/svdb/merge/meta.yml +++ b/modules/nf-core/svdb/merge/meta.yml @@ -10,34 +10,68 @@ tools: homepage: https://github.com/J35P312/SVDB documentation: https://github.com/J35P312/SVDB/blob/master/README.md licence: ["MIT"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test' ] - - priority: - type: list - description: prioritise the input vcf files according to this list, e.g ['tiddit','cnvnator'] - - vcfs: - type: list - description: Two or more VCF files. Order of files should correspond to the order of tags used for priority. - pattern: "*.{vcf,vcf.gz}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test' ] + - vcfs: + type: list + description: | + One or more VCF files. The order and number of files should correspond to + the order and number of tags in the `priority` input channel. + pattern: "*.{vcf,vcf.gz}" + - - input_priority: + type: list + description: | + Prioritize the input VCF files according to this list, + e.g ['tiddit','cnvnator']. The order and number of tags should correspond to + the order and number of VCFs in the `vcfs` input channel. + - - sort_inputs: + type: boolean + description: | + Should the input files be sorted by name. The priority tag will be sorted + together with it's corresponding VCF file. output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test' ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - vcf: - type: file - description: merged VCF file - pattern: "*.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.{vcf,vcf.gz,bcf,bcf.gz}": + type: file + description: VCF output file + pattern: "*.{vcf,vcf.gz,bcf,bcf.gz}" + - tbi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tbi": + type: file + description: Alternative VCF file index + pattern: "*.tbi" + - csi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: Default VCF file index + pattern: "*.csi" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@ramprasadn" maintainers: - "@ramprasadn" + - "@fellen31" diff --git a/modules/nf-core/svdb/merge/tests/main.nf.test b/modules/nf-core/svdb/merge/tests/main.nf.test index 42f7c57067..6a79d7a09a 100644 --- a/modules/nf-core/svdb/merge/tests/main.nf.test +++ b/modules/nf-core/svdb/merge/tests/main.nf.test @@ -2,23 +2,108 @@ nextflow_process { name "Test Process SVDB_MERGE" script "modules/nf-core/svdb/merge/main.nf" + config "./nextflow.config" process "SVDB_MERGE" tag "modules" tag "modules_nfcore" tag "svdb" tag "svdb/merge" - test("test_svdb_merge") { + test("1 sample, [], []") { when { process { """ input[0] = Channel.of([ [ id:'test' ], // meta map - [file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true), - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf', checkIfExists: true) ] + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) ]) - input[1] = [ 'tiddit', 'cnvnator'] + input[1] = [] + input[2] = [] + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert path(process.out.vcf.get(0).get(1)).linesGzip[3].contains("--vcf test.vcf") }, // SVDB command line + { assert snapshot( + path(process.out.vcf.get(0).get(1)).vcf.summary, + path(process.out.vcf.get(0).get(1)).vcf.variantsMD5, + process.out.versions + ).match() } + ) + } + } + + test("1 sample, [], true") { + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ]) + input[1] = [] + input[2] = [] + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert path(process.out.vcf.get(0).get(1)).linesGzip[3].contains("--vcf test.vcf") }, // SVDB command line + { assert snapshot( + path(process.out.vcf.get(0).get(1)).vcf.summary, + path(process.out.vcf.get(0).get(1)).vcf.variantsMD5, + process.out.versions + ).match() } + ) + } + } + + test("1 sample, ['tiddit'], []") { + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ]) + input[1] = ['tiddit'] + input[2] = [] + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert path(process.out.vcf.get(0).get(1)).linesGzip[3].contains("--priority tiddit --vcf test.vcf:tiddit") }, // SVDB command line + { assert snapshot( + path(process.out.vcf.get(0).get(1)).vcf.summary, + path(process.out.vcf.get(0).get(1)).vcf.variantsMD5, + process.out.versions + ).match() } + ) + } + } + + test("1 sample, ['tiddit'], true") { + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ]) + input[1] = ['tiddit'] + input[2] = true """ } } @@ -26,23 +111,61 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert path(process.out.vcf.get(0).get(1)).linesGzip.contains("##fileformat=VCFv4.1") } + { assert path(process.out.vcf.get(0).get(1)).linesGzip[3].contains("--priority tiddit --vcf test.vcf:tiddit") }, // SVDB command line + { assert snapshot( + path(process.out.vcf.get(0).get(1)).vcf.summary, + path(process.out.vcf.get(0).get(1)).vcf.variantsMD5, + process.out.versions + ).match() } ) } + } + + test("2 samples, [], []") { + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test' ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + ]) + input[1] = [] + input[2] = [] + """ + } + } + then { + assertAll ( + { assert process.success }, + { assert path(process.out.vcf.get(0).get(1)).linesGzip[3].contains("--vcf test2.vcf test.vcf") }, // SVDB command line + { assert snapshot( + path(process.out.vcf.get(0).get(1)).vcf.summary, + path(process.out.vcf.get(0).get(1)).vcf.variantsMD5, + process.out.versions + ).match() } + ) + } } - test("test_svdb_merge_noprio") { + test("2 samples, [], true") { when { process { """ input[0] = Channel.of([ [ id:'test' ], // meta map - [file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true), - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf', checkIfExists: true) ] + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] ]) input[1] = [] + input[2] = true """ } } @@ -50,10 +173,188 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert path(process.out.vcf.get(0).get(1)).linesGzip.contains("##fileformat=VCFv4.1") } + { assert path(process.out.vcf.get(0).get(1)).linesGzip[3].contains("--vcf test.vcf test2.vcf") }, // SVDB command line + { assert snapshot( + path(process.out.vcf.get(0).get(1)).vcf.summary, + path(process.out.vcf.get(0).get(1)).vcf.variantsMD5, + process.out.versions + ).match() } ) } + } + + test("2 samples, ['tiddit', 'cnvnator'], []") { + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test' ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + ]) + input[1] = ['tiddit', 'cnvnator'] + input[2] = [] + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert path(process.out.vcf.get(0).get(1)).linesGzip[3].contains("--priority tiddit,cnvnator --vcf test2.vcf:tiddit test.vcf:cnvnator") }, // SVDB command line + { assert snapshot( + path(process.out.vcf.get(0).get(1)).vcf.summary, + path(process.out.vcf.get(0).get(1)).vcf.variantsMD5, + process.out.versions + ).match() } + ) + } + } + + test("2 samples, ['tiddit', 'cnvnator'], true") { + when { + process { + """ + input[0] = Channel.of([ + [ id:'test' ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + ]) + input[1] = ['tiddit', 'cnvnator'] + input[2] = true + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert path(process.out.vcf.get(0).get(1)).linesGzip[3].contains("--priority tiddit,cnvnator --vcf test.vcf:cnvnator test2.vcf:tiddit") }, // SVDB command line + { assert snapshot( + path(process.out.vcf.get(0).get(1)).vcf.summary, + path(process.out.vcf.get(0).get(1)).vcf.variantsMD5, + process.out.versions + ).match() } + ) + } + } + + test("2 samples, [], [] - stub") { + + options "-stub" + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test' ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + ]) + input[1] = [] + input[2] = [] + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("2 samples, [], true - stub") { + + options "-stub" + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test' ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + ]) + input[1] = [] + input[2] = true + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("2 samples, ['tiddit', 'cnvnator'], [] - stub") { + + options "-stub" + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test' ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + ]) + input[1] = ['tiddit', 'cnvnator'] + input[2] = [] + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("2 samples, ['tiddit', 'cnvnator'], true - stub") { + + options "-stub" + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test' ], // meta map + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + ] + ]) + input[1] = ['tiddit', 'cnvnator'] + input[2] = true + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } } } diff --git a/modules/nf-core/svdb/merge/tests/main.nf.test.snap b/modules/nf-core/svdb/merge/tests/main.nf.test.snap new file mode 100644 index 0000000000..e86662e533 --- /dev/null +++ b/modules/nf-core/svdb/merge/tests/main.nf.test.snap @@ -0,0 +1,294 @@ +{ + "1 sample, [], []": { + "content": [ + "VcfFile [chromosomes=[MT192765.1], sampleCount=1, variantCount=9, phased=false, phasedAutodetect=false]", + "60fb4cab2aa891bebef8ffdbd0e41bc3", + [ + "versions.yml:md5,bf8271626d334b2a827f94a2daacadd0" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-24T09:00:25.9277471" + }, + "2 samples, ['tiddit', 'cnvnator'], true - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "merged.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,bf8271626d334b2a827f94a2daacadd0" + ], + "csi": [ + + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "merged.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,bf8271626d334b2a827f94a2daacadd0" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-24T09:05:49.325618245" + }, + "2 samples, ['tiddit', 'cnvnator'], []": { + "content": [ + "VcfFile [chromosomes=[MT192765.1], sampleCount=2, variantCount=9, phased=false, phasedAutodetect=false]", + "254e56e4fc8356d68424828438da66e3", + [ + "versions.yml:md5,bf8271626d334b2a827f94a2daacadd0" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-24T09:02:27.964808463" + }, + "2 samples, [], []": { + "content": [ + "VcfFile [chromosomes=[MT192765.1], sampleCount=2, variantCount=9, phased=false, phasedAutodetect=false]", + "7ad648266e57d405b5b01aaea4613d1c", + [ + "versions.yml:md5,bf8271626d334b2a827f94a2daacadd0" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-24T09:02:11.013532413" + }, + "2 samples, ['tiddit', 'cnvnator'], true": { + "content": [ + "VcfFile [chromosomes=[MT192765.1], sampleCount=2, variantCount=9, phased=false, phasedAutodetect=false]", + "254e56e4fc8356d68424828438da66e3", + [ + "versions.yml:md5,bf8271626d334b2a827f94a2daacadd0" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-24T09:02:35.956320871" + }, + "1 sample, ['tiddit'], []": { + "content": [ + "VcfFile [chromosomes=[MT192765.1], sampleCount=1, variantCount=9, phased=false, phasedAutodetect=false]", + "9dd588cd870672b78192f48ad440b5d", + [ + "versions.yml:md5,bf8271626d334b2a827f94a2daacadd0" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-24T09:00:42.064583463" + }, + "1 sample, [], true": { + "content": [ + "VcfFile [chromosomes=[MT192765.1], sampleCount=1, variantCount=9, phased=false, phasedAutodetect=false]", + "60fb4cab2aa891bebef8ffdbd0e41bc3", + [ + "versions.yml:md5,bf8271626d334b2a827f94a2daacadd0" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-24T09:00:33.88572601" + }, + "1 sample, ['tiddit'], true": { + "content": [ + "VcfFile [chromosomes=[MT192765.1], sampleCount=1, variantCount=9, phased=false, phasedAutodetect=false]", + "9dd588cd870672b78192f48ad440b5d", + [ + "versions.yml:md5,bf8271626d334b2a827f94a2daacadd0" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-24T09:00:50.18149857" + }, + "2 samples, [], true": { + "content": [ + "VcfFile [chromosomes=[MT192765.1], sampleCount=2, variantCount=9, phased=false, phasedAutodetect=false]", + "de0a3b56cdee89e4c9cd4fbb4ad3391d", + [ + "versions.yml:md5,bf8271626d334b2a827f94a2daacadd0" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-24T09:02:19.556799178" + }, + "2 samples, ['tiddit', 'cnvnator'], [] - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "merged.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,bf8271626d334b2a827f94a2daacadd0" + ], + "csi": [ + + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "merged.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,bf8271626d334b2a827f94a2daacadd0" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-24T09:05:40.427970257" + }, + "2 samples, [], [] - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "merged.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,bf8271626d334b2a827f94a2daacadd0" + ], + "csi": [ + + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "merged.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,bf8271626d334b2a827f94a2daacadd0" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-24T09:05:24.34471465" + }, + "2 samples, [], true - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "merged.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,bf8271626d334b2a827f94a2daacadd0" + ], + "csi": [ + + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test" + }, + "merged.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,bf8271626d334b2a827f94a2daacadd0" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-24T09:05:32.529261733" + } +} \ No newline at end of file diff --git a/modules/nf-core/svdb/merge/tests/nextflow.config b/modules/nf-core/svdb/merge/tests/nextflow.config new file mode 100644 index 0000000000..c267037ca0 --- /dev/null +++ b/modules/nf-core/svdb/merge/tests/nextflow.config @@ -0,0 +1,6 @@ +process { + withName: 'SVDB_MERGE' { + ext.prefix = "merged" + ext.args2 = '--output-type z --no-version' + } +} diff --git a/modules/nf-core/tabix/bgziptabix/environment.yml b/modules/nf-core/tabix/bgziptabix/environment.yml index c4235872e3..017c259da1 100644 --- a/modules/nf-core/tabix/bgziptabix/environment.yml +++ b/modules/nf-core/tabix/bgziptabix/environment.yml @@ -1,8 +1,7 @@ -name: tabix_bgziptabix channels: - conda-forge - bioconda - - defaults + dependencies: + - bioconda::htslib=1.20 - bioconda::tabix=1.11 - - bioconda::htslib=1.19.1 diff --git a/modules/nf-core/tabix/bgziptabix/main.nf b/modules/nf-core/tabix/bgziptabix/main.nf index bcdcf2a689..22f37a7739 100644 --- a/modules/nf-core/tabix/bgziptabix/main.nf +++ b/modules/nf-core/tabix/bgziptabix/main.nf @@ -4,8 +4,8 @@ process TABIX_BGZIPTABIX { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/htslib:1.19.1--h81da01d_1' : - 'biocontainers/htslib:1.19.1--h81da01d_1' }" + 'https://depot.galaxyproject.org/singularity/htslib:1.20--h5efdd21_2' : + 'biocontainers/htslib:1.20--h5efdd21_2' }" input: tuple val(meta), path(input) @@ -24,7 +24,7 @@ process TABIX_BGZIPTABIX { def prefix = task.ext.prefix ?: "${meta.id}" """ bgzip --threads ${task.cpus} -c $args $input > ${prefix}.${input.getExtension()}.gz - tabix $args2 ${prefix}.${input.getExtension()}.gz + tabix --threads ${task.cpus} $args2 ${prefix}.${input.getExtension()}.gz cat <<-END_VERSIONS > versions.yml "${task.process}": @@ -34,10 +34,11 @@ process TABIX_BGZIPTABIX { stub: def prefix = task.ext.prefix ?: "${meta.id}" + def args2 = task.ext.args2 ?: '' + def index = args2.contains("-C ") || args2.contains("--csi") ? "csi" : "tbi" """ echo "" | gzip > ${prefix}.${input.getExtension()}.gz - touch ${prefix}.${input.getExtension()}.gz.tbi - touch ${prefix}.${input.getExtension()}.gz.csi + touch ${prefix}.${input.getExtension()}.gz.${index} cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/tabix/bgziptabix/meta.yml b/modules/nf-core/tabix/bgziptabix/meta.yml index 438aba4d18..806fbc121f 100644 --- a/modules/nf-core/tabix/bgziptabix/meta.yml +++ b/modules/nf-core/tabix/bgziptabix/meta.yml @@ -13,38 +13,50 @@ tools: documentation: https://www.htslib.org/doc/tabix.1.html doi: 10.1093/bioinformatics/btq671 licence: ["MIT"] + identifier: biotools:tabix input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - tab: - type: file - description: TAB-delimited genome position file - pattern: "*.{bed,gff,sam,vcf}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: Sorted tab-delimited genome file output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - gz: - type: file - description: Output compressed file - pattern: "*.{gz}" - - tbi: - type: file - description: tabix index file - pattern: "*.{gz.tbi}" - - csi: - type: file - description: tabix alternate index file - pattern: "*.{gz.csi}" + - gz_tbi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.gz": + type: file + description: bgzipped tab-delimited genome file + pattern: "*.gz" + - "*.tbi": + type: file + description: tabix index file + pattern: "*.tbi" + - gz_csi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.gz": + type: file + description: bgzipped tab-delimited genome file + pattern: "*.gz" + - "*.csi": + type: file + description: csi index file + pattern: "*.csi" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@maxulysse" - "@DLBPointon" diff --git a/modules/nf-core/tabix/bgziptabix/tests/main.nf.test b/modules/nf-core/tabix/bgziptabix/tests/main.nf.test index 87ea2c84f9..4d4130dc07 100644 --- a/modules/nf-core/tabix/bgziptabix/tests/main.nf.test +++ b/modules/nf-core/tabix/bgziptabix/tests/main.nf.test @@ -17,7 +17,7 @@ nextflow_process { """ input[0] = [ [ id:'tbi_test' ], - [ file(params.test_data['sarscov2']['genome']['test_bed'], checkIfExists: true) ] + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/test.bed', checkIfExists: true) ] ] """ } @@ -43,7 +43,7 @@ nextflow_process { """ input[0] = [ [ id:'csi_test' ], - [ file(params.test_data['sarscov2']['genome']['test_bed'], checkIfExists: true) ] + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/test.bed', checkIfExists: true) ] ] """ } @@ -72,7 +72,7 @@ nextflow_process { """ input[0] = [ [ id:'test' ], - [ file(params.test_data['sarscov2']['genome']['test_bed'], checkIfExists: true) ] + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/test.bed', checkIfExists: true) ] ] """ } @@ -91,4 +91,33 @@ nextflow_process { } + test("sarscov2_bed_tbi_stub") { + config "./tabix_tbi.config" + + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/test.bed', checkIfExists: true) ] + ] + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert snapshot( + file(process.out.gz_tbi[0][1]).name + ).match("tbi_stub") + } + ) + } + + } + } diff --git a/modules/nf-core/tabix/bgziptabix/tests/main.nf.test.snap b/modules/nf-core/tabix/bgziptabix/tests/main.nf.test.snap index fcecb2e492..fb87799b20 100644 --- a/modules/nf-core/tabix/bgziptabix/tests/main.nf.test.snap +++ b/modules/nf-core/tabix/bgziptabix/tests/main.nf.test.snap @@ -8,14 +8,14 @@ "id": "tbi_test" }, "tbi_test.bed.gz:md5,fe4053cf4de3aebbdfc3be2efb125a74", - "tbi_test.bed.gz.tbi:md5,24908545311cf2b7c803c41d716872c4" + "tbi_test.bed.gz.tbi:md5,ca06caf88b1e3c67d5fcba0a1460b52c" ] ], "1": [ ], "2": [ - "versions.yml:md5,b4765e4d896ce4a4cdd6c896d12555fc" + "versions.yml:md5,736e7c3b16a3ac525253e5b5f5d8fdfa" ], "gz_csi": [ @@ -26,15 +26,19 @@ "id": "tbi_test" }, "tbi_test.bed.gz:md5,fe4053cf4de3aebbdfc3be2efb125a74", - "tbi_test.bed.gz.tbi:md5,24908545311cf2b7c803c41d716872c4" + "tbi_test.bed.gz.tbi:md5,ca06caf88b1e3c67d5fcba0a1460b52c" ] ], "versions": [ - "versions.yml:md5,b4765e4d896ce4a4cdd6c896d12555fc" + "versions.yml:md5,736e7c3b16a3ac525253e5b5f5d8fdfa" ] } ], - "timestamp": "2024-02-19T14:50:51.513838" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-07-19T11:29:16.053817543" }, "sarscov2_bed_csi": { "content": [ @@ -48,11 +52,11 @@ "id": "csi_test" }, "csi_test.bed.gz:md5,fe4053cf4de3aebbdfc3be2efb125a74", - "csi_test.bed.gz.csi:md5,e06165ddd34640783728cb07f2558b43" + "csi_test.bed.gz.csi:md5,c9c0377de58fdc89672bb3005a0d69f5" ] ], "2": [ - "versions.yml:md5,b4765e4d896ce4a4cdd6c896d12555fc" + "versions.yml:md5,736e7c3b16a3ac525253e5b5f5d8fdfa" ], "gz_csi": [ [ @@ -60,48 +64,109 @@ "id": "csi_test" }, "csi_test.bed.gz:md5,fe4053cf4de3aebbdfc3be2efb125a74", - "csi_test.bed.gz.csi:md5,e06165ddd34640783728cb07f2558b43" + "csi_test.bed.gz.csi:md5,c9c0377de58fdc89672bb3005a0d69f5" ] ], "gz_tbi": [ ], "versions": [ - "versions.yml:md5,b4765e4d896ce4a4cdd6c896d12555fc" + "versions.yml:md5,736e7c3b16a3ac525253e5b5f5d8fdfa" ] } ], - "timestamp": "2024-02-19T14:51:00.513777" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-07-19T11:29:27.667745444" }, "csi_test": { "content": [ "csi_test.bed.gz" ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, "timestamp": "2024-02-19T14:51:00.548801" }, + "sarscov2_bed_tbi_stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test.bed.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + + ], + "2": [ + "versions.yml:md5,736e7c3b16a3ac525253e5b5f5d8fdfa" + ], + "gz_csi": [ + + ], + "gz_tbi": [ + [ + { + "id": "test" + }, + "test.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test.bed.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,736e7c3b16a3ac525253e5b5f5d8fdfa" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-25T14:45:18.533169949" + }, "csi_stub": { "content": [ "test.bed.gz" ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, "timestamp": "2024-02-19T14:51:09.218454" }, + "tbi_stub": { + "content": [ + "test.bed.gz" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-25T14:45:18.550930179" + }, "tbi_test": { "content": [ "tbi_test.bed.gz" ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, "timestamp": "2024-02-19T14:50:51.579654" }, "sarscov2_bed_csi_stub": { "content": [ { "0": [ - [ - { - "id": "test" - }, - "test.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", - "test.bed.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" - ] + ], "1": [ [ @@ -113,7 +178,7 @@ ] ], "2": [ - "versions.yml:md5,b4765e4d896ce4a4cdd6c896d12555fc" + "versions.yml:md5,736e7c3b16a3ac525253e5b5f5d8fdfa" ], "gz_csi": [ [ @@ -125,19 +190,17 @@ ] ], "gz_tbi": [ - [ - { - "id": "test" - }, - "test.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", - "test.bed.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" - ] + ], "versions": [ - "versions.yml:md5,b4765e4d896ce4a4cdd6c896d12555fc" + "versions.yml:md5,736e7c3b16a3ac525253e5b5f5d8fdfa" ] } ], - "timestamp": "2024-02-19T14:51:09.164254" + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-25T14:44:19.786135972" } } \ No newline at end of file diff --git a/modules/nf-core/tabix/tabix/environment.yml b/modules/nf-core/tabix/tabix/environment.yml index 76b45e16c8..017c259da1 100644 --- a/modules/nf-core/tabix/tabix/environment.yml +++ b/modules/nf-core/tabix/tabix/environment.yml @@ -1,8 +1,7 @@ -name: tabix_tabix channels: - conda-forge - bioconda - - defaults + dependencies: + - bioconda::htslib=1.20 - bioconda::tabix=1.11 - - bioconda::htslib=1.19.1 diff --git a/modules/nf-core/tabix/tabix/main.nf b/modules/nf-core/tabix/tabix/main.nf index 1737141d7f..13acd670ea 100644 --- a/modules/nf-core/tabix/tabix/main.nf +++ b/modules/nf-core/tabix/tabix/main.nf @@ -4,8 +4,8 @@ process TABIX_TABIX { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/htslib:1.19.1--h81da01d_1' : - 'biocontainers/htslib:1.19.1--h81da01d_1' }" + 'https://depot.galaxyproject.org/singularity/htslib:1.20--h5efdd21_2' : + 'biocontainers/htslib:1.20--h5efdd21_2' }" input: tuple val(meta), path(tab) @@ -21,7 +21,10 @@ process TABIX_TABIX { script: def args = task.ext.args ?: '' """ - tabix $args $tab + tabix \\ + --threads $task.cpus \\ + $args \\ + $tab cat <<-END_VERSIONS > versions.yml "${task.process}": @@ -33,8 +36,8 @@ process TABIX_TABIX { """ touch ${tab}.tbi touch ${tab}.csi - cat <<-END_VERSIONS > versions.yml + cat <<-END_VERSIONS > versions.yml "${task.process}": tabix: \$(echo \$(tabix -h 2>&1) | sed 's/^.*Version: //; s/ .*\$//') END_VERSIONS diff --git a/modules/nf-core/tabix/tabix/meta.yml b/modules/nf-core/tabix/tabix/meta.yml index ae5b4f439f..7864832d93 100644 --- a/modules/nf-core/tabix/tabix/meta.yml +++ b/modules/nf-core/tabix/tabix/meta.yml @@ -11,34 +11,43 @@ tools: documentation: https://www.htslib.org/doc/tabix.1.html doi: 10.1093/bioinformatics/btq671 licence: ["MIT"] + identifier: biotools:tabix input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - tab: - type: file - description: TAB-delimited genome position file compressed with bgzip - pattern: "*.{bed.gz,gff.gz,sam.gz,vcf.gz}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - tab: + type: file + description: TAB-delimited genome position file compressed with bgzip + pattern: "*.{bed.gz,gff.gz,sam.gz,vcf.gz}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - tbi: - type: file - description: tabix index file - pattern: "*.{tbi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tbi": + type: file + description: tabix index file + pattern: "*.{tbi}" - csi: - type: file - description: coordinate sorted index file - pattern: "*.{csi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: coordinate sorted index file + pattern: "*.{csi}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/tabix/tabix/tests/main.nf.test b/modules/nf-core/tabix/tabix/tests/main.nf.test index 3a150c708f..102b0d7bf3 100644 --- a/modules/nf-core/tabix/tabix/tests/main.nf.test +++ b/modules/nf-core/tabix/tabix/tests/main.nf.test @@ -16,7 +16,7 @@ nextflow_process { """ input[0] = [ [ id:'tbi_bed' ], - [ file(params.test_data['sarscov2']['genome']['test_bed_gz'], checkIfExists: true) ] + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/test.bed.gz', checkIfExists: true) ] ] """ } @@ -25,11 +25,10 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert snapshot(process.out).match() }, { assert snapshot( - file(process.out.tbi[0][1]).name - ).match("tbi_bed") - } + process.out, + file(process.out.tbi[0][1]).name + ).match() } ) } } @@ -41,7 +40,7 @@ nextflow_process { """ input[0] = [ [ id:'tbi_gff' ], - [ file(params.test_data['sarscov2']['genome']['genome_gff3_gz'], checkIfExists: true) ] + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.gff3.gz', checkIfExists: true) ] ] """ } @@ -50,11 +49,9 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert snapshot(process.out).match() }, { assert snapshot( - file(process.out.tbi[0][1]).name - ).match("tbi_gff") - } + process.out, + file(process.out.tbi[0][1]).name).match() } ) } @@ -67,7 +64,7 @@ nextflow_process { """ input[0] = [ [ id:'tbi_vcf' ], - [ file(params.test_data['sarscov2']['illumina']['test_vcf_gz'], checkIfExists: true) ] + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true) ] ] """ } @@ -76,11 +73,10 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert snapshot(process.out).match() }, { assert snapshot( - file(process.out.tbi[0][1]).name - ).match("tbi_vcf") - } + process.out, + file(process.out.tbi[0][1]).name + ).match() } ) } @@ -93,7 +89,7 @@ nextflow_process { """ input[0] = [ [ id:'vcf_csi' ], - [ file(params.test_data['sarscov2']['illumina']['test_vcf_gz'], checkIfExists: true) ] + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true) ] ] """ } @@ -102,11 +98,10 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert snapshot(process.out).match() }, { assert snapshot( - file(process.out.csi[0][1]).name - ).match("vcf_csi") - } + process.out, + file(process.out.csi[0][1]).name + ).match() } ) } @@ -120,7 +115,7 @@ nextflow_process { """ input[0] = [ [ id:'vcf_csi_stub' ], - [ file(params.test_data['sarscov2']['illumina']['test_vcf_gz'], checkIfExists: true) ] + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true) ] ] """ } @@ -129,11 +124,10 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert snapshot(process.out).match() }, { assert snapshot( - file(process.out.csi[0][1]).name - ).match("vcf_csi_stub") - } + process.out, + file(process.out.csi[0][1]).name + ).match() } ) } diff --git a/modules/nf-core/tabix/tabix/tests/main.nf.test.snap b/modules/nf-core/tabix/tabix/tests/main.nf.test.snap index 034e38b688..c2b9ed0b80 100644 --- a/modules/nf-core/tabix/tabix/tests/main.nf.test.snap +++ b/modules/nf-core/tabix/tabix/tests/main.nf.test.snap @@ -1,16 +1,4 @@ { - "vcf_csi_stub": { - "content": [ - "test.vcf.gz.csi" - ], - "timestamp": "2024-03-04T14:51:59.788002" - }, - "tbi_gff": { - "content": [ - "genome.gff3.gz.tbi" - ], - "timestamp": "2024-02-19T14:53:37.420216" - }, "sarscov2_gff_tbi": { "content": [ { @@ -19,14 +7,14 @@ { "id": "tbi_gff" }, - "genome.gff3.gz.tbi:md5,53fc683fd217aae47ef10d23c52a9178" + "genome.gff3.gz.tbi:md5,f79a67d95a98076e04fbe0455d825926" ] ], "1": [ ], "2": [ - "versions.yml:md5,f4feeda7fdd4b567102f7f8e5d7037a3" + "versions.yml:md5,07064637fb8a217174052be8e40234e2" ], "csi": [ @@ -36,15 +24,20 @@ { "id": "tbi_gff" }, - "genome.gff3.gz.tbi:md5,53fc683fd217aae47ef10d23c52a9178" + "genome.gff3.gz.tbi:md5,f79a67d95a98076e04fbe0455d825926" ] ], "versions": [ - "versions.yml:md5,f4feeda7fdd4b567102f7f8e5d7037a3" + "versions.yml:md5,07064637fb8a217174052be8e40234e2" ] - } + }, + "genome.gff3.gz.tbi" ], - "timestamp": "2024-02-19T14:53:37.388157" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-07-19T12:06:25.653807564" }, "sarscov2_bedgz_tbi": { "content": [ @@ -54,14 +47,14 @@ { "id": "tbi_bed" }, - "test.bed.gz.tbi:md5,0f17d85e7f0a042b2aa367b70df224f8" + "test.bed.gz.tbi:md5,9a761d51cc81835fd1199201fdbcdd5d" ] ], "1": [ ], "2": [ - "versions.yml:md5,f4feeda7fdd4b567102f7f8e5d7037a3" + "versions.yml:md5,07064637fb8a217174052be8e40234e2" ], "csi": [ @@ -71,27 +64,20 @@ { "id": "tbi_bed" }, - "test.bed.gz.tbi:md5,0f17d85e7f0a042b2aa367b70df224f8" + "test.bed.gz.tbi:md5,9a761d51cc81835fd1199201fdbcdd5d" ] ], "versions": [ - "versions.yml:md5,f4feeda7fdd4b567102f7f8e5d7037a3" + "versions.yml:md5,07064637fb8a217174052be8e40234e2" ] - } - ], - "timestamp": "2024-02-19T14:53:28.879408" - }, - "tbi_vcf": { - "content": [ - "test.vcf.gz.tbi" - ], - "timestamp": "2024-02-19T14:53:46.402522" - }, - "vcf_csi": { - "content": [ - "test.vcf.gz.csi" + }, + "test.bed.gz.tbi" ], - "timestamp": "2024-02-19T14:53:54.921189" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-07-19T12:06:09.754082161" }, "sarscov2_vcf_tbi": { "content": [ @@ -101,14 +87,14 @@ { "id": "tbi_vcf" }, - "test.vcf.gz.tbi:md5,897f3f378a811b90e6dee56ce08d2bcf" + "test.vcf.gz.tbi:md5,d22e5b84e4fcd18792179f72e6da702e" ] ], "1": [ ], "2": [ - "versions.yml:md5,f4feeda7fdd4b567102f7f8e5d7037a3" + "versions.yml:md5,07064637fb8a217174052be8e40234e2" ], "csi": [ @@ -118,15 +104,20 @@ { "id": "tbi_vcf" }, - "test.vcf.gz.tbi:md5,897f3f378a811b90e6dee56ce08d2bcf" + "test.vcf.gz.tbi:md5,d22e5b84e4fcd18792179f72e6da702e" ] ], "versions": [ - "versions.yml:md5,f4feeda7fdd4b567102f7f8e5d7037a3" + "versions.yml:md5,07064637fb8a217174052be8e40234e2" ] - } + }, + "test.vcf.gz.tbi" ], - "timestamp": "2024-02-19T14:53:46.370358" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-07-19T12:06:40.042648294" }, "sarscov2_vcf_csi_stub": { "content": [ @@ -148,7 +139,7 @@ ] ], "2": [ - "versions.yml:md5,3d45df6d80883bad358631069a2940fd" + "versions.yml:md5,07064637fb8a217174052be8e40234e2" ], "csi": [ [ @@ -167,11 +158,16 @@ ] ], "versions": [ - "versions.yml:md5,3d45df6d80883bad358631069a2940fd" + "versions.yml:md5,07064637fb8a217174052be8e40234e2" ] - } + }, + "test.vcf.gz.csi" ], - "timestamp": "2024-03-04T14:51:59.766184" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-07-19T12:07:08.700367261" }, "sarscov2_vcf_csi": { "content": [ @@ -184,34 +180,33 @@ { "id": "vcf_csi" }, - "test.vcf.gz.csi:md5,0731ad6f40104d2bbb1a2cc478ef8f03" + "test.vcf.gz.csi:md5,04b41c1efd9ab3c6b1e008a286e27d2b" ] ], "2": [ - "versions.yml:md5,f4feeda7fdd4b567102f7f8e5d7037a3" + "versions.yml:md5,07064637fb8a217174052be8e40234e2" ], "csi": [ [ { "id": "vcf_csi" }, - "test.vcf.gz.csi:md5,0731ad6f40104d2bbb1a2cc478ef8f03" + "test.vcf.gz.csi:md5,04b41c1efd9ab3c6b1e008a286e27d2b" ] ], "tbi": [ ], "versions": [ - "versions.yml:md5,f4feeda7fdd4b567102f7f8e5d7037a3" + "versions.yml:md5,07064637fb8a217174052be8e40234e2" ] - } - ], - "timestamp": "2024-02-19T14:53:54.886876" - }, - "tbi_bed": { - "content": [ - "test.bed.gz.tbi" + }, + "test.vcf.gz.csi" ], - "timestamp": "2024-02-19T14:53:28.947628" + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-07-19T12:06:55.362067748" } } \ No newline at end of file diff --git a/modules/nf-core/tiddit/sv/environment.yml b/modules/nf-core/tiddit/sv/environment.yml index d0367f1717..2fd01cfd4b 100644 --- a/modules/nf-core/tiddit/sv/environment.yml +++ b/modules/nf-core/tiddit/sv/environment.yml @@ -1,7 +1,5 @@ -name: tiddit_sv channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::tiddit=3.6.1 diff --git a/modules/nf-core/tiddit/sv/meta.yml b/modules/nf-core/tiddit/sv/meta.yml index bfcbc4e3fd..21527baf13 100644 --- a/modules/nf-core/tiddit/sv/meta.yml +++ b/modules/nf-core/tiddit/sv/meta.yml @@ -11,56 +11,65 @@ tools: documentation: https://github.com/SciLifeLab/TIDDIT/blob/master/README.md doi: 10.12688/f1000research.11168.1 licence: ["GPL-3.0-or-later"] + identifier: biotools:tiddit input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM file - pattern: "*.{bam,cram}" - - input_index: - type: file - description: BAM/CRAM index file - pattern: "*.{bai,crai}" - - meta2: - type: map - description: | - Groovy Map containing sample information - e.g. `[ id:'test_fasta']` - - fasta: - type: file - description: Input FASTA file - pattern: "*.{fasta,fa}" - - meta3: - type: map - description: | - Groovy Map containing sample information from bwa index - e.g. `[ id:'test_bwa-index' ]` - - bwa_index: - type: file - description: BWA genome index files - pattern: "Directory containing BWA index *.{amb,ann,bwt,pac,sa}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM file + pattern: "*.{bam,cram}" + - input_index: + type: file + description: BAM/CRAM index file + pattern: "*.{bai,crai}" + - - meta2: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'test_fasta']` + - fasta: + type: file + description: Input FASTA file + pattern: "*.{fasta,fa}" + - - meta3: + type: map + description: | + Groovy Map containing sample information from bwa index + e.g. `[ id:'test_bwa-index' ]` + - bwa_index: + type: file + description: BWA genome index files + pattern: "Directory containing BWA index *.{amb,ann,bwt,pac,sa}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - vcf: - type: file - description: vcf - pattern: "*.{vcf}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.vcf": + type: file + description: vcf + pattern: "*.{vcf}" - ploidy: - type: file - description: tab - pattern: "*.{ploidies.tab}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.ploidies.tab": + type: file + description: tab + pattern: "*.{ploidies.tab}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@maxulysse" maintainers: diff --git a/modules/nf-core/untar/environment.yml b/modules/nf-core/untar/environment.yml index 0c9cbb101d..c7794856d8 100644 --- a/modules/nf-core/untar/environment.yml +++ b/modules/nf-core/untar/environment.yml @@ -1,11 +1,7 @@ -name: untar - channels: - conda-forge - bioconda - - defaults - dependencies: - conda-forge::grep=3.11 - - conda-forge::sed=4.7 + - conda-forge::sed=4.8 - conda-forge::tar=1.34 diff --git a/modules/nf-core/untar/main.nf b/modules/nf-core/untar/main.nf index 8a75bb957d..9bd8f55461 100644 --- a/modules/nf-core/untar/main.nf +++ b/modules/nf-core/untar/main.nf @@ -4,8 +4,8 @@ process UNTAR { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/ubuntu:20.04' : - 'nf-core/ubuntu:20.04' }" + 'https://depot.galaxyproject.org/singularity/ubuntu:22.04' : + 'nf-core/ubuntu:22.04' }" input: tuple val(meta), path(archive) @@ -52,8 +52,29 @@ process UNTAR { stub: prefix = task.ext.prefix ?: ( meta.id ? "${meta.id}" : archive.toString().replaceFirst(/\.[^\.]+(.gz)?$/, "")) """ - mkdir $prefix - touch ${prefix}/file.txt + mkdir ${prefix} + ## Dry-run untaring the archive to get the files and place all in prefix + if [[ \$(tar -taf ${archive} | grep -o -P "^.*?\\/" | uniq | wc -l) -eq 1 ]]; then + for i in `tar -tf ${archive}`; + do + if [[ \$(echo "\${i}" | grep -E "/\$") == "" ]]; + then + touch \${i} + else + mkdir -p \${i} + fi + done + else + for i in `tar -tf ${archive}`; + do + if [[ \$(echo "\${i}" | grep -E "/\$") == "" ]]; + then + touch ${prefix}/\${i} + else + mkdir -p ${prefix}/\${i} + fi + done + fi cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/untar/meta.yml b/modules/nf-core/untar/meta.yml index a9a2110f55..290346b3fa 100644 --- a/modules/nf-core/untar/meta.yml +++ b/modules/nf-core/untar/meta.yml @@ -10,30 +10,33 @@ tools: Extract tar.gz files. documentation: https://www.gnu.org/software/tar/manual/ licence: ["GPL-3.0-or-later"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - archive: - type: file - description: File to be untar - pattern: "*.{tar}.{gz}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - archive: + type: file + description: File to be untar + pattern: "*.{tar}.{gz}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - untar: - type: directory - description: Directory containing contents of archive - pattern: "*/" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - $prefix: + type: directory + description: Directory containing contents of archive + pattern: "*/" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/untar/tests/main.nf.test b/modules/nf-core/untar/tests/main.nf.test index 2a7c97bf81..c957517aaa 100644 --- a/modules/nf-core/untar/tests/main.nf.test +++ b/modules/nf-core/untar/tests/main.nf.test @@ -6,6 +6,7 @@ nextflow_process { tag "modules" tag "modules_nfcore" tag "untar" + test("test_untar") { when { @@ -19,10 +20,9 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert snapshot(process.out.untar).match("test_untar") }, + { assert snapshot(process.out).match() }, ) } - } test("test_untar_onlyfiles") { @@ -38,10 +38,48 @@ nextflow_process { then { assertAll ( { assert process.success }, - { assert snapshot(process.out.untar).match("test_untar_onlyfiles") }, + { assert snapshot(process.out).match() }, ) } + } + + test("test_untar - stub") { + + options "-stub" + when { + process { + """ + input[0] = [ [], file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/db/kraken2.tar.gz', checkIfExists: true) ] + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() }, + ) + } } + test("test_untar_onlyfiles - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ [], file(params.modules_testdata_base_path + 'generic/tar/hello.tar.gz', checkIfExists: true) ] + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() }, + ) + } + } } diff --git a/modules/nf-core/untar/tests/main.nf.test.snap b/modules/nf-core/untar/tests/main.nf.test.snap index 64550292f3..ceb91b7925 100644 --- a/modules/nf-core/untar/tests/main.nf.test.snap +++ b/modules/nf-core/untar/tests/main.nf.test.snap @@ -1,42 +1,158 @@ { "test_untar_onlyfiles": { "content": [ - [ - [ + { + "0": [ [ - - ], + [ + + ], + [ + "hello.txt:md5,e59ff97941044f85df5297e1c302d260" + ] + ] + ], + "1": [ + "versions.yml:md5,6063247258c56fd271d076bb04dd7536" + ], + "untar": [ + [ + [ + + ], + [ + "hello.txt:md5,e59ff97941044f85df5297e1c302d260" + ] + ] + ], + "versions": [ + "versions.yml:md5,6063247258c56fd271d076bb04dd7536" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-10T12:04:28.231047" + }, + "test_untar_onlyfiles - stub": { + "content": [ + { + "0": [ + [ + [ + + ], + [ + "hello.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "1": [ + "versions.yml:md5,6063247258c56fd271d076bb04dd7536" + ], + "untar": [ [ - "hello.txt:md5,e59ff97941044f85df5297e1c302d260" + [ + + ], + [ + "hello.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ] + ], + "versions": [ + "versions.yml:md5,6063247258c56fd271d076bb04dd7536" ] - ] + } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.3" }, - "timestamp": "2024-02-28T11:49:41.320643" + "timestamp": "2024-07-10T12:04:45.773103" + }, + "test_untar - stub": { + "content": [ + { + "0": [ + [ + [ + + ], + [ + "hash.k2d:md5,d41d8cd98f00b204e9800998ecf8427e", + "opts.k2d:md5,d41d8cd98f00b204e9800998ecf8427e", + "taxo.k2d:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "1": [ + "versions.yml:md5,6063247258c56fd271d076bb04dd7536" + ], + "untar": [ + [ + [ + + ], + [ + "hash.k2d:md5,d41d8cd98f00b204e9800998ecf8427e", + "opts.k2d:md5,d41d8cd98f00b204e9800998ecf8427e", + "taxo.k2d:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "versions": [ + "versions.yml:md5,6063247258c56fd271d076bb04dd7536" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.3" + }, + "timestamp": "2024-07-10T12:04:36.777441" }, "test_untar": { "content": [ - [ - [ + { + "0": [ [ - - ], + [ + + ], + [ + "hash.k2d:md5,8b8598468f54a7087c203ad0190555d9", + "opts.k2d:md5,a033d00cf6759407010b21700938f543", + "taxo.k2d:md5,094d5891cdccf2f1468088855c214b2c" + ] + ] + ], + "1": [ + "versions.yml:md5,6063247258c56fd271d076bb04dd7536" + ], + "untar": [ [ - "hash.k2d:md5,8b8598468f54a7087c203ad0190555d9", - "opts.k2d:md5,a033d00cf6759407010b21700938f543", - "taxo.k2d:md5,094d5891cdccf2f1468088855c214b2c" + [ + + ], + [ + "hash.k2d:md5,8b8598468f54a7087c203ad0190555d9", + "opts.k2d:md5,a033d00cf6759407010b21700938f543", + "taxo.k2d:md5,094d5891cdccf2f1468088855c214b2c" + ] ] + ], + "versions": [ + "versions.yml:md5,6063247258c56fd271d076bb04dd7536" ] - ] + } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nextflow": "24.04.3" }, - "timestamp": "2024-02-28T11:49:33.795172" + "timestamp": "2024-07-10T12:04:19.377674" } } \ No newline at end of file diff --git a/modules/nf-core/unzip/environment.yml b/modules/nf-core/unzip/environment.yml index d3a535f170..e93c649f44 100644 --- a/modules/nf-core/unzip/environment.yml +++ b/modules/nf-core/unzip/environment.yml @@ -1,7 +1,5 @@ -name: unzip channels: - conda-forge - bioconda - - defaults dependencies: - conda-forge::p7zip=16.02 diff --git a/modules/nf-core/unzip/main.nf b/modules/nf-core/unzip/main.nf index 08cfc3c406..a0c02109cd 100644 --- a/modules/nf-core/unzip/main.nf +++ b/modules/nf-core/unzip/main.nf @@ -20,7 +20,6 @@ process UNZIP { script: def args = task.ext.args ?: '' if ( archive instanceof List && archive.name.size > 1 ) { error "[UNZIP] error: 7za only accepts a single archive as input. Please check module input." } - prefix = task.ext.prefix ?: ( meta.id ? "${meta.id}" : archive.baseName) """ 7za \\ @@ -34,4 +33,17 @@ process UNZIP { 7za: \$(echo \$(7za --help) | sed 's/.*p7zip Version //; s/(.*//') END_VERSIONS """ + + stub: + def args = task.ext.args ?: '' + if ( archive instanceof List && archive.name.size > 1 ) { error "[UNZIP] error: 7za only accepts a single archive as input. Please check module input." } + prefix = task.ext.prefix ?: ( meta.id ? "${meta.id}" : archive.baseName) + """ + mkdir "${prefix}" + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + 7za: \$(echo \$(7za --help) | sed 's/.*p7zip Version //; s/(.*//') + END_VERSIONS + """ } diff --git a/modules/nf-core/unzip/meta.yml b/modules/nf-core/unzip/meta.yml index e8e377e2af..426fccb76b 100644 --- a/modules/nf-core/unzip/meta.yml +++ b/modules/nf-core/unzip/meta.yml @@ -7,35 +7,39 @@ keywords: - archiving tools: - unzip: - description: p7zip is a quick port of 7z.exe and 7za.exe (command line version of 7zip, see www.7-zip.org) for Unix. + description: p7zip is a quick port of 7z.exe and 7za.exe (command line version + of 7zip, see www.7-zip.org) for Unix. homepage: https://sourceforge.net/projects/p7zip/ documentation: https://sourceforge.net/projects/p7zip/ tool_dev_url: https://sourceforge.net/projects/p7zip" licence: ["LGPL-2.1-or-later"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - archive: - type: file - description: ZIP file - pattern: "*.zip" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - archive: + type: file + description: ZIP file + pattern: "*.zip" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - unzipped_archive: - type: directory - description: Directory contents of the unzipped archive - pattern: "${archive.baseName}/" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}/: + type: directory + description: Directory contents of the unzipped archive + pattern: "${archive.baseName}/" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@jfy133" maintainers: diff --git a/modules/nf-core/unzip/tests/main.nf.test b/modules/nf-core/unzip/tests/main.nf.test new file mode 100644 index 0000000000..238b68d8ba --- /dev/null +++ b/modules/nf-core/unzip/tests/main.nf.test @@ -0,0 +1,54 @@ +nextflow_process { + + name "Test Process UNZIP" + script "../main.nf" + process "UNZIP" + + tag "modules" + tag "modules_nfcore" + tag "unzip" + + test("generic [tar] [tar_gz]") { + + when { + process { + """ + input[0] = [ + [ id: 'hello' ], + file(params.modules_testdata_base_path + 'generic/tar/hello.tar.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } + + test("generic [tar] [tar_gz] stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [ id: 'hello' ], + file(params.modules_testdata_base_path + 'generic/tar/hello.tar.gz', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + } +} diff --git a/modules/nf-core/unzip/tests/main.nf.test.snap b/modules/nf-core/unzip/tests/main.nf.test.snap new file mode 100644 index 0000000000..cdd2ab1641 --- /dev/null +++ b/modules/nf-core/unzip/tests/main.nf.test.snap @@ -0,0 +1,76 @@ +{ + "generic [tar] [tar_gz] stub": { + "content": [ + { + "0": [ + [ + { + "id": "hello" + }, + [ + + ] + ] + ], + "1": [ + "versions.yml:md5,52c55ce814e8bc9edc5a6c625ed794b8" + ], + "unzipped_archive": [ + [ + { + "id": "hello" + }, + [ + + ] + ] + ], + "versions": [ + "versions.yml:md5,52c55ce814e8bc9edc5a6c625ed794b8" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-30T19:16:37.11550986" + }, + "generic [tar] [tar_gz]": { + "content": [ + { + "0": [ + [ + { + "id": "hello" + }, + [ + "hello.tar:md5,80c66db79a773bc87b3346035ff9593e" + ] + ] + ], + "1": [ + "versions.yml:md5,52c55ce814e8bc9edc5a6c625ed794b8" + ], + "unzipped_archive": [ + [ + { + "id": "hello" + }, + [ + "hello.tar:md5,80c66db79a773bc87b3346035ff9593e" + ] + ] + ], + "versions": [ + "versions.yml:md5,52c55ce814e8bc9edc5a6c625ed794b8" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-06-30T19:16:25.120242571" + } +} \ No newline at end of file diff --git a/modules/nf-core/unzip/tests/tags.yml b/modules/nf-core/unzip/tests/tags.yml new file mode 100644 index 0000000000..7f5647e120 --- /dev/null +++ b/modules/nf-core/unzip/tests/tags.yml @@ -0,0 +1,2 @@ +unzip: + - "modules/nf-core/unzip/**" diff --git a/modules/nf-core/vcftools/environment.yml b/modules/nf-core/vcftools/environment.yml index 503449e833..7dcc752b86 100644 --- a/modules/nf-core/vcftools/environment.yml +++ b/modules/nf-core/vcftools/environment.yml @@ -1,7 +1,5 @@ -name: vcftools channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::vcftools=0.1.16 diff --git a/modules/nf-core/vcftools/main.nf b/modules/nf-core/vcftools/main.nf index 475ef58f7a..24e0fc3b0e 100644 --- a/modules/nf-core/vcftools/main.nf +++ b/modules/nf-core/vcftools/main.nf @@ -126,7 +126,71 @@ process VCFTOOLS { """ stub: + def prefix = task.ext.prefix ?: "${meta.id}" """ + touch ${prefix}.vcf + touch ${prefix}.bcf + touch ${prefix}.frq + touch ${prefix}.frq.count + touch ${prefix}.idepth + touch ${prefix}.ldepth + touch ${prefix}.ldepth.mean + touch ${prefix}.gdepth + touch ${prefix}.hap.ld + touch ${prefix}.geno.ld + touch ${prefix}.geno.chisq + touch ${prefix}.list.hap.ld + touch ${prefix}.list.geno.ld + touch ${prefix}.interchrom.hap.ld + touch ${prefix}.interchrom.geno.ld + touch ${prefix}.TsTv + touch ${prefix}.TsTv.summary + touch ${prefix}.TsTv.count + touch ${prefix}.TsTv.qual + touch ${prefix}.FILTER.summary + touch ${prefix}.sites.pi + touch ${prefix}.windowed.pi + touch ${prefix}.weir.fst + touch ${prefix}.het + touch ${prefix}.hwe + touch ${prefix}.Tajima.D + touch ${prefix}.ifreqburden + touch ${prefix}.LROH + touch ${prefix}.relatedness + touch ${prefix}.relatedness2 + touch ${prefix}.lqual + touch ${prefix}.imiss + touch ${prefix}.lmiss + touch ${prefix}.snpden + touch ${prefix}.kept.sites + touch ${prefix}.removed.sites + touch ${prefix}.singletons + touch ${prefix}.indel.hist + touch ${prefix}.hapcount + touch ${prefix}.mendel + touch ${prefix}.FORMAT + touch ${prefix}.INFO + touch ${prefix}.012 + touch ${prefix}.012.indv + touch ${prefix}.012.pos + touch ${prefix}.impute.hap + touch ${prefix}.impute.hap.legend + touch ${prefix}.impute.hap.indv + touch ${prefix}.ldhat.sites + touch ${prefix}.ldhat.locs + touch ${prefix}.BEAGLE.GL + touch ${prefix}.BEAGLE.PL + touch ${prefix}.ped + touch ${prefix}.map + touch ${prefix}.tped + touch ${prefix}.tfam + touch ${prefix}.diff.sites_in_files + touch ${prefix}.diff.indv_in_files + touch ${prefix}.diff.sites + touch ${prefix}.diff.indv + touch ${prefix}.diff.discordance.matrix + touch ${prefix}.diff.switch + cat <<-END_VERSIONS > versions.yml "${task.process}": vcftools: \$(echo \$(vcftools --version 2>&1) | sed 's/^.*VCFtools (//;s/).*//') diff --git a/modules/nf-core/vcftools/meta.yml b/modules/nf-core/vcftools/meta.yml index 09ad5908ab..b4c564ecaf 100644 --- a/modules/nf-core/vcftools/meta.yml +++ b/modules/nf-core/vcftools/meta.yml @@ -6,287 +6,681 @@ keywords: - sort tools: - vcftools: - description: A set of tools written in Perl and C++ for working with VCF files. This package only contains the C++ libraries whereas the package perl-vcftools-vcf contains the perl libraries + description: A set of tools written in Perl and C++ for working with VCF files. + This package only contains the C++ libraries whereas the package perl-vcftools-vcf + contains the perl libraries homepage: http://vcftools.sourceforge.net/ documentation: http://vcftools.sourceforge.net/man_latest.html licence: ["LGPL"] + identifier: biotools:vcftools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - variant_file: - type: file - description: variant input file which can be vcf, vcf.gz, or bcf format. - - bed: - type: file - description: bed file which can be used with different arguments in vcftools (optional) - - diff_variant_file: - type: file - description: secondary variant file which can be used with the 'diff' suite of tools (optional) + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - variant_file: + type: file + description: variant input file which can be vcf, vcf.gz, or bcf format. + - - bed: + type: file + description: bed file which can be used with different arguments in vcftools + (optional) + - - diff_variant_file: + type: file + description: secondary variant file which can be used with the 'diff' suite + of tools (optional) output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - vcf: - type: file - description: vcf file (optional) - pattern: "*.vcf" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.vcf": + type: file + description: vcf file (optional) + pattern: "*.vcf" - bcf: - type: file - description: bcf file (optional) - pattern: "*.bcf" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bcf": + type: file + description: bcf file (optional) + pattern: "*.bcf" - frq: - type: file - description: Allele frequency for each site (optional) - pattern: "*.frq" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.frq": + type: file + description: Allele frequency for each site (optional) + pattern: "*.frq" - frq_count: - type: file - description: Allele counts for each site (optional) - pattern: "*.frq.count" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.frq.count": + type: file + description: Allele counts for each site (optional) + pattern: "*.frq.count" - idepth: - type: file - description: mean depth per individual (optional) - pattern: "*.idepth" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.idepth": + type: file + description: mean depth per individual (optional) + pattern: "*.idepth" - ldepth: - type: file - description: depth per site summed across individuals (optional) - pattern: "*.ildepth" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.ldepth": + type: file + description: depth per site summed across individuals (optional) + pattern: "*.ildepth" - ldepth_mean: - type: file - description: mean depth per site calculated across individuals (optional) - pattern: "*.ldepth.mean" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.ldepth.mean": + type: file + description: mean depth per site calculated across individuals (optional) + pattern: "*.ldepth.mean" - gdepth: - type: file - description: depth for each genotype in vcf file (optional) - pattern: "*.gdepth" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.gdepth": + type: file + description: depth for each genotype in vcf file (optional) + pattern: "*.gdepth" - hap_ld: - type: file - description: r2, D, and D’ statistics using phased haplotypes (optional) - pattern: "*.hap.ld" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.hap.ld": + type: file + description: r2, D, and D’ statistics using phased haplotypes (optional) + pattern: "*.hap.ld" - geno_ld: - type: file - description: squared correlation coefficient between genotypes encoded as 0, 1 and 2 to represent the number of non-reference alleles in each individual (optional) - pattern: "*.geno.ld" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.geno.ld": + type: file + description: squared correlation coefficient between genotypes encoded as 0, + 1 and 2 to represent the number of non-reference alleles in each individual + (optional) + pattern: "*.geno.ld" - geno_chisq: - type: file - description: test for genotype independence via the chi-squared statistic (optional) - pattern: "*.geno.chisq" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.geno.chisq": + type: file + description: test for genotype independence via the chi-squared statistic (optional) + pattern: "*.geno.chisq" - list_hap_ld: - type: file - description: r2 statistics of the sites contained in the provided input file verses all other sites (optional) - pattern: "*.list.hap.ld" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.list.hap.ld": + type: file + description: r2 statistics of the sites contained in the provided input file + verses all other sites (optional) + pattern: "*.list.hap.ld" - list_geno_ld: - type: file - description: r2 statistics of the sites contained in the provided input file verses all other sites (optional) - pattern: "*.list.geno.ld" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.list.geno.ld": + type: file + description: r2 statistics of the sites contained in the provided input file + verses all other sites (optional) + pattern: "*.list.geno.ld" - interchrom_hap_ld: - type: file - description: r2 statistics for sites (haplotypes) on different chromosomes (optional) - pattern: "*.interchrom.hap.ld" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.interchrom.hap.ld": + type: file + description: r2 statistics for sites (haplotypes) on different chromosomes (optional) + pattern: "*.interchrom.hap.ld" - interchrom_geno_ld: - type: file - description: r2 statistics for sites (genotypes) on different chromosomes (optional) - pattern: "*.interchrom.geno.ld" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.interchrom.geno.ld": + type: file + description: r2 statistics for sites (genotypes) on different chromosomes (optional) + pattern: "*.interchrom.geno.ld" - tstv: - type: file - description: Transition / Transversion ratio in bins of size defined in options (optional) - pattern: "*.TsTv" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.TsTv": + type: file + description: Transition / Transversion ratio in bins of size defined in options + (optional) + pattern: "*.TsTv" - tstv_summary: - type: file - description: Summary of all Transitions and Transversions (optional) - pattern: "*.TsTv.summary" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.TsTv.summary": + type: file + description: Summary of all Transitions and Transversions (optional) + pattern: "*.TsTv.summary" - tstv_count: - type: file - description: Transition / Transversion ratio as a function of alternative allele count (optional) - pattern: "*.TsTv.count" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.TsTv.count": + type: file + description: Transition / Transversion ratio as a function of alternative allele + count (optional) + pattern: "*.TsTv.count" - tstv_qual: - type: file - description: Transition / Transversion ratio as a function of SNP quality threshold (optional) - pattern: "*.TsTv.qual" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.TsTv.qual": + type: file + description: Transition / Transversion ratio as a function of SNP quality threshold + (optional) + pattern: "*.TsTv.qual" - filter_summary: - type: file - description: Summary of the number of SNPs and Ts/Tv ratio for each FILTER category (optional) - pattern: "*.FILTER.summary" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.FILTER.summary": + type: file + description: Summary of the number of SNPs and Ts/Tv ratio for each FILTER category + (optional) + pattern: "*.FILTER.summary" - sites_pi: - type: file - description: Nucleotide divergency on a per-site basis (optional) - pattern: "*.sites.pi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.sites.pi": + type: file + description: Nucleotide divergency on a per-site basis (optional) + pattern: "*.sites.pi" - windowed_pi: - type: file - description: Nucleotide diversity in windows, with window size determined by options (optional) - pattern: "*windowed.pi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.windowed.pi": + type: file + description: Nucleotide diversity in windows, with window size determined by + options (optional) + pattern: "*windowed.pi" - weir_fst: - type: file - description: Fst estimate from Weir and Cockerham’s 1984 paper (optional) - pattern: "*.weir.fst" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.weir.fst": + type: file + description: Fst estimate from Weir and Cockerham’s 1984 paper (optional) + pattern: "*.weir.fst" - heterozygosity: - type: file - description: Heterozygosity on a per-individual basis (optional) - pattern: "*.het" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.het": + type: file + description: Heterozygosity on a per-individual basis (optional) + pattern: "*.het" - hwe: - type: file - description: Contains the Observed numbers of Homozygotes and Heterozygotes and the corresponding Expected numbers under HWE (optional) - pattern: "*.hwe" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.hwe": + type: file + description: Contains the Observed numbers of Homozygotes and Heterozygotes + and the corresponding Expected numbers under HWE (optional) + pattern: "*.hwe" - tajima_d: - type: file - description: Tajima’s D statistic in bins with size of the specified number in options (optional) - pattern: "*.Tajima.D" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.Tajima.D": + type: file + description: Tajima’s D statistic in bins with size of the specified number + in options (optional) + pattern: "*.Tajima.D" - freq_burden: - type: file - description: Number of variants within each individual of a specific frequency in options (optional) - pattern: "*.ifreqburden" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.ifreqburden": + type: file + description: Number of variants within each individual of a specific frequency + in options (optional) + pattern: "*.ifreqburden" - lroh: - type: file - description: Long Runs of Homozygosity (optional) - pattern: "*.LROH" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.LROH": + type: file + description: Long Runs of Homozygosity (optional) + pattern: "*.LROH" - relatedness: - type: file - description: Relatedness statistic based on the method of Yang et al, Nature Genetics 2010 (doi:10.1038/ng.608) (optional) - pattern: "*.relatedness" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.relatedness": + type: file + description: Relatedness statistic based on the method of Yang et al, Nature + Genetics 2010 (doi:10.1038/ng.608) (optional) + pattern: "*.relatedness" - relatedness2: - type: file - description: Relatedness statistic based on the method of Manichaikul et al., BIOINFORMATICS 2010 (doi:10.1093/bioinformatics/btq559) (optional) - pattern: "*.relatedness2" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.relatedness2": + type: file + description: Relatedness statistic based on the method of Manichaikul et al., + BIOINFORMATICS 2010 (doi:10.1093/bioinformatics/btq559) (optional) + pattern: "*.relatedness2" - lqual: - type: file - description: per-site SNP quality (optional) - pattern: "*.lqual" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.lqual": + type: file + description: per-site SNP quality (optional) + pattern: "*.lqual" - missing_individual: - type: file - description: Missingness on a per-individual basis (optional) - pattern: "*.imiss" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.imiss": + type: file + description: Missingness on a per-individual basis (optional) + pattern: "*.imiss" - missing_site: - type: file - description: Missingness on a per-site basis (optional) - pattern: "*.lmiss" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.lmiss": + type: file + description: Missingness on a per-site basis (optional) + pattern: "*.lmiss" - snp_density: - type: file - description: Number and density of SNPs in bins of size defined by option (optional) - pattern: "*.snpden" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.snpden": + type: file + description: Number and density of SNPs in bins of size defined by option (optional) + pattern: "*.snpden" - kept_sites: - type: file - description: All sites that have been kept after filtering (optional) - pattern: "*.kept.sites" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.kept.sites": + type: file + description: All sites that have been kept after filtering (optional) + pattern: "*.kept.sites" - removed_sites: - type: file - description: All sites that have been removed after filtering (optional) - pattern: "*.removed.sites" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.removed.sites": + type: file + description: All sites that have been removed after filtering (optional) + pattern: "*.removed.sites" - singeltons: - type: file - description: Location of singletons, and the individual they occur in (optional) - pattern: "*.singeltons" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.singletons": + type: file + description: Location of singletons, and the individual they occur in (optional) + pattern: "*.singeltons" - indel_hist: - type: file - description: Histogram file of the length of all indels (including SNPs) (optional) - pattern: "*.indel_hist" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.indel.hist": + type: file + description: Histogram file of the length of all indels (including SNPs) (optional) + pattern: "*.indel_hist" - hapcount: - type: file - description: Unique haplotypes within user specified bins (optional) - pattern: "*.hapcount" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.hapcount": + type: file + description: Unique haplotypes within user specified bins (optional) + pattern: "*.hapcount" - mendel: - type: file - description: Mendel errors identified in trios (optional) - pattern: "*.mendel" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.mendel": + type: file + description: Mendel errors identified in trios (optional) + pattern: "*.mendel" - format: - type: file - description: Extracted information from the genotype fields in the VCF file relating to a specfied FORMAT identifier (optional) - pattern: "*.FORMAT" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.FORMAT": + type: file + description: Extracted information from the genotype fields in the VCF file + relating to a specfied FORMAT identifier (optional) + pattern: "*.FORMAT" - info: - type: file - description: Extracted information from the INFO field in the VCF file (optional) - pattern: "*.INFO" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.INFO": + type: file + description: Extracted information from the INFO field in the VCF file (optional) + pattern: "*.INFO" - genotypes_matrix: - type: file - description: | - Genotypes output as large matrix. - Genotypes of each individual on a separate line. - Genotypes are represented as 0, 1 and 2, where the number represent that number of non-reference alleles. - Missing genotypes are represented by -1 (optional) - pattern: "*.012" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.012": + type: file + description: | + Genotypes output as large matrix. + Genotypes of each individual on a separate line. + Genotypes are represented as 0, 1 and 2, where the number represent that number of non-reference alleles. + Missing genotypes are represented by -1 (optional) + pattern: "*.012" - genotypes_matrix_individual: - type: file - description: Details the individuals included in the main genotypes_matrix file (optional) - pattern: "*.012.indv" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.012.indv": + type: file + description: Details the individuals included in the main genotypes_matrix file + (optional) + pattern: "*.012.indv" - genotypes_matrix_position: - type: file - description: Details the site locations included in the main genotypes_matrix file (optional) - pattern: "*.012.pos" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.012.pos": + type: file + description: Details the site locations included in the main genotypes_matrix + file (optional) + pattern: "*.012.pos" - impute_hap: - type: file - description: Phased haplotypes in IMPUTE reference-panel format (optional) - pattern: "*.impute.hap" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.impute.hap": + type: file + description: Phased haplotypes in IMPUTE reference-panel format (optional) + pattern: "*.impute.hap" - impute_hap_legend: - type: file - description: Impute haplotype legend file (optional) - pattern: "*.impute.hap.legend" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.impute.hap.legend": + type: file + description: Impute haplotype legend file (optional) + pattern: "*.impute.hap.legend" - impute_hap_indv: - type: file - description: Impute haplotype individuals file (optional) - pattern: "*.impute.hap.indv" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.impute.hap.indv": + type: file + description: Impute haplotype individuals file (optional) + pattern: "*.impute.hap.indv" - ldhat_sites: - type: file - description: Output data in LDhat format, sites (optional) - pattern: "*.ldhat.sites" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.ldhat.sites": + type: file + description: Output data in LDhat format, sites (optional) + pattern: "*.ldhat.sites" - ldhat_locs: - type: file - description: output data in LDhat format, locations (optional) - pattern: "*.ldhat.locs" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.ldhat.locs": + type: file + description: output data in LDhat format, locations (optional) + pattern: "*.ldhat.locs" - beagle_gl: - type: file - description: Genotype likelihoods for biallelic sites (optional) - pattern: "*.BEAGLE.GL" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.BEAGLE.GL": + type: file + description: Genotype likelihoods for biallelic sites (optional) + pattern: "*.BEAGLE.GL" - beagle_pl: - type: file - description: Genotype likelihoods for biallelic sites (optional) - pattern: "*.BEAGLE.PL" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.BEAGLE.PL": + type: file + description: Genotype likelihoods for biallelic sites (optional) + pattern: "*.BEAGLE.PL" - ped: - type: file - description: output the genotype data in PLINK PED format (optional) - pattern: "*.ped" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.ped": + type: file + description: output the genotype data in PLINK PED format (optional) + pattern: "*.ped" - map_: - type: file - description: output the genotype data in PLINK PED format (optional) - pattern: "*.map" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.map": + type: file + description: output the genotype data in PLINK PED format (optional) + pattern: "*.map" - tped: - type: file - description: output the genotype data in PLINK PED format (optional) - pattern: "*.tped" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tped": + type: file + description: output the genotype data in PLINK PED format (optional) + pattern: "*.tped" - tfam: - type: file - description: output the genotype data in PLINK PED format (optional) - pattern: "*.tfam" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tfam": + type: file + description: output the genotype data in PLINK PED format (optional) + pattern: "*.tfam" - diff_sites_in_files: - type: file - description: Sites that are common / unique to each file specified in optional inputs (optional) - pattern: "*.diff.sites.in.files" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.diff.sites_in_files": + type: file + description: Sites that are common / unique to each file specified in optional + inputs (optional) + pattern: "*.diff.sites.in.files" - diff_indv_in_files: - type: file - description: Individuals that are common / unique to each file specified in optional inputs (optional) - pattern: "*.diff.indv.in.files" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.diff.indv_in_files": + type: file + description: Individuals that are common / unique to each file specified in + optional inputs (optional) + pattern: "*.diff.indv.in.files" - diff_sites: - type: file - description: Discordance on a site by site basis, specified in optional inputs (optional) - pattern: "*.diff.sites" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.diff.sites": + type: file + description: Discordance on a site by site basis, specified in optional inputs + (optional) + pattern: "*.diff.sites" - diff_indv: - type: file - description: Discordance on a individual by individual basis, specified in optional inputs (optional) - pattern: "*.diff.indv" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.diff.indv": + type: file + description: Discordance on a individual by individual basis, specified in optional + inputs (optional) + pattern: "*.diff.indv" - diff_discd_matrix: - type: file - description: Discordance matrix between files specified in optional inputs (optional) - pattern: "*.diff.discordance.matrix" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.diff.discordance.matrix": + type: file + description: Discordance matrix between files specified in optional inputs (optional) + pattern: "*.diff.discordance.matrix" - diff_switch_error: - type: file - description: Switch errors found between sites (optional) - pattern: "*.diff.switch" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.diff.switch": + type: file + description: Switch errors found between sites (optional) + pattern: "*.diff.switch" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@Mark-S-Hill" maintainers: diff --git a/modules/nf-core/vcftools/tests/main.nf.test.snap b/modules/nf-core/vcftools/tests/main.nf.test.snap index 77669aadc4..e17865541f 100644 --- a/modules/nf-core/vcftools/tests/main.nf.test.snap +++ b/modules/nf-core/vcftools/tests/main.nf.test.snap @@ -1599,389 +1599,1025 @@ "content": [ { "0": [ - + [ + { + "id": "test" + }, + "test.vcf:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "1": [ - + [ + { + "id": "test" + }, + "test.bcf:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "10": [ - + [ + { + "id": "test" + }, + "test.geno.chisq:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "11": [ - + [ + { + "id": "test" + }, + "test.list.hap.ld:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "12": [ - + [ + { + "id": "test" + }, + "test.list.geno.ld:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "13": [ - + [ + { + "id": "test" + }, + "test.interchrom.hap.ld:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "14": [ - + [ + { + "id": "test" + }, + "test.interchrom.geno.ld:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "15": [ - + [ + { + "id": "test" + }, + "test.TsTv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "16": [ - + [ + { + "id": "test" + }, + "test.TsTv.summary:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "17": [ - + [ + { + "id": "test" + }, + "test.TsTv.count:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "18": [ - + [ + { + "id": "test" + }, + "test.TsTv.qual:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "19": [ - + [ + { + "id": "test" + }, + "test.FILTER.summary:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "2": [ - + [ + { + "id": "test" + }, + "test.frq:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "20": [ - + [ + { + "id": "test" + }, + "test.sites.pi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "21": [ - + [ + { + "id": "test" + }, + "test.windowed.pi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "22": [ - + [ + { + "id": "test" + }, + "test.weir.fst:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "23": [ - + [ + { + "id": "test" + }, + "test.het:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "24": [ - + [ + { + "id": "test" + }, + "test.hwe:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "25": [ - + [ + { + "id": "test" + }, + "test.Tajima.D:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "26": [ - + [ + { + "id": "test" + }, + "test.ifreqburden:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "27": [ - + [ + { + "id": "test" + }, + "test.LROH:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "28": [ - + [ + { + "id": "test" + }, + "test.relatedness:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "29": [ - + [ + { + "id": "test" + }, + "test.relatedness2:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "3": [ - + [ + { + "id": "test" + }, + "test.frq.count:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "30": [ - + [ + { + "id": "test" + }, + "test.lqual:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "31": [ - + [ + { + "id": "test" + }, + "test.imiss:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "32": [ - + [ + { + "id": "test" + }, + "test.lmiss:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "33": [ - + [ + { + "id": "test" + }, + "test.snpden:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "34": [ - + [ + { + "id": "test" + }, + "test.kept.sites:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "35": [ - + [ + { + "id": "test" + }, + "test.removed.sites:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "36": [ - + [ + { + "id": "test" + }, + "test.singletons:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "37": [ - + [ + { + "id": "test" + }, + "test.indel.hist:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "38": [ - + [ + { + "id": "test" + }, + "test.hapcount:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "39": [ - + [ + { + "id": "test" + }, + "test.mendel:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "4": [ - + [ + { + "id": "test" + }, + "test.idepth:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "40": [ - + [ + { + "id": "test" + }, + "test.FORMAT:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "41": [ - + [ + { + "id": "test" + }, + "test.INFO:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "42": [ - + [ + { + "id": "test" + }, + "test.012:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "43": [ - + [ + { + "id": "test" + }, + "test.012.indv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "44": [ - + [ + { + "id": "test" + }, + "test.012.pos:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "45": [ - + [ + { + "id": "test" + }, + "test.impute.hap:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "46": [ - + [ + { + "id": "test" + }, + "test.impute.hap.legend:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "47": [ - + [ + { + "id": "test" + }, + "test.impute.hap.indv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "48": [ - + [ + { + "id": "test" + }, + "test.ldhat.sites:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "49": [ - + [ + { + "id": "test" + }, + "test.ldhat.locs:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "5": [ - + [ + { + "id": "test" + }, + "test.ldepth:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "50": [ - + [ + { + "id": "test" + }, + "test.BEAGLE.GL:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "51": [ - + [ + { + "id": "test" + }, + "test.BEAGLE.PL:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "52": [ - + [ + { + "id": "test" + }, + "test.ped:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "53": [ - + [ + { + "id": "test" + }, + "test.map:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "54": [ - + [ + { + "id": "test" + }, + "test.tped:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "55": [ - + [ + { + "id": "test" + }, + "test.tfam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "56": [ - + [ + { + "id": "test" + }, + "test.diff.sites_in_files:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "57": [ - + [ + { + "id": "test" + }, + "test.diff.indv_in_files:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "58": [ - + [ + { + "id": "test" + }, + "test.diff.sites:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "59": [ - + [ + { + "id": "test" + }, + "test.diff.indv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "6": [ - + [ + { + "id": "test" + }, + "test.ldepth.mean:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "60": [ - + [ + { + "id": "test" + }, + "test.diff.discordance.matrix:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "61": [ - + [ + { + "id": "test" + }, + "test.diff.switch:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "62": [ "versions.yml:md5,577abe71f1ed8b94c633e71dc2cfc491" ], "7": [ - + [ + { + "id": "test" + }, + "test.gdepth:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "8": [ - + [ + { + "id": "test" + }, + [ + "test.hap.ld:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.interchrom.hap.ld:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.list.hap.ld:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] ], "9": [ - + [ + { + "id": "test" + }, + [ + "test.geno.ld:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.interchrom.geno.ld:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.list.geno.ld:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] ], "bcf": [ - + [ + { + "id": "test" + }, + "test.bcf:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "beagle_gl": [ - + [ + { + "id": "test" + }, + "test.BEAGLE.GL:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "beagle_pl": [ - + [ + { + "id": "test" + }, + "test.BEAGLE.PL:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "diff_discd_matrix": [ - + [ + { + "id": "test" + }, + "test.diff.discordance.matrix:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "diff_indv": [ - + [ + { + "id": "test" + }, + "test.diff.indv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "diff_indv_in_files": [ - + [ + { + "id": "test" + }, + "test.diff.indv_in_files:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "diff_sites": [ - + [ + { + "id": "test" + }, + "test.diff.sites:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "diff_sites_in_files": [ - + [ + { + "id": "test" + }, + "test.diff.sites_in_files:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "diff_switch_error": [ - + [ + { + "id": "test" + }, + "test.diff.switch:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "filter_summary": [ - + [ + { + "id": "test" + }, + "test.FILTER.summary:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "format": [ - + [ + { + "id": "test" + }, + "test.FORMAT:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "freq_burden": [ - + [ + { + "id": "test" + }, + "test.ifreqburden:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "frq": [ - + [ + { + "id": "test" + }, + "test.frq:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "frq_count": [ - + [ + { + "id": "test" + }, + "test.frq.count:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "gdepth": [ - + [ + { + "id": "test" + }, + "test.gdepth:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "geno_chisq": [ - + [ + { + "id": "test" + }, + "test.geno.chisq:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "geno_ld": [ - + [ + { + "id": "test" + }, + [ + "test.geno.ld:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.interchrom.geno.ld:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.list.geno.ld:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] ], "genotypes_matrix": [ - + [ + { + "id": "test" + }, + "test.012:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "genotypes_matrix_individual": [ - + [ + { + "id": "test" + }, + "test.012.indv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "genotypes_matrix_position": [ - + [ + { + "id": "test" + }, + "test.012.pos:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "hap_ld": [ - + [ + { + "id": "test" + }, + [ + "test.hap.ld:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.interchrom.hap.ld:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.list.hap.ld:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] ], "hapcount": [ - + [ + { + "id": "test" + }, + "test.hapcount:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "heterozygosity": [ - + [ + { + "id": "test" + }, + "test.het:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "hwe": [ - + [ + { + "id": "test" + }, + "test.hwe:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "idepth": [ - + [ + { + "id": "test" + }, + "test.idepth:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "impute_hap": [ - + [ + { + "id": "test" + }, + "test.impute.hap:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "impute_hap_indv": [ - + [ + { + "id": "test" + }, + "test.impute.hap.indv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "impute_hap_legend": [ - + [ + { + "id": "test" + }, + "test.impute.hap.legend:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "indel_hist": [ - + [ + { + "id": "test" + }, + "test.indel.hist:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "info": [ - + [ + { + "id": "test" + }, + "test.INFO:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "interchrom_geno_ld": [ - + [ + { + "id": "test" + }, + "test.interchrom.geno.ld:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "interchrom_hap_ld": [ - + [ + { + "id": "test" + }, + "test.interchrom.hap.ld:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "kept_sites": [ - + [ + { + "id": "test" + }, + "test.kept.sites:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "ldepth": [ - + [ + { + "id": "test" + }, + "test.ldepth:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "ldepth_mean": [ - + [ + { + "id": "test" + }, + "test.ldepth.mean:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "ldhat_locs": [ - + [ + { + "id": "test" + }, + "test.ldhat.locs:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "ldhat_sites": [ - + [ + { + "id": "test" + }, + "test.ldhat.sites:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "list_geno_ld": [ - + [ + { + "id": "test" + }, + "test.list.geno.ld:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "list_hap_ld": [ - + [ + { + "id": "test" + }, + "test.list.hap.ld:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "lqual": [ - + [ + { + "id": "test" + }, + "test.lqual:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "lroh": [ - + [ + { + "id": "test" + }, + "test.LROH:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "map_": [ - + [ + { + "id": "test" + }, + "test.map:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "mendel": [ - + [ + { + "id": "test" + }, + "test.mendel:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "missing_individual": [ - + [ + { + "id": "test" + }, + "test.imiss:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "missing_site": [ - + [ + { + "id": "test" + }, + "test.lmiss:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "ped": [ - + [ + { + "id": "test" + }, + "test.ped:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "relatedness": [ - + [ + { + "id": "test" + }, + "test.relatedness:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "relatedness2": [ - + [ + { + "id": "test" + }, + "test.relatedness2:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "removed_sites": [ - + [ + { + "id": "test" + }, + "test.removed.sites:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "singeltons": [ - + [ + { + "id": "test" + }, + "test.singletons:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "sites_pi": [ - + [ + { + "id": "test" + }, + "test.sites.pi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "snp_density": [ - + [ + { + "id": "test" + }, + "test.snpden:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "tajima_d": [ - + [ + { + "id": "test" + }, + "test.Tajima.D:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "tfam": [ - + [ + { + "id": "test" + }, + "test.tfam:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "tped": [ - + [ + { + "id": "test" + }, + "test.tped:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "tstv": [ - + [ + { + "id": "test" + }, + "test.TsTv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "tstv_count": [ - + [ + { + "id": "test" + }, + "test.TsTv.count:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "tstv_qual": [ - + [ + { + "id": "test" + }, + "test.TsTv.qual:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "tstv_summary": [ - + [ + { + "id": "test" + }, + "test.TsTv.summary:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "vcf": [ - + [ + { + "id": "test" + }, + "test.vcf:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "versions": [ "versions.yml:md5,577abe71f1ed8b94c633e71dc2cfc491" ], "weir_fst": [ - + [ + { + "id": "test" + }, + "test.weir.fst:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ], "windowed_pi": [ - + [ + { + "id": "test" + }, + "test.windowed.pi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] ] } ], "meta": { "nf-test": "0.8.4", - "nextflow": "23.10.0" + "nextflow": "24.04.2" }, - "timestamp": "2024-03-22T13:07:31.866838" + "timestamp": "2024-06-24T13:34:42.814188" } } \ No newline at end of file diff --git a/nextflow.config b/nextflow.config index f9fa756dbb..1268aab63a 100644 --- a/nextflow.config +++ b/nextflow.config @@ -10,6 +10,7 @@ params { // Workflow flags: // Mandatory arguments + // Input options input = null // No default input input_restart = null // No default automatic input step = 'mapping' // Starts with mapping @@ -38,6 +39,7 @@ params { three_prime_clip_r1 = 0 three_prime_clip_r2 = 0 trim_nextseq = 0 + length_required = 15 // Default in FastP save_trimmed = false save_split_fastqs = false @@ -115,6 +117,8 @@ params { monochrome_logs = false hook_url = null help = false + help_full = false + show_hidden = false version = false pipelines_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/' @@ -128,37 +132,13 @@ params { test_data_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/sarek3' modules_testdata_base_path = null - // Max resource options - // Defaults only, expecting to be overwritten - max_memory = '128.GB' - max_cpus = 16 - max_time = '240.h' - // Schema validation default options - validationFailUnrecognisedParams = false - validationLenientMode = true - validationSchemaIgnoreParams = 'cf_ploidy,snpeff_db,vep_cache_version,genomes,igenomes_base' - validationShowHiddenParams = false - validate_params = true + validate_params = true } // Load base.config by default for all pipelines includeConfig 'conf/base.config' -// Load nf-core custom profiles from different Institutions -try { - includeConfig "${params.custom_config_base}/nfcore_custom.config" -} catch (Exception e) { - System.err.println("WARNING: Could not load nf-core/config profiles: ${params.custom_config_base}/nfcore_custom.config") -} - -// Load nf-core/sarek custom profiles from different institutions. -try { - includeConfig "${params.custom_config_base}/pipeline/sarek.config" -} catch (Exception e) { - System.err.println("WARNING: Could not load nf-core/config/sarek profiles: ${params.custom_config_base}/pipeline/sarek.config") -} - profiles { debug { dumpHashes = true @@ -173,7 +153,7 @@ profiles { podman.enabled = false shifter.enabled = false charliecloud.enabled = false - conda.channels = ['conda-forge', 'bioconda', 'defaults'] + conda.channels = ['conda-forge', 'bioconda'] apptainer.enabled = false } mamba { @@ -316,6 +296,7 @@ profiles { targeted { includeConfig 'conf/test/targeted.config' } tools { includeConfig 'conf/test/tools.config' } tools_germline { includeConfig 'conf/test/tools_germline.config' } + tools_germline_deepvariant { includeConfig 'conf/test/tools_germline_deepvariant.config' } tools_somatic { includeConfig 'conf/test/tools_somatic.config' } tools_somatic_ascat { includeConfig 'conf/test/tools_somatic_ascat.config' } tools_tumoronly { includeConfig 'conf/test/tools_tumoronly.config' } @@ -325,26 +306,23 @@ profiles { variantcalling_channels { includeConfig 'conf/test/variantcalling_channels.config' } } -// Set default registry for Apptainer, Docker, Podman and Singularity independent of -profile -// Will not be used unless Apptainer / Docker / Podman / Singularity are enabled -// Set to your registry if you have a mirror of containers -apptainer.registry = 'quay.io' -docker.registry = 'quay.io' -podman.registry = 'quay.io' -singularity.registry = 'quay.io' +// Load nf-core custom profiles from different Institutions +includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/nfcore_custom.config" : "/dev/null" -// Nextflow plugins -plugins { - id 'nf-validation@1.1.3' // Validation of pipeline parameters and creation of an input channel from a sample sheet - id 'nf-prov@1.2.2' // Provenance reports for pipeline runs -} +// Load nf-core/sarek custom profiles from different institutions. +includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/pipeline/sarek.config" : "/dev/null" + +// Set default registry for Apptainer, Docker, Podman, Charliecloud and Singularity independent of -profile +// Will not be used unless Apptainer / Docker / Podman / Charliecloud / Singularity are enabled +// Set to your registry if you have a mirror of containers +apptainer.registry = 'quay.io' +docker.registry = 'quay.io' +podman.registry = 'quay.io' +singularity.registry = 'quay.io' +charliecloud.registry = 'quay.io' // Load igenomes.config if required -if (!params.igenomes_ignore) { - includeConfig 'conf/igenomes.config' -} else { - params.genomes = [:] -} +includeConfig !params.igenomes_ignore ? 'conf/igenomes.config' : 'conf/igenomes_ignored.config' // Export these variables to prevent local Python/R libraries from conflicting with those in the container // The JULIA depot path has been adjusted to a fixed path `/usr/local/share/julia` that needs to be used for packages in the container. @@ -357,8 +335,15 @@ env { JULIA_DEPOT_PATH = "/usr/local/share/julia" } -// Capture exit codes from upstream processes when piping -process.shell = ['/bin/bash', '-euo', 'pipefail'] +// Set bash options +process.shell = """\ +bash + +set -e # Exit if a tool returns a non-zero status/exit code +set -u # Treat unset variables and parameters as an error +set -o pipefail # Returns the status of the last command to exit with a non-zero status or zero if all successfully execute +set -C # No clobber - prevent output redirection from overwriting files. +""" // Disable process selector warnings by default. Use debug profile to enable warnings. nextflow.enable.configProcessNamesValidation = false @@ -395,11 +380,56 @@ manifest { homePage = 'https://github.com/nf-core/sarek' description = """An open-source analysis pipeline to detect germline or somatic variants from whole genome or targeted sequencing""" mainScript = 'main.nf' - nextflowVersion = '!>=23.04.0' - version = '3.4.4' + nextflowVersion = '!>=24.04.2' + version = '3.5.0' doi = '10.12688/f1000research.16665.2, 10.1093/nargab/lqae031, 10.5281/zenodo.3476425' } +// Nextflow plugins +plugins { + id 'nf-schema@2.1.1' // Validation of pipeline parameters and creation of an input channel from a sample sheet + id 'nf-prov@1.2.2' // Provenance reports for pipeline runs +} + +validation { + defaultIgnoreParams = ["genomes"] + lenientMode = true + help { + enabled = true + command = "nextflow run $manifest.name -profile --input samplesheet.csv --outdir " + fullParameter = "help_full" + showHiddenParameter = "show_hidden" + beforeText = """ +-\033[2m----------------------------------------------------\033[0m- + \033[0;32m,--.\033[0;30m/\033[0;32m,-.\033[0m +\033[0;34m ___ __ __ __ ___ \033[0;32m/,-._.--~\'\033[0m +\033[0;34m |\\ | |__ __ / ` / \\ |__) |__ \033[0;33m} {\033[0m +\033[0;34m | \\| | \\__, \\__/ | \\ |___ \033[0;32m\\`-._,-`-,\033[0m + \033[0;32m`._,._,\'\033[0m +\033[0;37m ____\033[0m +\033[0;37m .´ _ `.\033[0m +\033[0;37m / \033[0;32m|\\\033[0m`-_ \\\033[0m \033[0;34m __ __ ___ \033[0m +\033[0;37m | \033[0;32m| \\\033[0m `-|\033[0m \033[0;34m|__` /\\ |__) |__ |__/\033[0m +\033[0;37m \\ \033[0;32m| \\\033[0m /\033[0m \033[0;34m.__| /¯¯\\ | \\ |___ | \\\033[0m +\033[0;37m `\033[0;32m|\033[0m____\033[0;32m\\\033[0m´\033[0m + +\033[0;35m ${manifest.name} ${manifest.version}\033[0m +-\033[2m----------------------------------------------------\033[0m- +""" + afterText = """${manifest.doi ? "\n* The pipeline\n" : ""}${manifest.doi.tokenize(",").collect { " https://doi.org/${it.trim().replace('https://doi.org/','')}"}.join("\n")}${manifest.doi ? "\n" : ""} +* The nf-core framework + https://doi.org/10.1038/s41587-020-0439-x + +* Software dependencies + https://github.com/${manifest.name}/blob/master/CITATIONS.md +""" + } + summary { + beforeText = validation.help.beforeText + afterText = validation.help.afterText + } +} + // Load modules.config for DSL2 module specific options includeConfig 'conf/modules/modules.config' @@ -430,6 +460,7 @@ includeConfig 'conf/modules/controlfreec.config' includeConfig 'conf/modules/deepvariant.config' includeConfig 'conf/modules/freebayes.config' includeConfig 'conf/modules/haplotypecaller.config' +includeConfig 'conf/modules/indexcov.config' includeConfig 'conf/modules/joint_germline.config' includeConfig 'conf/modules/manta.config' includeConfig 'conf/modules/mpileup.config' @@ -442,39 +473,8 @@ includeConfig 'conf/modules/sentieon_haplotyper_joint_germline.config' includeConfig 'conf/modules/strelka.config' includeConfig 'conf/modules/tiddit.config' includeConfig 'conf/modules/post_variant_calling.config' +includeConfig 'conf/modules/lofreq.config' //annotate includeConfig 'conf/modules/annotate.config' -// Function to ensure that resource requirements don't go beyond -// a maximum limit -def check_max(obj, type) { - if (type == 'memory') { - try { - if (obj.compareTo(params.max_memory as nextflow.util.MemoryUnit) == 1) - return params.max_memory as nextflow.util.MemoryUnit - else - return obj - } catch (all) { - println " ### ERROR ### Max memory '${params.max_memory}' is not valid! Using default value: $obj" - return obj - } - } else if (type == 'time') { - try { - if (obj.compareTo(params.max_time as nextflow.util.Duration) == 1) - return params.max_time as nextflow.util.Duration - else - return obj - } catch (all) { - println " ### ERROR ### Max time '${params.max_time}' is not valid! Using default value: $obj" - return obj - } - } else if (type == 'cpus') { - try { - return Math.min( obj, params.max_cpus as int ) - } catch (all) { - println " ### ERROR ### Max cpus '${params.max_cpus}' is not valid! Using default value: $obj" - return obj - } - } -} diff --git a/nextflow_schema.json b/nextflow_schema.json index 1611d58f40..5cdf35d555 100644 --- a/nextflow_schema.json +++ b/nextflow_schema.json @@ -1,10 +1,10 @@ { - "$schema": "http://json-schema.org/draft-07/schema", + "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "https://raw.githubusercontent.com/nf-core/sarek/master/nextflow_schema.json", "title": "nf-core/sarek pipeline parameters", "description": "An open-source analysis pipeline to detect germline or somatic variants from whole genome or targeted sequencing", "type": "object", - "definitions": { + "$defs": { "input_output_options": { "title": "Input/output options", "type": "object", @@ -82,18 +82,6 @@ "description": "Specify how many reads each split of a FastQ file contains. Set 0 to turn off splitting at all.", "help_text": "Use the the tool FastP to split FASTQ file by number of reads. This parallelizes across fastq file shards speeding up mapping. Note although the minimum value is 250 reads, if you have fewer than 250 reads a single FASTQ shard will still be created." }, - "wes": { - "type": "boolean", - "fa_icon": "fas fa-dna", - "description": "Enable when exome or panel data is provided.", - "help_text": "With this parameter flags in various tools are set for targeted sequencing data. It is recommended to enable for whole-exome and panel data analysis." - }, - "intervals": { - "type": "string", - "fa_icon": "fas fa-file-alt", - "help_text": "To speed up preprocessing and variant calling processes, the execution is parallelized across a reference chopped into smaller pieces.\n\nParts of preprocessing and variant calling are done by these intervals, the different resulting files are then merged.\nThis can parallelize processes, and push down wall clock time significantly.\n\nWe are aligning to the whole genome, and then run Base Quality Score Recalibration and Variant Calling on the supplied regions.\n\n**Whole Genome Sequencing:**\n\nThe (provided) intervals are chromosomes cut at their centromeres (so each chromosome arm processed separately) also additional unassigned contigs.\n\nWe are ignoring the `hs37d5` contig that contains concatenated decoy sequences.\n\nThe calling intervals can be defined using a .list or a BED file.\nA .list file contains one interval per line in the format `chromosome:start-end` (1-based coordinates).\nA BED file must be a tab-separated text file with one interval per line.\nThere must be at least three columns: chromosome, start, and end (0-based coordinates).\nAdditionally, the score column of the BED file can be used to provide an estimate of how many seconds it will take to call variants on that interval.\nThe fourth column remains unused.\n\n```\n|chr1|10000|207666|NA|47.3|\n```\nThis indicates that variant calling on the interval chr1:10001-207666 takes approximately 47.3 seconds.\n\nThe runtime estimate is used in two different ways.\nFirst, when there are multiple consecutive intervals in the file that take little time to compute, they are processed as a single job, thus reducing the number of processes that needs to be spawned.\nSecond, the jobs with largest processing time are started first, which reduces wall-clock time.\nIf no runtime is given, a time of 200000 nucleotides per second is assumed. See `--nucleotides_per_second` on how to customize this.\nActual figures vary from 2 nucleotides/second to 30000 nucleotides/second.\nIf you prefer, you can specify the full path to your reference genome when you run the pipeline:\n\n> **NB** If none provided, will be generated automatically from the FASTA reference\n> **NB** Use --no_intervals to disable automatic generation.\n\n**Targeted Sequencing:**\n\nThe recommended flow for targeted sequencing data is to use the workflow as it is, but also provide a `BED` file containing targets for all steps using the `--intervals` option. In addition, the parameter `--wes` should be set.\nIt is advised to pad the variant calling regions (exons or target) to some extent before submitting to the workflow.\n\nThe procedure is similar to whole genome sequencing, except that only BED file are accepted. See above for formatting description.\nAdding every exon as an interval in case of `WES` can generate >200K processes or jobs, much more forks, and similar number of directories in the Nextflow work directory. These are appropriately grouped together to reduce number of processes run in parallel (see above and `--nucleotides_per_second` for details). \nFurthermore, primers and/or baits are not 100% specific, (certainly not for MHC and KIR, etc.), quite likely there going to be reads mapping to multiple locations.\nIf you are certain that the target is unique for your genome (all the reads will certainly map to only one location), and aligning to the whole genome is an overkill, it is actually better to change the reference itself.", - "description": "Path to target bed file in case of whole exome or targeted sequencing or intervals file." - }, "nucleotides_per_second": { "type": "integer", "fa_icon": "fas fa-clock", @@ -101,18 +89,30 @@ "help_text": "Intervals are parts of the chopped up genome used to speed up preprocessing and variant calling. See `--intervals` for more info. \n\nChanging this parameter, changes the number of intervals that are grouped and processed together. Bed files from target sequencing can contain thousands or small intervals. Spinning up a new process for each can be quite resource intensive. Instead it can be desired to process small intervals together on larger nodes. \nIn order to make use of this parameter, no runtime estimate can be present in the bed file (column 5). ", "default": 200000 }, + "intervals": { + "type": "string", + "fa_icon": "fas fa-file-alt", + "help_text": "To speed up preprocessing and variant calling processes, the execution is parallelized across a reference chopped into smaller pieces.\n\nParts of preprocessing and variant calling are done by these intervals, the different resulting files are then merged.\nThis can parallelize processes, and push down wall clock time significantly.\n\nWe are aligning to the whole genome, and then run Base Quality Score Recalibration and Variant Calling on the supplied regions.\n\n**Whole Genome Sequencing:**\n\nThe (provided) intervals are chromosomes cut at their centromeres (so each chromosome arm processed separately) also additional unassigned contigs.\n\nWe are ignoring the `hs37d5` contig that contains concatenated decoy sequences.\n\nThe calling intervals can be defined using a .list or a BED file.\nA .list file contains one interval per line in the format `chromosome:start-end` (1-based coordinates).\nA BED file must be a tab-separated text file with one interval per line.\nThere must be at least three columns: chromosome, start, and end (0-based coordinates).\nAdditionally, the score column of the BED file can be used to provide an estimate of how many seconds it will take to call variants on that interval.\nThe fourth column remains unused.\n\n```\n|chr1|10000|207666|NA|47.3|\n```\nThis indicates that variant calling on the interval chr1:10001-207666 takes approximately 47.3 seconds.\n\nThe runtime estimate is used in two different ways.\nFirst, when there are multiple consecutive intervals in the file that take little time to compute, they are processed as a single job, thus reducing the number of processes that needs to be spawned.\nSecond, the jobs with largest processing time are started first, which reduces wall-clock time.\nIf no runtime is given, a time of 200000 nucleotides per second is assumed. See `--nucleotides_per_second` on how to customize this.\nActual figures vary from 2 nucleotides/second to 30000 nucleotides/second.\nIf you prefer, you can specify the full path to your reference genome when you run the pipeline:\n\n> **NB** If none provided, will be generated automatically from the FASTA reference\n> **NB** Use --no_intervals to disable automatic generation.\n\n**Targeted Sequencing:**\n\nThe recommended flow for targeted sequencing data is to use the workflow as it is, but also provide a `BED` file containing targets for all steps using the `--intervals` option. In addition, the parameter `--wes` should be set.\nIt is advised to pad the variant calling regions (exons or target) to some extent before submitting to the workflow.\n\nThe procedure is similar to whole genome sequencing, except that only BED file are accepted. See above for formatting description.\nAdding every exon as an interval in case of `WES` can generate >200K processes or jobs, much more forks, and similar number of directories in the Nextflow work directory. These are appropriately grouped together to reduce number of processes run in parallel (see above and `--nucleotides_per_second` for details). \nFurthermore, primers and/or baits are not 100% specific, (certainly not for MHC and KIR, etc.), quite likely there going to be reads mapping to multiple locations.\nIf you are certain that the target is unique for your genome (all the reads will certainly map to only one location), and aligning to the whole genome is an overkill, it is actually better to change the reference itself.", + "description": "Path to target bed file in case of whole exome or targeted sequencing or intervals file." + }, "no_intervals": { "type": "boolean", "fa_icon": "fas fa-ban", "description": "Disable usage of intervals.", "help_text": "Intervals are parts of the chopped up genome used to speed up preprocessing and variant calling. See `--intervals` for more info. \n\nIf `--no_intervals` is set no intervals will be taken into account for speed up or data processing." }, + "wes": { + "type": "boolean", + "fa_icon": "fas fa-dna", + "description": "Enable when exome or panel data is provided.", + "help_text": "With this parameter flags in various tools are set for targeted sequencing data. It is recommended to enable for whole-exome and panel data analysis." + }, "tools": { "type": "string", "fa_icon": "fas fa-toolbox", "description": "Tools to use for duplicate marking, variant calling and/or for annotation.", - "help_text": "Multiple tools separated with commas.\n\n**Variant Calling:**\n\nGermline variant calling can currently be performed with the following variant callers:\n- SNPs/Indels: DeepVariant, FreeBayes, GATK HaplotypeCaller, mpileup, Sentieon Haplotyper, Strelka\n- Structural Variants: Manta, TIDDIT\n- Copy-number: CNVKit\n\nTumor-only somatic variant calling can currently be performed with the following variant callers:\n- SNPs/Indels: FreeBayes, mpileup, Mutect2, Strelka\n- Structural Variants: Manta, TIDDIT\n- Copy-number: CNVKit, ControlFREEC\n\nSomatic variant calling can currently only be performed with the following variant callers:\n- SNPs/Indels: FreeBayes, Mutect2, Strelka\n- Structural variants: Manta, TIDDIT\n- Copy-Number: ASCAT, CNVKit, Control-FREEC\n- Microsatellite Instability: MSIsensorpro\n\n> **NB** Mutect2 for somatic variant calling cannot be combined with `--no_intervals`\n\n**Annotation:**\n \n- snpEff, VEP, merge (both consecutively), and bcftools annotate (needs `--bcftools_annotation`).\n\n> **NB** As Sarek will use bgzip and tabix to compress and index VCF files annotated, it expects VCF files to be sorted when starting from `--step annotate`.", - "pattern": "^((ascat|bcfann|cnvkit|controlfreec|deepvariant|freebayes|haplotypecaller|sentieon_dnascope|sentieon_haplotyper|manta|merge|mpileup|msisensorpro|mutect2|ngscheckmate|sentieon_dedup|snpeff|strelka|tiddit|vep)?,?)*(? **NB** Mutect2 for somatic variant calling cannot be combined with `--no_intervals`\n\n**Annotation:**\n \n- snpEff, VEP, merge (both consecutively), and bcftools annotate (needs `--bcftools_annotation`).\n\n> **NB** As Sarek will use bgzip and tabix to compress and index VCF files annotated, it expects VCF files to be sorted when starting from `--step annotate`.", + "pattern": "^((ascat|bcfann|cnvkit|controlfreec|deepvariant|freebayes|haplotypecaller|lofreq|sentieon_dnascope|sentieon_haplotyper|manta|indexcov|merge|mpileup|msisensorpro|mutect2|ngscheckmate|sentieon_dedup|snpeff|strelka|tiddit|vep)?,?)*(? **NB** PON file should be bgzipped." + }, + "pon_tbi": { + "type": "string", + "fa_icon": "fas fa-file", + "description": "Index of PON panel-of-normals VCF.", + "help_text": "If none provided, will be generated automatically from the PON bgzipped VCF file." }, "sentieon_haplotyper_emit_mode": { "type": "string", @@ -389,14 +379,12 @@ "description": "Option for selecting output and emit-mode of Sentieon's Haplotyper.", "fa_icon": "fas fa-toolbox", "help_text": "The option `--sentieon_haplotyper_emit_mode` can be set to the same string values as the Haplotyper's `--emit_mode`. To output both a vcf and a gvcf, specify both a vcf-option (currently, `all`, `confident` and `variant`) and `gvcf`. For example, to obtain a vcf and gvcf one could set `--sentieon_haplotyper_emit_mode` to `variant, gvcf`.", - "hidden": true, "pattern": "^(all|confident|gvcf|variant|gvcf,all|gvcf,confident|gvcf,variant|all,gvcf|confident,gvcf|variant,gvcf)(? **NB** If none provided, will be generated automatically from the FASTA reference. Combine with `--save_reference` to save for future runs.", - "hidden": true + "help_text": "If you wish to recompute indices available on igenomes, set `--bwa false`.\n\n> **NB** If none provided, will be generated automatically from the FASTA reference. Combine with `--save_reference` to save for future runs.\n\nIf you use AWS iGenomes, this has already been set for you appropriately." }, "bwamem2": { "type": "string", "fa_icon": "fas fa-copy", "description": "Path to bwa-mem2 mem indices.", - "help_text": "If you use AWS iGenomes, this has already been set for you appropriately.\n\nIf you wish to recompute indices available on igenomes, set `--bwamem2 false`.\n\n> **NB** If none provided, will be generated automatically from the FASTA reference, if `--aligner bwa-mem2` is specified. Combine with `--save_reference` to save for future runs.", - "hidden": true + "help_text": "If you use AWS iGenomes, this has already been set for you appropriately.\n\nIf you wish to recompute indices available on igenomes, set `--bwamem2 false`.\n\n> **NB** If none provided, will be generated automatically from the FASTA reference, if `--aligner bwa-mem2` is specified. Combine with `--save_reference` to save for future runs." }, "chr_dir": { "type": "string", "fa_icon": "fas fa-folder-open", "description": "Path to chromosomes folder used with ControLFREEC.", - "hidden": true, "help_text": "If you use AWS iGenomes, this has already been set for you appropriately." }, "dbsnp": { "type": "string", "fa_icon": "fas fa-file", "description": "Path to dbsnp file.", - "hidden": true, "help_text": "If you use AWS iGenomes, this has already been set for you appropriately." }, "dbsnp_tbi": { "type": "string", "fa_icon": "fas fa-file", "description": "Path to dbsnp index.", - "help_text": "If you use AWS iGenomes, this has already been set for you appropriately.\n\n> **NB** If none provided, will be generated automatically from the dbsnp file. Combine with `--save_reference` to save for future runs.", - "hidden": true + "help_text": "> **NB** If none provided, will be generated automatically from the dbsnp file. Combine with `--save_reference` to save for future runs.\n\nIf you use AWS iGenomes, this has already been set for you appropriately." }, "dbsnp_vqsr": { "type": "string", "fa_icon": "fas fa-copy", - "description": "label string for VariantRecalibration (haplotypecaller joint variant calling)" + "description": "Label string for VariantRecalibration (haplotypecaller joint variant calling).\n\nIf you use AWS iGenomes, this has already been set for you appropriately." }, "dict": { "type": "string", "fa_icon": "fas fa-file", "description": "Path to FASTA dictionary file.", - "help_text": "If you use AWS iGenomes, this has already been set for you appropriately.\n\n> **NB** If none provided, will be generated automatically from the FASTA reference. Combine with `--save_reference` to save for future runs.", - "hidden": true + "help_text": "> **NB** If none provided, will be generated automatically from the FASTA reference. Combine with `--save_reference` to save for future runs.\n\nIf you use AWS iGenomes, this has already been set for you appropriately." }, "dragmap": { "type": "string", "fa_icon": "fas fa-copy", "description": "Path to dragmap indices.", - "help_text": "If you use AWS iGenomes, this has already been set for you appropriately.\n\nIf you wish to recompute indices available on igenomes, set `--dragmap false`.\n\n> **NB** If none provided, will be generated automatically from the FASTA reference, if `--aligner dragmap` is specified. Combine with `--save_reference` to save for future runs.", - "hidden": true + "help_text": "If you wish to recompute indices available on igenomes, set `--dragmap false`.\n\n> **NB** If none provided, will be generated automatically from the FASTA reference, if `--aligner dragmap` is specified. Combine with `--save_reference` to save for future runs.\n\nIf you use AWS iGenomes, this has already been set for you appropriately." }, "fasta": { "type": "string", @@ -668,174 +673,148 @@ "mimetype": "text/plain", "pattern": "^\\S+\\.fn?a(sta)?(\\.gz)?$", "description": "Path to FASTA genome file.", - "help_text": "If you use AWS iGenomes, this has already been set for you appropriately.\n\nThis parameter is *mandatory* if `--genome` is not specified.", - "fa_icon": "far fa-file-code" + "help_text": "This parameter is *mandatory* if `--genome` is not specified.\n\nIf you use AWS iGenomes, this has already been set for you appropriately.", + "fa_icon": "fas fa-file" }, "fasta_fai": { "type": "string", "fa_icon": "fas fa-file", - "help_text": "If you use AWS iGenomes, this has already been set for you appropriately.\n\n> **NB** If none provided, will be generated automatically from the FASTA reference. Combine with `--save_reference` to save for future runs.", + "format": "file-path", + "exists": true, + "mimetype": "text/plain", + "help_text": "> **NB** If none provided, will be generated automatically from the FASTA reference. Combine with `--save_reference` to save for future runs.\n\nIf you use AWS iGenomes, this has already been set for you appropriately.", "description": "Path to FASTA reference index." }, "germline_resource": { "type": "string", "fa_icon": "fas fa-file", + "format": "file-path", + "exists": true, + "mimetype": "text/plain", "description": "Path to GATK Mutect2 Germline Resource File.", - "help_text": "If you use AWS iGenomes, this has already been set for you appropriately.\n\nThe germline resource VCF file (bgzipped and tabixed) needed by GATK4 Mutect2 is a collection of calls that are likely present in the sample, with allele frequencies.\nThe AF info field must be present.\nYou can find a smaller, stripped gnomAD VCF file (most of the annotation is removed and only calls signed by PASS are stored) in the AWS iGenomes Annotation/GermlineResource folder.", - "hidden": true + "help_text": "The germline resource VCF file (bgzipped and tabixed) needed by GATK4 Mutect2 is a collection of calls that are likely present in the sample, with allele frequencies.\nThe AF info field must be present.\nYou can find a smaller, stripped gnomAD VCF file (most of the annotation is removed and only calls signed by PASS are stored) in the AWS iGenomes Annotation/GermlineResource folder.\n\nIf you use AWS iGenomes, this has already been set for you appropriately." }, "germline_resource_tbi": { "type": "string", "fa_icon": "fas fa-file", + "format": "file-path", + "exists": true, + "mimetype": "text/plain", "description": "Path to GATK Mutect2 Germline Resource Index.", - "help_text": "If you use AWS iGenomes, this has already been set for you appropriately.\n\n> **NB** If none provided, will be generated automatically from the Germline Resource file, if provided. Combine with `--save_reference` to save for future runs.", - "hidden": true + "help_text": "> **NB** If none provided, will be generated automatically from the Germline Resource file, if provided. Combine with `--save_reference` to save for future runs.\n\nIf you use AWS iGenomes, this has already been set for you appropriately." }, "known_indels": { "type": "string", "fa_icon": "fas fa-copy", + "format": "file-path", + "exists": true, + "mimetype": "text/plain", "description": "Path to known indels file.", - "hidden": true, "help_text": "If you use AWS iGenomes, this has already been set for you appropriately." }, "known_indels_tbi": { "type": "string", "fa_icon": "fas fa-copy", + "format": "file-path", + "exists": true, + "mimetype": "text/plain", "description": "Path to known indels file index.", - "help_text": "If you use AWS iGenomes, this has already been set for you appropriately.\n\n> **NB** If none provided, will be generated automatically from the known index file, if provided. Combine with `--save_reference` to save for future runs.", - "hidden": true + "help_text": "> **NB** If none provided, will be generated automatically from the known index file, if provided. Combine with `--save_reference` to save for future runs.\n\nIf you use AWS iGenomes, this has already been set for you appropriately." }, "known_indels_vqsr": { "type": "string", - "fa_icon": "fas fa-copy", - "description": "If you use AWS iGenomes, this has already been set for you appropriately.\n\n1st label string for VariantRecalibration (haplotypecaller joint variant calling)" + "fa_icon": "fas fa-book", + "description": "Label string for VariantRecalibration (haplotypecaller joint variant calling). If you use AWS iGenomes, this has already been set for you appropriately." }, "known_snps": { "type": "string", "fa_icon": "fas fa-copy", - "description": "If you use AWS iGenomes, this has already been set for you appropriately.\n\nPath to known snps file." + "format": "file-path", + "exists": true, + "mimetype": "text/plain", + "description": "Path to known snps file.", + "help_text": "If you use AWS iGenomes, this has already been set for you appropriately." }, "known_snps_tbi": { "type": "string", "fa_icon": "fas fa-copy", + "format": "file-path", + "exists": true, + "mimetype": "text/plain", "description": "Path to known snps file snps.", - "help_text": "If you use AWS iGenomes, this has already been set for you appropriately.\n\n> **NB** If none provided, will be generated automatically from the known index file, if provided. Combine with `--save_reference` to save for future runs." + "help_text": "> **NB** If none provided, will be generated automatically from the known index file, if provided. Combine with `--save_reference` to save for future runs.\n\nIf you use AWS iGenomes, this has already been set for you appropriately." }, "known_snps_vqsr": { "type": "string", - "fa_icon": "fas fa-copy", - "description": "If you use AWS iGenomes, this has already been set for you appropriately.\n\nlabel string for VariantRecalibration (haplotypecaller joint variant calling)" + "fa_icon": "fas fa-book", + "description": "Label string for VariantRecalibration (haplotypecaller joint variant calling).If you use AWS iGenomes, this has already been set for you appropriately." }, "mappability": { "type": "string", "fa_icon": "fas fa-file", + "format": "file-path", + "exists": true, + "mimetype": "text/plain", "description": "Path to Control-FREEC mappability file.", - "hidden": true, "help_text": "If you use AWS iGenomes, this has already been set for you appropriately." }, "ngscheckmate_bed": { "type": "string", "fa_icon": "fas fa-file", + "format": "file-path", + "exists": true, + "mimetype": "text/plain", "description": "Path to SNP bed file for sample checking with NGSCheckMate", "help_text": "If you use AWS iGenomes, this has already been set for you appropriately." }, - "pon": { - "type": "string", - "fa_icon": "fas fa-file", - "description": "Panel-of-normals VCF (bgzipped) for GATK Mutect2", - "help_text": "Without PON, there will be no calls with PASS in the INFO field, only an unfiltered VCF is written.\nIt is highly recommended to make your own PON, as it depends on sequencer and library preparation.\n\nThe pipeline is shipped with a panel-of-normals for `--genome GATK.GRCh38` provided by [GATK](https://gatk.broadinstitute.org/hc/en-us/articles/360035890631-Panel-of-Normals-PON-). \n\nSee [PON documentation](https://gatk.broadinstitute.org/hc/en-us/articles/360042479112-CreateSomaticPanelOfNormals-BETA)\n> **NB** PON file should be bgzipped.", - "hidden": true - }, - "pon_tbi": { + "sentieon_dnascope_model": { "type": "string", "fa_icon": "fas fa-file", - "description": "Index of PON panel-of-normals VCF.", - "help_text": "If none provided, will be generated automatically from the PON bgzipped VCF file.", - "hidden": true + "format": "file-path", + "exists": true, + "mimetype": "text/plain", + "description": "Machine learning model for Sentieon Dnascope.", + "help_text": " It is recommended to use DNAscope with a machine learning model to perform variant calling with higher accuracy by improving the candidate detection and filtering. Sentieon can provide you with a model trained using a subset of the data from the GiAB truth-set found in https://github.com/genome-in-a-bottle. In addition, Sentieon can assist you in the creation of models using your own data, which will calibrate the specifics of your sequencing and bio-informatics processing.\n\nIf you use AWS iGenomes, this has already been set for you appropriately." }, - "sentieon_dnascope_model": { + "snpeff_cache": { "type": "string", - "fa_icon": "fas fa-database", - "hidden": true, - "description": "Machine learning model for Sentieon Dnascope.", - "help_text": " It is recommended to use DNAscope with a machine learning model to perform variant calling with higher accuracy by improving the candidate detection and filtering. Sentieon can provide you with a model trained using a subset of the data from the GiAB truth-set found in https://github.com/genome-in-a-bottle. In addition, Sentieon can assist you in the creation of models using your own data, which will calibrate the specifics of your sequencing and bio-informatics processing." + "format": "directory-path", + "fa_icon": "fas fa-cloud-download-alt", + "default": "s3://annotation-cache/snpeff_cache/", + "description": "Path to snpEff cache.", + "help_text": "Path to snpEff cache which should contain the relevant genome and build directory in the path ${snpeff_species}.${snpeff_version}\n\nIf you use AWS iGenomes, this has already been set for you appropriately." }, "snpeff_db": { "type": "string", "fa_icon": "fas fa-database", "description": "snpEff DB version.", - "help_text": "If you use AWS iGenomes, this has already been set for you appropriately.\nThis is used to specify the database to be use to annotate with.\nAlternatively databases' names can be listed with the `snpEff databases`." + "help_text": "This is used to specify the database to be use to annotate with.\nAlternatively databases' names can be listed with the `snpEff databases`.\n\nIf you use AWS iGenomes, this has already been set for you appropriately." }, - "snpeff_genome": { + "vep_cache": { "type": "string", - "fa_icon": "fas fa-microscope", - "description": "snpEff genome.", - "help_text": "If you use AWS iGenomes, this has already been set for you appropriately.\nThis is used to specify the genome when looking for local cache, or cloud based cache." + "format": "directory-path", + "fa_icon": "fas fa-cloud-download-alt", + "default": "s3://annotation-cache/vep_cache/", + "description": "Path to VEP cache.", + "help_text": "Path to VEP cache which should contain the relevant species, genome and build directories at the path ${vep_species}/${vep_genome}_${vep_cache_version}\n\nIf you use AWS iGenomes, this has already been set for you appropriately." + }, + "vep_cache_version": { + "type": "string", + "fa_icon": "fas fa-tag", + "description": "VEP cache version.", + "help_text": "Alternative cache version can be used to specify the correct Ensembl Genomes version number as these differ from the concurrent Ensembl/VEP version numbers.\n\nIf you use AWS iGenomes, this has already been set for you appropriately." }, "vep_genome": { "type": "string", "fa_icon": "fas fa-microscope", "description": "VEP genome.", - "help_text": "If you use AWS iGenomes, this has already been set for you appropriately.\nThis is used to specify the genome when looking for local cache, or cloud based cache." + "help_text": "This is used to specify the genome when looking for local cache, or cloud based cache.\n\nIf you use AWS iGenomes, this has already been set for you appropriately." }, "vep_species": { "type": "string", "fa_icon": "fas fa-microscope", "description": "VEP species.", - "help_text": "If you use AWS iGenomes, this has already been set for you appropriately.\nAlternatively species listed in Ensembl Genomes caches can be used." - }, - "vep_cache_version": { - "type": "string", - "fa_icon": "fas fa-tag", - "description": "VEP cache version.", - "help_text": "If you use AWS iGenomes, this has already been set for you appropriately.\nAlternatively cache version can be use to specify the correct Ensembl Genomes version number as these differ from the concurrent Ensembl/VEP version numbers" - }, - "save_reference": { - "type": "boolean", - "fa_icon": "fas fa-download", - "description": "Save built references.", - "help_text": "Set this parameter, if you wish to save all computed reference files. This is useful to avoid re-computation on future runs." - }, - "build_only_index": { - "type": "boolean", - "fa_icon": "fas fa-download", - "description": "Only built references.", - "help_text": "Set this parameter, if you wish to compute and save all computed reference files. No alignment or any other downstream steps will be performed." - }, - "download_cache": { - "type": "boolean", - "fa_icon": "fas fa-download", - "description": "Download annotation cache.", - "help_text": "Set this parameter, if you wish to download annotation cache.\nUsing this parameter will download cache even if --snpeff_cache and --vep_cache are provided." - }, - "igenomes_base": { - "type": "string", - "format": "directory-path", - "description": "Directory / URL base for iGenomes references.", - "default": "s3://ngi-igenomes/igenomes/", - "fa_icon": "fas fa-cloud-download-alt" - }, - "igenomes_ignore": { - "type": "boolean", - "description": "Do not load the iGenomes reference config.", - "fa_icon": "fas fa-ban", - "help_text": "Do not load `igenomes.config` when running the pipeline.\nYou may choose this option if you observe clashes between custom parameters and those supplied in `igenomes.config`.\n\n> **NB** You can then run `Sarek` by specifying at least a FASTA genome file." - }, - "vep_cache": { - "type": "string", - "format": "directory-path", - "fa_icon": "fas fa-cloud-download-alt", - "default": "s3://annotation-cache/vep_cache/", - "description": "Path to VEP cache.", - "help_text": "Path to VEP cache which should contain the relevant species, genome and build directories at the path ${vep_species}/${vep_genome}_${vep_cache_version}" - }, - "snpeff_cache": { - "type": "string", - "format": "directory-path", - "fa_icon": "fas fa-cloud-download-alt", - "default": "s3://annotation-cache/snpeff_cache/", - "description": "Path to snpEff cache.", - "help_text": "Path to snpEff cache which should contain the relevant genome and build directory in the path ${snpeff_species}.${snpeff_version}" + "help_text": "Alternatively species listed in Ensembl Genomes caches can be used.\n\nIf you use AWS iGenomes, this has already been set for you appropriately." } }, "help_text": "The pipeline config files come bundled with paths to the Illumina iGenomes reference index files.\nThe configuration is set up to use the AWS-iGenomes resource\ncf https://ewels.github.io/AWS-iGenomes/." @@ -914,41 +893,6 @@ } } }, - "max_job_request_options": { - "title": "Max job request options", - "type": "object", - "fa_icon": "fab fa-acquisitions-incorporated", - "description": "Set the top limit for requested resources for any single job.", - "help_text": "If you are running on a smaller system, a pipeline step requesting more resources than are available may cause the Nextflow to stop the run with an error. These options allow you to cap the maximum resources requested by any single job so that the pipeline will run on your system.\n\nNote that you can not _increase_ the resources requested by any job using these options. For that you will need your own configuration file. See [the nf-core website](https://nf-co.re/usage/configuration) for details.", - "properties": { - "max_cpus": { - "type": "integer", - "description": "Maximum number of CPUs that can be requested for any single job.", - "default": 16, - "fa_icon": "fas fa-microchip", - "hidden": true, - "help_text": "Use to set an upper-limit for the CPU requirement for each process. Should be an integer e.g. `--max_cpus 1`." - }, - "max_memory": { - "type": "string", - "description": "Maximum amount of memory that can be requested for any single job.", - "default": "128.GB", - "fa_icon": "fas fa-memory", - "pattern": "^\\d+(\\.\\d+)?\\.?\\s*(K|M|G|T)?B$", - "hidden": true, - "help_text": "Use to set an upper-limit for the memory requirement for each process. Should be a string in the format integer-unit e.g. `--max_memory '8.GB'`." - }, - "max_time": { - "type": "string", - "description": "Maximum amount of time that can be requested for any single job.", - "default": "240.h", - "fa_icon": "far fa-clock", - "pattern": "^(\\d+\\.?\\s*(s|m|h|d|day)\\s*)+$", - "hidden": true, - "help_text": "Use to set an upper-limit for the time requirement for each process. Should be a string in the format integer-unit e.g. `--max_time '2.h'`." - } - } - }, "generic_options": { "title": "Generic options", "type": "object", @@ -956,12 +900,6 @@ "description": "Less common options for the pipeline, typically set in a config file.", "help_text": "These options are common to all nf-core pipelines and allow you to customise some of the core preferences for how the pipeline runs.\n\nTypically these options would be set in a Nextflow config file loaded for all pipeline runs, such as `~/.nextflow/config`.", "properties": { - "help": { - "type": "boolean", - "description": "Display help text.", - "fa_icon": "fas fa-question-circle", - "hidden": true - }, "version": { "type": "boolean", "description": "Display version and exit.", @@ -1012,6 +950,13 @@ "fa_icon": "fas fa-palette", "hidden": true }, + "hook_url": { + "type": "string", + "description": "Incoming hook URL for messaging service", + "fa_icon": "fas fa-people-group", + "help_text": "Incoming hook URL for messaging service. Currently, MS Teams and Slack are supported.", + "hidden": true + }, "multiqc_title": { "type": "string", "description": "MultiQC report title. Printed as page header, used for filename if not otherwise specified.", @@ -1042,34 +987,6 @@ "fa_icon": "fas fa-check-square", "hidden": true }, - "validationShowHiddenParams": { - "type": "boolean", - "fa_icon": "far fa-eye-slash", - "description": "Show all params when using `--help`", - "hidden": true, - "help_text": "By default, parameters set as _hidden_ in the schema are not shown on the command line when a user runs with `--help`. Specifying this option will tell the pipeline to show all parameters." - }, - "validationFailUnrecognisedParams": { - "type": "boolean", - "fa_icon": "far fa-check-circle", - "description": "Validation of parameters fails when an unrecognised parameter is found.", - "hidden": true, - "help_text": "By default, when an unrecognised parameter is found, it returns a warinig." - }, - "validationLenientMode": { - "type": "boolean", - "fa_icon": "far fa-check-circle", - "description": "Validation of parameters in lenient more.", - "hidden": true, - "help_text": "Allows string values that are parseable as numbers or booleans. For further information see [JSONSchema docs](https://github.com/everit-org/json-schema#lenient-mode)." - }, - "hook_url": { - "type": "string", - "description": "Incoming hook URL for messaging service", - "fa_icon": "fas fa-people-group", - "help_text": "Incoming hook URL for messaging service. Currently, MS Teams and Slack are supported.", - "hidden": true - }, "pipelines_testdata_base_path": { "type": "string", "fa_icon": "far fa-check-circle", @@ -1082,34 +999,34 @@ }, "allOf": [ { - "$ref": "#/definitions/input_output_options" + "$ref": "#/$defs/input_output_options" }, { - "$ref": "#/definitions/main_options" + "$ref": "#/$defs/main_options" }, { - "$ref": "#/definitions/fastq_preprocessing" + "$ref": "#/$defs/fastq_preprocessing" }, { - "$ref": "#/definitions/preprocessing" + "$ref": "#/$defs/preprocessing" }, { - "$ref": "#/definitions/variant_calling" + "$ref": "#/$defs/variant_calling" }, { - "$ref": "#/definitions/annotation" + "$ref": "#/$defs/annotation" }, { - "$ref": "#/definitions/reference_genome_options" + "$ref": "#/$defs/general_reference_genome_options" }, { - "$ref": "#/definitions/institutional_config_options" + "$ref": "#/$defs/reference_genome_options" }, { - "$ref": "#/definitions/max_job_request_options" + "$ref": "#/$defs/institutional_config_options" }, { - "$ref": "#/definitions/generic_options" + "$ref": "#/$defs/generic_options" } ] } diff --git a/nf-test.config b/nf-test.config index c60f901961..e2b4d8261a 100644 --- a/nf-test.config +++ b/nf-test.config @@ -1,6 +1,20 @@ config { + // location for all nf-tests testsDir "." - workDir ".nf-test" + + // nf-test directory including temporary files for each test + workDir System.getenv("NFT_WORKDIR") ?: ".nf-test" + + // location of an optional nextflow.config file specific for executing tests configFile "conf/test.config" + + // run all test with defined profile(s) from the main nextflow.config profile "test" + triggers 'nextflow.config', 'nf-test.config', 'conf/test.config', 'conf/test_full.config' + + // Include plugins + plugins { + load "nft-bam@0.4.0" + load "nft-utils@0.0.3" + } } diff --git a/subworkflows/local/annotation_cache_initialisation/main.nf b/subworkflows/local/annotation_cache_initialisation/main.nf index f4752b7a61..572bcfc43b 100644 --- a/subworkflows/local/annotation_cache_initialisation/main.nf +++ b/subworkflows/local/annotation_cache_initialisation/main.nf @@ -12,7 +12,6 @@ workflow ANNOTATION_CACHE_INITIALISATION { take: snpeff_enabled snpeff_cache - snpeff_genome snpeff_db vep_enabled vep_cache @@ -24,8 +23,8 @@ workflow ANNOTATION_CACHE_INITIALISATION { main: if (snpeff_enabled) { - def snpeff_annotation_cache_key = (snpeff_cache == "s3://annotation-cache/snpeff_cache/") ? "${snpeff_genome}.${snpeff_db}/" : "" - def snpeff_cache_dir = "${snpeff_annotation_cache_key}${snpeff_genome}.${snpeff_db}" + def snpeff_annotation_cache_key = (snpeff_cache == "s3://annotation-cache/snpeff_cache/") ? "${snpeff_db}/" : "" + def snpeff_cache_dir = "${snpeff_annotation_cache_key}${snpeff_db}" def snpeff_cache_path_full = file("$snpeff_cache/$snpeff_cache_dir", type: 'dir') if ( !snpeff_cache_path_full.exists() || !snpeff_cache_path_full.isDirectory() ) { if (snpeff_cache == "s3://annotation-cache/snpeff_cache/") { @@ -35,7 +34,7 @@ workflow ANNOTATION_CACHE_INITIALISATION { } } snpeff_cache = Channel.fromPath(file("${snpeff_cache}/${snpeff_annotation_cache_key}"), checkIfExists: true).collect() - .map{ cache -> [ [ id:"${snpeff_genome}.${snpeff_db}" ], cache ] } + .map{ cache -> [ [ id:"${snpeff_db}" ], cache ] } } else snpeff_cache = [] if (vep_enabled) { diff --git a/subworkflows/local/bam_joint_calling_germline_gatk/main.nf b/subworkflows/local/bam_joint_calling_germline_gatk/main.nf index 4d030ba5c5..46357d6fbe 100644 --- a/subworkflows/local/bam_joint_calling_germline_gatk/main.nf +++ b/subworkflows/local/bam_joint_calling_germline_gatk/main.nf @@ -52,7 +52,7 @@ workflow BAM_JOINT_CALLING_GERMLINE_GATK { // Joint genotyping performed using GenotypeGVCFs // Sort vcfs called by interval within each VCF - GATK4_GENOTYPEGVCFS(genotype_input, fasta.map{ meta, fasta -> [ fasta ] }, fai.map{ meta, fai -> [ fai ] }, dict.map{ meta, dict -> [ dict ] }, dbsnp, dbsnp_tbi) + GATK4_GENOTYPEGVCFS(genotype_input, fasta, fai, dict, dbsnp.map{ it -> [ [:], it ] }, dbsnp_tbi.map{ it -> [ [:], it ] }) BCFTOOLS_SORT(GATK4_GENOTYPEGVCFS.out.vcf) gvcf_to_merge = BCFTOOLS_SORT.out.vcf.map{ meta, vcf -> [ meta.subMap('num_intervals') + [ id:'joint_variant_calling', patient:'all_samples', variantcaller:'haplotypecaller' ], vcf ]}.groupTuple() diff --git a/subworkflows/local/bam_joint_calling_germline_sentieon/main.nf b/subworkflows/local/bam_joint_calling_germline_sentieon/main.nf index 23d0d8675a..0ac47a1e1e 100644 --- a/subworkflows/local/bam_joint_calling_germline_sentieon/main.nf +++ b/subworkflows/local/bam_joint_calling_germline_sentieon/main.nf @@ -36,7 +36,7 @@ workflow BAM_JOINT_CALLING_GERMLINE_SENTIEON { .map{ meta, gvcf, tbi, intervals -> [ [ id:'joint_variant_calling', intervals_name:intervals.baseName, num_intervals:meta.num_intervals ], gvcf, tbi, intervals ] } .groupTuple(by:[0, 3]) - SENTIEON_GVCFTYPER(sentieon_input, fasta.map{meta, it -> [ it ]}, fai.map{meta, it -> [ it ]}, dbsnp, dbsnp_tbi) + SENTIEON_GVCFTYPER(sentieon_input, fasta, fai, dbsnp, dbsnp_tbi) BCFTOOLS_SORT(SENTIEON_GVCFTYPER.out.vcf_gz) diff --git a/subworkflows/local/bam_variant_calling_deepvariant/main.nf b/subworkflows/local/bam_variant_calling_deepvariant/main.nf index feb7c33c08..567050b505 100644 --- a/subworkflows/local/bam_variant_calling_deepvariant/main.nf +++ b/subworkflows/local/bam_variant_calling_deepvariant/main.nf @@ -4,7 +4,7 @@ // For all modules here: // A when clause condition is defined in the conf/modules.config to determine if the module should be run -include { DEEPVARIANT } from '../../../modules/nf-core/deepvariant/main' +include { DEEPVARIANT_RUNDEEPVARIANT } from '../../../modules/nf-core/deepvariant/rundeepvariant/main' include { GATK4_MERGEVCFS as MERGE_DEEPVARIANT_GVCF } from '../../../modules/nf-core/gatk4/mergevcfs/main' include { GATK4_MERGEVCFS as MERGE_DEEPVARIANT_VCF } from '../../../modules/nf-core/gatk4/mergevcfs/main' @@ -25,17 +25,17 @@ workflow BAM_VARIANT_CALLING_DEEPVARIANT { // Move num_intervals to meta map .map{ meta, cram, crai, intervals, num_intervals -> [ meta + [ num_intervals:num_intervals ], cram, crai, intervals ]} - DEEPVARIANT(cram_intervals, fasta, fasta_fai, [ [ id:'null' ], [] ]) + DEEPVARIANT_RUNDEEPVARIANT(cram_intervals, fasta, fasta_fai, [ [ id:'null' ], [] ], [ [ id:'null' ], [] ]) // Figuring out if there is one or more vcf(s) from the same sample - vcf_out = DEEPVARIANT.out.vcf.branch{ + vcf_out = DEEPVARIANT_RUNDEEPVARIANT.out.vcf.branch{ // Use meta.num_intervals to asses number of intervals intervals: it[0].num_intervals > 1 no_intervals: it[0].num_intervals <= 1 } // Figuring out if there is one or more gvcf(s) from the same sample - gvcf_out = DEEPVARIANT.out.gvcf.branch{ + gvcf_out = DEEPVARIANT_RUNDEEPVARIANT.out.gvcf.branch{ // Use meta.num_intervals to asses number of intervals intervals: it[0].num_intervals > 1 no_intervals: it[0].num_intervals <= 1 @@ -58,7 +58,7 @@ workflow BAM_VARIANT_CALLING_DEEPVARIANT { // add variantcaller to meta map and remove no longer necessary field: num_intervals .map{ meta, vcf -> [ meta - meta.subMap('num_intervals') + [ variantcaller:'deepvariant' ], vcf ] } - versions = versions.mix(DEEPVARIANT.out.versions) + versions = versions.mix(DEEPVARIANT_RUNDEEPVARIANT.out.versions) versions = versions.mix(MERGE_DEEPVARIANT_GVCF.out.versions) versions = versions.mix(MERGE_DEEPVARIANT_VCF.out.versions) diff --git a/subworkflows/local/bam_variant_calling_germline_all/main.nf b/subworkflows/local/bam_variant_calling_germline_all/main.nf index deb15527d7..c7666d8fe4 100644 --- a/subworkflows/local/bam_variant_calling_germline_all/main.nf +++ b/subworkflows/local/bam_variant_calling_germline_all/main.nf @@ -9,6 +9,7 @@ include { BAM_VARIANT_CALLING_DEEPVARIANT include { BAM_VARIANT_CALLING_FREEBAYES } from '../bam_variant_calling_freebayes/main' include { BAM_VARIANT_CALLING_GERMLINE_MANTA } from '../bam_variant_calling_germline_manta/main' include { BAM_VARIANT_CALLING_HAPLOTYPECALLER } from '../bam_variant_calling_haplotypecaller/main' +include { BAM_VARIANT_CALLING_INDEXCOV } from '../bam_variant_calling_indexcov/main' include { BAM_VARIANT_CALLING_SENTIEON_DNASCOPE } from '../bam_variant_calling_sentieon_dnascope/main' include { BAM_VARIANT_CALLING_SENTIEON_HAPLOTYPER } from '../bam_variant_calling_sentieon_haplotyper/main' include { BAM_VARIANT_CALLING_MPILEUP } from '../bam_variant_calling_mpileup/main' @@ -18,8 +19,6 @@ include { SENTIEON_DNAMODELAPPLY include { VCF_VARIANT_FILTERING_GATK } from '../vcf_variant_filtering_gatk/main' include { VCF_VARIANT_FILTERING_GATK as SENTIEON_HAPLOTYPER_VCF_VARIANT_FILTERING_GATK } from '../vcf_variant_filtering_gatk/main' - - workflow BAM_VARIANT_CALLING_GERMLINE_ALL { take: tools // Mandatory, list of tools to apply @@ -58,6 +57,7 @@ workflow BAM_VARIANT_CALLING_GERMLINE_ALL { gvcf_sentieon_dnascope = Channel.empty() gvcf_sentieon_haplotyper = Channel.empty() + out_indexcov = Channel.empty() vcf_deepvariant = Channel.empty() vcf_freebayes = Channel.empty() vcf_haplotypecaller = Channel.empty() @@ -191,6 +191,18 @@ workflow BAM_VARIANT_CALLING_GERMLINE_ALL { versions = versions.mix(BAM_VARIANT_CALLING_GERMLINE_MANTA.out.versions) } + // INDEXCOV, for WGS only + if (params.wes==false && tools.split(',').contains('indexcov')) { + BAM_VARIANT_CALLING_INDEXCOV ( + cram, + fasta, + fasta_fai + ) + + out_indexcov = BAM_VARIANT_CALLING_INDEXCOV.out.out_indexcov + versions = versions.mix(BAM_VARIANT_CALLING_INDEXCOV.out.versions) + } + // SENTIEON DNASCOPE if (tools.split(',').contains('sentieon_dnascope')) { BAM_VARIANT_CALLING_SENTIEON_DNASCOPE( @@ -356,6 +368,7 @@ workflow BAM_VARIANT_CALLING_GERMLINE_ALL { emit: gvcf_sentieon_dnascope gvcf_sentieon_haplotyper + out_indexcov vcf_all vcf_deepvariant vcf_freebayes diff --git a/subworkflows/local/bam_variant_calling_indexcov/main.nf b/subworkflows/local/bam_variant_calling_indexcov/main.nf new file mode 100644 index 0000000000..d1bc9f39a9 --- /dev/null +++ b/subworkflows/local/bam_variant_calling_indexcov/main.nf @@ -0,0 +1,44 @@ +// +// Indexcov calling +// +// For all modules here: +// A when clause condition is defined in the conf/modules.config to determine if the module should be run + +include { SAMTOOLS_REINDEX_BAM } from '../../../modules/local/samtools/reindex_bam/main' +include { GOLEFT_INDEXCOV } from '../../../modules/nf-core/goleft/indexcov/main' + +// Seems to be the consensus on upstream modules implementation too +workflow BAM_VARIANT_CALLING_INDEXCOV { + take: + cram // channel: [mandatory] [ meta, cram, crai ] + fasta // channel: [mandatory] [ meta, fasta ] + fasta_fai // channel: [mandatory] [ meta, fasta_fai ] + + main: + versions = Channel.empty() + + // generate a cleaner bam index without duplicate, supplementary, etc. (Small workload because the bam itself is not re-generated) + reindex_ch = SAMTOOLS_REINDEX_BAM( + cram, + fasta, + fasta_fai + ) + + versions = versions.mix(reindex_ch.versions) + + // create [ [id:directory], bams, bais ] + indexcov_input_ch = reindex_ch.output.map{[[id:"indexcov"], it[1], it[2]]}.groupTuple() + + goleft_ch = GOLEFT_INDEXCOV( + indexcov_input_ch, + fasta_fai + ) + + versions = versions.mix(goleft_ch.versions) + + + emit: + + out_indexcov = goleft_ch.output + versions +} diff --git a/subworkflows/local/bam_variant_calling_sentieon_haplotyper/main.nf b/subworkflows/local/bam_variant_calling_sentieon_haplotyper/main.nf index 1f0d59e522..5c1f1338d2 100644 --- a/subworkflows/local/bam_variant_calling_sentieon_haplotyper/main.nf +++ b/subworkflows/local/bam_variant_calling_sentieon_haplotyper/main.nf @@ -48,9 +48,9 @@ workflow BAM_VARIANT_CALLING_SENTIEON_HAPLOTYPER { emit_vcf = lst.size() > 0 ? lst[0] : '' SENTIEON_HAPLOTYPER( - cram_intervals_for_sentieon, - fasta.map{ meta, it -> it }, - fasta_fai.map{ meta, it -> it }, + cram_intervals_for_sentieon.map{ meta, cram, crai, intervals, num_intervals -> [ meta, cram, crai, intervals, [] ]}, + fasta, + fasta_fai, dbsnp, dbsnp_tbi, emit_vcf, diff --git a/subworkflows/local/bam_variant_calling_somatic_all/main.nf b/subworkflows/local/bam_variant_calling_somatic_all/main.nf index cdfabfc3ac..1f8e26d3c7 100644 --- a/subworkflows/local/bam_variant_calling_somatic_all/main.nf +++ b/subworkflows/local/bam_variant_calling_somatic_all/main.nf @@ -12,6 +12,7 @@ include { BAM_VARIANT_CALLING_SOMATIC_MANTA } from '../bam_variant_c include { BAM_VARIANT_CALLING_SOMATIC_MUTECT2 } from '../bam_variant_calling_somatic_mutect2/main' include { BAM_VARIANT_CALLING_SOMATIC_STRELKA } from '../bam_variant_calling_somatic_strelka/main' include { BAM_VARIANT_CALLING_SOMATIC_TIDDIT } from '../bam_variant_calling_somatic_tiddit/main' +include { BAM_VARIANT_CALLING_INDEXCOV } from '../bam_variant_calling_indexcov/main' include { MSISENSORPRO_MSISOMATIC } from '../../../modules/nf-core/msisensorpro/msisomatic/main' workflow BAM_VARIANT_CALLING_SOMATIC_ALL { @@ -53,6 +54,7 @@ workflow BAM_VARIANT_CALLING_SOMATIC_ALL { out_msisensorpro = Channel.empty() vcf_mutect2 = Channel.empty() vcf_tiddit = Channel.empty() + out_indexcov = Channel.empty() if (tools.split(',').contains('ascat')) { BAM_VARIANT_CALLING_SOMATIC_ASCAT( @@ -154,6 +156,20 @@ workflow BAM_VARIANT_CALLING_SOMATIC_ALL { versions = versions.mix(BAM_VARIANT_CALLING_SOMATIC_MANTA.out.versions) } + + // INDEXCOV, for WGS only + if (params.wes==false && tools.split(',').contains('indexcov')) { + BAM_VARIANT_CALLING_INDEXCOV ( + cram, + fasta, + fasta_fai + ) + + out_indexcov = BAM_VARIANT_CALLING_INDEXCOV.out.out_indexcov + versions = versions.mix(BAM_VARIANT_CALLING_INDEXCOV.out.versions) + } + + // STRELKA if (tools.split(',').contains('strelka')) { // Remap channel to match module/subworkflow @@ -232,6 +248,7 @@ workflow BAM_VARIANT_CALLING_SOMATIC_ALL { ) emit: + out_indexcov out_msisensorpro vcf_all vcf_freebayes diff --git a/subworkflows/local/bam_variant_calling_somatic_tiddit/main.nf b/subworkflows/local/bam_variant_calling_somatic_tiddit/main.nf index 259520fce1..8c17df041b 100644 --- a/subworkflows/local/bam_variant_calling_somatic_tiddit/main.nf +++ b/subworkflows/local/bam_variant_calling_somatic_tiddit/main.nf @@ -22,7 +22,7 @@ workflow BAM_VARIANT_CALLING_SOMATIC_TIDDIT { TIDDIT_NORMAL(cram_normal, fasta, bwa) TIDDIT_TUMOR(cram_tumor, fasta, bwa) - SVDB_MERGE(TIDDIT_NORMAL.out.vcf.join(TIDDIT_TUMOR.out.vcf, failOnDuplicate: true, failOnMismatch: true).map{ meta, vcf_normal, vcf_tumor -> [ meta, [vcf_normal, vcf_tumor] ] }, false) + SVDB_MERGE(TIDDIT_NORMAL.out.vcf.join(TIDDIT_TUMOR.out.vcf, failOnDuplicate: true, failOnMismatch: true).map{ meta, vcf_normal, vcf_tumor -> [ meta, [vcf_normal, vcf_tumor] ] }, false, true) vcf = SVDB_MERGE.out.vcf diff --git a/subworkflows/local/bam_variant_calling_tumor_only_all/main.nf b/subworkflows/local/bam_variant_calling_tumor_only_all/main.nf index 59b14ed898..8016391cfc 100644 --- a/subworkflows/local/bam_variant_calling_tumor_only_all/main.nf +++ b/subworkflows/local/bam_variant_calling_tumor_only_all/main.nf @@ -6,11 +6,11 @@ include { BAM_VARIANT_CALLING_CNVKIT } from '../bam_variant_calling_cnvkit/main' include { BAM_VARIANT_CALLING_FREEBAYES } from '../bam_variant_calling_freebayes/main' include { BAM_VARIANT_CALLING_MPILEUP } from '../bam_variant_calling_mpileup/main' -include { BAM_VARIANT_CALLING_SINGLE_STRELKA } from '../bam_variant_calling_single_strelka/main' include { BAM_VARIANT_CALLING_SINGLE_TIDDIT } from '../bam_variant_calling_single_tiddit/main' include { BAM_VARIANT_CALLING_TUMOR_ONLY_CONTROLFREEC } from '../bam_variant_calling_tumor_only_controlfreec/main' include { BAM_VARIANT_CALLING_TUMOR_ONLY_MANTA } from '../bam_variant_calling_tumor_only_manta/main' include { BAM_VARIANT_CALLING_TUMOR_ONLY_MUTECT2 } from '../bam_variant_calling_tumor_only_mutect2/main' +include { BAM_VARIANT_CALLING_TUMOR_ONLY_LOFREQ } from '../bam_variant_calling_tumor_only_lofreq/main' workflow BAM_VARIANT_CALLING_TUMOR_ONLY_ALL { take: @@ -45,8 +45,8 @@ workflow BAM_VARIANT_CALLING_TUMOR_ONLY_ALL { vcf_manta = Channel.empty() vcf_mpileup = Channel.empty() vcf_mutect2 = Channel.empty() - vcf_strelka = Channel.empty() vcf_tiddit = Channel.empty() + vcf_lofreq = Channel.empty() // MPILEUP if (tools.split(',').contains('mpileup') || tools.split(',').contains('controlfreec')) { @@ -134,6 +134,19 @@ workflow BAM_VARIANT_CALLING_TUMOR_ONLY_ALL { versions = versions.mix(BAM_VARIANT_CALLING_TUMOR_ONLY_MUTECT2.out.versions) } + //LOFREQ + if (tools.split(',').contains('lofreq')) { + BAM_VARIANT_CALLING_TUMOR_ONLY_LOFREQ( + cram, + fasta, + fasta_fai, + intervals, + dict + ) + vcf_lofreq = BAM_VARIANT_CALLING_TUMOR_ONLY_LOFREQ.out.vcf + versions = versions.mix(BAM_VARIANT_CALLING_TUMOR_ONLY_LOFREQ.out.versions) + } + // MANTA if (tools.split(',').contains('manta')) { BAM_VARIANT_CALLING_TUMOR_ONLY_MANTA( @@ -149,20 +162,6 @@ workflow BAM_VARIANT_CALLING_TUMOR_ONLY_ALL { versions = versions.mix(BAM_VARIANT_CALLING_TUMOR_ONLY_MANTA.out.versions) } - // STRELKA - if (tools.split(',').contains('strelka')) { - BAM_VARIANT_CALLING_SINGLE_STRELKA( - cram, - dict, - fasta.map{ meta, fasta -> [ fasta ] }, - fasta_fai.map{ meta, fasta_fai -> [ fasta_fai ] }, - intervals_bed_gz_tbi - ) - - vcf_strelka = BAM_VARIANT_CALLING_SINGLE_STRELKA.out.vcf - versions = versions.mix(BAM_VARIANT_CALLING_SINGLE_STRELKA.out.versions) - } - // TIDDIT if (tools.split(',').contains('tiddit')) { BAM_VARIANT_CALLING_SINGLE_TIDDIT( @@ -176,20 +175,20 @@ workflow BAM_VARIANT_CALLING_TUMOR_ONLY_ALL { vcf_all = Channel.empty().mix( vcf_freebayes, + vcf_lofreq, vcf_manta, vcf_mutect2, vcf_mpileup, - vcf_strelka, vcf_tiddit ) emit: vcf_all vcf_freebayes + vcf_lofreq vcf_manta vcf_mpileup vcf_mutect2 - vcf_strelka vcf_tiddit versions = versions diff --git a/subworkflows/local/bam_variant_calling_tumor_only_lofreq/main.nf b/subworkflows/local/bam_variant_calling_tumor_only_lofreq/main.nf new file mode 100644 index 0000000000..b619c1f796 --- /dev/null +++ b/subworkflows/local/bam_variant_calling_tumor_only_lofreq/main.nf @@ -0,0 +1,51 @@ +include { LOFREQ_CALLPARALLEL as LOFREQ } from '../../../modules/nf-core/lofreq/callparallel/main.nf' +include { GATK4_MERGEVCFS as MERGE_LOFREQ } from '../../../modules/nf-core/gatk4/mergevcfs/main.nf' + +workflow BAM_VARIANT_CALLING_TUMOR_ONLY_LOFREQ { + take: + input // channel: [mandatory] [ meta, tumor_cram, tumor_crai ] + fasta // channel: [mandatory] [ fasta ] + fai // channel: [mandatory] [ fasta_fai ] + intervals // channel: [mandatory] [ intervals, num_intervals ] or [ [], 0 ] + dict // channel: /path/to/reference/fasta/dictionary + + main: + versions = Channel.empty() + + // Combine cram and intervals for spread and gather strategy + input_intervals = input.combine(intervals) + // Move num_intervals to meta map + .map {meta, tumor_cram, tumor_crai, intervals, num_intervals -> [meta + [ num_intervals:num_intervals ], tumor_cram, tumor_crai, intervals]} + + LOFREQ(input_intervals, fasta, fai) // Call variants with LoFreq + + // Figuring out if there is one or more vcf(s) from the same sample + vcf_branch = LOFREQ.out.vcf.branch{ + // Use meta.num_intervals to asses number of intervals + intervals: it[0].num_intervals > 1 + no_intervals: it[0].num_intervals <= 1 + } + + // Figuring out if there is one or more tbi(s) from the same sample + tbi_branch = LOFREQ.out.tbi.branch{ + // Use meta.num_intervals to asses number of intervals + intervals: it[0].num_intervals > 1 + no_intervals: it[0].num_intervals <= 1 + } + + // Only when using intervals + vcf_to_merge = vcf_branch.intervals.map{ meta, vcf -> [ groupKey(meta, meta.num_intervals), vcf ] }.groupTuple() + + MERGE_LOFREQ(vcf_to_merge, dict) + + // Mix intervals and no_intervals channels together + // Remove unnecessary metadata + vcf = Channel.empty().mix(MERGE_LOFREQ.out.vcf, vcf_branch.no_intervals).map{ meta, vcf -> [ meta - meta.subMap('num_intervals') + [ variantcaller:'lofreq' ], vcf ] } + + versions = versions.mix(MERGE_LOFREQ.out.versions) + versions = versions.mix(LOFREQ.out.versions) + + emit: + vcf + versions +} diff --git a/subworkflows/local/fastq_create_umi_consensus_fgbio/main.nf b/subworkflows/local/fastq_create_umi_consensus_fgbio/main.nf index c237e64014..e683ad72c0 100644 --- a/subworkflows/local/fastq_create_umi_consensus_fgbio/main.nf +++ b/subworkflows/local/fastq_create_umi_consensus_fgbio/main.nf @@ -50,7 +50,10 @@ workflow FASTQ_CREATE_UMI_CONSENSUS_FGBIO { // Using newly created groups // To call a consensus across reads in the same group // And emit a consensus BAM file - CALLUMICONSENSUS(GROUPREADSBYUMI.out.bam) + // TODO: add params for call_min_reads and call_min_baseq + call_min_reads = 1 + call_min_baseq = 10 + CALLUMICONSENSUS(GROUPREADSBYUMI.out.bam, call_min_reads, call_min_baseq) ch_versions = ch_versions.mix(BAM2FASTQ.out.versions) ch_versions = ch_versions.mix(ALIGN_UMI.out.versions) diff --git a/subworkflows/local/samplesheet_to_channel/main.nf b/subworkflows/local/samplesheet_to_channel/main.nf index 245bfaec1a..1c0d80a1db 100644 --- a/subworkflows/local/samplesheet_to_channel/main.nf +++ b/subworkflows/local/samplesheet_to_channel/main.nf @@ -26,6 +26,8 @@ workflow SAMPLESHEET_TO_CHANNEL{ seq_center // seq_platform // skip_tools // + snpeff_cache // + snpeff_db // step // tools // umi_read_structure // @@ -278,6 +280,12 @@ Joint germline variant calling also requires intervals in order to genotype the error("Please specify --bcftools_annotations, --bcftools_annotations_tbi, and --bcftools_header_lines, when using BCFTools annotations") } + // Fails when snpeff annotation is enabled but snpeff_db is not specified + if ((snpeff_cache && tools && (tools.split(',').contains("snpeff") || tools.split(',').contains('merge'))) && + !snpeff_db) { + error("Please specify --snpeff_db") + } + emit: input_sample } diff --git a/subworkflows/local/samplesheet_to_channel/main.nf.test b/subworkflows/local/samplesheet_to_channel/main.nf.test deleted file mode 100644 index 49eeb2a132..0000000000 --- a/subworkflows/local/samplesheet_to_channel/main.nf.test +++ /dev/null @@ -1,34 +0,0 @@ -nextflow_workflow { - - name "Test Workflow SAMPLESHEET_TO_CHANNEL" - script "subworkflows/local/samplesheet_to_channel/main.nf" - workflow "SAMPLESHEET_TO_CHANNEL" - - test("Should run without failures") { - - when { - params { - // define parameters here. Example: - skip_tools = 'baserecalibrator' - - } - workflow { - """ - // define inputs of the workflow here. Example: - input[0] = Channel.of([['patient':'test', 'sample':'test', - 'sex':'XX', 'status':0, 'lane':'test_L1'], - file(params.test_data['sarscov2']['illumina']['test_1_fastq_gz'], checkIfExists: true), - file(params.test_data['sarscov2']['illumina']['test_2_fastq_gz'], checkIfExists: true), - [], [], [], [], [], [], []]) - """ - } - } - - then { - assert workflow.success - assert snapshot(workflow.out).match() - } - - } - -} diff --git a/subworkflows/local/samplesheet_to_channel/tests/main.nf.test b/subworkflows/local/samplesheet_to_channel/tests/main.nf.test new file mode 100644 index 0000000000..e1c8682d27 --- /dev/null +++ b/subworkflows/local/samplesheet_to_channel/tests/main.nf.test @@ -0,0 +1,62 @@ +nextflow_workflow { + + name "Test Workflow SAMPLESHEET_TO_CHANNEL" + script "../main.nf" + workflow "SAMPLESHEET_TO_CHANNEL" + + test("Should run without failures") { + when { + params { + } + workflow { + """ + // define inputs of the workflow here. Example: + input[0] = Channel.of([ + ['patient':'test', 'sample':'test', + 'sex':'XX', 'status':0, 'lane':'test_L1'], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/fastq/test_2.fastq.gz', checkIfExists: true), + [], [], [], [], [], [], [], [], [] + ]) + input[1] = 'bwa-mem' // aligner + input[2] = [] // ascat_alleles + input[3] = [] // ascat_loci + input[4] = [] // ascat_loci_gc + input[5] = [] // ascat_loci_rt + input[6] = [] // bcftools_annotations + input[7] = [] // bcftools_annotations_tbi + input[8] = [] // bcftools_header_lines + input[9] = false // build_only_index + input[10] = [] // dbsnp + input[11] = [] // fasta + input[12] = [] // germline_resource + input[13] = [] // intervals + input[14] = false // joint_germline + input[15] = false // joint_mutect2 + input[16] = [] // known_indels + input[17] = [] // known_snps + input[18] = false // no_intervals + input[19] = [] // pon + input[20] = 'variant' // sentieon_dnascope_emit_mode + input[21] = 'variant' // sentieon_haplotyper_emit_mode + input[22] = '' // seq_center + input[23] = 'ILLUMINA' // seq_platform + input[24] = 'baserecalibrator' // skip_tools + input[25] = [] // snpeff_cache + input[26] = 'WBcel235.105' // snpeff_db + input[27] = 'mapping' // step + input[28] = 'strelka' // tools + input[29] = [] // umi_read_structure + input[30] = false // wes + """ + } + } + + then { + assert workflow.success + assert snapshot(workflow.out).match() + } + + } + +} diff --git a/subworkflows/local/samplesheet_to_channel/main.nf.test.snap b/subworkflows/local/samplesheet_to_channel/tests/main.nf.test.snap similarity index 78% rename from subworkflows/local/samplesheet_to_channel/main.nf.test.snap rename to subworkflows/local/samplesheet_to_channel/tests/main.nf.test.snap index fa440f539b..19fcc95d66 100644 --- a/subworkflows/local/samplesheet_to_channel/main.nf.test.snap +++ b/subworkflows/local/samplesheet_to_channel/tests/main.nf.test.snap @@ -9,10 +9,10 @@ "sample": "test", "sex": "XX", "status": 0, + "lane": "test_L1", "id": "test-test_L1", + "data_type": "fastq_gz", "num_lanes": 1, - "read_group": "\"@RG\\tID:null.test.test_L1\\tPU:test_L1\\tSM:test_test\\tLB:test\\tDS:null\\tPL:ILLUMINA\"", - "data_type": "fastq", "size": 1 }, [ @@ -28,10 +28,10 @@ "sample": "test", "sex": "XX", "status": 0, + "lane": "test_L1", "id": "test-test_L1", + "data_type": "fastq_gz", "num_lanes": 1, - "read_group": "\"@RG\\tID:null.test.test_L1\\tPU:test_L1\\tSM:test_test\\tLB:test\\tDS:null\\tPL:ILLUMINA\"", - "data_type": "fastq", "size": 1 }, [ @@ -42,6 +42,10 @@ ] } ], - "timestamp": "2023-10-16T14:12:54.640503" + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.09.0" + }, + "timestamp": "2024-10-04T10:25:14.620549" } -} \ No newline at end of file +} diff --git a/subworkflows/local/utils_nfcore_sarek_pipeline/main.nf b/subworkflows/local/utils_nfcore_sarek_pipeline/main.nf index 23415aed48..ce568284c7 100644 --- a/subworkflows/local/utils_nfcore_sarek_pipeline/main.nf +++ b/subworkflows/local/utils_nfcore_sarek_pipeline/main.nf @@ -8,31 +8,30 @@ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ -include { paramsSummaryMap } from 'plugin/nf-validation' -include { fromSamplesheet } from 'plugin/nf-validation' +include { SAMPLESHEET_TO_CHANNEL } from '../samplesheet_to_channel' include { UTILS_NEXTFLOW_PIPELINE } from '../../nf-core/utils_nextflow_pipeline' -include { UTILS_NFVALIDATION_PLUGIN } from '../../nf-core/utils_nfvalidation_plugin' include { UTILS_NFCORE_PIPELINE } from '../../nf-core/utils_nfcore_pipeline' +include { UTILS_NFSCHEMA_PLUGIN } from '../../nf-core/utils_nfschema_plugin' include { completionEmail } from '../../nf-core/utils_nfcore_pipeline' include { completionSummary } from '../../nf-core/utils_nfcore_pipeline' include { dashedLine } from '../../nf-core/utils_nfcore_pipeline' include { getWorkflowVersion } from '../../nf-core/utils_nfcore_pipeline' include { imNotification } from '../../nf-core/utils_nfcore_pipeline' include { logColours } from '../../nf-core/utils_nfcore_pipeline' +include { paramsSummaryMap } from 'plugin/nf-schema' +include { samplesheetToList } from 'plugin/nf-schema' include { workflowCitation } from '../../nf-core/utils_nfcore_pipeline' -include { SAMPLESHEET_TO_CHANNEL } from '../samplesheet_to_channel' /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SUBWORKFLOW TO INITIALISE PIPELINE -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ workflow PIPELINE_INITIALISATION { take: version // boolean: Display version and exit - help // boolean: Display help text validate_params // boolean: Boolean whether to validate parameters against the schema at runtime monochrome_logs // boolean: Do not use coloured log outputs nextflow_cli_args // array: List of positional nextflow CLI args @@ -56,16 +55,10 @@ workflow PIPELINE_INITIALISATION { // // Validate parameters and generate parameter summary to stdout // - pre_help_text = nfCoreLogo(monochrome_logs) - post_help_text = '\n' + workflowCitation() + '\n' + dashedLine(monochrome_logs) - def String workflow_command = "nextflow run ${workflow.manifest.name} -profile --input samplesheet.csv --outdir " - UTILS_NFVALIDATION_PLUGIN ( - help, - workflow_command, - pre_help_text, - post_help_text, + UTILS_NFSCHEMA_PLUGIN ( + workflow, validate_params, - "nextflow_schema.json" + null ) // @@ -128,7 +121,9 @@ if (params.tools && (params.tools.split(',').contains('vep') || params.tools. params.input_restart = retrieveInput((!params.build_only_index && !params.input), params.step, params.outdir) - ch_from_samplesheet = params.build_only_index ? Channel.empty() : params.input ? Channel.fromSamplesheet("input") : Channel.fromSamplesheet("input_restart") + ch_from_samplesheet = params.build_only_index ? Channel.empty() : params.input ? + Channel.fromList(samplesheetToList(params.input, "$projectDir/assets/schema_input.json")) : + Channel.fromList(samplesheetToList(params.input_restart, "$projectDir/assets/schema_input.json")) SAMPLESHEET_TO_CHANNEL( ch_from_samplesheet, @@ -156,6 +151,8 @@ if (params.tools && (params.tools.split(',').contains('vep') || params.tools. params.seq_center, params.seq_platform, params.skip_tools, + params.snpeff_cache, + params.snpeff_db, params.step, params.tools, params.umi_read_structure, @@ -167,9 +164,9 @@ if (params.tools && (params.tools.split(',').contains('vep') || params.tools. } /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SUBWORKFLOW FOR PIPELINE COMPLETION -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ workflow PIPELINE_COMPLETION { @@ -184,19 +181,27 @@ workflow PIPELINE_COMPLETION { multiqc_report // string: Path to MultiQC report main: - summary_params = paramsSummaryMap(workflow, parameters_schema: "nextflow_schema.json") + def multiqc_report_list = multiqc_report.toList() + // // Completion email and summary // workflow.onComplete { if (email || email_on_fail) { - completionEmail(summary_params, email, email_on_fail, plaintext_email, outdir, monochrome_logs, multiqc_report.toList()) + completionEmail( + summary_params, + email, + email_on_fail, + plaintext_email, + outdir, + monochrome_logs, + multiqc_report_list.getVal() + ) } completionSummary(monochrome_logs) - if (hook_url) { imNotification(summary_params, hook_url) } @@ -208,9 +213,9 @@ workflow PIPELINE_COMPLETION { } /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ FUNCTIONS -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ // // Check and validate pipeline parameters @@ -226,7 +231,7 @@ def validateInputSamplesheet(input) { def (metas, fastqs) = input[1..2] // Check that multiple runs of the same sample are of the same datatype i.e. single-end / paired-end - def endedness_ok = metas.collect{ it.single_end }.unique().size == 1 + def endedness_ok = metas.collect{ meta -> meta.single_end }.unique().size == 1 if (!endedness_ok) { error("Please check input samplesheet -> Multiple runs of a sample must be of the same datatype i.e. single-end or paired-end: ${metas[0].id}") } @@ -247,7 +252,6 @@ def genomeExistsError() { error(error_string) } } - // // Generate methods description for MultiQC // @@ -289,8 +293,10 @@ def methodsDescriptionText(mqc_methods_yaml) { // Removing `https://doi.org/` to handle pipelines using DOIs vs DOI resolvers // Removing ` ` since the manifest.doi is a string and not a proper list def temp_doi_ref = "" - String[] manifest_doi = meta.manifest_map.doi.tokenize(",") - for (String doi_ref: manifest_doi) temp_doi_ref += "(doi: ${doi_ref.replace("https://doi.org/", "").replace(" ", "")}), " + def manifest_doi = meta.manifest_map.doi.tokenize(",") + manifest_doi.each { doi_ref -> + temp_doi_ref += "(doi: ${doi_ref.replace("https://doi.org/", "").replace(" ", "")}), " + } meta["doi_text"] = temp_doi_ref.substring(0, temp_doi_ref.length() - 2) } else meta["doi_text"] = "" meta["nodoi_text"] = meta.manifest_map.doi ? "" : "
  • If available, make sure to update the text to include the Zenodo DOI of the pipeline version used.
  • " @@ -345,7 +351,7 @@ def retrieveInput(need_input, step, outdir) { def input = null if (!params.input && !params.build_only_index) { switch (step) { - case 'mapping': Nextflow.error("Can't start with step $step without samplesheet") + case 'mapping': error("Can't start $step step without samplesheet") break case 'markduplicates': log.warn("Using file ${outdir}/csv/mapped.csv"); input = outdir + "/csv/mapped.csv" @@ -364,7 +370,7 @@ def retrieveInput(need_input, step, outdir) { input = outdir + "/csv/variantcalled.csv" break default: log.warn("Please provide an input samplesheet to the pipeline e.g. '--input samplesheet.csv'") - Nextflow.error("Unknown step $step") + error("Unknown step $step") } } return input diff --git a/subworkflows/local/vcf_annotate_bcftools/main.nf b/subworkflows/local/vcf_annotate_bcftools/main.nf index e54c52aa7c..9f23a426ca 100644 --- a/subworkflows/local/vcf_annotate_bcftools/main.nf +++ b/subworkflows/local/vcf_annotate_bcftools/main.nf @@ -3,8 +3,7 @@ // Run BCFtools to annotate VCF files // -include { BCFTOOLS_ANNOTATE } from '../../../modules/nf-core/bcftools/annotate/main' -include { TABIX_TABIX } from '../../../modules/nf-core/tabix/tabix/main' +include { BCFTOOLS_ANNOTATE } from '../../../modules/nf-core/bcftools/annotate/main' workflow VCF_ANNOTATE_BCFTOOLS { take: @@ -17,15 +16,12 @@ workflow VCF_ANNOTATE_BCFTOOLS { main: ch_versions = Channel.empty() - BCFTOOLS_ANNOTATE(vcf, annotations, annotations_index, header_lines) - TABIX_TABIX(BCFTOOLS_ANNOTATE.out.vcf) - - ch_vcf_tbi = BCFTOOLS_ANNOTATE.out.vcf.join(TABIX_TABIX.out.tbi, failOnDuplicate: true, failOnMismatch: true) + BCFTOOLS_ANNOTATE(vcf.map{ meta, vcf -> [ meta, vcf, [] ] }, annotations, annotations_index, header_lines) + ch_vcf_tbi = BCFTOOLS_ANNOTATE.out.vcf.join(BCFTOOLS_ANNOTATE.out.tbi, failOnDuplicate: true, failOnMismatch: true) // Gather versions of all tools used ch_versions = ch_versions.mix(BCFTOOLS_ANNOTATE.out.versions) - ch_versions = ch_versions.mix(TABIX_TABIX.out.versions) emit: vcf_tbi = ch_vcf_tbi // channel: [ val(meta), vcf.gz, vcf.gz.tbi ] diff --git a/subworkflows/nf-core/bam_ngscheckmate/main.nf b/subworkflows/nf-core/bam_ngscheckmate/main.nf index 4dd106f327..629dbf25cd 100644 --- a/subworkflows/nf-core/bam_ngscheckmate/main.nf +++ b/subworkflows/nf-core/bam_ngscheckmate/main.nf @@ -46,4 +46,3 @@ workflow BAM_NGSCHECKMATE { versions = ch_versions // channel: [ versions.yml ] } - diff --git a/subworkflows/nf-core/utils_nextflow_pipeline/main.nf b/subworkflows/nf-core/utils_nextflow_pipeline/main.nf index ac31f28f66..0fcbf7b3f2 100644 --- a/subworkflows/nf-core/utils_nextflow_pipeline/main.nf +++ b/subworkflows/nf-core/utils_nextflow_pipeline/main.nf @@ -2,18 +2,13 @@ // Subworkflow with functionality that may be useful for any Nextflow pipeline // -import org.yaml.snakeyaml.Yaml -import groovy.json.JsonOutput -import nextflow.extension.FilesEx - /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SUBWORKFLOW DEFINITION -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ workflow UTILS_NEXTFLOW_PIPELINE { - take: print_version // boolean: print version dump_parameters // boolean: dump parameters @@ -26,7 +21,7 @@ workflow UTILS_NEXTFLOW_PIPELINE { // Print workflow version and exit on --version // if (print_version) { - log.info "${workflow.manifest.name} ${getWorkflowVersion()}" + log.info("${workflow.manifest.name} ${getWorkflowVersion()}") System.exit(0) } @@ -49,16 +44,16 @@ workflow UTILS_NEXTFLOW_PIPELINE { } /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ FUNCTIONS -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ // // Generate version string // def getWorkflowVersion() { - String version_string = "" + def version_string = "" as String if (workflow.manifest.version) { def prefix_v = workflow.manifest.version[0] != 'v' ? 'v' : '' version_string += "${prefix_v}${workflow.manifest.version}" @@ -76,13 +71,13 @@ def getWorkflowVersion() { // Dump pipeline parameters to a JSON file // def dumpParametersToJSON(outdir) { - def timestamp = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss') - def filename = "params_${timestamp}.json" - def temp_pf = new File(workflow.launchDir.toString(), ".${filename}") - def jsonStr = JsonOutput.toJson(params) - temp_pf.text = JsonOutput.prettyPrint(jsonStr) + def timestamp = new java.util.Date().format('yyyy-MM-dd_HH-mm-ss') + def filename = "params_${timestamp}.json" + def temp_pf = new File(workflow.launchDir.toString(), ".${filename}") + def jsonStr = groovy.json.JsonOutput.toJson(params) + temp_pf.text = groovy.json.JsonOutput.prettyPrint(jsonStr) - FilesEx.copyTo(temp_pf.toPath(), "${outdir}/pipeline_info/params_${timestamp}.json") + nextflow.extension.FilesEx.copyTo(temp_pf.toPath(), "${outdir}/pipeline_info/params_${timestamp}.json") temp_pf.delete() } @@ -90,37 +85,40 @@ def dumpParametersToJSON(outdir) { // When running with -profile conda, warn if channels have not been set-up appropriately // def checkCondaChannels() { - Yaml parser = new Yaml() + def parser = new org.yaml.snakeyaml.Yaml() def channels = [] try { def config = parser.load("conda config --show channels".execute().text) channels = config.channels - } catch(NullPointerException | IOException e) { - log.warn "Could not verify conda channel configuration." - return + } + catch (NullPointerException e) { + log.warn("Could not verify conda channel configuration.") + return null + } + catch (IOException e) { + log.warn("Could not verify conda channel configuration.") + return null } // Check that all channels are present // This channel list is ordered by required channel priority. - def required_channels_in_order = ['conda-forge', 'bioconda', 'defaults'] + def required_channels_in_order = ['conda-forge', 'bioconda'] def channels_missing = ((required_channels_in_order as Set) - (channels as Set)) as Boolean // Check that they are in the right order - def channel_priority_violation = false - def n = required_channels_in_order.size() - for (int i = 0; i < n - 1; i++) { - channel_priority_violation |= !(channels.indexOf(required_channels_in_order[i]) < channels.indexOf(required_channels_in_order[i+1])) - } + def channel_priority_violation = required_channels_in_order != channels.findAll { ch -> ch in required_channels_in_order } if (channels_missing | channel_priority_violation) { - log.warn "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n" + - " There is a problem with your Conda configuration!\n\n" + - " You will need to set-up the conda-forge and bioconda channels correctly.\n" + - " Please refer to https://bioconda.github.io/\n" + - " The observed channel order is \n" + - " ${channels}\n" + - " but the following channel order is required:\n" + - " ${required_channels_in_order}\n" + - "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" + log.warn """\ + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + There is a problem with your Conda configuration! + You will need to set-up the conda-forge and bioconda channels correctly. + Please refer to https://bioconda.github.io/ + The observed channel order is + ${channels} + but the following channel order is required: + ${required_channels_in_order} + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" + """.stripIndent(true) } } diff --git a/subworkflows/nf-core/utils_nextflow_pipeline/tests/nextflow.config b/subworkflows/nf-core/utils_nextflow_pipeline/tests/nextflow.config index d0a926bf6d..a09572e5bb 100644 --- a/subworkflows/nf-core/utils_nextflow_pipeline/tests/nextflow.config +++ b/subworkflows/nf-core/utils_nextflow_pipeline/tests/nextflow.config @@ -3,7 +3,7 @@ manifest { author = """nf-core""" homePage = 'https://127.0.0.1' description = """Dummy pipeline""" - nextflowVersion = '!>=23.04.0' + nextflowVersion = '!>=23.04.0' version = '9.9.9' doi = 'https://doi.org/10.5281/zenodo.5070524' } diff --git a/subworkflows/nf-core/utils_nfcore_pipeline/main.nf b/subworkflows/nf-core/utils_nfcore_pipeline/main.nf index 14558c3927..5cb7bafef3 100644 --- a/subworkflows/nf-core/utils_nfcore_pipeline/main.nf +++ b/subworkflows/nf-core/utils_nfcore_pipeline/main.nf @@ -2,17 +2,13 @@ // Subworkflow with utility functions specific to the nf-core pipeline template // -import org.yaml.snakeyaml.Yaml -import nextflow.extension.FilesEx - /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SUBWORKFLOW DEFINITION -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ workflow UTILS_NFCORE_PIPELINE { - take: nextflow_cli_args @@ -25,23 +21,20 @@ workflow UTILS_NFCORE_PIPELINE { } /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ FUNCTIONS -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ // // Warn if a -profile or Nextflow config has not been provided to run the pipeline // def checkConfigProvided() { - valid_config = true + def valid_config = true as Boolean if (workflow.profile == 'standard' && workflow.configFiles.size() <= 1) { - log.warn "[$workflow.manifest.name] You are attempting to run the pipeline without any custom configuration!\n\n" + - "This will be dependent on your local compute environment but can be achieved via one or more of the following:\n" + - " (1) Using an existing pipeline profile e.g. `-profile docker` or `-profile singularity`\n" + - " (2) Using an existing nf-core/configs for your Institution e.g. `-profile crick` or `-profile uppmax`\n" + - " (3) Using your own local custom config e.g. `-c /path/to/your/custom.config`\n\n" + - "Please refer to the quick start section and usage docs for the pipeline.\n " + log.warn( + "[${workflow.manifest.name}] You are attempting to run the pipeline without any custom configuration!\n\n" + "This will be dependent on your local compute environment but can be achieved via one or more of the following:\n" + " (1) Using an existing pipeline profile e.g. `-profile docker` or `-profile singularity`\n" + " (2) Using an existing nf-core/configs for your Institution e.g. `-profile crick` or `-profile uppmax`\n" + " (3) Using your own local custom config e.g. `-c /path/to/your/custom.config`\n\n" + "Please refer to the quick start section and usage docs for the pipeline.\n " + ) valid_config = false } return valid_config @@ -52,12 +45,14 @@ def checkConfigProvided() { // def checkProfileProvided(nextflow_cli_args) { if (workflow.profile.endsWith(',')) { - error "The `-profile` option cannot end with a trailing comma, please remove it and re-run the pipeline!\n" + - "HINT: A common mistake is to provide multiple values separated by spaces e.g. `-profile test, docker`.\n" + error( + "The `-profile` option cannot end with a trailing comma, please remove it and re-run the pipeline!\n" + "HINT: A common mistake is to provide multiple values separated by spaces e.g. `-profile test, docker`.\n" + ) } if (nextflow_cli_args[0]) { - log.warn "nf-core pipelines do not accept positional arguments. The positional argument `${nextflow_cli_args[0]}` has been detected.\n" + - "HINT: A common mistake is to provide multiple values separated by spaces e.g. `-profile test, docker`.\n" + log.warn( + "nf-core pipelines do not accept positional arguments. The positional argument `${nextflow_cli_args[0]}` has been detected.\n" + "HINT: A common mistake is to provide multiple values separated by spaces e.g. `-profile test, docker`.\n" + ) } } @@ -66,25 +61,21 @@ def checkProfileProvided(nextflow_cli_args) { // def workflowCitation() { def temp_doi_ref = "" - String[] manifest_doi = workflow.manifest.doi.tokenize(",") - // Using a loop to handle multiple DOIs + def manifest_doi = workflow.manifest.doi.tokenize(",") + // Handling multiple DOIs // Removing `https://doi.org/` to handle pipelines using DOIs vs DOI resolvers // Removing ` ` since the manifest.doi is a string and not a proper list - for (String doi_ref: manifest_doi) temp_doi_ref += " https://doi.org/${doi_ref.replace('https://doi.org/', '').replace(' ', '')}\n" - return "If you use ${workflow.manifest.name} for your analysis please cite:\n\n" + - "* The pipeline\n" + - temp_doi_ref + "\n" + - "* The nf-core framework\n" + - " https://doi.org/10.1038/s41587-020-0439-x\n\n" + - "* Software dependencies\n" + - " https://github.com/${workflow.manifest.name}/blob/master/CITATIONS.md" + manifest_doi.each { doi_ref -> + temp_doi_ref += " https://doi.org/${doi_ref.replace('https://doi.org/', '').replace(' ', '')}\n" + } + return "If you use ${workflow.manifest.name} for your analysis please cite:\n\n" + "* The pipeline\n" + temp_doi_ref + "\n" + "* The nf-core framework\n" + " https://doi.org/10.1038/s41587-020-0439-x\n\n" + "* Software dependencies\n" + " https://github.com/${workflow.manifest.name}/blob/master/CITATIONS.md" } // // Generate workflow version string // def getWorkflowVersion() { - String version_string = "" + def version_string = "" as String if (workflow.manifest.version) { def prefix_v = workflow.manifest.version[0] != 'v' ? 'v' : '' version_string += "${prefix_v}${workflow.manifest.version}" @@ -102,8 +93,8 @@ def getWorkflowVersion() { // Get software versions for pipeline // def processVersionsFromYAML(yaml_file) { - Yaml yaml = new Yaml() - versions = yaml.load(yaml_file).collectEntries { k, v -> [ k.tokenize(':')[-1], v ] } + def yaml = new org.yaml.snakeyaml.Yaml() + def versions = yaml.load(yaml_file).collectEntries { k, v -> [k.tokenize(':')[-1], v] } return yaml.dumpAsMap(versions).trim() } @@ -113,8 +104,8 @@ def processVersionsFromYAML(yaml_file) { def workflowVersionToYAML() { return """ Workflow: - $workflow.manifest.name: ${getWorkflowVersion()} - Nextflow: $workflow.nextflow.version + ${workflow.manifest.name}: ${getWorkflowVersion()} + Nextflow: ${workflow.nextflow.version} """.stripIndent().trim() } @@ -122,11 +113,7 @@ def workflowVersionToYAML() { // Get channel of software versions used in pipeline in YAML format // def softwareVersionsToYAML(ch_versions) { - return ch_versions - .unique() - .map { processVersionsFromYAML(it) } - .unique() - .mix(Channel.of(workflowVersionToYAML())) + return ch_versions.unique().map { version -> processVersionsFromYAML(version) }.unique().mix(Channel.of(workflowVersionToYAML())) } // @@ -134,25 +121,31 @@ def softwareVersionsToYAML(ch_versions) { // def paramsSummaryMultiqc(summary_params) { def summary_section = '' - for (group in summary_params.keySet()) { - def group_params = summary_params.get(group) // This gets the parameters of that particular group - if (group_params) { - summary_section += "

    $group

    \n" - summary_section += "
    \n" - for (param in group_params.keySet()) { - summary_section += "
    $param
    ${group_params.get(param) ?: 'N/A'}
    \n" + summary_params + .keySet() + .each { group -> + def group_params = summary_params.get(group) + // This gets the parameters of that particular group + if (group_params) { + summary_section += "

    ${group}

    \n" + summary_section += "
    \n" + group_params + .keySet() + .sort() + .each { param -> + summary_section += "
    ${param}
    ${group_params.get(param) ?: 'N/A'}
    \n" + } + summary_section += "
    \n" } - summary_section += "
    \n" } - } - String yaml_file_text = "id: '${workflow.manifest.name.replace('/','-')}-summary'\n" - yaml_file_text += "description: ' - this information is collected when the pipeline is started.'\n" - yaml_file_text += "section_name: '${workflow.manifest.name} Workflow Summary'\n" - yaml_file_text += "section_href: 'https://github.com/${workflow.manifest.name}'\n" - yaml_file_text += "plot_type: 'html'\n" - yaml_file_text += "data: |\n" - yaml_file_text += "${summary_section}" + def yaml_file_text = "id: '${workflow.manifest.name.replace('/', '-')}-summary'\n" as String + yaml_file_text += "description: ' - this information is collected when the pipeline is started.'\n" + yaml_file_text += "section_name: '${workflow.manifest.name} Workflow Summary'\n" + yaml_file_text += "section_href: 'https://github.com/${workflow.manifest.name}'\n" + yaml_file_text += "plot_type: 'html'\n" + yaml_file_text += "data: |\n" + yaml_file_text += "${summary_section}" return yaml_file_text } @@ -161,7 +154,7 @@ def paramsSummaryMultiqc(summary_params) { // nf-core logo // def nfCoreLogo(monochrome_logs=true) { - Map colors = logColours(monochrome_logs) + def colors = logColours(monochrome_logs) as Map String.format( """\n ${dashedLine(monochrome_logs)} @@ -180,7 +173,7 @@ def nfCoreLogo(monochrome_logs=true) { // Return dashed line // def dashedLine(monochrome_logs=true) { - Map colors = logColours(monochrome_logs) + def colors = logColours(monochrome_logs) as Map return "-${colors.dim}----------------------------------------------------${colors.reset}-" } @@ -188,7 +181,7 @@ def dashedLine(monochrome_logs=true) { // ANSII colours used for terminal logging // def logColours(monochrome_logs=true) { - Map colorcodes = [:] + def colorcodes = [:] as Map // Reset / Meta colorcodes['reset'] = monochrome_logs ? '' : "\033[0m" @@ -200,54 +193,54 @@ def logColours(monochrome_logs=true) { colorcodes['hidden'] = monochrome_logs ? '' : "\033[8m" // Regular Colors - colorcodes['black'] = monochrome_logs ? '' : "\033[0;30m" - colorcodes['red'] = monochrome_logs ? '' : "\033[0;31m" - colorcodes['green'] = monochrome_logs ? '' : "\033[0;32m" - colorcodes['yellow'] = monochrome_logs ? '' : "\033[0;33m" - colorcodes['blue'] = monochrome_logs ? '' : "\033[0;34m" - colorcodes['purple'] = monochrome_logs ? '' : "\033[0;35m" - colorcodes['cyan'] = monochrome_logs ? '' : "\033[0;36m" - colorcodes['white'] = monochrome_logs ? '' : "\033[0;37m" + colorcodes['black'] = monochrome_logs ? '' : "\033[0;30m" + colorcodes['red'] = monochrome_logs ? '' : "\033[0;31m" + colorcodes['green'] = monochrome_logs ? '' : "\033[0;32m" + colorcodes['yellow'] = monochrome_logs ? '' : "\033[0;33m" + colorcodes['blue'] = monochrome_logs ? '' : "\033[0;34m" + colorcodes['purple'] = monochrome_logs ? '' : "\033[0;35m" + colorcodes['cyan'] = monochrome_logs ? '' : "\033[0;36m" + colorcodes['white'] = monochrome_logs ? '' : "\033[0;37m" // Bold - colorcodes['bblack'] = monochrome_logs ? '' : "\033[1;30m" - colorcodes['bred'] = monochrome_logs ? '' : "\033[1;31m" - colorcodes['bgreen'] = monochrome_logs ? '' : "\033[1;32m" - colorcodes['byellow'] = monochrome_logs ? '' : "\033[1;33m" - colorcodes['bblue'] = monochrome_logs ? '' : "\033[1;34m" - colorcodes['bpurple'] = monochrome_logs ? '' : "\033[1;35m" - colorcodes['bcyan'] = monochrome_logs ? '' : "\033[1;36m" - colorcodes['bwhite'] = monochrome_logs ? '' : "\033[1;37m" + colorcodes['bblack'] = monochrome_logs ? '' : "\033[1;30m" + colorcodes['bred'] = monochrome_logs ? '' : "\033[1;31m" + colorcodes['bgreen'] = monochrome_logs ? '' : "\033[1;32m" + colorcodes['byellow'] = monochrome_logs ? '' : "\033[1;33m" + colorcodes['bblue'] = monochrome_logs ? '' : "\033[1;34m" + colorcodes['bpurple'] = monochrome_logs ? '' : "\033[1;35m" + colorcodes['bcyan'] = monochrome_logs ? '' : "\033[1;36m" + colorcodes['bwhite'] = monochrome_logs ? '' : "\033[1;37m" // Underline - colorcodes['ublack'] = monochrome_logs ? '' : "\033[4;30m" - colorcodes['ured'] = monochrome_logs ? '' : "\033[4;31m" - colorcodes['ugreen'] = monochrome_logs ? '' : "\033[4;32m" - colorcodes['uyellow'] = monochrome_logs ? '' : "\033[4;33m" - colorcodes['ublue'] = monochrome_logs ? '' : "\033[4;34m" - colorcodes['upurple'] = monochrome_logs ? '' : "\033[4;35m" - colorcodes['ucyan'] = monochrome_logs ? '' : "\033[4;36m" - colorcodes['uwhite'] = monochrome_logs ? '' : "\033[4;37m" + colorcodes['ublack'] = monochrome_logs ? '' : "\033[4;30m" + colorcodes['ured'] = monochrome_logs ? '' : "\033[4;31m" + colorcodes['ugreen'] = monochrome_logs ? '' : "\033[4;32m" + colorcodes['uyellow'] = monochrome_logs ? '' : "\033[4;33m" + colorcodes['ublue'] = monochrome_logs ? '' : "\033[4;34m" + colorcodes['upurple'] = monochrome_logs ? '' : "\033[4;35m" + colorcodes['ucyan'] = monochrome_logs ? '' : "\033[4;36m" + colorcodes['uwhite'] = monochrome_logs ? '' : "\033[4;37m" // High Intensity - colorcodes['iblack'] = monochrome_logs ? '' : "\033[0;90m" - colorcodes['ired'] = monochrome_logs ? '' : "\033[0;91m" - colorcodes['igreen'] = monochrome_logs ? '' : "\033[0;92m" - colorcodes['iyellow'] = monochrome_logs ? '' : "\033[0;93m" - colorcodes['iblue'] = monochrome_logs ? '' : "\033[0;94m" - colorcodes['ipurple'] = monochrome_logs ? '' : "\033[0;95m" - colorcodes['icyan'] = monochrome_logs ? '' : "\033[0;96m" - colorcodes['iwhite'] = monochrome_logs ? '' : "\033[0;97m" + colorcodes['iblack'] = monochrome_logs ? '' : "\033[0;90m" + colorcodes['ired'] = monochrome_logs ? '' : "\033[0;91m" + colorcodes['igreen'] = monochrome_logs ? '' : "\033[0;92m" + colorcodes['iyellow'] = monochrome_logs ? '' : "\033[0;93m" + colorcodes['iblue'] = monochrome_logs ? '' : "\033[0;94m" + colorcodes['ipurple'] = monochrome_logs ? '' : "\033[0;95m" + colorcodes['icyan'] = monochrome_logs ? '' : "\033[0;96m" + colorcodes['iwhite'] = monochrome_logs ? '' : "\033[0;97m" // Bold High Intensity - colorcodes['biblack'] = monochrome_logs ? '' : "\033[1;90m" - colorcodes['bired'] = monochrome_logs ? '' : "\033[1;91m" - colorcodes['bigreen'] = monochrome_logs ? '' : "\033[1;92m" - colorcodes['biyellow'] = monochrome_logs ? '' : "\033[1;93m" - colorcodes['biblue'] = monochrome_logs ? '' : "\033[1;94m" - colorcodes['bipurple'] = monochrome_logs ? '' : "\033[1;95m" - colorcodes['bicyan'] = monochrome_logs ? '' : "\033[1;96m" - colorcodes['biwhite'] = monochrome_logs ? '' : "\033[1;97m" + colorcodes['biblack'] = monochrome_logs ? '' : "\033[1;90m" + colorcodes['bired'] = monochrome_logs ? '' : "\033[1;91m" + colorcodes['bigreen'] = monochrome_logs ? '' : "\033[1;92m" + colorcodes['biyellow'] = monochrome_logs ? '' : "\033[1;93m" + colorcodes['biblue'] = monochrome_logs ? '' : "\033[1;94m" + colorcodes['bipurple'] = monochrome_logs ? '' : "\033[1;95m" + colorcodes['bicyan'] = monochrome_logs ? '' : "\033[1;96m" + colorcodes['biwhite'] = monochrome_logs ? '' : "\033[1;97m" return colorcodes } @@ -262,14 +255,15 @@ def attachMultiqcReport(multiqc_report) { mqc_report = multiqc_report.getVal() if (mqc_report.getClass() == ArrayList && mqc_report.size() >= 1) { if (mqc_report.size() > 1) { - log.warn "[$workflow.manifest.name] Found multiple reports from process 'MULTIQC', will use only one" + log.warn("[${workflow.manifest.name}] Found multiple reports from process 'MULTIQC', will use only one") } mqc_report = mqc_report[0] } } - } catch (all) { + } + catch (Exception all) { if (multiqc_report) { - log.warn "[$workflow.manifest.name] Could not attach MultiQC report to summary email" + log.warn("[${workflow.manifest.name}] Could not attach MultiQC report to summary email") } } return mqc_report @@ -281,26 +275,35 @@ def attachMultiqcReport(multiqc_report) { def completionEmail(summary_params, email, email_on_fail, plaintext_email, outdir, monochrome_logs=true, multiqc_report=null) { // Set up the e-mail variables - def subject = "[$workflow.manifest.name] Successful: $workflow.runName" + def subject = "[${workflow.manifest.name}] Successful: ${workflow.runName}" if (!workflow.success) { - subject = "[$workflow.manifest.name] FAILED: $workflow.runName" + subject = "[${workflow.manifest.name}] FAILED: ${workflow.runName}" } def summary = [:] - for (group in summary_params.keySet()) { - summary << summary_params[group] - } + summary_params + .keySet() + .sort() + .each { group -> + summary << summary_params[group] + } def misc_fields = [:] misc_fields['Date Started'] = workflow.start misc_fields['Date Completed'] = workflow.complete misc_fields['Pipeline script file path'] = workflow.scriptFile misc_fields['Pipeline script hash ID'] = workflow.scriptId - if (workflow.repository) misc_fields['Pipeline repository Git URL'] = workflow.repository - if (workflow.commitId) misc_fields['Pipeline repository Git Commit'] = workflow.commitId - if (workflow.revision) misc_fields['Pipeline Git branch/tag'] = workflow.revision - misc_fields['Nextflow Version'] = workflow.nextflow.version - misc_fields['Nextflow Build'] = workflow.nextflow.build + if (workflow.repository) { + misc_fields['Pipeline repository Git URL'] = workflow.repository + } + if (workflow.commitId) { + misc_fields['Pipeline repository Git Commit'] = workflow.commitId + } + if (workflow.revision) { + misc_fields['Pipeline Git branch/tag'] = workflow.revision + } + misc_fields['Nextflow Version'] = workflow.nextflow.version + misc_fields['Nextflow Build'] = workflow.nextflow.build misc_fields['Nextflow Compile Timestamp'] = workflow.nextflow.timestamp def email_fields = [:] @@ -338,39 +341,41 @@ def completionEmail(summary_params, email, email_on_fail, plaintext_email, outdi // Render the sendmail template def max_multiqc_email_size = (params.containsKey('max_multiqc_email_size') ? params.max_multiqc_email_size : 0) as nextflow.util.MemoryUnit - def smail_fields = [ email: email_address, subject: subject, email_txt: email_txt, email_html: email_html, projectDir: "${workflow.projectDir}", mqcFile: mqc_report, mqcMaxSize: max_multiqc_email_size.toBytes() ] + def smail_fields = [email: email_address, subject: subject, email_txt: email_txt, email_html: email_html, projectDir: "${workflow.projectDir}", mqcFile: mqc_report, mqcMaxSize: max_multiqc_email_size.toBytes()] def sf = new File("${workflow.projectDir}/assets/sendmail_template.txt") def sendmail_template = engine.createTemplate(sf).make(smail_fields) def sendmail_html = sendmail_template.toString() // Send the HTML e-mail - Map colors = logColours(monochrome_logs) + def colors = logColours(monochrome_logs) as Map if (email_address) { try { - if (plaintext_email) { throw GroovyException('Send plaintext e-mail, not HTML') } + if (plaintext_email) { +new org.codehaus.groovy.GroovyException('Send plaintext e-mail, not HTML') } // Try to send HTML e-mail using sendmail def sendmail_tf = new File(workflow.launchDir.toString(), ".sendmail_tmp.html") sendmail_tf.withWriter { w -> w << sendmail_html } - [ 'sendmail', '-t' ].execute() << sendmail_html - log.info "-${colors.purple}[$workflow.manifest.name]${colors.green} Sent summary e-mail to $email_address (sendmail)-" - } catch (all) { + ['sendmail', '-t'].execute() << sendmail_html + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.green} Sent summary e-mail to ${email_address} (sendmail)-") + } + catch (Exception all) { // Catch failures and try with plaintext - def mail_cmd = [ 'mail', '-s', subject, '--content-type=text/html', email_address ] + def mail_cmd = ['mail', '-s', subject, '--content-type=text/html', email_address] mail_cmd.execute() << email_html - log.info "-${colors.purple}[$workflow.manifest.name]${colors.green} Sent summary e-mail to $email_address (mail)-" + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.green} Sent summary e-mail to ${email_address} (mail)-") } } // Write summary e-mail HTML to a file def output_hf = new File(workflow.launchDir.toString(), ".pipeline_report.html") output_hf.withWriter { w -> w << email_html } - FilesEx.copyTo(output_hf.toPath(), "${outdir}/pipeline_info/pipeline_report.html"); + nextflow.extension.FilesEx.copyTo(output_hf.toPath(), "${outdir}/pipeline_info/pipeline_report.html") output_hf.delete() // Write summary e-mail TXT to a file def output_tf = new File(workflow.launchDir.toString(), ".pipeline_report.txt") output_tf.withWriter { w -> w << email_txt } - FilesEx.copyTo(output_tf.toPath(), "${outdir}/pipeline_info/pipeline_report.txt"); + nextflow.extension.FilesEx.copyTo(output_tf.toPath(), "${outdir}/pipeline_info/pipeline_report.txt") output_tf.delete() } @@ -378,15 +383,17 @@ def completionEmail(summary_params, email, email_on_fail, plaintext_email, outdi // Print pipeline summary on completion // def completionSummary(monochrome_logs=true) { - Map colors = logColours(monochrome_logs) + def colors = logColours(monochrome_logs) as Map if (workflow.success) { if (workflow.stats.ignoredCount == 0) { - log.info "-${colors.purple}[$workflow.manifest.name]${colors.green} Pipeline completed successfully${colors.reset}-" - } else { - log.info "-${colors.purple}[$workflow.manifest.name]${colors.yellow} Pipeline completed successfully, but with errored process(es) ${colors.reset}-" + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.green} Pipeline completed successfully${colors.reset}-") + } + else { + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.yellow} Pipeline completed successfully, but with errored process(es) ${colors.reset}-") } - } else { - log.info "-${colors.purple}[$workflow.manifest.name]${colors.red} Pipeline completed with errors${colors.reset}-" + } + else { + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.red} Pipeline completed with errors${colors.reset}-") } } @@ -395,21 +402,30 @@ def completionSummary(monochrome_logs=true) { // def imNotification(summary_params, hook_url) { def summary = [:] - for (group in summary_params.keySet()) { - summary << summary_params[group] - } + summary_params + .keySet() + .sort() + .each { group -> + summary << summary_params[group] + } def misc_fields = [:] - misc_fields['start'] = workflow.start - misc_fields['complete'] = workflow.complete - misc_fields['scriptfile'] = workflow.scriptFile - misc_fields['scriptid'] = workflow.scriptId - if (workflow.repository) misc_fields['repository'] = workflow.repository - if (workflow.commitId) misc_fields['commitid'] = workflow.commitId - if (workflow.revision) misc_fields['revision'] = workflow.revision - misc_fields['nxf_version'] = workflow.nextflow.version - misc_fields['nxf_build'] = workflow.nextflow.build - misc_fields['nxf_timestamp'] = workflow.nextflow.timestamp + misc_fields['start'] = workflow.start + misc_fields['complete'] = workflow.complete + misc_fields['scriptfile'] = workflow.scriptFile + misc_fields['scriptid'] = workflow.scriptId + if (workflow.repository) { + misc_fields['repository'] = workflow.repository + } + if (workflow.commitId) { + misc_fields['commitid'] = workflow.commitId + } + if (workflow.revision) { + misc_fields['revision'] = workflow.revision + } + misc_fields['nxf_version'] = workflow.nextflow.version + misc_fields['nxf_build'] = workflow.nextflow.build + misc_fields['nxf_timestamp'] = workflow.nextflow.timestamp def msg_fields = [:] msg_fields['version'] = getWorkflowVersion() @@ -434,13 +450,13 @@ def imNotification(summary_params, hook_url) { def json_message = json_template.toString() // POST - def post = new URL(hook_url).openConnection(); + def post = new URL(hook_url).openConnection() post.setRequestMethod("POST") post.setDoOutput(true) post.setRequestProperty("Content-Type", "application/json") - post.getOutputStream().write(json_message.getBytes("UTF-8")); - def postRC = post.getResponseCode(); - if (! postRC.equals(200)) { - log.warn(post.getErrorStream().getText()); + post.getOutputStream().write(json_message.getBytes("UTF-8")) + def postRC = post.getResponseCode() + if (!postRC.equals(200)) { + log.warn(post.getErrorStream().getText()) } } diff --git a/subworkflows/nf-core/utils_nfschema_plugin/main.nf b/subworkflows/nf-core/utils_nfschema_plugin/main.nf new file mode 100644 index 0000000000..4994303ea0 --- /dev/null +++ b/subworkflows/nf-core/utils_nfschema_plugin/main.nf @@ -0,0 +1,46 @@ +// +// Subworkflow that uses the nf-schema plugin to validate parameters and render the parameter summary +// + +include { paramsSummaryLog } from 'plugin/nf-schema' +include { validateParameters } from 'plugin/nf-schema' + +workflow UTILS_NFSCHEMA_PLUGIN { + + take: + input_workflow // workflow: the workflow object used by nf-schema to get metadata from the workflow + validate_params // boolean: validate the parameters + parameters_schema // string: path to the parameters JSON schema. + // this has to be the same as the schema given to `validation.parametersSchema` + // when this input is empty it will automatically use the configured schema or + // "${projectDir}/nextflow_schema.json" as default. This input should not be empty + // for meta pipelines + + main: + + // + // Print parameter summary to stdout. This will display the parameters + // that differ from the default given in the JSON schema + // + if(parameters_schema) { + log.info paramsSummaryLog(input_workflow, parameters_schema:parameters_schema) + } else { + log.info paramsSummaryLog(input_workflow) + } + + // + // Validate the parameters using nextflow_schema.json or the schema + // given via the validation.parametersSchema configuration option + // + if(validate_params) { + if(parameters_schema) { + validateParameters(parameters_schema:parameters_schema) + } else { + validateParameters() + } + } + + emit: + dummy_emit = true +} + diff --git a/subworkflows/nf-core/utils_nfschema_plugin/meta.yml b/subworkflows/nf-core/utils_nfschema_plugin/meta.yml new file mode 100644 index 0000000000..f7d9f02885 --- /dev/null +++ b/subworkflows/nf-core/utils_nfschema_plugin/meta.yml @@ -0,0 +1,35 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/subworkflows/yaml-schema.json +name: "utils_nfschema_plugin" +description: Run nf-schema to validate parameters and create a summary of changed parameters +keywords: + - validation + - JSON schema + - plugin + - parameters + - summary +components: [] +input: + - input_workflow: + type: object + description: | + The workflow object of the used pipeline. + This object contains meta data used to create the params summary log + - validate_params: + type: boolean + description: Validate the parameters and error if invalid. + - parameters_schema: + type: string + description: | + Path to the parameters JSON schema. + This has to be the same as the schema given to the `validation.parametersSchema` config + option. When this input is empty it will automatically use the configured schema or + "${projectDir}/nextflow_schema.json" as default. The schema should not be given in this way + for meta pipelines. +output: + - dummy_emit: + type: boolean + description: Dummy emit to make nf-core subworkflows lint happy +authors: + - "@nvnieuwk" +maintainers: + - "@nvnieuwk" diff --git a/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test b/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test new file mode 100644 index 0000000000..8fb3016487 --- /dev/null +++ b/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test @@ -0,0 +1,117 @@ +nextflow_workflow { + + name "Test Subworkflow UTILS_NFSCHEMA_PLUGIN" + script "../main.nf" + workflow "UTILS_NFSCHEMA_PLUGIN" + + tag "subworkflows" + tag "subworkflows_nfcore" + tag "subworkflows/utils_nfschema_plugin" + tag "plugin/nf-schema" + + config "./nextflow.config" + + test("Should run nothing") { + + when { + + params { + test_data = '' + } + + workflow { + """ + validate_params = false + input[0] = workflow + input[1] = validate_params + input[2] = "" + """ + } + } + + then { + assertAll( + { assert workflow.success } + ) + } + } + + test("Should validate params") { + + when { + + params { + test_data = '' + outdir = null + } + + workflow { + """ + validate_params = true + input[0] = workflow + input[1] = validate_params + input[2] = "" + """ + } + } + + then { + assertAll( + { assert workflow.failed }, + { assert workflow.stdout.any { it.contains('ERROR ~ Validation of pipeline parameters failed!') } } + ) + } + } + + test("Should run nothing - custom schema") { + + when { + + params { + test_data = '' + } + + workflow { + """ + validate_params = false + input[0] = workflow + input[1] = validate_params + input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + """ + } + } + + then { + assertAll( + { assert workflow.success } + ) + } + } + + test("Should validate params - custom schema") { + + when { + + params { + test_data = '' + outdir = null + } + + workflow { + """ + validate_params = true + input[0] = workflow + input[1] = validate_params + input[2] = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + """ + } + } + + then { + assertAll( + { assert workflow.failed }, + { assert workflow.stdout.any { it.contains('ERROR ~ Validation of pipeline parameters failed!') } } + ) + } + } +} diff --git a/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config b/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config new file mode 100644 index 0000000000..0907ac58f0 --- /dev/null +++ b/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow.config @@ -0,0 +1,8 @@ +plugins { + id "nf-schema@2.1.0" +} + +validation { + parametersSchema = "${projectDir}/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json" + monochromeLogs = true +} \ No newline at end of file diff --git a/subworkflows/nf-core/utils_nfvalidation_plugin/tests/nextflow_schema.json b/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json similarity index 95% rename from subworkflows/nf-core/utils_nfvalidation_plugin/tests/nextflow_schema.json rename to subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json index 7626c1c93e..331e0d2f44 100644 --- a/subworkflows/nf-core/utils_nfvalidation_plugin/tests/nextflow_schema.json +++ b/subworkflows/nf-core/utils_nfschema_plugin/tests/nextflow_schema.json @@ -1,10 +1,10 @@ { - "$schema": "http://json-schema.org/draft-07/schema", + "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "https://raw.githubusercontent.com/./master/nextflow_schema.json", "title": ". pipeline parameters", "description": "", "type": "object", - "definitions": { + "$defs": { "input_output_options": { "title": "Input/output options", "type": "object", @@ -87,10 +87,10 @@ }, "allOf": [ { - "$ref": "#/definitions/input_output_options" + "$ref": "#/$defs/input_output_options" }, { - "$ref": "#/definitions/generic_options" + "$ref": "#/$defs/generic_options" } ] } diff --git a/subworkflows/nf-core/utils_nfvalidation_plugin/main.nf b/subworkflows/nf-core/utils_nfvalidation_plugin/main.nf deleted file mode 100644 index 2585b65d1b..0000000000 --- a/subworkflows/nf-core/utils_nfvalidation_plugin/main.nf +++ /dev/null @@ -1,62 +0,0 @@ -// -// Subworkflow that uses the nf-validation plugin to render help text and parameter summary -// - -/* -======================================================================================== - IMPORT NF-VALIDATION PLUGIN -======================================================================================== -*/ - -include { paramsHelp } from 'plugin/nf-validation' -include { paramsSummaryLog } from 'plugin/nf-validation' -include { validateParameters } from 'plugin/nf-validation' - -/* -======================================================================================== - SUBWORKFLOW DEFINITION -======================================================================================== -*/ - -workflow UTILS_NFVALIDATION_PLUGIN { - - take: - print_help // boolean: print help - workflow_command // string: default commmand used to run pipeline - pre_help_text // string: string to be printed before help text and summary log - post_help_text // string: string to be printed after help text and summary log - validate_params // boolean: validate parameters - schema_filename // path: JSON schema file, null to use default value - - main: - - log.debug "Using schema file: ${schema_filename}" - - // Default values for strings - pre_help_text = pre_help_text ?: '' - post_help_text = post_help_text ?: '' - workflow_command = workflow_command ?: '' - - // - // Print help message if needed - // - if (print_help) { - log.info pre_help_text + paramsHelp(workflow_command, parameters_schema: schema_filename) + post_help_text - System.exit(0) - } - - // - // Print parameter summary to stdout - // - log.info pre_help_text + paramsSummaryLog(workflow, parameters_schema: schema_filename) + post_help_text - - // - // Validate parameters relative to the parameter JSON schema - // - if (validate_params){ - validateParameters(parameters_schema: schema_filename) - } - - emit: - dummy_emit = true -} diff --git a/subworkflows/nf-core/utils_nfvalidation_plugin/meta.yml b/subworkflows/nf-core/utils_nfvalidation_plugin/meta.yml deleted file mode 100644 index 3d4a6b04f5..0000000000 --- a/subworkflows/nf-core/utils_nfvalidation_plugin/meta.yml +++ /dev/null @@ -1,44 +0,0 @@ -# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/subworkflows/yaml-schema.json -name: "UTILS_NFVALIDATION_PLUGIN" -description: Use nf-validation to initiate and validate a pipeline -keywords: - - utility - - pipeline - - initialise - - validation -components: [] -input: - - print_help: - type: boolean - description: | - Print help message and exit - - workflow_command: - type: string - description: | - The command to run the workflow e.g. "nextflow run main.nf" - - pre_help_text: - type: string - description: | - Text to print before the help message - - post_help_text: - type: string - description: | - Text to print after the help message - - validate_params: - type: boolean - description: | - Validate the parameters and error if invalid. - - schema_filename: - type: string - description: | - The filename of the schema to validate against. -output: - - dummy_emit: - type: boolean - description: | - Dummy emit to make nf-core subworkflows lint happy -authors: - - "@adamrtalbot" -maintainers: - - "@adamrtalbot" - - "@maxulysse" diff --git a/subworkflows/nf-core/utils_nfvalidation_plugin/tests/main.nf.test b/subworkflows/nf-core/utils_nfvalidation_plugin/tests/main.nf.test deleted file mode 100644 index 5784a33f2f..0000000000 --- a/subworkflows/nf-core/utils_nfvalidation_plugin/tests/main.nf.test +++ /dev/null @@ -1,200 +0,0 @@ -nextflow_workflow { - - name "Test Workflow UTILS_NFVALIDATION_PLUGIN" - script "../main.nf" - workflow "UTILS_NFVALIDATION_PLUGIN" - tag "subworkflows" - tag "subworkflows_nfcore" - tag "plugin/nf-validation" - tag "'plugin/nf-validation'" - tag "utils_nfvalidation_plugin" - tag "subworkflows/utils_nfvalidation_plugin" - - test("Should run nothing") { - - when { - - params { - monochrome_logs = true - test_data = '' - } - - workflow { - """ - help = false - workflow_command = null - pre_help_text = null - post_help_text = null - validate_params = false - schema_filename = "$moduleTestDir/nextflow_schema.json" - - input[0] = help - input[1] = workflow_command - input[2] = pre_help_text - input[3] = post_help_text - input[4] = validate_params - input[5] = schema_filename - """ - } - } - - then { - assertAll( - { assert workflow.success } - ) - } - } - - test("Should run help") { - - - when { - - params { - monochrome_logs = true - test_data = '' - } - workflow { - """ - help = true - workflow_command = null - pre_help_text = null - post_help_text = null - validate_params = false - schema_filename = "$moduleTestDir/nextflow_schema.json" - - input[0] = help - input[1] = workflow_command - input[2] = pre_help_text - input[3] = post_help_text - input[4] = validate_params - input[5] = schema_filename - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert workflow.exitStatus == 0 }, - { assert workflow.stdout.any { it.contains('Input/output options') } }, - { assert workflow.stdout.any { it.contains('--outdir') } } - ) - } - } - - test("Should run help with command") { - - when { - - params { - monochrome_logs = true - test_data = '' - } - workflow { - """ - help = true - workflow_command = "nextflow run noorg/doesntexist" - pre_help_text = null - post_help_text = null - validate_params = false - schema_filename = "$moduleTestDir/nextflow_schema.json" - - input[0] = help - input[1] = workflow_command - input[2] = pre_help_text - input[3] = post_help_text - input[4] = validate_params - input[5] = schema_filename - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert workflow.exitStatus == 0 }, - { assert workflow.stdout.any { it.contains('nextflow run noorg/doesntexist') } }, - { assert workflow.stdout.any { it.contains('Input/output options') } }, - { assert workflow.stdout.any { it.contains('--outdir') } } - ) - } - } - - test("Should run help with extra text") { - - - when { - - params { - monochrome_logs = true - test_data = '' - } - workflow { - """ - help = true - workflow_command = "nextflow run noorg/doesntexist" - pre_help_text = "pre-help-text" - post_help_text = "post-help-text" - validate_params = false - schema_filename = "$moduleTestDir/nextflow_schema.json" - - input[0] = help - input[1] = workflow_command - input[2] = pre_help_text - input[3] = post_help_text - input[4] = validate_params - input[5] = schema_filename - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert workflow.exitStatus == 0 }, - { assert workflow.stdout.any { it.contains('pre-help-text') } }, - { assert workflow.stdout.any { it.contains('nextflow run noorg/doesntexist') } }, - { assert workflow.stdout.any { it.contains('Input/output options') } }, - { assert workflow.stdout.any { it.contains('--outdir') } }, - { assert workflow.stdout.any { it.contains('post-help-text') } } - ) - } - } - - test("Should validate params") { - - when { - - params { - monochrome_logs = true - test_data = '' - outdir = 1 - } - workflow { - """ - help = false - workflow_command = null - pre_help_text = null - post_help_text = null - validate_params = true - schema_filename = "$moduleTestDir/nextflow_schema.json" - - input[0] = help - input[1] = workflow_command - input[2] = pre_help_text - input[3] = post_help_text - input[4] = validate_params - input[5] = schema_filename - """ - } - } - - then { - assertAll( - { assert workflow.failed }, - { assert workflow.stdout.any { it.contains('ERROR ~ ERROR: Validation of pipeline parameters failed!') } } - ) - } - } -} diff --git a/subworkflows/nf-core/utils_nfvalidation_plugin/tests/tags.yml b/subworkflows/nf-core/utils_nfvalidation_plugin/tests/tags.yml deleted file mode 100644 index 60b1cfff49..0000000000 --- a/subworkflows/nf-core/utils_nfvalidation_plugin/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -subworkflows/utils_nfvalidation_plugin: - - subworkflows/nf-core/utils_nfvalidation_plugin/** diff --git a/tests/.nftignore b/tests/.nftignore new file mode 100644 index 0000000000..c8aa4b3433 --- /dev/null +++ b/tests/.nftignore @@ -0,0 +1,28 @@ +.DS_Store +annotation/**/*.vcf.{gz,gz.tbi} +csv/*.csv +multiqc/multiqc_data/gatk-base-recalibrator-reported-empirical-plot.txt +multiqc/multiqc_data/gatk_base_recalibrator.txt +multiqc/multiqc_data/multiqc.log +multiqc/multiqc_data/multiqc_data.json +multiqc/multiqc_data/multiqc_general_stats.txt +multiqc/multiqc_data/multiqc_picard_dups.txt +multiqc/multiqc_data/multiqc_software_versions.txt +multiqc/multiqc_data/multiqc_sources.txt +multiqc/multiqc_data/picard_deduplication.txt +multiqc/multiqc_data/vcftools_tstv_by_qual.txt +multiqc/multiqc_plots/{svg,pdf,png}/*.{svg,pdf,png} +multiqc/multiqc_report.html +no_intervals.{bed,bed.gz,bed.gz.tbi} +pipeline_info/*.{html,json,txt,yml} +preprocessing/**/*.{md,recal,sorted}.{bam,bam.bai,cram,cram.crai,table} +reference/dragmap/hash_table.{cfg,cfg.bin} +reference/dragmap/hash_table_stats.txt +reports/EnsemblVEP/*/*.ann.summary.html +reports/fastqc/**/*_fastqc.{html,zip} +reports/markduplicates/**/*.md.cram.metrics +reports/snpeff/*/*_snpEff.csv +reports/snpeff/*/snpEff_summary.html +reports/vcftools/**/*.TsTv.qual +variant_calling/**/*.vcf.{gz,gz.tbi} +variant_calling/controlfreec/*/config.txt diff --git a/tests/aligner-bwa-mem.nf.test b/tests/aligner-bwa-mem.nf.test new file mode 100644 index 0000000000..c3c719e3a8 --- /dev/null +++ b/tests/aligner-bwa-mem.nf.test @@ -0,0 +1,81 @@ +nextflow_pipeline { + + name "Test pipeline" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + + test("Run with profile test | --aligner bwa-mem --save_reference | skip QC/recal/md") { + + when { + params { + aligner = 'bwa-mem' + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + save_reference = true + skip_tools = 'baserecalibrator,fastqc,markduplicates,mosdepth,multiqc,samtools' + tools = '' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + test("Run with profile test | --aligner bwa-mem --save_reference --build_only_index") { + + when { + params { + aligner = 'bwa-mem' + build_only_index = true + input = false + outdir = "$outputDir" + save_reference = true + skip_tools = 'multiqc' + tools = '' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path + ).match() } + ) + } + } +} diff --git a/tests/aligner-bwa-mem.nf.test.snap b/tests/aligner-bwa-mem.nf.test.snap new file mode 100644 index 0000000000..9744bcceef --- /dev/null +++ b/tests/aligner-bwa-mem.nf.test.snap @@ -0,0 +1,113 @@ +{ + "Run with profile test | --aligner bwa-mem --save_reference | skip QC/recal/md": { + "content": [ + 10, + { + "BAM_TO_CRAM_MAPPING": { + "samtools": 1.21 + }, + "BWAMEM1_MEM": { + "bwa": "0.7.18-r1243-dirty", + "samtools": 1.2 + }, + "INDEX_MERGE_BAM": { + "samtools": 1.21 + }, + "MERGE_BAM": { + "samtools": 1.21 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/mapped.csv", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "preprocessing", + "preprocessing/mapped", + "preprocessing/mapped/test", + "preprocessing/mapped/test/test.sorted.cram", + "preprocessing/mapped/test/test.sorted.cram.crai", + "reference", + "reference/bwa", + "reference/bwa/genome.amb", + "reference/bwa/genome.ann", + "reference/bwa/genome.bwt", + "reference/bwa/genome.pac", + "reference/bwa/genome.sa", + "reference/intervals", + "reference/intervals/chr22_1-40001.bed", + "reference/intervals/chr22_1-40001.bed.gz", + "reference/intervals/genome.bed", + "reference/intervals/genome.bed.gz" + ], + [ + "genome.amb:md5,1891c1de381b3a96d4e72f590fde20c1", + "genome.ann:md5,2df4aa2d7580639fa0fcdbcad5e2e969", + "genome.bwt:md5,815eded87e4cb6b0f1daab5c4d6e30af", + "genome.pac:md5,8569fbdb2c98c6fb16dfa73d8eacb070", + "genome.sa:md5,e7cff62b919448a3a3d0fe4aaf427594", + "chr22_1-40001.bed:md5,87a15eb9c2ff20ccd5cd8735a28708f7", + "chr22_1-40001.bed.gz:md5,87a15eb9c2ff20ccd5cd8735a28708f7", + "genome.bed:md5,a87dc7d20ebca626f65cc16ff6c97a3e", + "genome.bed.gz:md5,a87dc7d20ebca626f65cc16ff6c97a3e" + ], + [ + [ + "test.sorted.cram", + "5534c350547fd253f0f2b9450362bed" + ] + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.09.0" + }, + "timestamp": "2024-10-08T11:22:13.394114" + }, + "Run with profile test | --aligner bwa-mem --save_reference --build_only_index": { + "content": [ + 5, + { + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reference/bwa", + "reference/bwa/genome.amb", + "reference/bwa/genome.ann", + "reference/bwa/genome.bwt", + "reference/bwa/genome.pac", + "reference/bwa/genome.sa", + "reference/intervals", + "reference/intervals/chr22_1-40001.bed", + "reference/intervals/chr22_1-40001.bed.gz", + "reference/intervals/genome.bed", + "reference/intervals/genome.bed.gz" + ], + [ + "genome.amb:md5,1891c1de381b3a96d4e72f590fde20c1", + "genome.ann:md5,2df4aa2d7580639fa0fcdbcad5e2e969", + "genome.bwt:md5,815eded87e4cb6b0f1daab5c4d6e30af", + "genome.pac:md5,8569fbdb2c98c6fb16dfa73d8eacb070", + "genome.sa:md5,e7cff62b919448a3a3d0fe4aaf427594", + "chr22_1-40001.bed:md5,87a15eb9c2ff20ccd5cd8735a28708f7", + "chr22_1-40001.bed.gz:md5,87a15eb9c2ff20ccd5cd8735a28708f7", + "genome.bed:md5,a87dc7d20ebca626f65cc16ff6c97a3e", + "genome.bed.gz:md5,a87dc7d20ebca626f65cc16ff6c97a3e" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.09.0" + }, + "timestamp": "2024-10-08T11:23:05.203586" + } +} diff --git a/tests/aligner-bwa-mem2.nf.test b/tests/aligner-bwa-mem2.nf.test new file mode 100644 index 0000000000..204ec2711f --- /dev/null +++ b/tests/aligner-bwa-mem2.nf.test @@ -0,0 +1,81 @@ +nextflow_pipeline { + + name "Test pipeline" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + + test("Run with profile test | --aligner bwa-mem2 --save_reference | skip QC/recal/md") { + + when { + params { + aligner = 'bwa-mem2' + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + save_reference = true + skip_tools = 'baserecalibrator,fastqc,markduplicates,mosdepth,multiqc,samtools' + tools = '' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + test("Run with profile test | --aligner bwa-mem2 --save_reference --build_only_index") { + + when { + params { + aligner = 'bwa-mem2' + build_only_index = true + input = false + outdir = "$outputDir" + save_reference = true + skip_tools = 'multiqc' + tools = '' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path + ).match() } + ) + } + } +} diff --git a/tests/aligner-bwa-mem2.nf.test.snap b/tests/aligner-bwa-mem2.nf.test.snap new file mode 100644 index 0000000000..5ab17312a0 --- /dev/null +++ b/tests/aligner-bwa-mem2.nf.test.snap @@ -0,0 +1,113 @@ +{ + "Run with profile test | --aligner bwa-mem2 --save_reference --build_only_index": { + "content": [ + 5, + { + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reference/bwamem2", + "reference/bwamem2/genome.fasta.0123", + "reference/bwamem2/genome.fasta.amb", + "reference/bwamem2/genome.fasta.ann", + "reference/bwamem2/genome.fasta.bwt.2bit.64", + "reference/bwamem2/genome.fasta.pac", + "reference/intervals", + "reference/intervals/chr22_1-40001.bed", + "reference/intervals/chr22_1-40001.bed.gz", + "reference/intervals/genome.bed", + "reference/intervals/genome.bed.gz" + ], + [ + "genome.fasta.0123:md5,d73300d44f733bcdb7c988fc3ff3e3e9", + "genome.fasta.amb:md5,1891c1de381b3a96d4e72f590fde20c1", + "genome.fasta.ann:md5,2df4aa2d7580639fa0fcdbcad5e2e969", + "genome.fasta.bwt.2bit.64:md5,cd4bdf496eab05228a50c45ee43c1ed0", + "genome.fasta.pac:md5,8569fbdb2c98c6fb16dfa73d8eacb070", + "chr22_1-40001.bed:md5,87a15eb9c2ff20ccd5cd8735a28708f7", + "chr22_1-40001.bed.gz:md5,87a15eb9c2ff20ccd5cd8735a28708f7", + "genome.bed:md5,a87dc7d20ebca626f65cc16ff6c97a3e", + "genome.bed.gz:md5,a87dc7d20ebca626f65cc16ff6c97a3e" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.09.0" + }, + "timestamp": "2024-10-08T11:26:09.886689" + }, + "Run with profile test | --aligner bwa-mem2 --save_reference | skip QC/recal/md": { + "content": [ + 10, + { + "BAM_TO_CRAM_MAPPING": { + "samtools": 1.21 + }, + "BWAMEM2_MEM": { + "bwamem2": "2.2.1", + "samtools": "1.19.2" + }, + "INDEX_MERGE_BAM": { + "samtools": 1.21 + }, + "MERGE_BAM": { + "samtools": 1.21 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/mapped.csv", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "preprocessing", + "preprocessing/mapped", + "preprocessing/mapped/test", + "preprocessing/mapped/test/test.sorted.cram", + "preprocessing/mapped/test/test.sorted.cram.crai", + "reference", + "reference/bwamem2", + "reference/bwamem2/genome.fasta.0123", + "reference/bwamem2/genome.fasta.amb", + "reference/bwamem2/genome.fasta.ann", + "reference/bwamem2/genome.fasta.bwt.2bit.64", + "reference/bwamem2/genome.fasta.pac", + "reference/intervals", + "reference/intervals/chr22_1-40001.bed", + "reference/intervals/chr22_1-40001.bed.gz", + "reference/intervals/genome.bed", + "reference/intervals/genome.bed.gz" + ], + [ + "genome.fasta.0123:md5,d73300d44f733bcdb7c988fc3ff3e3e9", + "genome.fasta.amb:md5,1891c1de381b3a96d4e72f590fde20c1", + "genome.fasta.ann:md5,2df4aa2d7580639fa0fcdbcad5e2e969", + "genome.fasta.bwt.2bit.64:md5,cd4bdf496eab05228a50c45ee43c1ed0", + "genome.fasta.pac:md5,8569fbdb2c98c6fb16dfa73d8eacb070", + "chr22_1-40001.bed:md5,87a15eb9c2ff20ccd5cd8735a28708f7", + "chr22_1-40001.bed.gz:md5,87a15eb9c2ff20ccd5cd8735a28708f7", + "genome.bed:md5,a87dc7d20ebca626f65cc16ff6c97a3e", + "genome.bed.gz:md5,a87dc7d20ebca626f65cc16ff6c97a3e" + ], + [ + [ + "test.sorted.cram", + "5534c350547fd253f0f2b9450362bed" + ] + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.09.0" + }, + "timestamp": "2024-10-08T11:25:00.196657" + } +} diff --git a/tests/aligner-dragmap.nf.test b/tests/aligner-dragmap.nf.test new file mode 100644 index 0000000000..ab11bcf143 --- /dev/null +++ b/tests/aligner-dragmap.nf.test @@ -0,0 +1,81 @@ +nextflow_pipeline { + + name "Test pipeline" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + + test("Run with profile test | --aligner dragmap --save_reference | skip QC/recal/md") { + + when { + params { + aligner = 'dragmap' + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + save_reference = true + skip_tools = 'baserecalibrator,fastqc,markduplicates,mosdepth,multiqc,samtools' + tools = '' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + test("Run with profile test | --aligner dragmap --save_reference --build_only_index") { + + when { + params { + aligner = 'dragmap' + build_only_index = true + input = false + outdir = "$outputDir" + save_reference = true + skip_tools = 'multiqc' + tools = '' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path + ).match() } + ) + } + } +} diff --git a/tests/aligner-dragmap.nf.test.snap b/tests/aligner-dragmap.nf.test.snap new file mode 100644 index 0000000000..3302761f2d --- /dev/null +++ b/tests/aligner-dragmap.nf.test.snap @@ -0,0 +1,120 @@ +{ + "Run with profile test | --aligner dragmap --save_reference --build_only_index": { + "content": [ + 5, + { + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reference/dragmap", + "reference/dragmap/hash_table.cfg", + "reference/dragmap/hash_table.cfg.bin", + "reference/dragmap/hash_table.cmp", + "reference/dragmap/hash_table_stats.txt", + "reference/dragmap/ref_index.bin", + "reference/dragmap/reference.bin", + "reference/dragmap/repeat_mask.bin", + "reference/dragmap/str_table.bin", + "reference/intervals", + "reference/intervals/chr22_1-40001.bed", + "reference/intervals/chr22_1-40001.bed.gz", + "reference/intervals/genome.bed", + "reference/intervals/genome.bed.gz" + ], + [ + "hash_table.cmp:md5,1caab4ffc89f81ace615a2e813295cf4", + "ref_index.bin:md5,dbb5c7d26b974e0ac338024fe4535044", + "reference.bin:md5,be67b80ee48aa96b383fd72f1ccfefea", + "repeat_mask.bin:md5,294939f1f80aa7f4a70b9b537e4c0f21", + "str_table.bin:md5,45f7818c4a10fdeed04db7a34b5f9ff1", + "chr22_1-40001.bed:md5,87a15eb9c2ff20ccd5cd8735a28708f7", + "chr22_1-40001.bed.gz:md5,87a15eb9c2ff20ccd5cd8735a28708f7", + "genome.bed:md5,a87dc7d20ebca626f65cc16ff6c97a3e", + "genome.bed.gz:md5,a87dc7d20ebca626f65cc16ff6c97a3e" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.09.0" + }, + "timestamp": "2024-10-08T11:28:56.135088" + }, + "Run with profile test | --aligner dragmap --save_reference | skip QC/recal/md": { + "content": [ + 10, + { + "BAM_TO_CRAM_MAPPING": { + "samtools": 1.21 + }, + "DRAGMAP_ALIGN": { + "dragmap": "1.2.1", + "samtools": "1.19.2", + "pigz": "2.3.4" + }, + "INDEX_MERGE_BAM": { + "samtools": 1.21 + }, + "MERGE_BAM": { + "samtools": 1.21 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/mapped.csv", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "preprocessing", + "preprocessing/mapped", + "preprocessing/mapped/test", + "preprocessing/mapped/test/test.sorted.cram", + "preprocessing/mapped/test/test.sorted.cram.crai", + "reference", + "reference/dragmap", + "reference/dragmap/hash_table.cfg", + "reference/dragmap/hash_table.cfg.bin", + "reference/dragmap/hash_table.cmp", + "reference/dragmap/hash_table_stats.txt", + "reference/dragmap/ref_index.bin", + "reference/dragmap/reference.bin", + "reference/dragmap/repeat_mask.bin", + "reference/dragmap/str_table.bin", + "reference/intervals", + "reference/intervals/chr22_1-40001.bed", + "reference/intervals/chr22_1-40001.bed.gz", + "reference/intervals/genome.bed", + "reference/intervals/genome.bed.gz" + ], + [ + "hash_table.cmp:md5,1caab4ffc89f81ace615a2e813295cf4", + "ref_index.bin:md5,dbb5c7d26b974e0ac338024fe4535044", + "reference.bin:md5,be67b80ee48aa96b383fd72f1ccfefea", + "repeat_mask.bin:md5,294939f1f80aa7f4a70b9b537e4c0f21", + "str_table.bin:md5,45f7818c4a10fdeed04db7a34b5f9ff1", + "chr22_1-40001.bed:md5,87a15eb9c2ff20ccd5cd8735a28708f7", + "chr22_1-40001.bed.gz:md5,87a15eb9c2ff20ccd5cd8735a28708f7", + "genome.bed:md5,a87dc7d20ebca626f65cc16ff6c97a3e", + "genome.bed.gz:md5,a87dc7d20ebca626f65cc16ff6c97a3e" + ], + [ + [ + "test.sorted.cram", + "7088dc71e5390aec0dd9d778f4568297" + ] + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.09.0" + }, + "timestamp": "2024-10-08T11:28:12.076906" + } +} diff --git a/tests/alignment_from_everything.nf.test b/tests/alignment_from_everything.nf.test new file mode 100644 index 0000000000..92f9ef56c8 --- /dev/null +++ b/tests/alignment_from_everything.nf.test @@ -0,0 +1,46 @@ +nextflow_pipeline { + + name "Test pipeline" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + profile "alignment_from_everything" + + test("Run with profile test,alignment_from_everything | --save_mapped --save_output_as_bam") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + igenomes_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data' + outdir = "$outputDir" + save_mapped = true + save_output_as_bam = true + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } +} diff --git a/tests/alignment_from_everything.nf.test.snap b/tests/alignment_from_everything.nf.test.snap new file mode 100644 index 0000000000..b82b7fbb11 --- /dev/null +++ b/tests/alignment_from_everything.nf.test.snap @@ -0,0 +1,335 @@ +{ + "Run with profile test,alignment_from_everything | --save_mapped --save_output_as_bam": { + "content": [ + 27, + { + "BCFTOOLS_STATS": { + "bcftools": 1.2 + }, + "BWAMEM1_MEM": { + "bwa": "0.7.18-r1243-dirty", + "samtools": 1.2 + }, + "CRAM_TO_BAM": { + "samtools": 1.21 + }, + "CRAM_TO_BAM_RECAL": { + "samtools": 1.21 + }, + "FASTQC": { + "fastqc": "0.12.1" + }, + "GATK4_APPLYBQSR": { + "gatk4": "4.5.0.0" + }, + "GATK4_BASERECALIBRATOR": { + "gatk4": "4.5.0.0" + }, + "GATK4_MARKDUPLICATES": { + "gatk4": "4.5.0.0", + "samtools": "1.19.2" + }, + "INDEX_CRAM": { + "samtools": 1.21 + }, + "INDEX_MERGE_BAM": { + "samtools": 1.21 + }, + "MERGE_BAM": { + "samtools": 1.21 + }, + "MOSDEPTH": { + "mosdepth": "0.3.8" + }, + "SAMTOOLS_STATS": { + "samtools": 1.21 + }, + "STRELKA_SINGLE": { + "strelka": "2.9.10" + }, + "VCFTOOLS_TSTV_COUNT": { + "vcftools": "0.1.16" + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/mapped.csv", + "csv/markduplicates.csv", + "csv/markduplicates_no_table.csv", + "csv/recalibrated.csv", + "csv/variantcalled.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/bcftools_stats_indel-lengths.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Indels.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_SNP.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transitions.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transversions.txt", + "multiqc/multiqc_data/fastqc-status-check-heatmap.txt", + "multiqc/multiqc_data/fastqc_adapter_content_plot.txt", + "multiqc/multiqc_data/fastqc_per_base_n_content_plot.txt", + "multiqc/multiqc_data/fastqc_per_base_sequence_quality_plot.txt", + "multiqc/multiqc_data/fastqc_per_sequence_gc_content_plot_Counts.txt", + "multiqc/multiqc_data/fastqc_per_sequence_gc_content_plot_Percentages.txt", + "multiqc/multiqc_data/fastqc_per_sequence_quality_scores_plot.txt", + "multiqc/multiqc_data/fastqc_sequence_counts_plot.txt", + "multiqc/multiqc_data/fastqc_sequence_duplication_levels_plot.txt", + "multiqc/multiqc_data/fastqc_sequence_length_distribution_plot.txt", + "multiqc/multiqc_data/fastqc_top_overrepresented_sequences_table.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-reported-empirical-plot.txt", + "multiqc/multiqc_data/gatk_base_recalibrator.txt", + "multiqc/multiqc_data/mosdepth-coverage-per-contig-single.txt", + "multiqc/multiqc_data/mosdepth-cumcoverage-dist-id.txt", + "multiqc/multiqc_data/mosdepth_cov_dist.txt", + "multiqc/multiqc_data/mosdepth_cumcov_dist.txt", + "multiqc/multiqc_data/mosdepth_perchrom.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_bcftools_stats.txt", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_fastqc.txt", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_picard_dups.txt", + "multiqc/multiqc_data/multiqc_samtools_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/picard_deduplication.txt", + "multiqc/multiqc_data/picard_histogram.txt", + "multiqc/multiqc_data/picard_histogram_1.txt", + "multiqc/multiqc_data/picard_histogram_2.txt", + "multiqc/multiqc_data/samtools-stats-dp.txt", + "multiqc/multiqc_data/samtools_alignment_plot.txt", + "multiqc/multiqc_data/vcftools_tstv_by_count.txt", + "multiqc/multiqc_data/vcftools_tstv_by_qual.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_indel-lengths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Indels.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_SNP.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transitions.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transversions.pdf", + "multiqc/multiqc_plots/pdf/fastqc-status-check-heatmap.pdf", + "multiqc/multiqc_plots/pdf/fastqc_adapter_content_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_base_n_content_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_base_sequence_quality_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_sequence_gc_content_plot_Counts.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_sequence_gc_content_plot_Percentages.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_sequence_quality_scores_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_sequence_counts_plot-cnt.pdf", + "multiqc/multiqc_plots/pdf/fastqc_sequence_counts_plot-pct.pdf", + "multiqc/multiqc_plots/pdf/fastqc_sequence_duplication_levels_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_sequence_length_distribution_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_top_overrepresented_sequences_table.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-reported-empirical-plot.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-coverage-per-contig-single-cnt.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-coverage-per-contig-single-pct.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-cumcoverage-dist-id.pdf", + "multiqc/multiqc_plots/pdf/picard_deduplication-cnt.pdf", + "multiqc/multiqc_plots/pdf/picard_deduplication-pct.pdf", + "multiqc/multiqc_plots/pdf/samtools-stats-dp.pdf", + "multiqc/multiqc_plots/pdf/samtools_alignment_plot-cnt.pdf", + "multiqc/multiqc_plots/pdf/samtools_alignment_plot-pct.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_count.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_qual.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/bcftools_stats_indel-lengths.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Indels.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_SNP.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transitions.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transversions.png", + "multiqc/multiqc_plots/png/fastqc-status-check-heatmap.png", + "multiqc/multiqc_plots/png/fastqc_adapter_content_plot.png", + "multiqc/multiqc_plots/png/fastqc_per_base_n_content_plot.png", + "multiqc/multiqc_plots/png/fastqc_per_base_sequence_quality_plot.png", + "multiqc/multiqc_plots/png/fastqc_per_sequence_gc_content_plot_Counts.png", + "multiqc/multiqc_plots/png/fastqc_per_sequence_gc_content_plot_Percentages.png", + "multiqc/multiqc_plots/png/fastqc_per_sequence_quality_scores_plot.png", + "multiqc/multiqc_plots/png/fastqc_sequence_counts_plot-cnt.png", + "multiqc/multiqc_plots/png/fastqc_sequence_counts_plot-pct.png", + "multiqc/multiqc_plots/png/fastqc_sequence_duplication_levels_plot.png", + "multiqc/multiqc_plots/png/fastqc_sequence_length_distribution_plot.png", + "multiqc/multiqc_plots/png/fastqc_top_overrepresented_sequences_table.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-reported-empirical-plot.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/mosdepth-coverage-per-contig-single-cnt.png", + "multiqc/multiqc_plots/png/mosdepth-coverage-per-contig-single-pct.png", + "multiqc/multiqc_plots/png/mosdepth-cumcoverage-dist-id.png", + "multiqc/multiqc_plots/png/picard_deduplication-cnt.png", + "multiqc/multiqc_plots/png/picard_deduplication-pct.png", + "multiqc/multiqc_plots/png/samtools-stats-dp.png", + "multiqc/multiqc_plots/png/samtools_alignment_plot-cnt.png", + "multiqc/multiqc_plots/png/samtools_alignment_plot-pct.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_count.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_qual.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/bcftools_stats_indel-lengths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Indels.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_SNP.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transitions.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transversions.svg", + "multiqc/multiqc_plots/svg/fastqc-status-check-heatmap.svg", + "multiqc/multiqc_plots/svg/fastqc_adapter_content_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_per_base_n_content_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_per_base_sequence_quality_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_per_sequence_gc_content_plot_Counts.svg", + "multiqc/multiqc_plots/svg/fastqc_per_sequence_gc_content_plot_Percentages.svg", + "multiqc/multiqc_plots/svg/fastqc_per_sequence_quality_scores_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_sequence_counts_plot-cnt.svg", + "multiqc/multiqc_plots/svg/fastqc_sequence_counts_plot-pct.svg", + "multiqc/multiqc_plots/svg/fastqc_sequence_duplication_levels_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_sequence_length_distribution_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_top_overrepresented_sequences_table.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-reported-empirical-plot.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/mosdepth-coverage-per-contig-single-cnt.svg", + "multiqc/multiqc_plots/svg/mosdepth-coverage-per-contig-single-pct.svg", + "multiqc/multiqc_plots/svg/mosdepth-cumcoverage-dist-id.svg", + "multiqc/multiqc_plots/svg/picard_deduplication-cnt.svg", + "multiqc/multiqc_plots/svg/picard_deduplication-pct.svg", + "multiqc/multiqc_plots/svg/samtools-stats-dp.svg", + "multiqc/multiqc_plots/svg/samtools_alignment_plot-cnt.svg", + "multiqc/multiqc_plots/svg/samtools_alignment_plot-pct.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_count.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_qual.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "preprocessing", + "preprocessing/mapped", + "preprocessing/mapped/test", + "preprocessing/mapped/test/test.sorted.bam", + "preprocessing/mapped/test/test.sorted.bam.bai", + "preprocessing/markduplicates", + "preprocessing/markduplicates/test", + "preprocessing/markduplicates/test/test.md.bam", + "preprocessing/markduplicates/test/test.md.bam.bai", + "preprocessing/recal_table", + "preprocessing/recal_table/test", + "preprocessing/recal_table/test/test.recal.table", + "preprocessing/recalibrated", + "preprocessing/recalibrated/test", + "preprocessing/recalibrated/test/test.recal.bam", + "preprocessing/recalibrated/test/test.recal.bam.bai", + "reference", + "reports", + "reports/bcftools", + "reports/bcftools/strelka", + "reports/bcftools/strelka/test", + "reports/bcftools/strelka/test/test.strelka.variants.bcftools_stats.txt", + "reports/fastqc", + "reports/fastqc/test-test_L1", + "reports/fastqc/test-test_L1/test-test_L1_1_fastqc.html", + "reports/fastqc/test-test_L1/test-test_L1_1_fastqc.zip", + "reports/fastqc/test-test_L1/test-test_L1_2_fastqc.html", + "reports/fastqc/test-test_L1/test-test_L1_2_fastqc.zip", + "reports/fastqc/test-test_L2", + "reports/fastqc/test-test_L2/test-test_L2_1_fastqc.html", + "reports/fastqc/test-test_L2/test-test_L2_1_fastqc.zip", + "reports/fastqc/test-test_L2/test-test_L2_2_fastqc.html", + "reports/fastqc/test-test_L2/test-test_L2_2_fastqc.zip", + "reports/markduplicates", + "reports/markduplicates/test", + "reports/markduplicates/test/test.md.cram.metrics", + "reports/mosdepth", + "reports/mosdepth/test", + "reports/mosdepth/test/test.md.mosdepth.global.dist.txt", + "reports/mosdepth/test/test.md.mosdepth.region.dist.txt", + "reports/mosdepth/test/test.md.mosdepth.summary.txt", + "reports/mosdepth/test/test.md.regions.bed.gz", + "reports/mosdepth/test/test.md.regions.bed.gz.csi", + "reports/mosdepth/test/test.recal.mosdepth.global.dist.txt", + "reports/mosdepth/test/test.recal.mosdepth.region.dist.txt", + "reports/mosdepth/test/test.recal.mosdepth.summary.txt", + "reports/mosdepth/test/test.recal.regions.bed.gz", + "reports/mosdepth/test/test.recal.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/test", + "reports/samtools/test/test.md.cram.stats", + "reports/samtools/test/test.recal.cram.stats", + "reports/vcftools", + "reports/vcftools/strelka", + "reports/vcftools/strelka/test", + "reports/vcftools/strelka/test/test.strelka.variants.FILTER.summary", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.count", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.qual", + "variant_calling", + "variant_calling/strelka", + "variant_calling/strelka/test", + "variant_calling/strelka/test/test.strelka.genome.vcf.gz", + "variant_calling/strelka/test/test.strelka.genome.vcf.gz.tbi", + "variant_calling/strelka/test/test.strelka.variants.vcf.gz", + "variant_calling/strelka/test/test.strelka.variants.vcf.gz.tbi" + ], + [ + "bcftools_stats_indel-lengths.txt:md5,deccb75341ca46a6f09658f7fd9e348b", + "bcftools_stats_vqc_Count_Indels.txt:md5,7b2a64880b653ccf0400ed9073e290dd", + "bcftools_stats_vqc_Count_SNP.txt:md5,72e934e0e8ed9b9712105bbd66dd9ffd", + "bcftools_stats_vqc_Count_Transitions.txt:md5,72e934e0e8ed9b9712105bbd66dd9ffd", + "bcftools_stats_vqc_Count_Transversions.txt:md5,72e934e0e8ed9b9712105bbd66dd9ffd", + "fastqc-status-check-heatmap.txt:md5,a020b9689ddeb4abec16b4854fe452f1", + "fastqc_adapter_content_plot.txt:md5,2e1b72be741319e7fadbbb39d7e5b37d", + "fastqc_per_base_n_content_plot.txt:md5,ad3b971a6bb4e8ba6c844c8a03584eb8", + "fastqc_per_base_sequence_quality_plot.txt:md5,1bc03889d243a944253ac637d81ae10c", + "fastqc_per_sequence_gc_content_plot_Counts.txt:md5,2c42d140ce06c08dad2b58f397c23239", + "fastqc_per_sequence_gc_content_plot_Percentages.txt:md5,59e22821d350bfb97c37ffd9088f5ad9", + "fastqc_per_sequence_quality_scores_plot.txt:md5,f33615cc98bb6225f39545a415fa7c0f", + "fastqc_sequence_counts_plot.txt:md5,7f0f19a58e8e54e792a751fd04a9ae13", + "fastqc_sequence_duplication_levels_plot.txt:md5,92b02e250ff78725deb9a10d510fcecc", + "fastqc_sequence_length_distribution_plot.txt:md5,fb04dce68ec566314125bc9438211b28", + "fastqc_top_overrepresented_sequences_table.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt:md5,20b2630a7400c9c279bf8c0c66341f7d", + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt:md5,b9b943621e50c7f3e75a37871667d5ed", + "mosdepth-coverage-per-contig-single.txt:md5,264db67a99d2c90ea7b075e33c201b77", + "mosdepth-cumcoverage-dist-id.txt:md5,5235e965da7ebe3bfebb24ffa88defff", + "mosdepth_cov_dist.txt:md5,8d0d7cb485a7bffb07da17b28f827120", + "mosdepth_cumcov_dist.txt:md5,8d0d7cb485a7bffb07da17b28f827120", + "mosdepth_perchrom.txt:md5,264db67a99d2c90ea7b075e33c201b77", + "multiqc_bcftools_stats.txt:md5,103ba59d44fc60e9308e64bbd0d0e504", + "multiqc_citations.txt:md5,ace4ca89138a5f1e2be289c157c00bd9", + "multiqc_fastqc.txt:md5,bde0d0bffa62228b33fb68b7e25b6ff8", + "multiqc_samtools_stats.txt:md5,0f1e4c6c497d9a952765f9f3068ea4b9", + "picard_histogram.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "picard_histogram_1.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "picard_histogram_2.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "samtools-stats-dp.txt:md5,c94f4d3ffa3f510552f90e173fdd9f9d", + "samtools_alignment_plot.txt:md5,717f499a3543e7ee4c7a8454bf80aeca", + "vcftools_tstv_by_count.txt:md5,50efc5214fe2c39f21efb66a710d2ed6", + "test.strelka.variants.bcftools_stats.txt:md5,86bd4938eed920d36f3f5937102a2967", + "test.md.mosdepth.global.dist.txt:md5,b61e1acee11a6ddf7ce3232a5948a6a0", + "test.md.mosdepth.region.dist.txt:md5,1a382f98d488d2ae3df83a0d87caafc1", + "test.md.mosdepth.summary.txt:md5,839108358878ada89e1eaddf6e0541ba", + "test.md.regions.bed.gz:md5,6fdaec99e739dc0f47fe55dd64dfe93e", + "test.md.regions.bed.gz.csi:md5,5f9c60279af78e3aeafc96a8c11fb35f", + "test.recal.mosdepth.global.dist.txt:md5,b61e1acee11a6ddf7ce3232a5948a6a0", + "test.recal.mosdepth.region.dist.txt:md5,1a382f98d488d2ae3df83a0d87caafc1", + "test.recal.mosdepth.summary.txt:md5,839108358878ada89e1eaddf6e0541ba", + "test.recal.regions.bed.gz:md5,6fdaec99e739dc0f47fe55dd64dfe93e", + "test.recal.regions.bed.gz.csi:md5,5f9c60279af78e3aeafc96a8c11fb35f", + "test.md.cram.stats:md5,7d19da3fc342afe0884c944f97a578b1", + "test.recal.cram.stats:md5,820d123e746d1abdc90fd8710828082e", + "test.strelka.variants.FILTER.summary:md5,ad417bc96d31223f61170987975d8128", + "test.strelka.variants.TsTv.count:md5,fa27f678965b7cba6a92efcd039f802a" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-11-09T21:51:21.998984" + } +} diff --git a/tests/alignment_to_fastq.nf.test b/tests/alignment_to_fastq.nf.test new file mode 100644 index 0000000000..a7288ef998 --- /dev/null +++ b/tests/alignment_to_fastq.nf.test @@ -0,0 +1,46 @@ +nextflow_pipeline { + + name "Test pipeline" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + profile "alignment_to_fastq" + + test("Run with profile test,alignment_to_fastq") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + igenomes_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data' + outdir = "$outputDir" + save_mapped = true + save_output_as_bam = true + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } +} diff --git a/tests/alignment_to_fastq.nf.test.snap b/tests/alignment_to_fastq.nf.test.snap new file mode 100644 index 0000000000..53a0512f82 --- /dev/null +++ b/tests/alignment_to_fastq.nf.test.snap @@ -0,0 +1,335 @@ +{ + "Run with profile test,alignment_to_fastq": { + "content": [ + 27, + { + "BCFTOOLS_STATS": { + "bcftools": 1.2 + }, + "BWAMEM1_MEM": { + "bwa": "0.7.18-r1243-dirty", + "samtools": 1.2 + }, + "CRAM_TO_BAM": { + "samtools": 1.21 + }, + "CRAM_TO_BAM_RECAL": { + "samtools": 1.21 + }, + "FASTQC": { + "fastqc": "0.12.1" + }, + "GATK4_APPLYBQSR": { + "gatk4": "4.5.0.0" + }, + "GATK4_BASERECALIBRATOR": { + "gatk4": "4.5.0.0" + }, + "GATK4_MARKDUPLICATES": { + "gatk4": "4.5.0.0", + "samtools": "1.19.2" + }, + "INDEX_CRAM": { + "samtools": 1.21 + }, + "INDEX_MERGE_BAM": { + "samtools": 1.21 + }, + "MERGE_BAM": { + "samtools": 1.21 + }, + "MOSDEPTH": { + "mosdepth": "0.3.8" + }, + "SAMTOOLS_STATS": { + "samtools": 1.21 + }, + "STRELKA_SINGLE": { + "strelka": "2.9.10" + }, + "VCFTOOLS_TSTV_COUNT": { + "vcftools": "0.1.16" + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/mapped.csv", + "csv/markduplicates.csv", + "csv/markduplicates_no_table.csv", + "csv/recalibrated.csv", + "csv/variantcalled.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/bcftools_stats_indel-lengths.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Indels.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_SNP.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transitions.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transversions.txt", + "multiqc/multiqc_data/fastqc-status-check-heatmap.txt", + "multiqc/multiqc_data/fastqc_adapter_content_plot.txt", + "multiqc/multiqc_data/fastqc_per_base_n_content_plot.txt", + "multiqc/multiqc_data/fastqc_per_base_sequence_quality_plot.txt", + "multiqc/multiqc_data/fastqc_per_sequence_gc_content_plot_Counts.txt", + "multiqc/multiqc_data/fastqc_per_sequence_gc_content_plot_Percentages.txt", + "multiqc/multiqc_data/fastqc_per_sequence_quality_scores_plot.txt", + "multiqc/multiqc_data/fastqc_sequence_counts_plot.txt", + "multiqc/multiqc_data/fastqc_sequence_duplication_levels_plot.txt", + "multiqc/multiqc_data/fastqc_sequence_length_distribution_plot.txt", + "multiqc/multiqc_data/fastqc_top_overrepresented_sequences_table.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-reported-empirical-plot.txt", + "multiqc/multiqc_data/gatk_base_recalibrator.txt", + "multiqc/multiqc_data/mosdepth-coverage-per-contig-single.txt", + "multiqc/multiqc_data/mosdepth-cumcoverage-dist-id.txt", + "multiqc/multiqc_data/mosdepth_cov_dist.txt", + "multiqc/multiqc_data/mosdepth_cumcov_dist.txt", + "multiqc/multiqc_data/mosdepth_perchrom.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_bcftools_stats.txt", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_fastqc.txt", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_picard_dups.txt", + "multiqc/multiqc_data/multiqc_samtools_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/picard_deduplication.txt", + "multiqc/multiqc_data/picard_histogram.txt", + "multiqc/multiqc_data/picard_histogram_1.txt", + "multiqc/multiqc_data/picard_histogram_2.txt", + "multiqc/multiqc_data/samtools-stats-dp.txt", + "multiqc/multiqc_data/samtools_alignment_plot.txt", + "multiqc/multiqc_data/vcftools_tstv_by_count.txt", + "multiqc/multiqc_data/vcftools_tstv_by_qual.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_indel-lengths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Indels.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_SNP.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transitions.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transversions.pdf", + "multiqc/multiqc_plots/pdf/fastqc-status-check-heatmap.pdf", + "multiqc/multiqc_plots/pdf/fastqc_adapter_content_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_base_n_content_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_base_sequence_quality_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_sequence_gc_content_plot_Counts.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_sequence_gc_content_plot_Percentages.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_sequence_quality_scores_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_sequence_counts_plot-cnt.pdf", + "multiqc/multiqc_plots/pdf/fastqc_sequence_counts_plot-pct.pdf", + "multiqc/multiqc_plots/pdf/fastqc_sequence_duplication_levels_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_sequence_length_distribution_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_top_overrepresented_sequences_table.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-reported-empirical-plot.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-coverage-per-contig-single-cnt.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-coverage-per-contig-single-pct.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-cumcoverage-dist-id.pdf", + "multiqc/multiqc_plots/pdf/picard_deduplication-cnt.pdf", + "multiqc/multiqc_plots/pdf/picard_deduplication-pct.pdf", + "multiqc/multiqc_plots/pdf/samtools-stats-dp.pdf", + "multiqc/multiqc_plots/pdf/samtools_alignment_plot-cnt.pdf", + "multiqc/multiqc_plots/pdf/samtools_alignment_plot-pct.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_count.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_qual.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/bcftools_stats_indel-lengths.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Indels.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_SNP.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transitions.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transversions.png", + "multiqc/multiqc_plots/png/fastqc-status-check-heatmap.png", + "multiqc/multiqc_plots/png/fastqc_adapter_content_plot.png", + "multiqc/multiqc_plots/png/fastqc_per_base_n_content_plot.png", + "multiqc/multiqc_plots/png/fastqc_per_base_sequence_quality_plot.png", + "multiqc/multiqc_plots/png/fastqc_per_sequence_gc_content_plot_Counts.png", + "multiqc/multiqc_plots/png/fastqc_per_sequence_gc_content_plot_Percentages.png", + "multiqc/multiqc_plots/png/fastqc_per_sequence_quality_scores_plot.png", + "multiqc/multiqc_plots/png/fastqc_sequence_counts_plot-cnt.png", + "multiqc/multiqc_plots/png/fastqc_sequence_counts_plot-pct.png", + "multiqc/multiqc_plots/png/fastqc_sequence_duplication_levels_plot.png", + "multiqc/multiqc_plots/png/fastqc_sequence_length_distribution_plot.png", + "multiqc/multiqc_plots/png/fastqc_top_overrepresented_sequences_table.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-reported-empirical-plot.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/mosdepth-coverage-per-contig-single-cnt.png", + "multiqc/multiqc_plots/png/mosdepth-coverage-per-contig-single-pct.png", + "multiqc/multiqc_plots/png/mosdepth-cumcoverage-dist-id.png", + "multiqc/multiqc_plots/png/picard_deduplication-cnt.png", + "multiqc/multiqc_plots/png/picard_deduplication-pct.png", + "multiqc/multiqc_plots/png/samtools-stats-dp.png", + "multiqc/multiqc_plots/png/samtools_alignment_plot-cnt.png", + "multiqc/multiqc_plots/png/samtools_alignment_plot-pct.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_count.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_qual.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/bcftools_stats_indel-lengths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Indels.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_SNP.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transitions.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transversions.svg", + "multiqc/multiqc_plots/svg/fastqc-status-check-heatmap.svg", + "multiqc/multiqc_plots/svg/fastqc_adapter_content_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_per_base_n_content_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_per_base_sequence_quality_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_per_sequence_gc_content_plot_Counts.svg", + "multiqc/multiqc_plots/svg/fastqc_per_sequence_gc_content_plot_Percentages.svg", + "multiqc/multiqc_plots/svg/fastqc_per_sequence_quality_scores_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_sequence_counts_plot-cnt.svg", + "multiqc/multiqc_plots/svg/fastqc_sequence_counts_plot-pct.svg", + "multiqc/multiqc_plots/svg/fastqc_sequence_duplication_levels_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_sequence_length_distribution_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_top_overrepresented_sequences_table.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-reported-empirical-plot.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/mosdepth-coverage-per-contig-single-cnt.svg", + "multiqc/multiqc_plots/svg/mosdepth-coverage-per-contig-single-pct.svg", + "multiqc/multiqc_plots/svg/mosdepth-cumcoverage-dist-id.svg", + "multiqc/multiqc_plots/svg/picard_deduplication-cnt.svg", + "multiqc/multiqc_plots/svg/picard_deduplication-pct.svg", + "multiqc/multiqc_plots/svg/samtools-stats-dp.svg", + "multiqc/multiqc_plots/svg/samtools_alignment_plot-cnt.svg", + "multiqc/multiqc_plots/svg/samtools_alignment_plot-pct.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_count.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_qual.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "preprocessing", + "preprocessing/mapped", + "preprocessing/mapped/test", + "preprocessing/mapped/test/test.sorted.bam", + "preprocessing/mapped/test/test.sorted.bam.bai", + "preprocessing/markduplicates", + "preprocessing/markduplicates/test", + "preprocessing/markduplicates/test/test.md.bam", + "preprocessing/markduplicates/test/test.md.bam.bai", + "preprocessing/recal_table", + "preprocessing/recal_table/test", + "preprocessing/recal_table/test/test.recal.table", + "preprocessing/recalibrated", + "preprocessing/recalibrated/test", + "preprocessing/recalibrated/test/test.recal.bam", + "preprocessing/recalibrated/test/test.recal.bam.bai", + "reference", + "reports", + "reports/bcftools", + "reports/bcftools/strelka", + "reports/bcftools/strelka/test", + "reports/bcftools/strelka/test/test.strelka.variants.bcftools_stats.txt", + "reports/fastqc", + "reports/fastqc/test-test_L1", + "reports/fastqc/test-test_L1/test-test_L1_1_fastqc.html", + "reports/fastqc/test-test_L1/test-test_L1_1_fastqc.zip", + "reports/fastqc/test-test_L1/test-test_L1_2_fastqc.html", + "reports/fastqc/test-test_L1/test-test_L1_2_fastqc.zip", + "reports/fastqc/test-test_L2", + "reports/fastqc/test-test_L2/test-test_L2_1_fastqc.html", + "reports/fastqc/test-test_L2/test-test_L2_1_fastqc.zip", + "reports/fastqc/test-test_L2/test-test_L2_2_fastqc.html", + "reports/fastqc/test-test_L2/test-test_L2_2_fastqc.zip", + "reports/markduplicates", + "reports/markduplicates/test", + "reports/markduplicates/test/test.md.cram.metrics", + "reports/mosdepth", + "reports/mosdepth/test", + "reports/mosdepth/test/test.md.mosdepth.global.dist.txt", + "reports/mosdepth/test/test.md.mosdepth.region.dist.txt", + "reports/mosdepth/test/test.md.mosdepth.summary.txt", + "reports/mosdepth/test/test.md.regions.bed.gz", + "reports/mosdepth/test/test.md.regions.bed.gz.csi", + "reports/mosdepth/test/test.recal.mosdepth.global.dist.txt", + "reports/mosdepth/test/test.recal.mosdepth.region.dist.txt", + "reports/mosdepth/test/test.recal.mosdepth.summary.txt", + "reports/mosdepth/test/test.recal.regions.bed.gz", + "reports/mosdepth/test/test.recal.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/test", + "reports/samtools/test/test.md.cram.stats", + "reports/samtools/test/test.recal.cram.stats", + "reports/vcftools", + "reports/vcftools/strelka", + "reports/vcftools/strelka/test", + "reports/vcftools/strelka/test/test.strelka.variants.FILTER.summary", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.count", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.qual", + "variant_calling", + "variant_calling/strelka", + "variant_calling/strelka/test", + "variant_calling/strelka/test/test.strelka.genome.vcf.gz", + "variant_calling/strelka/test/test.strelka.genome.vcf.gz.tbi", + "variant_calling/strelka/test/test.strelka.variants.vcf.gz", + "variant_calling/strelka/test/test.strelka.variants.vcf.gz.tbi" + ], + [ + "bcftools_stats_indel-lengths.txt:md5,deccb75341ca46a6f09658f7fd9e348b", + "bcftools_stats_vqc_Count_Indels.txt:md5,7b2a64880b653ccf0400ed9073e290dd", + "bcftools_stats_vqc_Count_SNP.txt:md5,72e934e0e8ed9b9712105bbd66dd9ffd", + "bcftools_stats_vqc_Count_Transitions.txt:md5,72e934e0e8ed9b9712105bbd66dd9ffd", + "bcftools_stats_vqc_Count_Transversions.txt:md5,72e934e0e8ed9b9712105bbd66dd9ffd", + "fastqc-status-check-heatmap.txt:md5,a020b9689ddeb4abec16b4854fe452f1", + "fastqc_adapter_content_plot.txt:md5,2e1b72be741319e7fadbbb39d7e5b37d", + "fastqc_per_base_n_content_plot.txt:md5,ad3b971a6bb4e8ba6c844c8a03584eb8", + "fastqc_per_base_sequence_quality_plot.txt:md5,1bc03889d243a944253ac637d81ae10c", + "fastqc_per_sequence_gc_content_plot_Counts.txt:md5,2c42d140ce06c08dad2b58f397c23239", + "fastqc_per_sequence_gc_content_plot_Percentages.txt:md5,59e22821d350bfb97c37ffd9088f5ad9", + "fastqc_per_sequence_quality_scores_plot.txt:md5,f33615cc98bb6225f39545a415fa7c0f", + "fastqc_sequence_counts_plot.txt:md5,7f0f19a58e8e54e792a751fd04a9ae13", + "fastqc_sequence_duplication_levels_plot.txt:md5,92b02e250ff78725deb9a10d510fcecc", + "fastqc_sequence_length_distribution_plot.txt:md5,fb04dce68ec566314125bc9438211b28", + "fastqc_top_overrepresented_sequences_table.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt:md5,20b2630a7400c9c279bf8c0c66341f7d", + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt:md5,b9b943621e50c7f3e75a37871667d5ed", + "mosdepth-coverage-per-contig-single.txt:md5,264db67a99d2c90ea7b075e33c201b77", + "mosdepth-cumcoverage-dist-id.txt:md5,5235e965da7ebe3bfebb24ffa88defff", + "mosdepth_cov_dist.txt:md5,8d0d7cb485a7bffb07da17b28f827120", + "mosdepth_cumcov_dist.txt:md5,8d0d7cb485a7bffb07da17b28f827120", + "mosdepth_perchrom.txt:md5,264db67a99d2c90ea7b075e33c201b77", + "multiqc_bcftools_stats.txt:md5,103ba59d44fc60e9308e64bbd0d0e504", + "multiqc_citations.txt:md5,ace4ca89138a5f1e2be289c157c00bd9", + "multiqc_fastqc.txt:md5,bde0d0bffa62228b33fb68b7e25b6ff8", + "multiqc_samtools_stats.txt:md5,0f1e4c6c497d9a952765f9f3068ea4b9", + "picard_histogram.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "picard_histogram_1.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "picard_histogram_2.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "samtools-stats-dp.txt:md5,c94f4d3ffa3f510552f90e173fdd9f9d", + "samtools_alignment_plot.txt:md5,717f499a3543e7ee4c7a8454bf80aeca", + "vcftools_tstv_by_count.txt:md5,50efc5214fe2c39f21efb66a710d2ed6", + "test.strelka.variants.bcftools_stats.txt:md5,86bd4938eed920d36f3f5937102a2967", + "test.md.mosdepth.global.dist.txt:md5,b61e1acee11a6ddf7ce3232a5948a6a0", + "test.md.mosdepth.region.dist.txt:md5,1a382f98d488d2ae3df83a0d87caafc1", + "test.md.mosdepth.summary.txt:md5,839108358878ada89e1eaddf6e0541ba", + "test.md.regions.bed.gz:md5,6fdaec99e739dc0f47fe55dd64dfe93e", + "test.md.regions.bed.gz.csi:md5,5f9c60279af78e3aeafc96a8c11fb35f", + "test.recal.mosdepth.global.dist.txt:md5,b61e1acee11a6ddf7ce3232a5948a6a0", + "test.recal.mosdepth.region.dist.txt:md5,1a382f98d488d2ae3df83a0d87caafc1", + "test.recal.mosdepth.summary.txt:md5,839108358878ada89e1eaddf6e0541ba", + "test.recal.regions.bed.gz:md5,6fdaec99e739dc0f47fe55dd64dfe93e", + "test.recal.regions.bed.gz.csi:md5,5f9c60279af78e3aeafc96a8c11fb35f", + "test.md.cram.stats:md5,7d19da3fc342afe0884c944f97a578b1", + "test.recal.cram.stats:md5,820d123e746d1abdc90fd8710828082e", + "test.strelka.variants.FILTER.summary:md5,ad417bc96d31223f61170987975d8128", + "test.strelka.variants.TsTv.count:md5,fa27f678965b7cba6a92efcd039f802a" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-11-09T17:49:40.35031" + } +} diff --git a/tests/annotation_bcfann.nf.test b/tests/annotation_bcfann.nf.test new file mode 100644 index 0000000000..a2d5430350 --- /dev/null +++ b/tests/annotation_bcfann.nf.test @@ -0,0 +1,40 @@ +nextflow_pipeline { + + name "Test pipeline" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + + test("Run with profile test | --tools bcfann") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/vcf_single.csv" + step = 'annotate' + tools = 'bcfann' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path + ).match() } + ) + } + } +} diff --git a/tests/annotation_bcfann.nf.test.snap b/tests/annotation_bcfann.nf.test.snap new file mode 100644 index 0000000000..0aa7d675ce --- /dev/null +++ b/tests/annotation_bcfann.nf.test.snap @@ -0,0 +1,40 @@ +{ + "Run with profile test | --tools bcfann": { + "content": [ + 2, + { + "BCFTOOLS_ANNOTATE": { + "bcftools": 1.2 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "annotation", + "annotation/test", + "annotation/test/test_BCF.ann.vcf.gz", + "annotation/test/test_BCF.ann.vcf.gz.tbi", + "csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml" + ], + [ + "multiqc_citations.txt:md5,4c806e63a283ec1b7e78cdae3a923d4f" + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-10-29T09:38:54.991004" + } +} diff --git a/tests/annotation_merge.nf.test b/tests/annotation_merge.nf.test new file mode 100644 index 0000000000..323b29eb3a --- /dev/null +++ b/tests/annotation_merge.nf.test @@ -0,0 +1,77 @@ +nextflow_pipeline { + + name "Test pipeline" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + + test("Run with profile test | --tools merge") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/vcf_single.csv" + step = 'annotate' + snpeff_cache = 's3://annotation-cache/snpeff_cache/' + vep_cache = 's3://annotation-cache/vep_cache/' + tools = 'merge' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path + ).match() } + ) + } + } + + test("Run with profile test | --tools merge,snpeff,vep") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/vcf_single.csv" + step = 'annotate' + snpeff_cache = 's3://annotation-cache/snpeff_cache/' + vep_cache = 's3://annotation-cache/vep_cache/' + tools = 'merge,snpeff,vep' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path + ).match() } + ) + } + } +} diff --git a/tests/annotation_merge.nf.test.snap b/tests/annotation_merge.nf.test.snap new file mode 100644 index 0000000000..62ea3b6aca --- /dev/null +++ b/tests/annotation_merge.nf.test.snap @@ -0,0 +1,205 @@ +{ + "Run with profile test | --tools merge,snpeff,vep": { + "content": [ + 7, + { + "ENSEMBLVEP_VEP": { + "ensemblvep": 113.0 + }, + "SNPEFF_SNPEFF": { + "snpeff": "5.1d" + }, + "TABIX_BGZIPTABIX": { + "tabix": 1.2 + }, + "TABIX_TABIX": { + "tabix": 1.2 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "annotation", + "annotation/test", + "annotation/test/test_VEP.ann.vcf.gz", + "annotation/test/test_VEP.ann.vcf.gz.tbi", + "annotation/test/test_snpEff.ann.vcf.gz", + "annotation/test/test_snpEff.ann.vcf.gz.tbi", + "annotation/test/test_snpEff_VEP.ann.vcf.gz", + "annotation/test/test_snpEff_VEP.ann.vcf.gz.tbi", + "csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_snpeff.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/snpeff_effects.txt", + "multiqc/multiqc_data/snpeff_qualities.txt", + "multiqc/multiqc_data/snpeff_variant_effects_region.txt", + "multiqc/multiqc_data/vep-general-stats.txt", + "multiqc/multiqc_data/vep.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/snpeff_effects-cnt.pdf", + "multiqc/multiqc_plots/pdf/snpeff_effects-pct.pdf", + "multiqc/multiqc_plots/pdf/snpeff_qualities.pdf", + "multiqc/multiqc_plots/pdf/snpeff_variant_effects_region-cnt.pdf", + "multiqc/multiqc_plots/pdf/snpeff_variant_effects_region-log.pdf", + "multiqc/multiqc_plots/pdf/snpeff_variant_effects_region-pct-log.pdf", + "multiqc/multiqc_plots/pdf/snpeff_variant_effects_region-pct.pdf", + "multiqc/multiqc_plots/pdf/vep-general-stats.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/snpeff_effects-cnt.png", + "multiqc/multiqc_plots/png/snpeff_effects-pct.png", + "multiqc/multiqc_plots/png/snpeff_qualities.png", + "multiqc/multiqc_plots/png/snpeff_variant_effects_region-cnt.png", + "multiqc/multiqc_plots/png/snpeff_variant_effects_region-log.png", + "multiqc/multiqc_plots/png/snpeff_variant_effects_region-pct-log.png", + "multiqc/multiqc_plots/png/snpeff_variant_effects_region-pct.png", + "multiqc/multiqc_plots/png/vep-general-stats.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/snpeff_effects-cnt.svg", + "multiqc/multiqc_plots/svg/snpeff_effects-pct.svg", + "multiqc/multiqc_plots/svg/snpeff_qualities.svg", + "multiqc/multiqc_plots/svg/snpeff_variant_effects_region-cnt.svg", + "multiqc/multiqc_plots/svg/snpeff_variant_effects_region-log.svg", + "multiqc/multiqc_plots/svg/snpeff_variant_effects_region-pct-log.svg", + "multiqc/multiqc_plots/svg/snpeff_variant_effects_region-pct.svg", + "multiqc/multiqc_plots/svg/vep-general-stats.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reports", + "reports/EnsemblVEP", + "reports/EnsemblVEP/test", + "reports/EnsemblVEP/test/test_VEP.ann.summary.html", + "reports/EnsemblVEP/test/test_snpEff_VEP.ann.summary.html", + "reports/snpeff", + "reports/snpeff/test", + "reports/snpeff/test/snpEff_summary.html", + "reports/snpeff/test/test_snpEff.csv", + "reports/snpeff/test/test_snpEff.genes.txt" + ], + [ + "multiqc_citations.txt:md5,ebf9f49bc020eeb38546ddab3a98171e", + "multiqc_snpeff.txt:md5,03a2b1c461cb6e5cccac64033a2f6526", + "snpeff_effects.txt:md5,3c5e9a1c191b77c781dc4d033b1dd1f7", + "snpeff_qualities.txt:md5,4c059b4e8bf0a64940ad1d6e30efd3a6", + "snpeff_variant_effects_region.txt:md5,05efd324edadced17ba3cd2b7714af57", + "vep-general-stats.txt:md5,71c994ae4221384f4e22459723d29cd0", + "vep.txt:md5,20570f3e4e51407b860a31d7e1d59de0", + "test_snpEff.genes.txt:md5,130536bf0237d7f3f746d32aaa32840a" + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-10-29T10:03:16.229715" + }, + "Run with profile test | --tools merge": { + "content": [ + 5, + { + "ENSEMBLVEP_VEP": { + "ensemblvep": 113.0 + }, + "SNPEFF_SNPEFF": { + "snpeff": "5.1d" + }, + "TABIX_BGZIPTABIX": { + "tabix": 1.2 + }, + "TABIX_TABIX": { + "tabix": 1.2 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "annotation", + "annotation/test", + "annotation/test/test_snpEff_VEP.ann.vcf.gz", + "annotation/test/test_snpEff_VEP.ann.vcf.gz.tbi", + "csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_snpeff.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/snpeff_effects.txt", + "multiqc/multiqc_data/snpeff_qualities.txt", + "multiqc/multiqc_data/snpeff_variant_effects_region.txt", + "multiqc/multiqc_data/vep-general-stats.txt", + "multiqc/multiqc_data/vep.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/snpeff_effects-cnt.pdf", + "multiqc/multiqc_plots/pdf/snpeff_effects-pct.pdf", + "multiqc/multiqc_plots/pdf/snpeff_qualities.pdf", + "multiqc/multiqc_plots/pdf/snpeff_variant_effects_region-cnt.pdf", + "multiqc/multiqc_plots/pdf/snpeff_variant_effects_region-log.pdf", + "multiqc/multiqc_plots/pdf/snpeff_variant_effects_region-pct-log.pdf", + "multiqc/multiqc_plots/pdf/snpeff_variant_effects_region-pct.pdf", + "multiqc/multiqc_plots/pdf/vep-general-stats.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/snpeff_effects-cnt.png", + "multiqc/multiqc_plots/png/snpeff_effects-pct.png", + "multiqc/multiqc_plots/png/snpeff_qualities.png", + "multiqc/multiqc_plots/png/snpeff_variant_effects_region-cnt.png", + "multiqc/multiqc_plots/png/snpeff_variant_effects_region-log.png", + "multiqc/multiqc_plots/png/snpeff_variant_effects_region-pct-log.png", + "multiqc/multiqc_plots/png/snpeff_variant_effects_region-pct.png", + "multiqc/multiqc_plots/png/vep-general-stats.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/snpeff_effects-cnt.svg", + "multiqc/multiqc_plots/svg/snpeff_effects-pct.svg", + "multiqc/multiqc_plots/svg/snpeff_qualities.svg", + "multiqc/multiqc_plots/svg/snpeff_variant_effects_region-cnt.svg", + "multiqc/multiqc_plots/svg/snpeff_variant_effects_region-log.svg", + "multiqc/multiqc_plots/svg/snpeff_variant_effects_region-pct-log.svg", + "multiqc/multiqc_plots/svg/snpeff_variant_effects_region-pct.svg", + "multiqc/multiqc_plots/svg/vep-general-stats.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reports", + "reports/EnsemblVEP", + "reports/EnsemblVEP/test", + "reports/EnsemblVEP/test/test_snpEff_VEP.ann.summary.html", + "reports/snpeff", + "reports/snpeff/test" + ], + [ + "multiqc_citations.txt:md5,ebf9f49bc020eeb38546ddab3a98171e", + "multiqc_snpeff.txt:md5,03a2b1c461cb6e5cccac64033a2f6526", + "snpeff_effects.txt:md5,3c5e9a1c191b77c781dc4d033b1dd1f7", + "snpeff_qualities.txt:md5,4c059b4e8bf0a64940ad1d6e30efd3a6", + "snpeff_variant_effects_region.txt:md5,05efd324edadced17ba3cd2b7714af57", + "vep-general-stats.txt:md5,57563be109a57f6edfa427b2b2c310ba", + "vep.txt:md5,bf54f689bb0ccab5e1566e48373f768c" + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-10-29T09:56:59.385257" + } +} diff --git a/tests/annotation_snpeff.nf.test b/tests/annotation_snpeff.nf.test new file mode 100644 index 0000000000..9addb8f1d7 --- /dev/null +++ b/tests/annotation_snpeff.nf.test @@ -0,0 +1,68 @@ +nextflow_pipeline { + + name "Test pipeline" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + + test("Run with profile test | --tools snpeff --download_cache") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/vcf_single.csv" + step = 'annotate' + download_cache = true + tools = 'snpeff' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path + ).match() } + ) + } + } + + test("Fails with profile test | --tools snpeff --snpeff_db na --build_only_index") { + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/vcf_single.csv" + step = 'annotate' + snpeff_cache = 's3://annotation-cache/snpeff_cache/' + snpeff_db = "na" + input = false + build_only_index = true + tools = 'snpeff' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + assertAll( + { assert workflow.failed}, + { assert workflow.stdout.toString().contains("This path is not available within annotation-cache") } + ) + } + } +} diff --git a/tests/annotation_snpeff.nf.test.snap b/tests/annotation_snpeff.nf.test.snap new file mode 100644 index 0000000000..da7ef2d016 --- /dev/null +++ b/tests/annotation_snpeff.nf.test.snap @@ -0,0 +1,108 @@ +{ + "Run with profile test | --tools snpeff --download_cache": { + "content": [ + 4, + { + "SNPEFF_SNPEFF": { + "snpeff": "5.1d" + }, + "TABIX_BGZIPTABIX": { + "tabix": 1.2 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "annotation", + "annotation/test", + "annotation/test/test_snpEff.ann.vcf.gz", + "annotation/test/test_snpEff.ann.vcf.gz.tbi", + "cache", + "cache/snpeff_cache", + "cache/snpeff_cache/WBcel235.105", + "cache/snpeff_cache/WBcel235.105/sequence.I.bin", + "cache/snpeff_cache/WBcel235.105/sequence.II.bin", + "cache/snpeff_cache/WBcel235.105/sequence.III.bin", + "cache/snpeff_cache/WBcel235.105/sequence.IV.bin", + "cache/snpeff_cache/WBcel235.105/sequence.V.bin", + "cache/snpeff_cache/WBcel235.105/sequence.X.bin", + "cache/snpeff_cache/WBcel235.105/sequence.bin", + "cache/snpeff_cache/WBcel235.105/snpEffectPredictor.bin", + "cache/versions.yml", + "csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_snpeff.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/snpeff_effects.txt", + "multiqc/multiqc_data/snpeff_qualities.txt", + "multiqc/multiqc_data/snpeff_variant_effects_region.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/snpeff_effects-cnt.pdf", + "multiqc/multiqc_plots/pdf/snpeff_effects-pct.pdf", + "multiqc/multiqc_plots/pdf/snpeff_qualities.pdf", + "multiqc/multiqc_plots/pdf/snpeff_variant_effects_region-cnt.pdf", + "multiqc/multiqc_plots/pdf/snpeff_variant_effects_region-log.pdf", + "multiqc/multiqc_plots/pdf/snpeff_variant_effects_region-pct-log.pdf", + "multiqc/multiqc_plots/pdf/snpeff_variant_effects_region-pct.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/snpeff_effects-cnt.png", + "multiqc/multiqc_plots/png/snpeff_effects-pct.png", + "multiqc/multiqc_plots/png/snpeff_qualities.png", + "multiqc/multiqc_plots/png/snpeff_variant_effects_region-cnt.png", + "multiqc/multiqc_plots/png/snpeff_variant_effects_region-log.png", + "multiqc/multiqc_plots/png/snpeff_variant_effects_region-pct-log.png", + "multiqc/multiqc_plots/png/snpeff_variant_effects_region-pct.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/snpeff_effects-cnt.svg", + "multiqc/multiqc_plots/svg/snpeff_effects-pct.svg", + "multiqc/multiqc_plots/svg/snpeff_qualities.svg", + "multiqc/multiqc_plots/svg/snpeff_variant_effects_region-cnt.svg", + "multiqc/multiqc_plots/svg/snpeff_variant_effects_region-log.svg", + "multiqc/multiqc_plots/svg/snpeff_variant_effects_region-pct-log.svg", + "multiqc/multiqc_plots/svg/snpeff_variant_effects_region-pct.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reports", + "reports/snpeff", + "reports/snpeff/test", + "reports/snpeff/test/snpEff_summary.html", + "reports/snpeff/test/test_snpEff.csv", + "reports/snpeff/test/test_snpEff.genes.txt" + ], + [ + "sequence.I.bin:md5,2fd1694bd91cf7952cbad8cfed161e53", + "sequence.II.bin:md5,bacedbdea89508e108223767fa260a4c", + "sequence.III.bin:md5,444118a9fb9d0a03c37e86094d8e52a9", + "sequence.IV.bin:md5,ff756628faa0b71cd65495668c3d82b5", + "sequence.V.bin:md5,d6ad5476162ac45829f719dd4ee3f4e7", + "sequence.X.bin:md5,b79bec6cc8f96b8373dac56bab5d0a6c", + "sequence.bin:md5,ec2bc2ae81755ab90fcf1848bc7ce41f", + "snpEffectPredictor.bin:md5,1d99251d0405f0a42913ed8b5b2c2fa7", + "versions.yml:md5,e9698b9ccf3bb151631d1e07fbae8397", + "multiqc_citations.txt:md5,47e39f5f5f05da6bc38d13aa81fe8b6e", + "multiqc_snpeff.txt:md5,03a2b1c461cb6e5cccac64033a2f6526", + "snpeff_effects.txt:md5,3c5e9a1c191b77c781dc4d033b1dd1f7", + "snpeff_qualities.txt:md5,4c059b4e8bf0a64940ad1d6e30efd3a6", + "snpeff_variant_effects_region.txt:md5,05efd324edadced17ba3cd2b7714af57", + "test_snpEff.genes.txt:md5,130536bf0237d7f3f746d32aaa32840a" + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-10-29T10:16:00.665699" + } +} diff --git a/tests/annotation_vep.nf.test b/tests/annotation_vep.nf.test new file mode 100644 index 0000000000..80659a41b7 --- /dev/null +++ b/tests/annotation_vep.nf.test @@ -0,0 +1,68 @@ +nextflow_pipeline { + + name "Test pipeline" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + + test("Run with profile test | --tools vep --download_cache --vep_include_fasta") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/vcf_single.csv" + step = 'annotate' + download_cache = true + tools = 'vep' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path + ).match() } + ) + } + } + + test("Fails with profile test | --tools vep --vep_cache_version 1 --build_only_index") { + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/vcf_single.csv" + step = 'annotate' + vep_cache = 's3://annotation-cache/vep_cache/' + vep_cache_version = 1 + input = false + build_only_index = true + tools = 'vep' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + assertAll( + { assert workflow.failed}, + { assert workflow.stdout.toString().contains("This path is not available within annotation-cache") } + ) + } + } +} diff --git a/tests/annotation_vep.nf.test.snap b/tests/annotation_vep.nf.test.snap new file mode 100644 index 0000000000..d9d82cda3d --- /dev/null +++ b/tests/annotation_vep.nf.test.snap @@ -0,0 +1,283 @@ +{ + "Run with profile test | --tools vep --download_cache --vep_include_fasta": { + "content": [ + 4, + { + "ENSEMBLVEP_VEP": { + "ensemblvep": 113.0 + }, + "TABIX_TABIX": { + "tabix": 1.2 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "annotation", + "annotation/test", + "annotation/test/test_VEP.ann.vcf.gz", + "annotation/test/test_VEP.ann.vcf.gz.tbi", + "cache", + "cache/vep_cache", + "cache/vep_cache/caenorhabditis_elegans", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/I", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/I/1-1000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/I/10000001-11000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/I/1000001-2000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/I/11000001-12000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/I/12000001-13000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/I/13000001-14000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/I/14000001-15000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/I/15000001-16000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/I/2000001-3000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/I/3000001-4000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/I/4000001-5000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/I/5000001-6000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/I/6000001-7000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/I/7000001-8000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/I/8000001-9000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/I/9000001-10000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/II", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/II/1-1000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/II/10000001-11000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/II/1000001-2000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/II/11000001-12000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/II/12000001-13000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/II/13000001-14000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/II/14000001-15000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/II/15000001-16000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/II/2000001-3000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/II/3000001-4000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/II/4000001-5000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/II/5000001-6000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/II/6000001-7000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/II/7000001-8000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/II/8000001-9000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/II/9000001-10000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/III", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/III/1-1000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/III/10000001-11000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/III/1000001-2000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/III/11000001-12000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/III/12000001-13000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/III/13000001-14000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/III/2000001-3000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/III/3000001-4000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/III/4000001-5000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/III/5000001-6000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/III/6000001-7000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/III/7000001-8000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/III/8000001-9000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/III/9000001-10000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV/1-1000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV/10000001-11000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV/1000001-2000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV/11000001-12000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV/12000001-13000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV/13000001-14000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV/14000001-15000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV/15000001-16000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV/16000001-17000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV/17000001-18000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV/2000001-3000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV/3000001-4000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV/4000001-5000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV/5000001-6000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV/6000001-7000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV/7000001-8000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV/8000001-9000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/IV/9000001-10000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/MtDNA", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/MtDNA/1-1000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/1-1000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/10000001-11000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/1000001-2000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/11000001-12000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/12000001-13000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/13000001-14000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/14000001-15000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/15000001-16000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/16000001-17000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/17000001-18000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/18000001-19000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/19000001-20000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/20000001-21000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/2000001-3000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/3000001-4000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/4000001-5000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/5000001-6000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/6000001-7000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/7000001-8000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/8000001-9000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/V/9000001-10000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X/1-1000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X/10000001-11000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X/1000001-2000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X/11000001-12000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X/12000001-13000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X/13000001-14000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X/14000001-15000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X/15000001-16000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X/16000001-17000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X/17000001-18000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X/2000001-3000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X/3000001-4000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X/4000001-5000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X/5000001-6000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X/6000001-7000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X/7000001-8000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X/8000001-9000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/X/9000001-10000000.gz", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/chr_synonyms.txt", + "cache/vep_cache/caenorhabditis_elegans/113_WBcel235/info.txt", + "cache/versions.yml", + "csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/vep-general-stats.txt", + "multiqc/multiqc_data/vep.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/vep-general-stats.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/vep-general-stats.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/vep-general-stats.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reports", + "reports/EnsemblVEP", + "reports/EnsemblVEP/test", + "reports/EnsemblVEP/test/test_VEP.ann.summary.html" + ], + [ + "1-1000000.gz:md5,cadcba92b0999210dd8d832505d2e4c4", + "10000001-11000000.gz:md5,998a75dd927d10d45f8eebeef5fc7a75", + "1000001-2000000.gz:md5,a5cb3adb1ec9f40eed6a355d1492ba9b", + "11000001-12000000.gz:md5,46e6917f51093e28cce061774b9ed158", + "12000001-13000000.gz:md5,0adffacf8482d6c224df27104f65c9d6", + "13000001-14000000.gz:md5,aee759d812fc900a980ab0c4c5bd0273", + "14000001-15000000.gz:md5,f65537a3f76c40e63b6deb0b6cdb09dc", + "15000001-16000000.gz:md5,379f092ad1afa888da1fc13e80535def", + "2000001-3000000.gz:md5,86839741524579fd089498d6bee44dff", + "3000001-4000000.gz:md5,509b28af3920427e951f00b6973b5df4", + "4000001-5000000.gz:md5,f606e69cf59b0bdf2b61653608d955a6", + "5000001-6000000.gz:md5,a14ce1e21856e4a77ed63c67cbdfb26a", + "6000001-7000000.gz:md5,e1a895d6e8b352182b53ed1d0ce6e24e", + "7000001-8000000.gz:md5,ddf91b60f636d26b68b6bab3520b6b32", + "8000001-9000000.gz:md5,57482b996f89e92bbd0196efa4915cd3", + "9000001-10000000.gz:md5,43b5d89f84236b49b384d7f37f928129", + "1-1000000.gz:md5,d18811781848f70baef0b0348190d7ce", + "10000001-11000000.gz:md5,19011165abc56233ea0c5b0e6938d9c9", + "1000001-2000000.gz:md5,5e720fa191f3c9ac799b6a071bcc4332", + "11000001-12000000.gz:md5,b19c46fb00ca13a2a31128bd1829ddf5", + "12000001-13000000.gz:md5,54354b0870ca96641c51ed63382da007", + "13000001-14000000.gz:md5,6954fdc223f58eb406e602752ab7d139", + "14000001-15000000.gz:md5,929275a1cfea883999dddc20931a2e72", + "15000001-16000000.gz:md5,5f5b783a589a1fd80cc565e6f339c540", + "2000001-3000000.gz:md5,54e476e0e9f4a5d973ee710fd824abc7", + "3000001-4000000.gz:md5,d78d4a63165429fdb3a61b7cdbd3c43a", + "4000001-5000000.gz:md5,983f8efcebb7f62d7e7b1b3c0573d43e", + "5000001-6000000.gz:md5,e2cd03ed5b67b8ee123e4c4958508fe4", + "6000001-7000000.gz:md5,d04bc9335ba39ace20bce936e3a5cdeb", + "7000001-8000000.gz:md5,9354b26a9ba94aa5bc30f537c22382fb", + "8000001-9000000.gz:md5,b227c6ef81ab72d211d25dc4f44813b9", + "9000001-10000000.gz:md5,a6d7f29edd7c22139403a11cac989b7a", + "1-1000000.gz:md5,2117acb322a117a9c5db85c072575331", + "10000001-11000000.gz:md5,646c9582b56eb12ddbb1dd35b25c3670", + "1000001-2000000.gz:md5,ee433e4e5e37b2d008c43e1af4be0f8d", + "11000001-12000000.gz:md5,962fd6e52046484b3b123f9380ed64e9", + "12000001-13000000.gz:md5,1abf2d695c829eb2c88e0d3dbc739a1c", + "13000001-14000000.gz:md5,a6e03bf867f5cc694174a230f1b13a6b", + "2000001-3000000.gz:md5,a5b250aa9e3ee8cecc23bea0e2fa19a1", + "3000001-4000000.gz:md5,1390a6d2a28a4861b282d36d0fb85660", + "4000001-5000000.gz:md5,4bc7106bb2661aea28613c31935a5c8f", + "5000001-6000000.gz:md5,7317d6fbb3c77d7cdd31e781afab8f7d", + "6000001-7000000.gz:md5,1a3b6fa586e570c16b4833e34b28751e", + "7000001-8000000.gz:md5,b7bcb06393682f621403afdf19bf87b4", + "8000001-9000000.gz:md5,0011675a8567d394da54a52480b35786", + "9000001-10000000.gz:md5,e4fa88e4ec57ed0c71fd21090d8aa17a", + "1-1000000.gz:md5,a47af22d33275652036ddf7161699c7c", + "10000001-11000000.gz:md5,7fc129e7edbaa5be87306de417c2ef28", + "1000001-2000000.gz:md5,cbc12c339741df5ad06bf9a946be6c93", + "11000001-12000000.gz:md5,d1cc5e20e3d3402debdc102087a5407f", + "12000001-13000000.gz:md5,42c69c8e86d28151e9a8b1787dbee125", + "13000001-14000000.gz:md5,c7459d1789a833e8a898ebdbc607e7d8", + "14000001-15000000.gz:md5,5806b20108f56d9eeabcdd4f8450dca3", + "15000001-16000000.gz:md5,78e859f70026a05be43d48b9b272f287", + "16000001-17000000.gz:md5,539db7fc976bee4b6031f8dcb6a4641d", + "17000001-18000000.gz:md5,f3ea55e7552dc36734d6e8ba67d1e4c2", + "2000001-3000000.gz:md5,539013ecfdcd06eb653445f857265322", + "3000001-4000000.gz:md5,beb9701b402bd5ddc46a4da6e531f783", + "4000001-5000000.gz:md5,3f46efb2635850cc6c3d8ae51727a400", + "5000001-6000000.gz:md5,e11549bca12c5e2a7a208a997fda1c68", + "6000001-7000000.gz:md5,c0f3546c6859dc1a5fe9ff7f015ecd7e", + "7000001-8000000.gz:md5,344b72822f647819f4ee6b5afa9d7701", + "8000001-9000000.gz:md5,1c06d285ff5c53f89f073212343902b7", + "9000001-10000000.gz:md5,79140e754039c6d6fc6eeecddcf2aa8e", + "1-1000000.gz:md5,40ef48190d3269cd4112450bc717b1ef", + "1-1000000.gz:md5,1a8739457c429931923ed77596a9ee54", + "10000001-11000000.gz:md5,316fa1d06fc1878b6a5995f4aee3e49d", + "1000001-2000000.gz:md5,3926c03a091850c909bd0ccfc7133c0b", + "11000001-12000000.gz:md5,29ca11d2f05051cc439a0d24a9db134c", + "12000001-13000000.gz:md5,a46f648554e91999652019516c933754", + "13000001-14000000.gz:md5,167b126d1c690a0e7e25fc5ccd09fb7c", + "14000001-15000000.gz:md5,645554c896133c476c3083302371bcf8", + "15000001-16000000.gz:md5,60fc48d9a7aff6286fc6630c46bcfebc", + "16000001-17000000.gz:md5,07e1750d1c95a61e96774d2cf3da4d89", + "17000001-18000000.gz:md5,59d084309f6a975ec1066a828b5845ba", + "18000001-19000000.gz:md5,868c12d305dbd4d04399ec7848804328", + "19000001-20000000.gz:md5,6de03a00061f6a88dcbbb8ed5fc0b8dc", + "20000001-21000000.gz:md5,732b956f13da9ef01f9de3355d12e28b", + "2000001-3000000.gz:md5,7c7528266c523cad419ea25e75d9566e", + "3000001-4000000.gz:md5,bb2283c0cfb0e4601fc535a4d51e6f2d", + "4000001-5000000.gz:md5,64c7f28f554414a88c886b0bcadb3c39", + "5000001-6000000.gz:md5,58e7106fe577a8b5e5c698445b4f0c33", + "6000001-7000000.gz:md5,2e309d12cf1c1c6276585f457ceeacc2", + "7000001-8000000.gz:md5,08cb0600f7806608f0103187a6c9c64e", + "8000001-9000000.gz:md5,869333c2615f714860d17d794640d4ad", + "9000001-10000000.gz:md5,b85cc861c6a3b30cf6f06c8af136b383", + "1-1000000.gz:md5,c8d97b084c159c3cb5be1fff4637dfce", + "10000001-11000000.gz:md5,f441f2af06fd4973749dfbfbef40fe1b", + "1000001-2000000.gz:md5,c42a1526a836cfacefb67e9217f648aa", + "11000001-12000000.gz:md5,264421c249c696b45c92e2611285fee7", + "12000001-13000000.gz:md5,e673d1fdbe7dc0d09bea3d11a5797d6a", + "13000001-14000000.gz:md5,88f4f84e63b362f1b4f800c48b37e82c", + "14000001-15000000.gz:md5,26282f2b305ed82fb9f8875e97361105", + "15000001-16000000.gz:md5,30b9132c2610d42919ba231d1adbef2a", + "16000001-17000000.gz:md5,3d0e975ccd1ae4e92bf1d9d915ed293f", + "17000001-18000000.gz:md5,7db5b3819da3df1e47fe757dc9c6f2ba", + "2000001-3000000.gz:md5,55f6130a8d5872bdc9f8eed231ad0f65", + "3000001-4000000.gz:md5,402b826dbf6993c207ad15483a44182b", + "4000001-5000000.gz:md5,43cf926d43db25af5724fb5077edfee1", + "5000001-6000000.gz:md5,f40276dbea3f6f9a75f9301d1253eb09", + "6000001-7000000.gz:md5,df0d2d38060d4e7c606072ae814b1f38", + "7000001-8000000.gz:md5,c4117cc51255c0a91c51ff43403f00f7", + "8000001-9000000.gz:md5,59a4ebadca27041634c58652c544c8dd", + "9000001-10000000.gz:md5,c54510616273a4d1bfa9d525dbbbca40", + "chr_synonyms.txt:md5,d390f0bcc6fec9786bc66b75f2d4390b", + "info.txt:md5,249c88c7a71464e048cca0c4b2a21198", + "versions.yml:md5,c3ed0b3df82507df6a99335c73147606", + "multiqc_citations.txt:md5,afe6b13a9c0770828d9bc9515d6db802", + "vep-general-stats.txt:md5,2fca07ac5623e758cac3ce49d9f6e3b0", + "vep.txt:md5,60a5d57c60308aba2aa6206f903d27e9" + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-10-29T10:19:17.846465" + } +} diff --git a/tests/config/pytesttags.yml b/tests/config/pytesttags.yml index 63c96d73a9..6be50502d1 100644 --- a/tests/config/pytesttags.yml +++ b/tests/config/pytesttags.yml @@ -1,102 +1,5 @@ -# default -default: - - "**" - -# default_extended - -tumor_normal_pair: - - conf/modules/** - - main.nf - - modules/** - - nextflow.config - - nextflow_schema.json - - subworkflows/** - - tests/csv/3.0/fastq_pair.csv - - tests/test_tumor_normal_pair.yml - - workflows/** - -save_mapped_only: - - conf/modules/** - - main.nf - - modules/** - - nextflow.config - - nextflow_schema.json - - subworkflows/** - - tests/csv/3.0/fastq_single.csv - - tests/test_save_mapped.yml - - workflows/** - -save_output_as_bam_only: - - conf/modules/** - - main.nf - - modules/** - - nextflow.config - - nextflow_schema.json - - subworkflows/** - - tests/csv/3.0/fastq_single.csv - - tests/test_save_output_as_bam_only.yml - - workflows/** - -skip_all_qc: - - conf/modules/** - - main.nf - - modules/** - - nextflow.config - - nextflow_schema.json - - subworkflows/** - - tests/csv/3.0/fastq_single.csv - - tests/test_skip_all_qc.yml - - workflows/** - -skip_markduplicates: - - conf/modules/** - - main.nf - - modules/** - - nextflow.config - - nextflow_schema.json - - subworkflows/** - - tests/csv/3.0/fastq_single.csv - - tests/test_skip_markduplicates.yml - - workflows/** - -validation_checks: - - conf/modules/** - - main.nf - - modules/** - - nextflow.config - - nextflow_schema.json - - subworkflows/** - - tests/csv/3.0/sample_with_space.csv - - tests/test_samplesheet_validation_spaces.yml - - workflows/** - # preprocessing -## alignment_from_everything -alignment_from_everything: - - conf/modules/** - - conf/test/alignment_from_everything.config - - main.nf - - modules/** - - nextflow.config - - nextflow_schema.json - - subworkflows/** - - tests/csv/3.0/bam_and_fastq_and_spring.csv - - tests/test_alignment_from_everything.yml - - workflows/** - -## alignment_to_fastq -alignment_to_fastq: - - conf/modules/alignment_to_fastq.config - - modules/nf-core/cat/fastq/** - - modules/nf-core/mosdepth/** - - modules/nf-core/samtools/collatefastq/** - - modules/nf-core/samtools/merge/** - - modules/nf-core/samtools/view/** - - subworkflows/local/bam_convert_samtools/** - - tests/csv/3.0/bam_for_remapping.csv - - tests/test_alignment_to_fastq.yml - ## umi umi: - conf/modules/umi.config @@ -124,33 +27,6 @@ fastp: ## aligner -### bwamem -bwamem: - - conf/modules/aligner.config - - modules/nf-core/bwa/mem/** - - modules/nf-core/mosdepth/** - - subworkflows/local/fastq_align_bwamem_mem2_dragmap_sentieon/** - - tests/csv/3.0/fastq_single.csv - - tests/test_aligner_bwamem.yml - -### bwamem2 -bwamem2: - - conf/modules/aligner.config - - modules/nf-core/bwamem2/mem/** - - modules/nf-core/mosdepth/** - - subworkflows/local/fastq_align_bwamem_mem2_dragmap_sentieon/** - - tests/csv/3.0/fastq_single.csv - - tests/test_aligner_bwamem2.yml - -### dragmap -dragmap: - - conf/modules/aligner.config - - modules/nf-core/dragmap/align/** - - modules/nf-core/mosdepth/** - - subworkflows/local/fastq_align_bwamem_mem2_dragmap_sentieon/** - - tests/csv/3.0/fastq_single.csv - - tests/test_aligner_dragmap.yml - ### sentieon/bwamem sentieon/bwamem: - conf/modules/aligner.config @@ -159,21 +35,6 @@ sentieon/bwamem: - tests/csv/3.0/fastq_single.csv - tests/test_sentieon_aligner_bwamem.yml -## markduplicates -gatk4/markduplicates: - - conf/modules/markduplicates.config - - modules/nf-core/gatk4/markduplicates/** - - modules/nf-core/mosdepth/** - - modules/nf-core/samtools/convert/** - - modules/nf-core/samtools/index/** - - modules/nf-core/samtools/stats/** - - subworkflows/local/bam_markduplicates/** - - subworkflows/local/cram_qc_mosdepth_samtools/** - - tests/csv/3.0/mapped_single_bam.csv - - tests/csv/3.0/mapped_single_cram.csv - - tests/test_markduplicates_from_bam.yml - - tests/test_markduplicates_from_cram.yml - ## sentieon/dedup sentieon/dedup: - conf/modules/sentieon_dedup.config @@ -188,34 +49,6 @@ sentieon/dedup: - tests/test_sentieon_dedup_from_bam.yml - tests/test_sentieon_dedup_from_cram.yml -## prepare_recalibration -prepare_recalibration: - - conf/modules/prepare_recalibration.config - - modules/nf-core/gatk4/baserecalibrator/** - - modules/nf-core/gatk4/gatherbqsrreports/** - - modules/nf-core/mosdepth/** - - modules/nf-core/samtools/convert/** - - subworkflows/local/bam_baserecalibrator/** - - tests/csv/3.0/mapped_single_bam.csv - - tests/csv/3.0/mapped_single_cram.csv - - tests/test_prepare_recalibration_from_bam.yml - - tests/test_prepare_recalibration_from_cram.yml - -## recalibrate -recalibrate: - - conf/modules/recalibrate.config - - modules/nf-core/gatk4/applybqsr/** - - modules/nf-core/samtools/convert/** - - modules/nf-core/samtools/index/** - - modules/nf-core/samtools/merge/** - - modules/nf-core/mosdepth/** - - subworkflows/local/bam_applybqsr/** - - subworkflows/local/cram_merge_index_samtools/** - - tests/csv/3.0/prepare_recalibration_single_bam.csv - - tests/csv/3.0/prepare_recalibration_single_cram.csv - - tests/test_recalibrate_from_bam.yml - - tests/test_recalibrate_from_cram.yml - ## intervals intervals: - conf/modules/prepare_intervals.config @@ -263,27 +96,6 @@ cnvkit: - tests/csv/3.0/recalibrated_tumoronly.csv - tests/test_cnvkit.yml -## controlfreec -controlfreec: - - conf/modules/controlfreec.config - - conf/modules/mpileup.config - - modules/nf-core/cat/cat/** - - modules/nf-core/controlfreec/assesssignificance/** - - modules/nf-core/controlfreec/freec/** - - modules/nf-core/controlfreec/freec2bed/** - - modules/nf-core/controlfreec/freec2circos/** - - modules/nf-core/controlfreec/makegraph2/** - - modules/nf-core/mosdepth/** - - modules/nf-core/samtools/mpileup/** - - subworkflows/local/bam_variant_calling_mpileup/** - - subworkflows/local/bam_variant_calling_somatic_all/** - - subworkflows/local/bam_variant_calling_somatic_controlfreec/** - - subworkflows/local/bam_variant_calling_tumor_only_all/** - - subworkflows/local/bam_variant_calling_tumor_only_controlfreec/** - - tests/csv/3.0/recalibrated_somatic.csv - - tests/csv/3.0/recalibrated_tumoronly.csv - - tests/test_controlfreec.yml - ## deepvariant deepvariant: - conf/modules/deepvariant.config @@ -502,39 +314,15 @@ mutect2: - tests/csv/3.0/recalibrated_tumoronly.csv - tests/test_mutect2.yml -## strelka -strelka: - - conf/modules/strelka.config - - modules/nf-core/gatk4/mergevcfs/** +## lofreq +lofreq: + - conf/modules/lofreq.config - modules/nf-core/mosdepth/** - - modules/nf-core/strelka/germline/** - - modules/nf-core/strelka/somatic/** - - subworkflows/local/bam_variant_calling_germline_all/** - - subworkflows/local/bam_variant_calling_single_strelka/** - - subworkflows/local/bam_variant_calling_somatic_all/** - - subworkflows/local/bam_variant_calling_somatic_strelka/** + - modules/nf-core/lofreq/callparallel/** - subworkflows/local/bam_variant_calling_tumor_only_all/** - - tests/csv/3.0/recalibrated_germline.csv - - tests/csv/3.0/recalibrated_somatic.csv + - subworkflows/local/bam_variant_calling_tumor_only_lofreq/** - tests/csv/3.0/recalibrated_tumoronly.csv - - tests/csv/3.0/recalibrated.csv - - tests/test_strelka.yml - -## strelka_bp -strelka_bp: - - conf/modules/manta.config - - conf/modules/strelka.config - - modules/nf-core/gatk4/mergevcfs/** - - modules/nf-core/manta/somatic/** - - modules/nf-core/mosdepth/** - - modules/nf-core/strelka/somatic/** - - subworkflows/local/bam_variant_calling_germline_all/** - - subworkflows/local/bam_variant_calling_somatic_all/** - - subworkflows/local/bam_variant_calling_somatic_manta/** - - subworkflows/local/bam_variant_calling_somatic_strelka/** - - subworkflows/local/bam_variant_calling_tumor_only_all/** - - tests/csv/3.0/recalibrated_somatic.csv - - tests/test_strelka_bp.yml + - tests/test_lofreq.yml ## tiddit tiddit: @@ -553,55 +341,6 @@ tiddit: - tests/csv/3.0/recalibrated_tumoronly.csv - tests/test_tiddit.yml -# annotate - -## cache -cache: - - conf/modules/prepare_cache.config - - modules/nf-core/ensemblvep/download/** - - modules/nf-core/snpeff/download/** - - subworkflows/local/prepare_cache/** - - tests/test_annotation_cache.yml - -## merge -merge: - - conf/modules/annotate.config - - modules/nf-core/ensemblvep/vep/** - - modules/nf-core/snpeff/snpeff/** - - modules/nf-core/tabix/bgziptabix/** - - subworkflows/local/vcf_annotate_all/** - - subworkflows/nf-core/vcf_annotate_ensemblvep/** - - subworkflows/nf-core/vcf_annotate_snpeff/** - - tests/csv/3.0/vcf_single.csv - - tests/test_annotation_merge.yml - -## snpeff -snpeff: - - conf/modules/annotate.config - - modules/nf-core/snpeff/snpeff/** - - modules/nf-core/tabix/bgziptabix/** - - subworkflows/nf-core/vcf_annotate_snpeff/** - - tests/csv/3.0/vcf_single.csv - - tests/test_annotation_snpeff.yml - -## vep -vep: - - conf/modules/annotate.config - - modules/nf-core/ensemblvep/vep/** - - modules/nf-core/tabix/bgziptabix/** - - subworkflows/nf-core/vcf_annotate_ensemblvep/** - - tests/csv/3.0/vcf_single.csv - - tests/test_annotation_vep.yml - -## bcfann -bcfann: - - conf/modules/annotate.config - - modules/nf-core/bcftools/annotate/** - - modules/nf-core/tabix/bgziptabix/** - - subworkflows/nf-core/vcf_annotate_bcftools/** - - tests/csv/3.0/vcf_single.csv - - tests/test_annotation_bcfann.yml - # postprocessing ## concatenate germline vcfs @@ -626,7 +365,6 @@ concatenate_vcfs: - subworkflows/local/bam_variant_calling_germline_manta/** - subworkflows/local/bam_variant_calling_haplotypecaller/** - subworkflows/local/bam_variant_calling_mpileup/** - - subworkflows/local/bam_variant_calling_single_strelka/** - subworkflows/local/bam_variant_calling_single_tiddit/** - subworkflows/local/bam_variant_calling_somatic_all/** - subworkflows/local/bam_variant_calling_tumor_only_all/** diff --git a/tests/csv/3.0/mapped_joint_bam.csv b/tests/csv/3.0/mapped_joint_bam.csv index 689393be00..1dc3920b1e 100644 --- a/tests/csv/3.0/mapped_joint_bam.csv +++ b/tests/csv/3.0/mapped_joint_bam.csv @@ -1,3 +1,3 @@ -patient,sample,bam,bai -testN,testN,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam.bai -testT,testT,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/bam/test2.paired_end.sorted.bam,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/bam/test2.paired_end.sorted.bam.bai +patient,status,sample,bam,bai +testN,0,testN,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam.bai +testT,0,testT,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/bam/test2.paired_end.sorted.bam,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/bam/test2.paired_end.sorted.bam.bai diff --git a/tests/default.nf.test b/tests/default.nf.test new file mode 100644 index 0000000000..8a1e9956be --- /dev/null +++ b/tests/default.nf.test @@ -0,0 +1,59 @@ +nextflow_pipeline { + + name "Test pipeline" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + + test("Run with profile test") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + test("Run with profile test | --input tests/csv/3.0/sample_with_space.csv") { + + when { + params { + input = "${projectDir}/tests/csv/3.0/sample_with_space.csv" + outdir = "$outputDir" + } + } + + then { + assertAll( + { assert workflow.failed}, + { assert workflow.stderr.toString().contains("Sample ID must be provided and cannot contain spaces") } + ) + } + } +} diff --git a/tests/default.nf.test.snap b/tests/default.nf.test.snap new file mode 100644 index 0000000000..5bf1b26c5e --- /dev/null +++ b/tests/default.nf.test.snap @@ -0,0 +1,325 @@ +{ + "Run with profile test": { + "content": [ + 23, + { + "BCFTOOLS_STATS": { + "bcftools": 1.2 + }, + "BWAMEM1_MEM": { + "bwa": "0.7.18-r1243-dirty", + "samtools": 1.2 + }, + "FASTQC": { + "fastqc": "0.12.1" + }, + "GATK4_APPLYBQSR": { + "gatk4": "4.5.0.0" + }, + "GATK4_BASERECALIBRATOR": { + "gatk4": "4.5.0.0" + }, + "GATK4_MARKDUPLICATES": { + "gatk4": "4.5.0.0", + "samtools": "1.19.2" + }, + "INDEX_CRAM": { + "samtools": 1.21 + }, + "MOSDEPTH": { + "mosdepth": "0.3.8" + }, + "SAMTOOLS_STATS": { + "samtools": 1.21 + }, + "STRELKA_SINGLE": { + "strelka": "2.9.10" + }, + "VCFTOOLS_TSTV_COUNT": { + "vcftools": "0.1.16" + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/markduplicates.csv", + "csv/markduplicates_no_table.csv", + "csv/recalibrated.csv", + "csv/variantcalled.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/bcftools_stats_indel-lengths.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Indels.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_SNP.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transitions.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transversions.txt", + "multiqc/multiqc_data/fastqc-status-check-heatmap.txt", + "multiqc/multiqc_data/fastqc_adapter_content_plot.txt", + "multiqc/multiqc_data/fastqc_per_base_n_content_plot.txt", + "multiqc/multiqc_data/fastqc_per_base_sequence_quality_plot.txt", + "multiqc/multiqc_data/fastqc_per_sequence_gc_content_plot_Counts.txt", + "multiqc/multiqc_data/fastqc_per_sequence_gc_content_plot_Percentages.txt", + "multiqc/multiqc_data/fastqc_per_sequence_quality_scores_plot.txt", + "multiqc/multiqc_data/fastqc_sequence_counts_plot.txt", + "multiqc/multiqc_data/fastqc_sequence_duplication_levels_plot.txt", + "multiqc/multiqc_data/fastqc_sequence_length_distribution_plot.txt", + "multiqc/multiqc_data/fastqc_top_overrepresented_sequences_table.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-reported-empirical-plot.txt", + "multiqc/multiqc_data/gatk_base_recalibrator.txt", + "multiqc/multiqc_data/mosdepth-coverage-per-contig-single.txt", + "multiqc/multiqc_data/mosdepth-cumcoverage-dist-id.txt", + "multiqc/multiqc_data/mosdepth_cov_dist.txt", + "multiqc/multiqc_data/mosdepth_cumcov_dist.txt", + "multiqc/multiqc_data/mosdepth_perchrom.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_bcftools_stats.txt", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_fastqc.txt", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_picard_dups.txt", + "multiqc/multiqc_data/multiqc_samtools_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/picard_deduplication.txt", + "multiqc/multiqc_data/picard_histogram.txt", + "multiqc/multiqc_data/picard_histogram_1.txt", + "multiqc/multiqc_data/picard_histogram_2.txt", + "multiqc/multiqc_data/samtools-stats-dp.txt", + "multiqc/multiqc_data/samtools_alignment_plot.txt", + "multiqc/multiqc_data/vcftools_tstv_by_count.txt", + "multiqc/multiqc_data/vcftools_tstv_by_qual.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_indel-lengths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Indels.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_SNP.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transitions.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transversions.pdf", + "multiqc/multiqc_plots/pdf/fastqc-status-check-heatmap.pdf", + "multiqc/multiqc_plots/pdf/fastqc_adapter_content_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_base_n_content_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_base_sequence_quality_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_sequence_gc_content_plot_Counts.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_sequence_gc_content_plot_Percentages.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_sequence_quality_scores_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_sequence_counts_plot-cnt.pdf", + "multiqc/multiqc_plots/pdf/fastqc_sequence_counts_plot-pct.pdf", + "multiqc/multiqc_plots/pdf/fastqc_sequence_duplication_levels_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_sequence_length_distribution_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_top_overrepresented_sequences_table.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-reported-empirical-plot.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-coverage-per-contig-single-cnt.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-coverage-per-contig-single-pct.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-cumcoverage-dist-id.pdf", + "multiqc/multiqc_plots/pdf/picard_deduplication-cnt.pdf", + "multiqc/multiqc_plots/pdf/picard_deduplication-pct.pdf", + "multiqc/multiqc_plots/pdf/samtools-stats-dp.pdf", + "multiqc/multiqc_plots/pdf/samtools_alignment_plot-cnt.pdf", + "multiqc/multiqc_plots/pdf/samtools_alignment_plot-pct.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_count.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_qual.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/bcftools_stats_indel-lengths.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Indels.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_SNP.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transitions.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transversions.png", + "multiqc/multiqc_plots/png/fastqc-status-check-heatmap.png", + "multiqc/multiqc_plots/png/fastqc_adapter_content_plot.png", + "multiqc/multiqc_plots/png/fastqc_per_base_n_content_plot.png", + "multiqc/multiqc_plots/png/fastqc_per_base_sequence_quality_plot.png", + "multiqc/multiqc_plots/png/fastqc_per_sequence_gc_content_plot_Counts.png", + "multiqc/multiqc_plots/png/fastqc_per_sequence_gc_content_plot_Percentages.png", + "multiqc/multiqc_plots/png/fastqc_per_sequence_quality_scores_plot.png", + "multiqc/multiqc_plots/png/fastqc_sequence_counts_plot-cnt.png", + "multiqc/multiqc_plots/png/fastqc_sequence_counts_plot-pct.png", + "multiqc/multiqc_plots/png/fastqc_sequence_duplication_levels_plot.png", + "multiqc/multiqc_plots/png/fastqc_sequence_length_distribution_plot.png", + "multiqc/multiqc_plots/png/fastqc_top_overrepresented_sequences_table.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-reported-empirical-plot.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/mosdepth-coverage-per-contig-single-cnt.png", + "multiqc/multiqc_plots/png/mosdepth-coverage-per-contig-single-pct.png", + "multiqc/multiqc_plots/png/mosdepth-cumcoverage-dist-id.png", + "multiqc/multiqc_plots/png/picard_deduplication-cnt.png", + "multiqc/multiqc_plots/png/picard_deduplication-pct.png", + "multiqc/multiqc_plots/png/samtools-stats-dp.png", + "multiqc/multiqc_plots/png/samtools_alignment_plot-cnt.png", + "multiqc/multiqc_plots/png/samtools_alignment_plot-pct.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_count.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_qual.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/bcftools_stats_indel-lengths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Indels.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_SNP.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transitions.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transversions.svg", + "multiqc/multiqc_plots/svg/fastqc-status-check-heatmap.svg", + "multiqc/multiqc_plots/svg/fastqc_adapter_content_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_per_base_n_content_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_per_base_sequence_quality_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_per_sequence_gc_content_plot_Counts.svg", + "multiqc/multiqc_plots/svg/fastqc_per_sequence_gc_content_plot_Percentages.svg", + "multiqc/multiqc_plots/svg/fastqc_per_sequence_quality_scores_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_sequence_counts_plot-cnt.svg", + "multiqc/multiqc_plots/svg/fastqc_sequence_counts_plot-pct.svg", + "multiqc/multiqc_plots/svg/fastqc_sequence_duplication_levels_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_sequence_length_distribution_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_top_overrepresented_sequences_table.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-reported-empirical-plot.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/mosdepth-coverage-per-contig-single-cnt.svg", + "multiqc/multiqc_plots/svg/mosdepth-coverage-per-contig-single-pct.svg", + "multiqc/multiqc_plots/svg/mosdepth-cumcoverage-dist-id.svg", + "multiqc/multiqc_plots/svg/picard_deduplication-cnt.svg", + "multiqc/multiqc_plots/svg/picard_deduplication-pct.svg", + "multiqc/multiqc_plots/svg/samtools-stats-dp.svg", + "multiqc/multiqc_plots/svg/samtools_alignment_plot-cnt.svg", + "multiqc/multiqc_plots/svg/samtools_alignment_plot-pct.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_count.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_qual.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "preprocessing", + "preprocessing/markduplicates", + "preprocessing/markduplicates/test", + "preprocessing/markduplicates/test/test.md.cram", + "preprocessing/markduplicates/test/test.md.cram.crai", + "preprocessing/recal_table", + "preprocessing/recal_table/test", + "preprocessing/recal_table/test/test.recal.table", + "preprocessing/recalibrated", + "preprocessing/recalibrated/test", + "preprocessing/recalibrated/test/test.recal.cram", + "preprocessing/recalibrated/test/test.recal.cram.crai", + "reference", + "reports", + "reports/bcftools", + "reports/bcftools/strelka", + "reports/bcftools/strelka/test", + "reports/bcftools/strelka/test/test.strelka.variants.bcftools_stats.txt", + "reports/fastqc", + "reports/fastqc/test-test_L1", + "reports/fastqc/test-test_L1/test-test_L1_1_fastqc.html", + "reports/fastqc/test-test_L1/test-test_L1_1_fastqc.zip", + "reports/fastqc/test-test_L1/test-test_L1_2_fastqc.html", + "reports/fastqc/test-test_L1/test-test_L1_2_fastqc.zip", + "reports/fastqc/test-test_L2", + "reports/fastqc/test-test_L2/test-test_L2_1_fastqc.html", + "reports/fastqc/test-test_L2/test-test_L2_1_fastqc.zip", + "reports/fastqc/test-test_L2/test-test_L2_2_fastqc.html", + "reports/fastqc/test-test_L2/test-test_L2_2_fastqc.zip", + "reports/markduplicates", + "reports/markduplicates/test", + "reports/markduplicates/test/test.md.cram.metrics", + "reports/mosdepth", + "reports/mosdepth/test", + "reports/mosdepth/test/test.md.mosdepth.global.dist.txt", + "reports/mosdepth/test/test.md.mosdepth.region.dist.txt", + "reports/mosdepth/test/test.md.mosdepth.summary.txt", + "reports/mosdepth/test/test.md.regions.bed.gz", + "reports/mosdepth/test/test.md.regions.bed.gz.csi", + "reports/mosdepth/test/test.recal.mosdepth.global.dist.txt", + "reports/mosdepth/test/test.recal.mosdepth.region.dist.txt", + "reports/mosdepth/test/test.recal.mosdepth.summary.txt", + "reports/mosdepth/test/test.recal.regions.bed.gz", + "reports/mosdepth/test/test.recal.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/test", + "reports/samtools/test/test.md.cram.stats", + "reports/samtools/test/test.recal.cram.stats", + "reports/vcftools", + "reports/vcftools/strelka", + "reports/vcftools/strelka/test", + "reports/vcftools/strelka/test/test.strelka.variants.FILTER.summary", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.count", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.qual", + "variant_calling", + "variant_calling/strelka", + "variant_calling/strelka/test", + "variant_calling/strelka/test/test.strelka.genome.vcf.gz", + "variant_calling/strelka/test/test.strelka.genome.vcf.gz.tbi", + "variant_calling/strelka/test/test.strelka.variants.vcf.gz", + "variant_calling/strelka/test/test.strelka.variants.vcf.gz.tbi" + ], + [ + "bcftools_stats_indel-lengths.txt:md5,deccb75341ca46a6f09658f7fd9e348b", + "bcftools_stats_vqc_Count_Indels.txt:md5,7b2a64880b653ccf0400ed9073e290dd", + "bcftools_stats_vqc_Count_SNP.txt:md5,72e934e0e8ed9b9712105bbd66dd9ffd", + "bcftools_stats_vqc_Count_Transitions.txt:md5,72e934e0e8ed9b9712105bbd66dd9ffd", + "bcftools_stats_vqc_Count_Transversions.txt:md5,72e934e0e8ed9b9712105bbd66dd9ffd", + "fastqc-status-check-heatmap.txt:md5,a020b9689ddeb4abec16b4854fe452f1", + "fastqc_adapter_content_plot.txt:md5,2e1b72be741319e7fadbbb39d7e5b37d", + "fastqc_per_base_n_content_plot.txt:md5,ad3b971a6bb4e8ba6c844c8a03584eb8", + "fastqc_per_base_sequence_quality_plot.txt:md5,1bc03889d243a944253ac637d81ae10c", + "fastqc_per_sequence_gc_content_plot_Counts.txt:md5,2c42d140ce06c08dad2b58f397c23239", + "fastqc_per_sequence_gc_content_plot_Percentages.txt:md5,59e22821d350bfb97c37ffd9088f5ad9", + "fastqc_per_sequence_quality_scores_plot.txt:md5,f33615cc98bb6225f39545a415fa7c0f", + "fastqc_sequence_counts_plot.txt:md5,7f0f19a58e8e54e792a751fd04a9ae13", + "fastqc_sequence_duplication_levels_plot.txt:md5,92b02e250ff78725deb9a10d510fcecc", + "fastqc_sequence_length_distribution_plot.txt:md5,fb04dce68ec566314125bc9438211b28", + "fastqc_top_overrepresented_sequences_table.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt:md5,20b2630a7400c9c279bf8c0c66341f7d", + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt:md5,b9b943621e50c7f3e75a37871667d5ed", + "mosdepth-coverage-per-contig-single.txt:md5,264db67a99d2c90ea7b075e33c201b77", + "mosdepth-cumcoverage-dist-id.txt:md5,5235e965da7ebe3bfebb24ffa88defff", + "mosdepth_cov_dist.txt:md5,8d0d7cb485a7bffb07da17b28f827120", + "mosdepth_cumcov_dist.txt:md5,8d0d7cb485a7bffb07da17b28f827120", + "mosdepth_perchrom.txt:md5,264db67a99d2c90ea7b075e33c201b77", + "multiqc_bcftools_stats.txt:md5,103ba59d44fc60e9308e64bbd0d0e504", + "multiqc_citations.txt:md5,ace4ca89138a5f1e2be289c157c00bd9", + "multiqc_fastqc.txt:md5,bde0d0bffa62228b33fb68b7e25b6ff8", + "multiqc_samtools_stats.txt:md5,0f1e4c6c497d9a952765f9f3068ea4b9", + "picard_histogram.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "picard_histogram_1.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "picard_histogram_2.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "samtools-stats-dp.txt:md5,c94f4d3ffa3f510552f90e173fdd9f9d", + "samtools_alignment_plot.txt:md5,717f499a3543e7ee4c7a8454bf80aeca", + "vcftools_tstv_by_count.txt:md5,50efc5214fe2c39f21efb66a710d2ed6", + "test.strelka.variants.bcftools_stats.txt:md5,86bd4938eed920d36f3f5937102a2967", + "test.md.mosdepth.global.dist.txt:md5,b61e1acee11a6ddf7ce3232a5948a6a0", + "test.md.mosdepth.region.dist.txt:md5,1a382f98d488d2ae3df83a0d87caafc1", + "test.md.mosdepth.summary.txt:md5,839108358878ada89e1eaddf6e0541ba", + "test.md.regions.bed.gz:md5,6fdaec99e739dc0f47fe55dd64dfe93e", + "test.md.regions.bed.gz.csi:md5,5f9c60279af78e3aeafc96a8c11fb35f", + "test.recal.mosdepth.global.dist.txt:md5,b61e1acee11a6ddf7ce3232a5948a6a0", + "test.recal.mosdepth.region.dist.txt:md5,1a382f98d488d2ae3df83a0d87caafc1", + "test.recal.mosdepth.summary.txt:md5,839108358878ada89e1eaddf6e0541ba", + "test.recal.regions.bed.gz:md5,6fdaec99e739dc0f47fe55dd64dfe93e", + "test.recal.regions.bed.gz.csi:md5,5f9c60279af78e3aeafc96a8c11fb35f", + "test.md.cram.stats:md5,7d19da3fc342afe0884c944f97a578b1", + "test.recal.cram.stats:md5,820d123e746d1abdc90fd8710828082e", + "test.strelka.variants.FILTER.summary:md5,ad417bc96d31223f61170987975d8128", + "test.strelka.variants.TsTv.count:md5,fa27f678965b7cba6a92efcd039f802a" + ], + [ + [ + "test.md.cram", + "724c601c9daf019d356a53a7d5e1c8b1" + ], + [ + "test.recal.cram", + "dbd6f40b1e6d72501dc034e62e9d54eb" + ] + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-11-09T21:41:27.897118" + } +} diff --git a/tests/main.nf.test b/tests/main.nf.test deleted file mode 100644 index 53e3dc0ca6..0000000000 --- a/tests/main.nf.test +++ /dev/null @@ -1,26 +0,0 @@ -nextflow_pipeline { - - name "Test pipeline" - script "../main.nf" - tag "pipeline" - tag "pipeline_sarek" - - test("Run with profile test") { - - when { - params { - outdir = "results" - max_cpus = 2 - max_memory = '6.GB' - max_time = '6.h' - input = "$projectDir/tests/csv/3.0/fastq_pair.csv" - } - } - - then { - assertAll( - { assert workflow.success } - ) - } - } -} diff --git a/tests/save_output_as_bam.nf.test b/tests/save_output_as_bam.nf.test new file mode 100644 index 0000000000..d19683a971 --- /dev/null +++ b/tests/save_output_as_bam.nf.test @@ -0,0 +1,43 @@ +nextflow_pipeline { + + name "Test pipeline" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + + test("Run with profile test | --save_output_as_bam | skip QC/recal/md") { + + when { + params { + outdir = "$outputDir" + save_output_as_bam = true + skip_tools = 'baserecalibrator,fastqc,markduplicates,mosdepth,multiqc,samtools' + tools = '' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // bam_files: All bam files + def bam_files = getAllFilesFromDir(params.outdir, include: ['**/*.bam']) + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All bam files + bam_files.collect{ file -> [ file.getName(), bam(file.toString()).getReadsMD5() ] } + ).match() } + ) + } + } +} diff --git a/tests/save_output_as_bam.nf.test.snap b/tests/save_output_as_bam.nf.test.snap new file mode 100644 index 0000000000..55f6083979 --- /dev/null +++ b/tests/save_output_as_bam.nf.test.snap @@ -0,0 +1,51 @@ +{ + "Run with profile test | --save_output_as_bam | skip QC/recal/md": { + "content": [ + 10, + { + "BAM_TO_CRAM_MAPPING": { + "samtools": 1.21 + }, + "BWAMEM1_MEM": { + "bwa": "0.7.18-r1243-dirty", + "samtools": 1.2 + }, + "INDEX_MERGE_BAM": { + "samtools": 1.21 + }, + "MERGE_BAM": { + "samtools": 1.21 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/mapped.csv", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "preprocessing", + "preprocessing/mapped", + "preprocessing/mapped/test", + "preprocessing/mapped/test/test.sorted.bam", + "preprocessing/mapped/test/test.sorted.bam.bai", + "reference" + ], + [ + + ], + [ + [ + "test.sorted.bam", + "5534c350547fd253f0f2b9450362bed" + ] + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.09.0" + }, + "timestamp": "2024-10-08T11:11:44.283548" + } +} diff --git a/tests/saved_mapped.nf.test b/tests/saved_mapped.nf.test new file mode 100644 index 0000000000..06e5524e50 --- /dev/null +++ b/tests/saved_mapped.nf.test @@ -0,0 +1,43 @@ +nextflow_pipeline { + + name "Test pipeline" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + + test("Run with profile test | --save_mapped | skip QC/recal/md") { + + when { + params { + outdir = "$outputDir" + save_mapped = true + skip_tools = 'baserecalibrator,fastqc,markduplicates,mosdepth,multiqc,samtools' + tools = '' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // bam_files: All bam files + def bam_files = getAllFilesFromDir(params.outdir, include: ['**/*.bam']) + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All bam files + bam_files.collect{ file -> [ file.getName(), bam(file.toString()).getReadsMD5() ] } + ).match() } + ) + } + } +} diff --git a/tests/saved_mapped.nf.test.snap b/tests/saved_mapped.nf.test.snap new file mode 100644 index 0000000000..987f214d70 --- /dev/null +++ b/tests/saved_mapped.nf.test.snap @@ -0,0 +1,48 @@ +{ + "Run with profile test | --save_mapped | skip QC/recal/md": { + "content": [ + 10, + { + "BAM_TO_CRAM_MAPPING": { + "samtools": 1.21 + }, + "BWAMEM1_MEM": { + "bwa": "0.7.18-r1243-dirty", + "samtools": 1.2 + }, + "INDEX_MERGE_BAM": { + "samtools": 1.21 + }, + "MERGE_BAM": { + "samtools": 1.21 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/mapped.csv", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "preprocessing", + "preprocessing/mapped", + "preprocessing/mapped/test", + "preprocessing/mapped/test/test.sorted.cram", + "preprocessing/mapped/test/test.sorted.cram.crai", + "reference" + ], + [ + + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.09.0" + }, + "timestamp": "2024-10-08T09:53:32.063494" + } +} diff --git a/tests/sentieon.nf.test b/tests/sentieon.nf.test new file mode 100644 index 0000000000..2a774e0803 --- /dev/null +++ b/tests/sentieon.nf.test @@ -0,0 +1,48 @@ +nextflow_pipeline { + + name "Test sentieon" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + tag "sentieon" + + test("stub") { + + tag "stub" + + options "-stub" + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + aligner = "sentieon-bwamem" + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } +} diff --git a/tests/sentieon.nf.test.snap b/tests/sentieon.nf.test.snap new file mode 100644 index 0000000000..4c66400e96 --- /dev/null +++ b/tests/sentieon.nf.test.snap @@ -0,0 +1,301 @@ +{ + "stub": { + "content": [ + 23, + { + "BCFTOOLS_STATS": { + "bcftools": 1.2 + }, + "FASTQC": { + "fastqc": "0.12.1" + }, + "GATK4_APPLYBQSR": { + "gatk4": "4.5.0.0" + }, + "GATK4_BASERECALIBRATOR": { + "gatk4": "4.5.0.0" + }, + "GATK4_MARKDUPLICATES": { + "gatk4": "4.5.0.0", + "samtools": "1.19.2" + }, + "INDEX_CRAM": { + "samtools": 1.21 + }, + "MOSDEPTH": { + "mosdepth": "0.3.8" + }, + "SAMTOOLS_STATS": { + "samtools": 1.21 + }, + "SENTIEON_BWAMEM": { + "sentieon": 202308.03, + "bwa": "0.7.17-r1188" + }, + "STRELKA_SINGLE": { + "strelka": "2.9.10" + }, + "VCFTOOLS_TSTV_COUNT": { + "vcftools": "0.1.16" + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/markduplicates.csv", + "csv/markduplicates_no_table.csv", + "csv/recalibrated.csv", + "csv/variantcalled.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_plots", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "preprocessing", + "preprocessing/markduplicates", + "preprocessing/markduplicates/test", + "preprocessing/markduplicates/test/test.cram", + "preprocessing/markduplicates/test/test.cram.crai", + "preprocessing/recal_table", + "preprocessing/recal_table/test", + "preprocessing/recal_table/test/test.recal.table", + "preprocessing/recalibrated", + "preprocessing/recalibrated/test", + "preprocessing/recalibrated/test/test.recal.cram", + "preprocessing/recalibrated/test/test.recal.cram.crai", + "reference", + "reports", + "reports/bcftools", + "reports/bcftools/strelka", + "reports/bcftools/strelka/test", + "reports/bcftools/strelka/test/test.strelka.variants.bcftools_stats.txt", + "reports/fastqc", + "reports/fastqc/test-test_L1", + "reports/fastqc/test-test_L1/test-test_L1.html", + "reports/fastqc/test-test_L1/test-test_L1.zip", + "reports/fastqc/test-test_L2", + "reports/fastqc/test-test_L2/test-test_L2.html", + "reports/fastqc/test-test_L2/test-test_L2.zip", + "reports/markduplicates", + "reports/markduplicates/test", + "reports/markduplicates/test/test.md.cram.metrics", + "reports/mosdepth", + "reports/mosdepth/test", + "reports/mosdepth/test/test.md.global.dist.txt", + "reports/mosdepth/test/test.md.per-base.bed.gz", + "reports/mosdepth/test/test.md.per-base.bed.gz.csi", + "reports/mosdepth/test/test.md.per-base.d4", + "reports/mosdepth/test/test.md.quantized.bed.gz", + "reports/mosdepth/test/test.md.quantized.bed.gz.csi", + "reports/mosdepth/test/test.md.region.dist.txt", + "reports/mosdepth/test/test.md.regions.bed.gz", + "reports/mosdepth/test/test.md.regions.bed.gz.csi", + "reports/mosdepth/test/test.md.summary.txt", + "reports/mosdepth/test/test.md.thresholds.bed.gz", + "reports/mosdepth/test/test.md.thresholds.bed.gz.csi", + "reports/mosdepth/test/test.recal.global.dist.txt", + "reports/mosdepth/test/test.recal.per-base.bed.gz", + "reports/mosdepth/test/test.recal.per-base.bed.gz.csi", + "reports/mosdepth/test/test.recal.per-base.d4", + "reports/mosdepth/test/test.recal.quantized.bed.gz", + "reports/mosdepth/test/test.recal.quantized.bed.gz.csi", + "reports/mosdepth/test/test.recal.region.dist.txt", + "reports/mosdepth/test/test.recal.regions.bed.gz", + "reports/mosdepth/test/test.recal.regions.bed.gz.csi", + "reports/mosdepth/test/test.recal.summary.txt", + "reports/mosdepth/test/test.recal.thresholds.bed.gz", + "reports/mosdepth/test/test.recal.thresholds.bed.gz.csi", + "reports/samtools", + "reports/samtools/test", + "reports/samtools/test/test.md.cram.stats", + "reports/samtools/test/test.recal.cram.stats", + "reports/vcftools", + "reports/vcftools/strelka", + "reports/vcftools/strelka/test", + "reports/vcftools/strelka/test/test.strelka.variants.012", + "reports/vcftools/strelka/test/test.strelka.variants.012.indv", + "reports/vcftools/strelka/test/test.strelka.variants.012.pos", + "reports/vcftools/strelka/test/test.strelka.variants.BEAGLE.GL", + "reports/vcftools/strelka/test/test.strelka.variants.BEAGLE.PL", + "reports/vcftools/strelka/test/test.strelka.variants.FILTER.summary", + "reports/vcftools/strelka/test/test.strelka.variants.FORMAT", + "reports/vcftools/strelka/test/test.strelka.variants.INFO", + "reports/vcftools/strelka/test/test.strelka.variants.LROH", + "reports/vcftools/strelka/test/test.strelka.variants.Tajima.D", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.count", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.qual", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.summary", + "reports/vcftools/strelka/test/test.strelka.variants.bcf", + "reports/vcftools/strelka/test/test.strelka.variants.diff.discordance.matrix", + "reports/vcftools/strelka/test/test.strelka.variants.diff.indv", + "reports/vcftools/strelka/test/test.strelka.variants.diff.indv_in_files", + "reports/vcftools/strelka/test/test.strelka.variants.diff.sites", + "reports/vcftools/strelka/test/test.strelka.variants.diff.sites_in_files", + "reports/vcftools/strelka/test/test.strelka.variants.diff.switch", + "reports/vcftools/strelka/test/test.strelka.variants.frq", + "reports/vcftools/strelka/test/test.strelka.variants.frq.count", + "reports/vcftools/strelka/test/test.strelka.variants.gdepth", + "reports/vcftools/strelka/test/test.strelka.variants.geno.chisq", + "reports/vcftools/strelka/test/test.strelka.variants.geno.ld", + "reports/vcftools/strelka/test/test.strelka.variants.hap.ld", + "reports/vcftools/strelka/test/test.strelka.variants.hapcount", + "reports/vcftools/strelka/test/test.strelka.variants.het", + "reports/vcftools/strelka/test/test.strelka.variants.hwe", + "reports/vcftools/strelka/test/test.strelka.variants.idepth", + "reports/vcftools/strelka/test/test.strelka.variants.ifreqburden", + "reports/vcftools/strelka/test/test.strelka.variants.imiss", + "reports/vcftools/strelka/test/test.strelka.variants.impute.hap", + "reports/vcftools/strelka/test/test.strelka.variants.impute.hap.indv", + "reports/vcftools/strelka/test/test.strelka.variants.impute.hap.legend", + "reports/vcftools/strelka/test/test.strelka.variants.indel.hist", + "reports/vcftools/strelka/test/test.strelka.variants.interchrom.geno.ld", + "reports/vcftools/strelka/test/test.strelka.variants.interchrom.hap.ld", + "reports/vcftools/strelka/test/test.strelka.variants.kept.sites", + "reports/vcftools/strelka/test/test.strelka.variants.ldepth", + "reports/vcftools/strelka/test/test.strelka.variants.ldepth.mean", + "reports/vcftools/strelka/test/test.strelka.variants.ldhat.locs", + "reports/vcftools/strelka/test/test.strelka.variants.ldhat.sites", + "reports/vcftools/strelka/test/test.strelka.variants.list.geno.ld", + "reports/vcftools/strelka/test/test.strelka.variants.list.hap.ld", + "reports/vcftools/strelka/test/test.strelka.variants.lmiss", + "reports/vcftools/strelka/test/test.strelka.variants.lqual", + "reports/vcftools/strelka/test/test.strelka.variants.map", + "reports/vcftools/strelka/test/test.strelka.variants.mendel", + "reports/vcftools/strelka/test/test.strelka.variants.ped", + "reports/vcftools/strelka/test/test.strelka.variants.relatedness", + "reports/vcftools/strelka/test/test.strelka.variants.relatedness2", + "reports/vcftools/strelka/test/test.strelka.variants.removed.sites", + "reports/vcftools/strelka/test/test.strelka.variants.singletons", + "reports/vcftools/strelka/test/test.strelka.variants.sites.pi", + "reports/vcftools/strelka/test/test.strelka.variants.snpden", + "reports/vcftools/strelka/test/test.strelka.variants.tfam", + "reports/vcftools/strelka/test/test.strelka.variants.tped", + "reports/vcftools/strelka/test/test.strelka.variants.vcf", + "reports/vcftools/strelka/test/test.strelka.variants.weir.fst", + "reports/vcftools/strelka/test/test.strelka.variants.windowed.pi", + "variant_calling", + "variant_calling/strelka", + "variant_calling/strelka/test", + "variant_calling/strelka/test/test.strelka.genome.vcf.gz", + "variant_calling/strelka/test/test.strelka.genome.vcf.gz.tbi", + "variant_calling/strelka/test/test.strelka.variants.vcf.gz", + "variant_calling/strelka/test/test.strelka.variants.vcf.gz.tbi" + ], + [ + "test.cram:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.cram.crai:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.bcftools_stats.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "test-test_L1.html:md5,d41d8cd98f00b204e9800998ecf8427e", + "test-test_L1.zip:md5,d41d8cd98f00b204e9800998ecf8427e", + "test-test_L2.html:md5,d41d8cd98f00b204e9800998ecf8427e", + "test-test_L2.zip:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.md.global.dist.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.md.per-base.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test.md.per-base.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.md.per-base.d4:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.md.quantized.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test.md.quantized.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.md.region.dist.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.md.regions.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test.md.regions.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.md.summary.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.md.thresholds.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test.md.thresholds.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.recal.global.dist.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.recal.per-base.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test.recal.per-base.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.recal.per-base.d4:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.recal.quantized.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test.recal.quantized.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.recal.region.dist.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.recal.regions.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test.recal.regions.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.recal.summary.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.recal.thresholds.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test.recal.thresholds.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.md.cram.stats:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.recal.cram.stats:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.012:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.012.indv:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.012.pos:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.BEAGLE.GL:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.BEAGLE.PL:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.FILTER.summary:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.FORMAT:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.INFO:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.LROH:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.Tajima.D:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.TsTv:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.TsTv.count:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.TsTv.summary:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.bcf:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.diff.discordance.matrix:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.diff.indv:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.diff.indv_in_files:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.diff.sites:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.diff.sites_in_files:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.diff.switch:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.frq:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.frq.count:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.gdepth:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.geno.chisq:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.geno.ld:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.hap.ld:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.hapcount:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.het:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.hwe:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.idepth:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.ifreqburden:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.imiss:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.impute.hap:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.impute.hap.indv:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.impute.hap.legend:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.indel.hist:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.interchrom.geno.ld:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.interchrom.hap.ld:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.kept.sites:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.ldepth:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.ldepth.mean:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.ldhat.locs:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.ldhat.sites:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.list.geno.ld:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.list.hap.ld:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.lmiss:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.lqual:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.map:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.mendel:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.ped:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.relatedness:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.relatedness2:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.removed.sites:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.singletons:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.sites.pi:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.snpden:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.tfam:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.tped:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.vcf:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.weir.fst:md5,d41d8cd98f00b204e9800998ecf8427e", + "test.strelka.variants.windowed.pi:md5,d41d8cd98f00b204e9800998ecf8427e" + ], + [ + [ + "test.cram", + "d41d8cd98f00b204e9800998ecf8427e" + ], + [ + "test.recal.cram", + "d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "meta": { + "nf-test": "0.9.2", + "nextflow": "24.10.2" + }, + "timestamp": "2024-12-09T10:15:55.105507" + } +} diff --git a/tests/start_from_markduplicates.nf.test b/tests/start_from_markduplicates.nf.test new file mode 100644 index 0000000000..b9bc2f4434 --- /dev/null +++ b/tests/start_from_markduplicates.nf.test @@ -0,0 +1,162 @@ +nextflow_pipeline { + + name "Test pipeline when starting from markduplicates" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + + test("Run with profile test | --input tests/csv/3.0/mapped_single_bam.csv --step markduplicates --tools null ") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/mapped_single_bam.csv" + step = 'markduplicates' + tools = null + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + test("Run with profile test | --input tests/csv/3.0/mapped_single_bam.csv --step markduplicates --skip_tools markduplicates --tools null ") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/mapped_single_bam.csv" + step = 'markduplicates' + skip_tools = "markduplicates" + tools = null + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + + test("Run with profile test | --input tests/csv/3.0/mapped_single_cram.csv --step markduplicates --tools null ") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/mapped_single_cram.csv" + step = 'markduplicates' + tools = null + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + test("Run with profile test | --input tests/csv/3.0/mapped_single_cram.csv --step markduplicates --skip_tools markduplicates --tools null ") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/mapped_single_cram.csv" + step = 'markduplicates' + skip_tools = "markduplicates" + tools = null + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } +} diff --git a/tests/start_from_markduplicates.nf.test.snap b/tests/start_from_markduplicates.nf.test.snap new file mode 100644 index 0000000000..def63ca189 --- /dev/null +++ b/tests/start_from_markduplicates.nf.test.snap @@ -0,0 +1,633 @@ +{ + "Run with profile test | --input tests/csv/3.0/mapped_single_bam.csv --step markduplicates --skip_tools markduplicates --tools null ": { + "content": [ + 9, + { + "GATK4_APPLYBQSR": { + "gatk4": "4.5.0.0" + }, + "GATK4_BASERECALIBRATOR": { + "gatk4": "4.5.0.0" + }, + "MOSDEPTH": { + "mosdepth": "0.3.8" + }, + "SAMTOOLS_STATS": { + "samtools": 1.21 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-reported-empirical-plot.txt", + "multiqc/multiqc_data/gatk_base_recalibrator.txt", + "multiqc/multiqc_data/mosdepth-coverage-per-contig-single.txt", + "multiqc/multiqc_data/mosdepth-cumcoverage-dist-id.txt", + "multiqc/multiqc_data/mosdepth_cov_dist.txt", + "multiqc/multiqc_data/mosdepth_cumcov_dist.txt", + "multiqc/multiqc_data/mosdepth_perchrom.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_samtools_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/samtools-stats-dp.txt", + "multiqc/multiqc_data/samtools_alignment_plot.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-reported-empirical-plot.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-coverage-per-contig-single-cnt.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-coverage-per-contig-single-pct.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-cumcoverage-dist-id.pdf", + "multiqc/multiqc_plots/pdf/samtools-stats-dp.pdf", + "multiqc/multiqc_plots/pdf/samtools_alignment_plot-cnt.pdf", + "multiqc/multiqc_plots/pdf/samtools_alignment_plot-pct.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-reported-empirical-plot.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/mosdepth-coverage-per-contig-single-cnt.png", + "multiqc/multiqc_plots/png/mosdepth-coverage-per-contig-single-pct.png", + "multiqc/multiqc_plots/png/mosdepth-cumcoverage-dist-id.png", + "multiqc/multiqc_plots/png/samtools-stats-dp.png", + "multiqc/multiqc_plots/png/samtools_alignment_plot-cnt.png", + "multiqc/multiqc_plots/png/samtools_alignment_plot-pct.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-reported-empirical-plot.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/mosdepth-coverage-per-contig-single-cnt.svg", + "multiqc/multiqc_plots/svg/mosdepth-coverage-per-contig-single-pct.svg", + "multiqc/multiqc_plots/svg/mosdepth-cumcoverage-dist-id.svg", + "multiqc/multiqc_plots/svg/samtools-stats-dp.svg", + "multiqc/multiqc_plots/svg/samtools_alignment_plot-cnt.svg", + "multiqc/multiqc_plots/svg/samtools_alignment_plot-pct.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "preprocessing", + "preprocessing/recal_table", + "preprocessing/recal_table/test", + "preprocessing/recal_table/test/test.recal.table", + "reference", + "reports", + "reports/mosdepth", + "reports/mosdepth/test", + "reports/mosdepth/test/test.sorted.mosdepth.global.dist.txt", + "reports/mosdepth/test/test.sorted.mosdepth.region.dist.txt", + "reports/mosdepth/test/test.sorted.mosdepth.summary.txt", + "reports/mosdepth/test/test.sorted.regions.bed.gz", + "reports/mosdepth/test/test.sorted.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/test", + "reports/samtools/test/test.sorted.cram.stats" + ], + [ + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt:md5,d2650a5bec510d798e347f36a4d00e2d", + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt:md5,f2123f7ee3d060c1547efc6247a04e42", + "mosdepth-coverage-per-contig-single.txt:md5,8b48f3336b063dcb1e086928b28a2cc6", + "mosdepth-cumcoverage-dist-id.txt:md5,3148977f0c4684ba681ee298d677fe38", + "mosdepth_cov_dist.txt:md5,9a531d5a5c05e568a1aeb2e738ac23c4", + "mosdepth_cumcov_dist.txt:md5,9a531d5a5c05e568a1aeb2e738ac23c4", + "mosdepth_perchrom.txt:md5,8b48f3336b063dcb1e086928b28a2cc6", + "multiqc_citations.txt:md5,7d0b4b866fa577272c48a1f3ad72e75d", + "multiqc_samtools_stats.txt:md5,7f5f43de35c194be7f5980b62eacfab7", + "samtools-stats-dp.txt:md5,85c4ca7a3a6f2534d4d329937be49966", + "samtools_alignment_plot.txt:md5,301dda049c8aa2f848c98c81f584c315", + "test.sorted.mosdepth.global.dist.txt:md5,bdb8f185c35dd1eec7ce2f69bce57972", + "test.sorted.mosdepth.region.dist.txt:md5,f1f1ad86fc280bced1888a5d7d25a3f2", + "test.sorted.mosdepth.summary.txt:md5,32ea70ef1b99def3dc900b4afd513a40", + "test.sorted.regions.bed.gz:md5,07bbc084a889f1cece4307fd00214a6e", + "test.sorted.regions.bed.gz.csi:md5,c5d0be930ffc9e562f21519a0d488d5d", + "test.sorted.cram.stats:md5,a15b3a5e59337db312d66020c7bb93ac" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-11-09T16:15:20.865586" + }, + "Run with profile test | --input tests/csv/3.0/mapped_single_bam.csv --step markduplicates --tools null ": { + "content": [ + 13, + { + "GATK4_APPLYBQSR": { + "gatk4": "4.5.0.0" + }, + "GATK4_BASERECALIBRATOR": { + "gatk4": "4.5.0.0" + }, + "GATK4_MARKDUPLICATES": { + "gatk4": "4.5.0.0", + "samtools": "1.19.2" + }, + "INDEX_CRAM": { + "samtools": 1.21 + }, + "MOSDEPTH": { + "mosdepth": "0.3.8" + }, + "SAMTOOLS_STATS": { + "samtools": 1.21 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/markduplicates.csv", + "csv/markduplicates_no_table.csv", + "csv/recalibrated.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-reported-empirical-plot.txt", + "multiqc/multiqc_data/gatk_base_recalibrator.txt", + "multiqc/multiqc_data/mosdepth-coverage-per-contig-single.txt", + "multiqc/multiqc_data/mosdepth-cumcoverage-dist-id.txt", + "multiqc/multiqc_data/mosdepth_cov_dist.txt", + "multiqc/multiqc_data/mosdepth_cumcov_dist.txt", + "multiqc/multiqc_data/mosdepth_perchrom.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_picard_dups.txt", + "multiqc/multiqc_data/multiqc_samtools_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/picard_deduplication.txt", + "multiqc/multiqc_data/picard_histogram.txt", + "multiqc/multiqc_data/picard_histogram_1.txt", + "multiqc/multiqc_data/picard_histogram_2.txt", + "multiqc/multiqc_data/samtools-stats-dp.txt", + "multiqc/multiqc_data/samtools_alignment_plot.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-reported-empirical-plot.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-coverage-per-contig-single-cnt.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-coverage-per-contig-single-pct.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-cumcoverage-dist-id.pdf", + "multiqc/multiqc_plots/pdf/picard_deduplication-cnt.pdf", + "multiqc/multiqc_plots/pdf/picard_deduplication-pct.pdf", + "multiqc/multiqc_plots/pdf/samtools-stats-dp.pdf", + "multiqc/multiqc_plots/pdf/samtools_alignment_plot-cnt.pdf", + "multiqc/multiqc_plots/pdf/samtools_alignment_plot-pct.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-reported-empirical-plot.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/mosdepth-coverage-per-contig-single-cnt.png", + "multiqc/multiqc_plots/png/mosdepth-coverage-per-contig-single-pct.png", + "multiqc/multiqc_plots/png/mosdepth-cumcoverage-dist-id.png", + "multiqc/multiqc_plots/png/picard_deduplication-cnt.png", + "multiqc/multiqc_plots/png/picard_deduplication-pct.png", + "multiqc/multiqc_plots/png/samtools-stats-dp.png", + "multiqc/multiqc_plots/png/samtools_alignment_plot-cnt.png", + "multiqc/multiqc_plots/png/samtools_alignment_plot-pct.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-reported-empirical-plot.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/mosdepth-coverage-per-contig-single-cnt.svg", + "multiqc/multiqc_plots/svg/mosdepth-coverage-per-contig-single-pct.svg", + "multiqc/multiqc_plots/svg/mosdepth-cumcoverage-dist-id.svg", + "multiqc/multiqc_plots/svg/picard_deduplication-cnt.svg", + "multiqc/multiqc_plots/svg/picard_deduplication-pct.svg", + "multiqc/multiqc_plots/svg/samtools-stats-dp.svg", + "multiqc/multiqc_plots/svg/samtools_alignment_plot-cnt.svg", + "multiqc/multiqc_plots/svg/samtools_alignment_plot-pct.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "preprocessing", + "preprocessing/markduplicates", + "preprocessing/markduplicates/test", + "preprocessing/markduplicates/test/test.md.cram", + "preprocessing/markduplicates/test/test.md.cram.crai", + "preprocessing/recal_table", + "preprocessing/recal_table/test", + "preprocessing/recal_table/test/test.recal.table", + "preprocessing/recalibrated", + "preprocessing/recalibrated/test", + "preprocessing/recalibrated/test/test.recal.cram", + "preprocessing/recalibrated/test/test.recal.cram.crai", + "reference", + "reports", + "reports/markduplicates", + "reports/markduplicates/test", + "reports/markduplicates/test/test.md.cram.metrics", + "reports/mosdepth", + "reports/mosdepth/test", + "reports/mosdepth/test/test.md.mosdepth.global.dist.txt", + "reports/mosdepth/test/test.md.mosdepth.region.dist.txt", + "reports/mosdepth/test/test.md.mosdepth.summary.txt", + "reports/mosdepth/test/test.md.regions.bed.gz", + "reports/mosdepth/test/test.md.regions.bed.gz.csi", + "reports/mosdepth/test/test.recal.mosdepth.global.dist.txt", + "reports/mosdepth/test/test.recal.mosdepth.region.dist.txt", + "reports/mosdepth/test/test.recal.mosdepth.summary.txt", + "reports/mosdepth/test/test.recal.regions.bed.gz", + "reports/mosdepth/test/test.recal.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/test", + "reports/samtools/test/test.md.cram.stats", + "reports/samtools/test/test.recal.cram.stats" + ], + [ + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt:md5,902745b5a1915e5c1a25267b11bebbe7", + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt:md5,832ed176357d6a5b4e50c718f5e4a704", + "mosdepth-coverage-per-contig-single.txt:md5,76d816c3f71faf2009c8a6f88092a2f3", + "mosdepth-cumcoverage-dist-id.txt:md5,3af8f7d8ed7d1fdff6118e0098258192", + "mosdepth_cov_dist.txt:md5,4a2236db76d75e45012f6d7c180c90d6", + "mosdepth_cumcov_dist.txt:md5,4a2236db76d75e45012f6d7c180c90d6", + "mosdepth_perchrom.txt:md5,76d816c3f71faf2009c8a6f88092a2f3", + "multiqc_citations.txt:md5,7d0b4b866fa577272c48a1f3ad72e75d", + "multiqc_samtools_stats.txt:md5,de9451d4736a410d09de58828761ea87", + "picard_histogram.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "picard_histogram_1.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "picard_histogram_2.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "samtools-stats-dp.txt:md5,2247da9fa269d826da3f33ba6fa66954", + "samtools_alignment_plot.txt:md5,22572fcd0791878ed37ae2f48213cee2", + "test.md.mosdepth.global.dist.txt:md5,8e875e20e3fb9cf288d68c1d223f6fd5", + "test.md.mosdepth.region.dist.txt:md5,75e1ce7e55af51f4985fa91654a5ea2d", + "test.md.mosdepth.summary.txt:md5,b23cf96942b2ada3f41172a9349a1175", + "test.md.regions.bed.gz:md5,74cd0c779c7b3228adcf3b177333886a", + "test.md.regions.bed.gz.csi:md5,080731cdedcd389e72135f048d6e2e00", + "test.recal.mosdepth.global.dist.txt:md5,8e875e20e3fb9cf288d68c1d223f6fd5", + "test.recal.mosdepth.region.dist.txt:md5,75e1ce7e55af51f4985fa91654a5ea2d", + "test.recal.mosdepth.summary.txt:md5,b23cf96942b2ada3f41172a9349a1175", + "test.recal.regions.bed.gz:md5,74cd0c779c7b3228adcf3b177333886a", + "test.recal.regions.bed.gz.csi:md5,080731cdedcd389e72135f048d6e2e00", + "test.md.cram.stats:md5,f181d98f08ad94c3926ac149a87d834b", + "test.recal.cram.stats:md5,18346c938c7b1bfaf9ac9413fdba90d8" + ], + [ + [ + "test.md.cram", + "2f11e4fe3390b8ad0a1852616fd1da04" + ], + [ + "test.recal.cram", + "463ac3b905fbf4ddf113a94dbfa8d69f" + ] + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-11-09T16:12:33.604156" + }, + "Run with profile test | --input tests/csv/3.0/mapped_single_cram.csv --step markduplicates --skip_tools markduplicates --tools null ": { + "content": [ + 12, + { + "GATK4_APPLYBQSR": { + "gatk4": "4.5.0.0" + }, + "GATK4_BASERECALIBRATOR": { + "gatk4": "4.5.0.0" + }, + "INDEX_CRAM": { + "samtools": 1.21 + }, + "MOSDEPTH": { + "mosdepth": "0.3.8" + }, + "SAMTOOLS_STATS": { + "samtools": 1.21 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/recalibrated.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-reported-empirical-plot.txt", + "multiqc/multiqc_data/gatk_base_recalibrator.txt", + "multiqc/multiqc_data/mosdepth-coverage-per-contig-single.txt", + "multiqc/multiqc_data/mosdepth-cumcoverage-dist-id.txt", + "multiqc/multiqc_data/mosdepth_cov_dist.txt", + "multiqc/multiqc_data/mosdepth_cumcov_dist.txt", + "multiqc/multiqc_data/mosdepth_perchrom.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_samtools_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/samtools-stats-dp.txt", + "multiqc/multiqc_data/samtools_alignment_plot.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-reported-empirical-plot.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-coverage-per-contig-single-cnt.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-coverage-per-contig-single-pct.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-cumcoverage-dist-id.pdf", + "multiqc/multiqc_plots/pdf/samtools-stats-dp.pdf", + "multiqc/multiqc_plots/pdf/samtools_alignment_plot-cnt.pdf", + "multiqc/multiqc_plots/pdf/samtools_alignment_plot-pct.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-reported-empirical-plot.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/mosdepth-coverage-per-contig-single-cnt.png", + "multiqc/multiqc_plots/png/mosdepth-coverage-per-contig-single-pct.png", + "multiqc/multiqc_plots/png/mosdepth-cumcoverage-dist-id.png", + "multiqc/multiqc_plots/png/samtools-stats-dp.png", + "multiqc/multiqc_plots/png/samtools_alignment_plot-cnt.png", + "multiqc/multiqc_plots/png/samtools_alignment_plot-pct.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-reported-empirical-plot.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/mosdepth-coverage-per-contig-single-cnt.svg", + "multiqc/multiqc_plots/svg/mosdepth-coverage-per-contig-single-pct.svg", + "multiqc/multiqc_plots/svg/mosdepth-cumcoverage-dist-id.svg", + "multiqc/multiqc_plots/svg/samtools-stats-dp.svg", + "multiqc/multiqc_plots/svg/samtools_alignment_plot-cnt.svg", + "multiqc/multiqc_plots/svg/samtools_alignment_plot-pct.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "preprocessing", + "preprocessing/recal_table", + "preprocessing/recal_table/test", + "preprocessing/recal_table/test/test.recal.table", + "preprocessing/recalibrated", + "preprocessing/recalibrated/test", + "preprocessing/recalibrated/test/test.recal.cram", + "preprocessing/recalibrated/test/test.recal.cram.crai", + "reference", + "reports", + "reports/mosdepth", + "reports/mosdepth/test", + "reports/mosdepth/test/test.recal.mosdepth.global.dist.txt", + "reports/mosdepth/test/test.recal.mosdepth.region.dist.txt", + "reports/mosdepth/test/test.recal.mosdepth.summary.txt", + "reports/mosdepth/test/test.recal.regions.bed.gz", + "reports/mosdepth/test/test.recal.regions.bed.gz.csi", + "reports/mosdepth/test/test.sorted.mosdepth.global.dist.txt", + "reports/mosdepth/test/test.sorted.mosdepth.region.dist.txt", + "reports/mosdepth/test/test.sorted.mosdepth.summary.txt", + "reports/mosdepth/test/test.sorted.regions.bed.gz", + "reports/mosdepth/test/test.sorted.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/test", + "reports/samtools/test/test.recal.cram.stats", + "reports/samtools/test/test.sorted.cram.stats" + ], + [ + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt:md5,d2650a5bec510d798e347f36a4d00e2d", + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt:md5,f2123f7ee3d060c1547efc6247a04e42", + "mosdepth-coverage-per-contig-single.txt:md5,8b48f3336b063dcb1e086928b28a2cc6", + "mosdepth-cumcoverage-dist-id.txt:md5,3148977f0c4684ba681ee298d677fe38", + "mosdepth_cov_dist.txt:md5,9a531d5a5c05e568a1aeb2e738ac23c4", + "mosdepth_cumcov_dist.txt:md5,9a531d5a5c05e568a1aeb2e738ac23c4", + "mosdepth_perchrom.txt:md5,8b48f3336b063dcb1e086928b28a2cc6", + "multiqc_citations.txt:md5,7d0b4b866fa577272c48a1f3ad72e75d", + "multiqc_samtools_stats.txt:md5,7f5f43de35c194be7f5980b62eacfab7", + "samtools-stats-dp.txt:md5,85c4ca7a3a6f2534d4d329937be49966", + "samtools_alignment_plot.txt:md5,301dda049c8aa2f848c98c81f584c315", + "test.recal.mosdepth.global.dist.txt:md5,bdb8f185c35dd1eec7ce2f69bce57972", + "test.recal.mosdepth.region.dist.txt:md5,f1f1ad86fc280bced1888a5d7d25a3f2", + "test.recal.mosdepth.summary.txt:md5,32ea70ef1b99def3dc900b4afd513a40", + "test.recal.regions.bed.gz:md5,07bbc084a889f1cece4307fd00214a6e", + "test.recal.regions.bed.gz.csi:md5,c5d0be930ffc9e562f21519a0d488d5d", + "test.sorted.mosdepth.global.dist.txt:md5,bdb8f185c35dd1eec7ce2f69bce57972", + "test.sorted.mosdepth.region.dist.txt:md5,f1f1ad86fc280bced1888a5d7d25a3f2", + "test.sorted.mosdepth.summary.txt:md5,32ea70ef1b99def3dc900b4afd513a40", + "test.sorted.regions.bed.gz:md5,07bbc084a889f1cece4307fd00214a6e", + "test.sorted.regions.bed.gz.csi:md5,c5d0be930ffc9e562f21519a0d488d5d", + "test.recal.cram.stats:md5,9f75ec16d22ce12c348cbd7477c9886e", + "test.sorted.cram.stats:md5,308a4213cc2ea25cbdd6d58b562673a5" + ], + [ + [ + "test.recal.cram", + "463ac3b905fbf4ddf113a94dbfa8d69f" + ] + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-11-09T16:23:55.741166" + }, + "Run with profile test | --input tests/csv/3.0/mapped_single_cram.csv --step markduplicates --tools null ": { + "content": [ + 13, + { + "GATK4_APPLYBQSR": { + "gatk4": "4.5.0.0" + }, + "GATK4_BASERECALIBRATOR": { + "gatk4": "4.5.0.0" + }, + "GATK4_MARKDUPLICATES": { + "gatk4": "4.5.0.0", + "samtools": "1.19.2" + }, + "INDEX_CRAM": { + "samtools": 1.21 + }, + "MOSDEPTH": { + "mosdepth": "0.3.8" + }, + "SAMTOOLS_STATS": { + "samtools": 1.21 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/markduplicates.csv", + "csv/markduplicates_no_table.csv", + "csv/recalibrated.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-reported-empirical-plot.txt", + "multiqc/multiqc_data/gatk_base_recalibrator.txt", + "multiqc/multiqc_data/mosdepth-coverage-per-contig-single.txt", + "multiqc/multiqc_data/mosdepth-cumcoverage-dist-id.txt", + "multiqc/multiqc_data/mosdepth_cov_dist.txt", + "multiqc/multiqc_data/mosdepth_cumcov_dist.txt", + "multiqc/multiqc_data/mosdepth_perchrom.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_picard_dups.txt", + "multiqc/multiqc_data/multiqc_samtools_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/picard_deduplication.txt", + "multiqc/multiqc_data/picard_histogram.txt", + "multiqc/multiqc_data/picard_histogram_1.txt", + "multiqc/multiqc_data/picard_histogram_2.txt", + "multiqc/multiqc_data/samtools-stats-dp.txt", + "multiqc/multiqc_data/samtools_alignment_plot.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-reported-empirical-plot.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-coverage-per-contig-single-cnt.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-coverage-per-contig-single-pct.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-cumcoverage-dist-id.pdf", + "multiqc/multiqc_plots/pdf/picard_deduplication-cnt.pdf", + "multiqc/multiqc_plots/pdf/picard_deduplication-pct.pdf", + "multiqc/multiqc_plots/pdf/samtools-stats-dp.pdf", + "multiqc/multiqc_plots/pdf/samtools_alignment_plot-cnt.pdf", + "multiqc/multiqc_plots/pdf/samtools_alignment_plot-pct.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-reported-empirical-plot.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/mosdepth-coverage-per-contig-single-cnt.png", + "multiqc/multiqc_plots/png/mosdepth-coverage-per-contig-single-pct.png", + "multiqc/multiqc_plots/png/mosdepth-cumcoverage-dist-id.png", + "multiqc/multiqc_plots/png/picard_deduplication-cnt.png", + "multiqc/multiqc_plots/png/picard_deduplication-pct.png", + "multiqc/multiqc_plots/png/samtools-stats-dp.png", + "multiqc/multiqc_plots/png/samtools_alignment_plot-cnt.png", + "multiqc/multiqc_plots/png/samtools_alignment_plot-pct.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-reported-empirical-plot.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/mosdepth-coverage-per-contig-single-cnt.svg", + "multiqc/multiqc_plots/svg/mosdepth-coverage-per-contig-single-pct.svg", + "multiqc/multiqc_plots/svg/mosdepth-cumcoverage-dist-id.svg", + "multiqc/multiqc_plots/svg/picard_deduplication-cnt.svg", + "multiqc/multiqc_plots/svg/picard_deduplication-pct.svg", + "multiqc/multiqc_plots/svg/samtools-stats-dp.svg", + "multiqc/multiqc_plots/svg/samtools_alignment_plot-cnt.svg", + "multiqc/multiqc_plots/svg/samtools_alignment_plot-pct.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "preprocessing", + "preprocessing/markduplicates", + "preprocessing/markduplicates/test", + "preprocessing/markduplicates/test/test.md.cram", + "preprocessing/markduplicates/test/test.md.cram.crai", + "preprocessing/recal_table", + "preprocessing/recal_table/test", + "preprocessing/recal_table/test/test.recal.table", + "preprocessing/recalibrated", + "preprocessing/recalibrated/test", + "preprocessing/recalibrated/test/test.recal.cram", + "preprocessing/recalibrated/test/test.recal.cram.crai", + "reference", + "reports", + "reports/markduplicates", + "reports/markduplicates/test", + "reports/markduplicates/test/test.md.cram.metrics", + "reports/mosdepth", + "reports/mosdepth/test", + "reports/mosdepth/test/test.md.mosdepth.global.dist.txt", + "reports/mosdepth/test/test.md.mosdepth.region.dist.txt", + "reports/mosdepth/test/test.md.mosdepth.summary.txt", + "reports/mosdepth/test/test.md.regions.bed.gz", + "reports/mosdepth/test/test.md.regions.bed.gz.csi", + "reports/mosdepth/test/test.recal.mosdepth.global.dist.txt", + "reports/mosdepth/test/test.recal.mosdepth.region.dist.txt", + "reports/mosdepth/test/test.recal.mosdepth.summary.txt", + "reports/mosdepth/test/test.recal.regions.bed.gz", + "reports/mosdepth/test/test.recal.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/test", + "reports/samtools/test/test.md.cram.stats", + "reports/samtools/test/test.recal.cram.stats" + ], + [ + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt:md5,902745b5a1915e5c1a25267b11bebbe7", + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt:md5,832ed176357d6a5b4e50c718f5e4a704", + "mosdepth-coverage-per-contig-single.txt:md5,76d816c3f71faf2009c8a6f88092a2f3", + "mosdepth-cumcoverage-dist-id.txt:md5,3af8f7d8ed7d1fdff6118e0098258192", + "mosdepth_cov_dist.txt:md5,4a2236db76d75e45012f6d7c180c90d6", + "mosdepth_cumcov_dist.txt:md5,4a2236db76d75e45012f6d7c180c90d6", + "mosdepth_perchrom.txt:md5,76d816c3f71faf2009c8a6f88092a2f3", + "multiqc_citations.txt:md5,7d0b4b866fa577272c48a1f3ad72e75d", + "multiqc_samtools_stats.txt:md5,de9451d4736a410d09de58828761ea87", + "picard_histogram.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "picard_histogram_1.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "picard_histogram_2.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "samtools-stats-dp.txt:md5,2247da9fa269d826da3f33ba6fa66954", + "samtools_alignment_plot.txt:md5,22572fcd0791878ed37ae2f48213cee2", + "test.md.mosdepth.global.dist.txt:md5,8e875e20e3fb9cf288d68c1d223f6fd5", + "test.md.mosdepth.region.dist.txt:md5,75e1ce7e55af51f4985fa91654a5ea2d", + "test.md.mosdepth.summary.txt:md5,b23cf96942b2ada3f41172a9349a1175", + "test.md.regions.bed.gz:md5,74cd0c779c7b3228adcf3b177333886a", + "test.md.regions.bed.gz.csi:md5,080731cdedcd389e72135f048d6e2e00", + "test.recal.mosdepth.global.dist.txt:md5,8e875e20e3fb9cf288d68c1d223f6fd5", + "test.recal.mosdepth.region.dist.txt:md5,75e1ce7e55af51f4985fa91654a5ea2d", + "test.recal.mosdepth.summary.txt:md5,b23cf96942b2ada3f41172a9349a1175", + "test.recal.regions.bed.gz:md5,74cd0c779c7b3228adcf3b177333886a", + "test.recal.regions.bed.gz.csi:md5,080731cdedcd389e72135f048d6e2e00", + "test.md.cram.stats:md5,f181d98f08ad94c3926ac149a87d834b", + "test.recal.cram.stats:md5,18346c938c7b1bfaf9ac9413fdba90d8" + ], + [ + [ + "test.md.cram", + "2f11e4fe3390b8ad0a1852616fd1da04" + ], + [ + "test.recal.cram", + "463ac3b905fbf4ddf113a94dbfa8d69f" + ] + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-11-09T16:18:25.238396" + } +} diff --git a/tests/start_from_preparerecalibration.nf.test b/tests/start_from_preparerecalibration.nf.test new file mode 100644 index 0000000000..8d445aaf40 --- /dev/null +++ b/tests/start_from_preparerecalibration.nf.test @@ -0,0 +1,162 @@ +nextflow_pipeline { + + name "Test pipeline when starting from prepare recalibration" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + + test("Run with profile test | --input tests/csv/3.0/mapped_single_bam.csv --step prepare_recalibration --tools null ") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/mapped_single_bam.csv" + step = 'prepare_recalibration' + tools = null + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + test("Run with profile test | --input tests/csv/3.0/mapped_single_bam.csv --step prepare_recalibration --skip_tools baserecalibrator --tools strelka ") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/mapped_single_bam.csv" + step = 'prepare_recalibration' + skip_tools = "baserecalibrator" + tools = 'strelka' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + + test("Run with profile test | --input tests/csv/3.0/mapped_single_cram.csv --step prepare_recalibration --tools null ") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/mapped_single_cram.csv" + step = 'prepare_recalibration' + tools = null + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + test("Run with profile test | --input tests/csv/3.0/mapped_single_cram.csv --step prepare_recalibration --skip_tools baserecalibrator --tools strelka ") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/mapped_single_cram.csv" + step = 'prepare_recalibration' + skip_tools = "baserecalibrator" + tools = 'strelka' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } +} diff --git a/tests/start_from_preparerecalibration.nf.test.snap b/tests/start_from_preparerecalibration.nf.test.snap new file mode 100644 index 0000000000..bfaab47657 --- /dev/null +++ b/tests/start_from_preparerecalibration.nf.test.snap @@ -0,0 +1,396 @@ +{ + "Run with profile test | --input tests/csv/3.0/mapped_single_bam.csv --step prepare_recalibration --tools null ": { + "content": [ + 7, + { + "GATK4_APPLYBQSR": { + "gatk4": "4.5.0.0" + }, + "GATK4_BASERECALIBRATOR": { + "gatk4": "4.5.0.0" + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/markduplicates.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-reported-empirical-plot.txt", + "multiqc/multiqc_data/gatk_base_recalibrator.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-reported-empirical-plot.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-reported-empirical-plot.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-reported-empirical-plot.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "preprocessing", + "preprocessing/recal_table", + "preprocessing/recal_table/test", + "preprocessing/recal_table/test/test.recal.table", + "reference" + ], + [ + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt:md5,d2650a5bec510d798e347f36a4d00e2d", + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt:md5,f2123f7ee3d060c1547efc6247a04e42", + "multiqc_citations.txt:md5,3815a9f79e41890653a0e0d602c92ac9" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-11-09T16:57:10.392231" + }, + "Run with profile test | --input tests/csv/3.0/mapped_single_bam.csv --step prepare_recalibration --skip_tools baserecalibrator --tools strelka ": { + "content": [ + 10, + { + "BCFTOOLS_STATS": { + "bcftools": 1.2 + }, + "STRELKA_SINGLE": { + "strelka": "2.9.10" + }, + "VCFTOOLS_TSTV_COUNT": { + "vcftools": "0.1.16" + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/variantcalled.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/bcftools-stats-subtypes.txt", + "multiqc/multiqc_data/bcftools_stats_indel-lengths.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Indels.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_SNP.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transitions.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transversions.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_bcftools_stats.txt", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/vcftools_tstv_by_count.txt", + "multiqc/multiqc_data/vcftools_tstv_by_qual.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-cnt.pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-pct.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_indel-lengths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Indels.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_SNP.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transitions.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transversions.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_count.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_qual.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-cnt.png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-pct.png", + "multiqc/multiqc_plots/png/bcftools_stats_indel-lengths.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Indels.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_SNP.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transitions.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transversions.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_count.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_qual.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-cnt.svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-pct.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_indel-lengths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Indels.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_SNP.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transitions.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transversions.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_count.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_qual.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reports", + "reports/bcftools", + "reports/bcftools/strelka", + "reports/bcftools/strelka/test", + "reports/bcftools/strelka/test/test.strelka.variants.bcftools_stats.txt", + "reports/vcftools", + "reports/vcftools/strelka", + "reports/vcftools/strelka/test", + "reports/vcftools/strelka/test/test.strelka.variants.FILTER.summary", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.count", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.qual", + "variant_calling", + "variant_calling/strelka", + "variant_calling/strelka/test", + "variant_calling/strelka/test/test.strelka.genome.vcf.gz", + "variant_calling/strelka/test/test.strelka.genome.vcf.gz.tbi", + "variant_calling/strelka/test/test.strelka.variants.vcf.gz", + "variant_calling/strelka/test/test.strelka.variants.vcf.gz.tbi" + ], + [ + "bcftools-stats-subtypes.txt:md5,08ac2a27511e23e15734fa84e97b53bb", + "bcftools_stats_indel-lengths.txt:md5,e2ae4bb4c56cff896db88a20ff2933d9", + "bcftools_stats_vqc_Count_Indels.txt:md5,aeeb155cb7bf39e5f349ce38dbe1699f", + "bcftools_stats_vqc_Count_SNP.txt:md5,ad63f8dfcc15fdc738b1e8ff54c8e452", + "bcftools_stats_vqc_Count_Transitions.txt:md5,6155d186233346cda2cb67147db8f6e7", + "bcftools_stats_vqc_Count_Transversions.txt:md5,60a0122e9d3828a4019336a87e158e42", + "multiqc_bcftools_stats.txt:md5,3a9838b06c741231489075203469d2c4", + "multiqc_citations.txt:md5,ac2b3cf2dfb12c40837b9bbad8112d86", + "vcftools_tstv_by_count.txt:md5,5ca47a28d7ae697981041d8b458ae889", + "test.strelka.variants.bcftools_stats.txt:md5,3b37a441393a9329104451f44e56c619", + "test.strelka.variants.FILTER.summary:md5,39ff2cc8eb7495a14a6b76e0ab627027", + "test.strelka.variants.TsTv.count:md5,ee7dafc8d941b8502a04a63dc3126fff" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-11-09T16:58:48.877928" + }, + "Run with profile test | --input tests/csv/3.0/mapped_single_cram.csv --step prepare_recalibration --skip_tools baserecalibrator --tools strelka ": { + "content": [ + 10, + { + "BCFTOOLS_STATS": { + "bcftools": 1.2 + }, + "STRELKA_SINGLE": { + "strelka": "2.9.10" + }, + "VCFTOOLS_TSTV_COUNT": { + "vcftools": "0.1.16" + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/variantcalled.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/bcftools-stats-subtypes.txt", + "multiqc/multiqc_data/bcftools_stats_indel-lengths.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Indels.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_SNP.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transitions.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transversions.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_bcftools_stats.txt", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/vcftools_tstv_by_count.txt", + "multiqc/multiqc_data/vcftools_tstv_by_qual.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-cnt.pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-pct.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_indel-lengths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Indels.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_SNP.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transitions.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transversions.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_count.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_qual.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-cnt.png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-pct.png", + "multiqc/multiqc_plots/png/bcftools_stats_indel-lengths.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Indels.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_SNP.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transitions.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transversions.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_count.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_qual.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-cnt.svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-pct.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_indel-lengths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Indels.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_SNP.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transitions.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transversions.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_count.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_qual.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reports", + "reports/bcftools", + "reports/bcftools/strelka", + "reports/bcftools/strelka/test", + "reports/bcftools/strelka/test/test.strelka.variants.bcftools_stats.txt", + "reports/vcftools", + "reports/vcftools/strelka", + "reports/vcftools/strelka/test", + "reports/vcftools/strelka/test/test.strelka.variants.FILTER.summary", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.count", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.qual", + "variant_calling", + "variant_calling/strelka", + "variant_calling/strelka/test", + "variant_calling/strelka/test/test.strelka.genome.vcf.gz", + "variant_calling/strelka/test/test.strelka.genome.vcf.gz.tbi", + "variant_calling/strelka/test/test.strelka.variants.vcf.gz", + "variant_calling/strelka/test/test.strelka.variants.vcf.gz.tbi" + ], + [ + "bcftools-stats-subtypes.txt:md5,08ac2a27511e23e15734fa84e97b53bb", + "bcftools_stats_indel-lengths.txt:md5,e2ae4bb4c56cff896db88a20ff2933d9", + "bcftools_stats_vqc_Count_Indels.txt:md5,aeeb155cb7bf39e5f349ce38dbe1699f", + "bcftools_stats_vqc_Count_SNP.txt:md5,ad63f8dfcc15fdc738b1e8ff54c8e452", + "bcftools_stats_vqc_Count_Transitions.txt:md5,6155d186233346cda2cb67147db8f6e7", + "bcftools_stats_vqc_Count_Transversions.txt:md5,60a0122e9d3828a4019336a87e158e42", + "multiqc_bcftools_stats.txt:md5,3a9838b06c741231489075203469d2c4", + "multiqc_citations.txt:md5,ac2b3cf2dfb12c40837b9bbad8112d86", + "vcftools_tstv_by_count.txt:md5,5ca47a28d7ae697981041d8b458ae889", + "test.strelka.variants.bcftools_stats.txt:md5,3b37a441393a9329104451f44e56c619", + "test.strelka.variants.FILTER.summary:md5,39ff2cc8eb7495a14a6b76e0ab627027", + "test.strelka.variants.TsTv.count:md5,ee7dafc8d941b8502a04a63dc3126fff" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-11-09T17:02:51.883814" + }, + "Run with profile test | --input tests/csv/3.0/mapped_single_cram.csv --step prepare_recalibration --tools null ": { + "content": [ + 10, + { + "GATK4_APPLYBQSR": { + "gatk4": "4.5.0.0" + }, + "GATK4_BASERECALIBRATOR": { + "gatk4": "4.5.0.0" + }, + "INDEX_CRAM": { + "samtools": 1.21 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/markduplicates.csv", + "csv/recalibrated.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-reported-empirical-plot.txt", + "multiqc/multiqc_data/gatk_base_recalibrator.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-reported-empirical-plot.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-reported-empirical-plot.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-reported-empirical-plot.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "preprocessing", + "preprocessing/recal_table", + "preprocessing/recal_table/test", + "preprocessing/recal_table/test/test.recal.table", + "preprocessing/recalibrated", + "preprocessing/recalibrated/test", + "preprocessing/recalibrated/test/test.recal.cram", + "preprocessing/recalibrated/test/test.recal.cram.crai", + "reference", + "reports", + "reports/mosdepth", + "reports/mosdepth/test", + "reports/mosdepth/test/test.recal.mosdepth.global.dist.txt", + "reports/mosdepth/test/test.recal.mosdepth.region.dist.txt", + "reports/mosdepth/test/test.recal.mosdepth.summary.txt", + "reports/mosdepth/test/test.recal.regions.bed.gz", + "reports/mosdepth/test/test.recal.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/test", + "reports/samtools/test/test.recal.cram.stats" + ], + [ + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt:md5,d2650a5bec510d798e347f36a4d00e2d", + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt:md5,f2123f7ee3d060c1547efc6247a04e42", + "multiqc_citations.txt:md5,3815a9f79e41890653a0e0d602c92ac9", + "test.recal.mosdepth.global.dist.txt:md5,bdb8f185c35dd1eec7ce2f69bce57972", + "test.recal.mosdepth.region.dist.txt:md5,f1f1ad86fc280bced1888a5d7d25a3f2", + "test.recal.mosdepth.summary.txt:md5,32ea70ef1b99def3dc900b4afd513a40", + "test.recal.regions.bed.gz:md5,07bbc084a889f1cece4307fd00214a6e", + "test.recal.regions.bed.gz.csi:md5,c5d0be930ffc9e562f21519a0d488d5d", + "test.recal.cram.stats:md5,9f75ec16d22ce12c348cbd7477c9886e" + ], + [ + [ + "test.recal.cram", + "463ac3b905fbf4ddf113a94dbfa8d69f" + ] + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-11-09T17:01:30.130705" + } +} diff --git a/tests/start_from_recalibration.nf.test b/tests/start_from_recalibration.nf.test new file mode 100644 index 0000000000..3549bd793a --- /dev/null +++ b/tests/start_from_recalibration.nf.test @@ -0,0 +1,162 @@ +nextflow_pipeline { + + name "Test pipeline when starting from recalibration" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + + test("Run with profile test | --input tests/csv/3.0/mapped_single_bam.csv --step recalibrate --tools null ") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/prepare_recalibration_single_bam.csv" + step = 'recalibrate' + tools = null + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + test("Run with profile test | --input tests/csv/3.0/mapped_single_bam.csv --step recalibrate --skip_tools baserecalibrator --tools strelka ") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/prepare_recalibration_single_bam.csv" + step = 'recalibrate' + skip_tools = "baserecalibrator" + tools = 'strelka' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + + test("Run with profile test | --input tests/csv/3.0/mapped_single_cram.csv --step recalibrate --tools null ") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/prepare_recalibration_single_cram.csv" + step = 'recalibrate' + tools = null + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + test("Run with profile test | --input tests/csv/3.0/mapped_single_cram.csv --step recalibrate --skip_tools baserecalibrator --tools strelka ") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + input = "${projectDir}/tests/csv/3.0/prepare_recalibration_single_cram.csv" + step = 'recalibrate' + skip_tools = "baserecalibrator" + tools = 'strelka' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } +} diff --git a/tests/start_from_recalibration.nf.test.snap b/tests/start_from_recalibration.nf.test.snap new file mode 100644 index 0000000000..b07d2def0c --- /dev/null +++ b/tests/start_from_recalibration.nf.test.snap @@ -0,0 +1,344 @@ +{ + "Run with profile test | --input tests/csv/3.0/mapped_single_cram.csv --step recalibrate --tools null ": { + "content": [ + 9, + { + "GATK4_APPLYBQSR": { + "gatk4": "4.5.0.0" + }, + "INDEX_CRAM": { + "samtools": 1.21 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/recalibrated.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "preprocessing", + "preprocessing/recalibrated", + "preprocessing/recalibrated/test", + "preprocessing/recalibrated/test/test.recal.cram", + "preprocessing/recalibrated/test/test.recal.cram.crai", + "reference", + "reports", + "reports/mosdepth", + "reports/mosdepth/test", + "reports/mosdepth/test/test.recal.mosdepth.global.dist.txt", + "reports/mosdepth/test/test.recal.mosdepth.region.dist.txt", + "reports/mosdepth/test/test.recal.mosdepth.summary.txt", + "reports/mosdepth/test/test.recal.regions.bed.gz", + "reports/mosdepth/test/test.recal.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/test", + "reports/samtools/test/test.recal.cram.stats" + ], + [ + "multiqc_citations.txt:md5,4c806e63a283ec1b7e78cdae3a923d4f", + "test.recal.mosdepth.global.dist.txt:md5,bdb8f185c35dd1eec7ce2f69bce57972", + "test.recal.mosdepth.region.dist.txt:md5,f1f1ad86fc280bced1888a5d7d25a3f2", + "test.recal.mosdepth.summary.txt:md5,32ea70ef1b99def3dc900b4afd513a40", + "test.recal.regions.bed.gz:md5,07bbc084a889f1cece4307fd00214a6e", + "test.recal.regions.bed.gz.csi:md5,c5d0be930ffc9e562f21519a0d488d5d", + "test.recal.cram.stats:md5,1aacf5ac46a0e600925934cf9ee92fc4" + ], + [ + [ + "test.recal.cram", + "463ac3b905fbf4ddf113a94dbfa8d69f" + ] + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-11-09T17:22:03.443847" + }, + "Run with profile test | --input tests/csv/3.0/mapped_single_bam.csv --step recalibrate --skip_tools baserecalibrator --tools strelka ": { + "content": [ + 10, + { + "BCFTOOLS_STATS": { + "bcftools": 1.2 + }, + "STRELKA_SINGLE": { + "strelka": "2.9.10" + }, + "VCFTOOLS_TSTV_COUNT": { + "vcftools": "0.1.16" + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/variantcalled.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/bcftools-stats-subtypes.txt", + "multiqc/multiqc_data/bcftools_stats_indel-lengths.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Indels.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_SNP.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transitions.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transversions.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_bcftools_stats.txt", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/vcftools_tstv_by_count.txt", + "multiqc/multiqc_data/vcftools_tstv_by_qual.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-cnt.pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-pct.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_indel-lengths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Indels.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_SNP.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transitions.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transversions.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_count.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_qual.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-cnt.png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-pct.png", + "multiqc/multiqc_plots/png/bcftools_stats_indel-lengths.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Indels.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_SNP.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transitions.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transversions.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_count.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_qual.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-cnt.svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-pct.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_indel-lengths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Indels.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_SNP.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transitions.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transversions.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_count.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_qual.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reports", + "reports/bcftools", + "reports/bcftools/strelka", + "reports/bcftools/strelka/test", + "reports/bcftools/strelka/test/test.strelka.variants.bcftools_stats.txt", + "reports/vcftools", + "reports/vcftools/strelka", + "reports/vcftools/strelka/test", + "reports/vcftools/strelka/test/test.strelka.variants.FILTER.summary", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.count", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.qual", + "variant_calling", + "variant_calling/strelka", + "variant_calling/strelka/test", + "variant_calling/strelka/test/test.strelka.genome.vcf.gz", + "variant_calling/strelka/test/test.strelka.genome.vcf.gz.tbi", + "variant_calling/strelka/test/test.strelka.variants.vcf.gz", + "variant_calling/strelka/test/test.strelka.variants.vcf.gz.tbi" + ], + [ + "bcftools-stats-subtypes.txt:md5,08ac2a27511e23e15734fa84e97b53bb", + "bcftools_stats_indel-lengths.txt:md5,e2ae4bb4c56cff896db88a20ff2933d9", + "bcftools_stats_vqc_Count_Indels.txt:md5,aeeb155cb7bf39e5f349ce38dbe1699f", + "bcftools_stats_vqc_Count_SNP.txt:md5,ad63f8dfcc15fdc738b1e8ff54c8e452", + "bcftools_stats_vqc_Count_Transitions.txt:md5,6155d186233346cda2cb67147db8f6e7", + "bcftools_stats_vqc_Count_Transversions.txt:md5,60a0122e9d3828a4019336a87e158e42", + "multiqc_bcftools_stats.txt:md5,3a9838b06c741231489075203469d2c4", + "multiqc_citations.txt:md5,ac2b3cf2dfb12c40837b9bbad8112d86", + "vcftools_tstv_by_count.txt:md5,5ca47a28d7ae697981041d8b458ae889", + "test.strelka.variants.bcftools_stats.txt:md5,3b37a441393a9329104451f44e56c619", + "test.strelka.variants.FILTER.summary:md5,39ff2cc8eb7495a14a6b76e0ab627027", + "test.strelka.variants.TsTv.count:md5,ee7dafc8d941b8502a04a63dc3126fff" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-11-09T17:20:43.396817" + }, + "Run with profile test | --input tests/csv/3.0/mapped_single_cram.csv --step recalibrate --skip_tools baserecalibrator --tools strelka ": { + "content": [ + 10, + { + "BCFTOOLS_STATS": { + "bcftools": 1.2 + }, + "STRELKA_SINGLE": { + "strelka": "2.9.10" + }, + "VCFTOOLS_TSTV_COUNT": { + "vcftools": "0.1.16" + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/variantcalled.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/bcftools-stats-subtypes.txt", + "multiqc/multiqc_data/bcftools_stats_indel-lengths.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Indels.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_SNP.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transitions.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transversions.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_bcftools_stats.txt", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/vcftools_tstv_by_count.txt", + "multiqc/multiqc_data/vcftools_tstv_by_qual.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-cnt.pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-pct.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_indel-lengths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Indels.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_SNP.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transitions.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transversions.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_count.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_qual.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-cnt.png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-pct.png", + "multiqc/multiqc_plots/png/bcftools_stats_indel-lengths.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Indels.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_SNP.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transitions.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transversions.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_count.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_qual.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-cnt.svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-pct.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_indel-lengths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Indels.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_SNP.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transitions.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transversions.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_count.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_qual.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reports", + "reports/bcftools", + "reports/bcftools/strelka", + "reports/bcftools/strelka/test", + "reports/bcftools/strelka/test/test.strelka.variants.bcftools_stats.txt", + "reports/vcftools", + "reports/vcftools/strelka", + "reports/vcftools/strelka/test", + "reports/vcftools/strelka/test/test.strelka.variants.FILTER.summary", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.count", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.qual", + "variant_calling", + "variant_calling/strelka", + "variant_calling/strelka/test", + "variant_calling/strelka/test/test.strelka.genome.vcf.gz", + "variant_calling/strelka/test/test.strelka.genome.vcf.gz.tbi", + "variant_calling/strelka/test/test.strelka.variants.vcf.gz", + "variant_calling/strelka/test/test.strelka.variants.vcf.gz.tbi" + ], + [ + "bcftools-stats-subtypes.txt:md5,08ac2a27511e23e15734fa84e97b53bb", + "bcftools_stats_indel-lengths.txt:md5,e2ae4bb4c56cff896db88a20ff2933d9", + "bcftools_stats_vqc_Count_Indels.txt:md5,aeeb155cb7bf39e5f349ce38dbe1699f", + "bcftools_stats_vqc_Count_SNP.txt:md5,ad63f8dfcc15fdc738b1e8ff54c8e452", + "bcftools_stats_vqc_Count_Transitions.txt:md5,6155d186233346cda2cb67147db8f6e7", + "bcftools_stats_vqc_Count_Transversions.txt:md5,60a0122e9d3828a4019336a87e158e42", + "multiqc_bcftools_stats.txt:md5,3a9838b06c741231489075203469d2c4", + "multiqc_citations.txt:md5,ac2b3cf2dfb12c40837b9bbad8112d86", + "vcftools_tstv_by_count.txt:md5,5ca47a28d7ae697981041d8b458ae889", + "test.strelka.variants.bcftools_stats.txt:md5,3b37a441393a9329104451f44e56c619", + "test.strelka.variants.FILTER.summary:md5,39ff2cc8eb7495a14a6b76e0ab627027", + "test.strelka.variants.TsTv.count:md5,ee7dafc8d941b8502a04a63dc3126fff" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-11-09T17:23:40.285914" + }, + "Run with profile test | --input tests/csv/3.0/mapped_single_bam.csv --step recalibrate --tools null ": { + "content": [ + 6, + { + "GATK4_APPLYBQSR": { + "gatk4": "4.5.0.0" + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "preprocessing", + "reference" + ], + [ + "multiqc_citations.txt:md5,4c806e63a283ec1b7e78cdae3a923d4f" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-11-09T17:19:08.093003" + } +} diff --git a/tests/test_aligner_bwamem.yml b/tests/test_aligner_bwamem.yml deleted file mode 100644 index ccd81021ef..0000000000 --- a/tests/test_aligner_bwamem.yml +++ /dev/null @@ -1,81 +0,0 @@ -- name: Run bwamem - command: nextflow run main.nf -profile test --aligner bwa-mem --save_reference --outdir results - tags: - - aligner - - bwamem - - preprocessing - files: - - path: results/csv/markduplicates.csv - md5sum: 0d6120bb99e92f6810343270711ca53e - - path: results/csv/markduplicates_no_table.csv - md5sum: 2a2d3d4842befd4def39156463859ee3 - - path: results/csv/recalibrated.csv - md5sum: 2d29d9e53894dcce96a1b5beb6ef3312 - - path: results/multiqc - - path: results/preprocessing/markduplicates/test/test.md.cram - # binary changes md5sums on reruns - - path: results/preprocessing/markduplicates/test/test.md.cram.crai - # binary changes md5sums on reruns - - path: results/preprocessing/recal_table/test/test.recal.table - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - # binary changes md5sums on reruns - - path: results/reference/bwa/genome.amb - md5sum: 1891c1de381b3a96d4e72f590fde20c1 - - path: results/reference/bwa/genome.ann - md5sum: 2df4aa2d7580639fa0fcdbcad5e2e969 - - path: results/reference/bwa/genome.bwt - md5sum: 815eded87e4cb6b0f1daab5c4d6e30af - - path: results/reference/bwa/genome.pac - md5sum: 8569fbdb2c98c6fb16dfa73d8eacb070 - - path: results/reference/bwa/genome.sa - md5sum: e7cff62b919448a3a3d0fe4aaf427594 - - path: results/reference/intervals/chr22_1-40001.bed - md5sum: 87a15eb9c2ff20ccd5cd8735a28708f7 - - path: results/reference/intervals/chr22_1-40001.bed.gz - md5sum: d3341fa28986c40b24fcc10a079dbb80 - - path: results/reference/intervals/genome.bed - md5sum: a87dc7d20ebca626f65cc16ff6c97a3e - - path: results/reports/fastqc/test-test_L1 - - path: results/reports/markduplicates/test/test.md.cram.metrics - contains: ["test 17094 1534 168 1046782 12429 197 0 0.635998", "1.0 0.999991 1171"] - - path: results/reports/mosdepth/test/test.md.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.md.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.md.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.md.regions.bed.gz - - path: results/reports/mosdepth/test/test.md.regions.bed.gz.csi - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi - - path: results/reports/samtools/test/test.md.cram.stats - # conda changes md5sums for test - - path: results/reports/samtools/test/test.recal.cram.stats - # conda changes md5sums for test -- name: Build only index with bwa - command: nextflow run main.nf -profile test --build_only_index --input false --outdir results - tags: - - aligner - - build_only_index - - bwamem - files: - - path: results/multiqc - - path: results/reference/bwa/genome.amb - md5sum: 1891c1de381b3a96d4e72f590fde20c1 - - path: results/reference/bwa/genome.ann - md5sum: 2df4aa2d7580639fa0fcdbcad5e2e969 - - path: results/reference/bwa/genome.bwt - md5sum: 815eded87e4cb6b0f1daab5c4d6e30af - - path: results/reference/bwa/genome.pac - md5sum: 8569fbdb2c98c6fb16dfa73d8eacb070 - - path: results/reference/bwa/genome.sa - md5sum: e7cff62b919448a3a3d0fe4aaf427594 - - path: results/reference/intervals/chr22_1-40001.bed - md5sum: 87a15eb9c2ff20ccd5cd8735a28708f7 - - path: results/reference/intervals/chr22_1-40001.bed.gz - md5sum: d3341fa28986c40b24fcc10a079dbb80 - - path: results/reference/intervals/genome.bed - md5sum: a87dc7d20ebca626f65cc16ff6c97a3e diff --git a/tests/test_aligner_bwamem2.yml b/tests/test_aligner_bwamem2.yml deleted file mode 100644 index be19c08df6..0000000000 --- a/tests/test_aligner_bwamem2.yml +++ /dev/null @@ -1,81 +0,0 @@ -- name: Run bwamem2 - command: nextflow run main.nf -profile test --aligner bwa-mem2 --save_reference --outdir results - tags: - - aligner - - bwamem2 - - preprocessing - files: - - path: results/csv/markduplicates.csv - md5sum: 0d6120bb99e92f6810343270711ca53e - - path: results/csv/markduplicates_no_table.csv - md5sum: 2a2d3d4842befd4def39156463859ee3 - - path: results/csv/recalibrated.csv - md5sum: 2d29d9e53894dcce96a1b5beb6ef3312 - - path: results/multiqc - - path: results/preprocessing/markduplicates/test/test.md.cram - # binary changes md5sums on reruns - - path: results/preprocessing/markduplicates/test/test.md.cram.crai - # binary changes md5sums on reruns - - path: results/preprocessing/recal_table/test/test.recal.table - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - # binary changes md5sums on reruns - - path: results/reference/bwamem2/genome.fasta.0123 - md5sum: d73300d44f733bcdb7c988fc3ff3e3e9 - - path: results/reference/bwamem2/genome.fasta.amb - md5sum: 1891c1de381b3a96d4e72f590fde20c1 - - path: results/reference/bwamem2/genome.fasta.ann - md5sum: 2df4aa2d7580639fa0fcdbcad5e2e969 - - path: results/reference/bwamem2/genome.fasta.bwt.2bit.64 - md5sum: cd4bdf496eab05228a50c45ee43c1ed0 - - path: results/reference/bwamem2/genome.fasta.pac - md5sum: 8569fbdb2c98c6fb16dfa73d8eacb070 - - path: results/reference/intervals/chr22_1-40001.bed - md5sum: 87a15eb9c2ff20ccd5cd8735a28708f7 - - path: results/reference/intervals/chr22_1-40001.bed.gz - md5sum: d3341fa28986c40b24fcc10a079dbb80 - - path: results/reference/intervals/genome.bed - md5sum: a87dc7d20ebca626f65cc16ff6c97a3e - - path: results/reports/fastqc/test-test_L1 - - path: results/reports/markduplicates/test/test.md.cram.metrics - contains: ["test 17094 1534 168 1046782 12429 197 0 0.635998", "1.0 0.999991 1171"] - - path: results/reports/mosdepth/test/test.md.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.md.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.md.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.md.regions.bed.gz - - path: results/reports/mosdepth/test/test.md.regions.bed.gz.csi - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi - - path: results/reports/samtools/test/test.md.cram.stats - # conda changes md5sums for test - - path: results/reports/samtools/test/test.recal.cram.stats - # conda changes md5sums for test -- name: Build only index with bwa-mem2 - command: nextflow run main.nf -profile test --build_only_index --aligner bwa-mem2 --input false --outdir results - tags: - - aligner - - build_only_index - - bwamem2 - files: - - path: results/multiqc - - path: results/reference/bwamem2/genome.fasta.0123 - md5sum: d73300d44f733bcdb7c988fc3ff3e3e9 - - path: results/reference/bwamem2/genome.fasta.amb - md5sum: 1891c1de381b3a96d4e72f590fde20c1 - - path: results/reference/bwamem2/genome.fasta.ann - md5sum: 2df4aa2d7580639fa0fcdbcad5e2e969 - - path: results/reference/bwamem2/genome.fasta.bwt.2bit.64 - md5sum: cd4bdf496eab05228a50c45ee43c1ed0 - - path: results/reference/bwamem2/genome.fasta.pac - md5sum: 8569fbdb2c98c6fb16dfa73d8eacb070 - - path: results/reference/intervals/chr22_1-40001.bed - md5sum: 87a15eb9c2ff20ccd5cd8735a28708f7 - - path: results/reference/intervals/chr22_1-40001.bed.gz - md5sum: d3341fa28986c40b24fcc10a079dbb80 - - path: results/reference/intervals/genome.bed - md5sum: a87dc7d20ebca626f65cc16ff6c97a3e diff --git a/tests/test_aligner_dragmap.yml b/tests/test_aligner_dragmap.yml deleted file mode 100644 index 6f38538926..0000000000 --- a/tests/test_aligner_dragmap.yml +++ /dev/null @@ -1,107 +0,0 @@ -- name: Run dragmap - command: nextflow run main.nf -profile test --aligner dragmap --save_reference --outdir results - tags: - - aligner - - dragmap - - preprocessing - files: - - path: results/csv/markduplicates.csv - md5sum: 0d6120bb99e92f6810343270711ca53e - - path: results/csv/markduplicates_no_table.csv - md5sum: 2a2d3d4842befd4def39156463859ee3 - - path: results/csv/recalibrated.csv - md5sum: 2d29d9e53894dcce96a1b5beb6ef3312 - - path: results/multiqc - - path: results/preprocessing/markduplicates/test/test.md.cram - # binary changes md5sums on reruns - - path: results/preprocessing/markduplicates/test/test.md.cram.crai - # binary changes md5sums on reruns - - path: results/preprocessing/recal_table/test/test.recal.table - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - # binary changes md5sums on reruns - - path: results/reference/dragmap/hash_table.cfg - contains: - [ - "reference_sequences = 1", - "reference_len = 368640", - "reference_len_raw = 40001", - "reference_len_not_n = 40001", - "reference_alt_seed = 204800", - ] - - path: results/reference/dragmap/hash_table.cfg.bin - # binary changes md5sums on reruns - - path: results/reference/dragmap/hash_table.cmp - md5sum: 1caab4ffc89f81ace615a2e813295cf4 - - path: results/reference/dragmap/hash_table_stats.txt - contains: ["A bases: 10934", "C bases: 8612", "G bases: 8608", "T bases: 11847"] - - path: results/reference/dragmap/ref_index.bin - md5sum: dbb5c7d26b974e0ac338024fe4535044 - - path: results/reference/dragmap/reference.bin - md5sum: be67b80ee48aa96b383fd72f1ccfefea - - path: results/reference/dragmap/repeat_mask.bin - md5sum: 294939f1f80aa7f4a70b9b537e4c0f21 - - path: results/reference/dragmap/str_table.bin - md5sum: 45f7818c4a10fdeed04db7a34b5f9ff1 - - path: results/reference/intervals/chr22_1-40001.bed - md5sum: 87a15eb9c2ff20ccd5cd8735a28708f7 - - path: results/reference/intervals/chr22_1-40001.bed.gz - md5sum: d3341fa28986c40b24fcc10a079dbb80 - - path: results/reference/intervals/genome.bed - md5sum: a87dc7d20ebca626f65cc16ff6c97a3e - - path: results/reports/fastqc/test-test_L1 - - path: results/reports/markduplicates/test/test.md.cram.metrics - contains: ["LB0 27214 1086 322 1037558 20017 100 0 0.687981"] - - path: results/reports/mosdepth/test/test.md.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.md.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.md.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.md.regions.bed.gz - - path: results/reports/mosdepth/test/test.md.regions.bed.gz.csi - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi - - path: results/reports/samtools/test/test.md.cram.stats - # conda changes md5sums for test - - path: results/reports/samtools/test/test.recal.cram.stats - # conda changes md5sums for test -- name: Build only index with dragmap - command: nextflow run main.nf -profile test --build_only_index --aligner dragmap --input false --outdir results - tags: - - aligner - - build_only_index - - dragmap - files: - - path: results/multiqc - - path: results/reference/dragmap/hash_table.cfg - contains: - [ - "reference_sequences = 1", - "reference_len = 368640", - "reference_len_raw = 40001", - "reference_len_not_n = 40001", - "reference_alt_seed = 204800", - ] - - path: results/reference/dragmap/hash_table.cfg.bin - # binary changes md5sums on reruns - - path: results/reference/dragmap/hash_table.cmp - md5sum: 1caab4ffc89f81ace615a2e813295cf4 - - path: results/reference/dragmap/hash_table_stats.txt - contains: ["A bases: 10934", "C bases: 8612", "G bases: 8608", "T bases: 11847"] - - path: results/reference/dragmap/ref_index.bin - md5sum: dbb5c7d26b974e0ac338024fe4535044 - - path: results/reference/dragmap/reference.bin - md5sum: be67b80ee48aa96b383fd72f1ccfefea - - path: results/reference/dragmap/repeat_mask.bin - md5sum: 294939f1f80aa7f4a70b9b537e4c0f21 - - path: results/reference/dragmap/str_table.bin - md5sum: 45f7818c4a10fdeed04db7a34b5f9ff1 - - path: results/reference/intervals/chr22_1-40001.bed - md5sum: 87a15eb9c2ff20ccd5cd8735a28708f7 - - path: results/reference/intervals/chr22_1-40001.bed.gz - md5sum: d3341fa28986c40b24fcc10a079dbb80 - - path: results/reference/intervals/genome.bed - md5sum: a87dc7d20ebca626f65cc16ff6c97a3e diff --git a/tests/test_alignment_from_everything.yml b/tests/test_alignment_from_everything.yml deleted file mode 100644 index cdbbb15b36..0000000000 --- a/tests/test_alignment_from_everything.yml +++ /dev/null @@ -1,42 +0,0 @@ -- name: Run alignment to bam fastq and spring files to bam files - command: nextflow run main.nf -profile test,alignment_from_everything --outdir results --save_mapped --save_output_as_bam - tags: - - alignment_from_everything - files: - - path: results/csv/mapped.csv - md5sum: 5a10f49a6a691c84e31e7dd8e91c8201 - - path: results/csv/markduplicates.csv - md5sum: 293ae6ec0286272470dd8d6edf4b4fc9 - - path: results/csv/markduplicates_no_table.csv - md5sum: e379af37e14b94c17465654f971bd23f - - path: results/csv/recalibrated.csv - md5sum: 976529de568cdf201e6b9dfb2da3f62f - - path: results/multiqc - - path: results/preprocessing/mapped/test/test.sorted.bam - - path: results/preprocessing/mapped/test/test.sorted.bam.bai - - path: results/preprocessing/mapped/test2/test2.sorted.bam - - path: results/preprocessing/mapped/test2/test2.sorted.bam.bai - - path: results/preprocessing/mapped/test3/test3.sorted.bam - - path: results/preprocessing/mapped/test3/test3.sorted.bam.bai - - path: results/preprocessing/mapped/test_bam/test_bam.sorted.bam - - path: results/preprocessing/mapped/test_bam/test_bam.sorted.bam.bai - - path: results/preprocessing/markduplicates/test/test.md.bam - - path: results/preprocessing/markduplicates/test/test.md.bam.bai - - path: results/preprocessing/markduplicates/test2/test2.md.bam - - path: results/preprocessing/markduplicates/test2/test2.md.bam.bai - - path: results/preprocessing/markduplicates/test3/test3.md.bam - - path: results/preprocessing/markduplicates/test3/test3.md.bam.bai - - path: results/preprocessing/markduplicates/test_bam/test_bam.md.bam - - path: results/preprocessing/markduplicates/test_bam/test_bam.md.bam.bai - - path: results/preprocessing/recal_table/test/test.recal.table - - path: results/preprocessing/recal_table/test2/test2.recal.table - - path: results/preprocessing/recal_table/test3/test3.recal.table - - path: results/preprocessing/recal_table/test_bam/test_bam.recal.table - - path: results/preprocessing/recalibrated/test/test.recal.bam - - path: results/preprocessing/recalibrated/test/test.recal.bam.bai - - path: results/preprocessing/recalibrated/test2/test2.recal.bam - - path: results/preprocessing/recalibrated/test2/test2.recal.bam.bai - - path: results/preprocessing/recalibrated/test3/test3.recal.bam - - path: results/preprocessing/recalibrated/test3/test3.recal.bam.bai - - path: results/preprocessing/recalibrated/test_bam/test_bam.recal.bam - - path: results/preprocessing/recalibrated/test_bam/test_bam.recal.bam.bai diff --git a/tests/test_alignment_to_fastq.yml b/tests/test_alignment_to_fastq.yml deleted file mode 100644 index bc7a3e5057..0000000000 --- a/tests/test_alignment_to_fastq.yml +++ /dev/null @@ -1,42 +0,0 @@ -- name: Run alignment to fastq and then remap on bam files - command: nextflow run main.nf -profile test,alignment_to_fastq --outdir results - tags: - - alignment_to_fastq - - input_bam - files: - - path: results/csv/markduplicates.csv - md5sum: 0d6120bb99e92f6810343270711ca53e - - path: results/csv/markduplicates_no_table.csv - md5sum: 2a2d3d4842befd4def39156463859ee3 - - path: results/csv/recalibrated.csv - md5sum: 2d29d9e53894dcce96a1b5beb6ef3312 - - path: results/multiqc - - path: results/preprocessing/markduplicates/test/test.md.cram - # binary changes md5sums on reruns - - path: results/preprocessing/markduplicates/test/test.md.cram.crai - # binary changes md5sums on reruns - - path: results/preprocessing/recal_table/test/test.recal.table - md5sum: 9c0517ffdc5d30a5c73b9f7df1ff3060 - - path: results/preprocessing/recalibrated/test/test.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - # binary changes md5sums on reruns - - path: results/reports/fastqc/test-1 - - path: results/reports/markduplicates/test/test.md.cram.metrics - contains: ["test 0 2820 2 2 0 828 0 0.293617 3807", "1.0 0.999986 1178 1178", "2.0 1.47674 800 800", "100.0 1.911145 0 0"] - - path: results/reports/mosdepth/test/test.md.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.md.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.md.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.md.regions.bed.gz - - path: results/reports/mosdepth/test/test.md.regions.bed.gz.csi - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi - - path: results/reports/samtools/test/test.md.cram.stats - # conda changes md5sums for test - - path: results/reports/samtools/test/test.recal.cram.stats - # conda changes md5sums for test - - path: results/preprocessing/mapped/ - should_exist: false diff --git a/tests/test_annotation_bcfann.yml b/tests/test_annotation_bcfann.yml deleted file mode 100644 index 99ac781d8e..0000000000 --- a/tests/test_annotation_bcfann.yml +++ /dev/null @@ -1,10 +0,0 @@ -- name: Run bcfann - command: nextflow run main.nf -profile test,annotation --tools bcfann --outdir results - tags: - - annotation - - bcfann - files: - - path: results/annotation/test/test_BCF.ann.vcf.gz - # binary changes md5sums on reruns - - path: results/annotation/test/test_BCF.ann.vcf.gz.tbi - # binary changes md5sums on reruns diff --git a/tests/test_annotation_cache.yml b/tests/test_annotation_cache.yml deleted file mode 100644 index 290169d32d..0000000000 --- a/tests/test_annotation_cache.yml +++ /dev/null @@ -1,29 +0,0 @@ -- name: Only download annotation cache - command: nextflow run main.nf -profile test,annotation --tools merge --download_cache --input false --build_only_index --outdir results - tags: - - annotation - - cache - - vep - - snpeff - files: - - path: results/multiqc - - path: results/cache/snpeff_cache - - path: results/cache/vep_cache - - path: results/annotation - should_exist: false - -- name: Fail to locate VEP cache - command: nextflow run main.nf -profile test,annotation --vep_cache s3://annotation-cache/vep_cache/ --vep_cache_version 1 --tools vep --input false --build_only_index --outdir results - tags: - - annotation - - cache - - vep - exit_code: 1 - -- name: Fail to locate snpEff cache - command: nextflow run main.nf -profile test,annotation --snpeff_cache s3://annotation-cache/snpeff_cache/ --snpeff_genome na --tools snpeff --input false --build_only_index --outdir results - tags: - - annotation - - cache - - snpeff - exit_code: 1 diff --git a/tests/test_annotation_merge.yml b/tests/test_annotation_merge.yml deleted file mode 100644 index 7c39948743..0000000000 --- a/tests/test_annotation_merge.yml +++ /dev/null @@ -1,58 +0,0 @@ -- name: Run snpEff followed by VEP - command: nextflow run main.nf -profile test,annotation --tools merge --outdir results --download_cache - tags: - - annotation - - merge - files: - - path: results/annotation/test/test_snpEff_VEP.ann.vcf.gz - # binary changes md5sums on reruns - - path: results/annotation/test/test_snpEff_VEP.ann.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/reports/EnsemblVEP/test/test_snpEff_VEP.ann.summary.html - contains: ["test_snpEff.ann.vcf.gzOutput filetest_snpEff_VEP.ann.vcf.gz"] - - path: results/multiqc - - path: results/annotation/test/test_snpEff.ann.vcf.gz - should_exist: false - - path: results/annotation/test/test_snpEff.ann.vcf.gz.tbi - should_exist: false - - path: results/annotation/test/test_VEP.ann.vcf.gz - should_exist: false - - path: results/annotation/test/test_VEP.ann.vcf.gz.tbi - should_exist: false - - path: results/reports/snpeff/test/snpEff_summary.html - should_exist: false - - path: results/reports/snpeff/test/test_snpEff.csv - should_exist: false - - path: results/reports/snpeff/test/test_snpEff.genes.txt - should_exist: false - - path: results/reports/EnsemblVEP/test/test_VEP.ann.summary.html - should_exist: false -- name: Run VEP and snpEff followed by VEP - command: nextflow run main.nf -profile test,annotation --tools merge,snpeff,vep --outdir results --download_cache - tags: - - annotation - - merge - files: - - path: results/annotation/test/test_VEP.ann.vcf.gz - # binary changes md5sums on reruns - - path: results/annotation/test/test_VEP.ann.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/annotation/test/test_snpEff.ann.vcf.gz - # binary changes md5sums on reruns - - path: results/annotation/test/test_snpEff.ann.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/annotation/test/test_snpEff_VEP.ann.vcf.gz - # binary changes md5sums on reruns - - path: results/annotation/test/test_snpEff_VEP.ann.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/multiqc - - path: results/reports/EnsemblVEP/test/test_VEP.ann.summary.html - # text-based file changes md5sums on reruns - - path: results/reports/EnsemblVEP/test/test_snpEff_VEP.ann.summary.html - # text-based file changes md5sums on reruns - - path: results/reports/snpeff/test/snpEff_summary.html - # text-based file changes md5sums on reruns - - path: results/reports/snpeff/test/test_snpEff.csv - # text-based file changes md5sums on reruns - - path: results/reports/snpeff/test/test_snpEff.genes.txt - md5sum: 130536bf0237d7f3f746d32aaa32840a diff --git a/tests/test_annotation_snpeff.yml b/tests/test_annotation_snpeff.yml deleted file mode 100644 index 5ad71d0664..0000000000 --- a/tests/test_annotation_snpeff.yml +++ /dev/null @@ -1,27 +0,0 @@ -- name: Run snpEff - command: nextflow run main.nf -profile test,annotation --tools snpeff --outdir results --download_cache - tags: - - annotation - - snpeff - files: - - path: results/annotation/test/test_snpEff.ann.vcf.gz - # binary changes md5sums on reruns - - path: results/annotation/test/test_snpEff.ann.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/multiqc - - path: results/reports/snpeff/test/snpEff_summary.html - contains: [" Genome total length ", " 100,286,402 ", " MT192765.1 "] - - path: results/reports/snpeff/test/test_snpEff.csv - contains: - [ - "Values , 50,100", - "Count , 1,8", - "Reference , 0", - "Het , 1", - "Hom , 8", - "Missing , 0", - "MT192765.1, Position,0,1", - "MT192765.1,Count,0,0", - ] - - path: results/reports/snpeff/test/test_snpEff.genes.txt - md5sum: 130536bf0237d7f3f746d32aaa32840a diff --git a/tests/test_annotation_vep.yml b/tests/test_annotation_vep.yml deleted file mode 100644 index c8c16747b2..0000000000 --- a/tests/test_annotation_vep.yml +++ /dev/null @@ -1,27 +0,0 @@ -- name: Run VEP - command: nextflow run main.nf -profile test,annotation --tools vep --outdir results --download_cache - tags: - - annotation - - vep - files: - - path: results/annotation/test/test_VEP.ann.vcf.gz - # binary changes md5sums on reruns - - path: results/annotation/test/test_VEP.ann.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/multiqc - - path: results/reports/EnsemblVEP/test/test_VEP.ann.summary.html - contains: ["test.vcf.gzOutput filetest_VEP.ann.vcf.gz"] -- name: Run VEP with fasta - command: nextflow run main.nf -profile test,annotation --tools vep --vep_include_fasta --outdir results --download_cache - tags: - - annotation - - vep - files: - - path: results/annotation/test/test_VEP.ann.vcf.gz - # binary changes md5sums on reruns - - path: results/annotation/test/test_VEP.ann.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/multiqc - - path: results/reports/EnsemblVEP/test/test_VEP.ann.summary.html - # text-based file changes md5sums on reruns - contains: ["test.vcf.gzOutput filetest_VEP.ann.vcf.gz"] diff --git a/tests/test_controlfreec.yml b/tests/test_controlfreec.yml deleted file mode 100644 index 5483c69469..0000000000 --- a/tests/test_controlfreec.yml +++ /dev/null @@ -1,173 +0,0 @@ -- name: Run variant calling on somatic samples with controlfreec - command: nextflow run main.nf -profile test,tools_somatic --tools controlfreec --outdir results - tags: - - controlfreec - - somatic - - variant_calling - - copy_number_calling - files: - - path: results/multiqc - - path: results/variant_calling/controlfreec/sample4_vs_sample3/config.txt - contains: - [ - "BedGraphOutput = TRUE", - "minExpectedGC = 0", - "maxThreads = 2", - "noisyData = TRUE", - "readCountThreshold = 1", - "sex = XX", - "window = 10", - ] - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.bed - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.circos.txt - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.normal.mpileup.gz_control.cpn - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.p.value.txt - # binary changes md5sums on reruns - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.tumor.mpileup.gz_BAF.txt - # binary changes md5sums on reruns - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.tumor.mpileup.gz_CNVs - # binary changes md5sums on reruns - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.tumor.mpileup.gz_info.txt - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.tumor.mpileup.gz_ratio.BedGraph - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.tumor.mpileup.gz_ratio.txt - # binary changes md5sums on reruns - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.tumor.mpileup.gz_sample.cpn - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_BAF.png - # binary changes md5sums on reruns - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_ratio.log2.png - # binary changes md5sums on reruns - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_ratio.png - # binary changes md5sums on reruns - - path: results/variant_calling/mpileup/sample4_vs_sample3/sample4_vs_sample3.normal.mpileup.gz - should_exist: false - - path: results/variant_calling/mpileup/sample4_vs_sample3/sample4_vs_sample3.tumor.mpileup.gz - should_exist: false - - path: results/cnvkit - should_exist: false - - path: results/reports/mosdepth/sample3/sample3.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample3/sample3.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/sample3/sample3.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample3/sample3.recal.regions.bed.gz - - path: results/reports/mosdepth/sample3/sample3.recal.regions.bed.gz.csi - - path: results/reports/samtools/sample3/sample3.recal.cram.stats - # conda changes md5sums for test - - path: results/reports/mosdepth/sample4/sample4.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample4/sample4.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/sample4/sample4.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample4/sample4.recal.regions.bed.gz - - path: results/reports/mosdepth/sample4/sample4.recal.regions.bed.gz.csi - - path: results/reports/samtools/sample4/sample4.recal.cram.stats - # conda changes md5sums for test -- name: Run variant calling on somatic samples with controlfreec without intervals - command: nextflow run main.nf -profile test,tools_somatic --tools controlfreec --no_intervals -stub-run --outdir results - tags: - - controlfreec - - no_intervals - - somatic - - variant_calling - - copy_number_calling - files: - - path: results/multiqc - - path: results/no_intervals.bed - md5sum: f3dac01ea66b95fe477446fde2d31489 - - path: results/no_intervals.bed.gz - md5sum: f3dac01ea66b95fe477446fde2d31489 - - path: results/no_intervals.bed.gz.tbi - md5sum: f3dac01ea66b95fe477446fde2d31489 - - path: results/variant_calling/controlfreec/sample4_vs_sample3/GC_profile.sample4_vs_sample3.cpn - md5sum: d41d8cd98f00b204e9800998ecf8427e # This is the md5sum of an empty file. Are all these files suppose to be empty? - - path: results/variant_calling/controlfreec/sample4_vs_sample3/config.txt - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.bed - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.circos.txt - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.p.value.txt - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_BAF.png - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_BAF.txt - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_CNVs - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_info.txt - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_ratio.BedGraph - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_ratio.log2.png - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_ratio.png - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_ratio.txt - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_sample.cpn - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/mpileup/sample4_vs_sample3/sample4_vs_sample3.normal.mpileup.gz - should_exist: false - - path: results/variant_calling/mpileup/sample4_vs_sample3/sample4_vs_sample3.tumor.mpileup.gz - should_exist: false - - path: results/controlfreec - should_exist: false - - path: results/mpileup - should_exist: false - - path: results/reports/mosdepth/sample3/sample3.recal.global.dist.txt - - path: results/reports/mosdepth/sample3/sample3.recal.summary.txt - - path: results/reports/mosdepth/sample3/sample3.recal.per-base.bed.gz - - path: results/reports/mosdepth/sample3/sample3.recal.per-base.bed.gz.csi - - path: results/reports/samtools/sample3/sample3.recal.cram.stats - # conda changes md5sums for test - - path: results/reports/mosdepth/sample4/sample4.recal.global.dist.txt - - path: results/reports/mosdepth/sample4/sample4.recal.summary.txt - - path: results/reports/mosdepth/sample4/sample4.recal.per-base.bed.gz - - path: results/reports/mosdepth/sample4/sample4.recal.per-base.bed.gz.csi - - path: results/reports/samtools/sample4/sample4.recal.cram.stats - # conda changes md5sums for test -- name: Run variant calling on tumor_only sample with controlfreec - command: nextflow run main.nf -profile test,tools_tumoronly --tools controlfreec -stub-run --outdir results - tags: - - controlfreec - - tumor_only - - variant_calling - - copy_number_calling - files: - - path: results/multiqc - - path: results/variant_calling/controlfreec/sample2/GC_profile.sample2.cpn - md5sum: d41d8cd98f00b204e9800998ecf8427e # This is the md5sum of an empty file. Are all these files suppose to be empty? - - path: results/variant_calling/controlfreec/sample2/config.txt - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample2/sample2.bed - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample2/sample2.circos.txt - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample2/sample2.p.value.txt - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample2/sample2_BAF.png - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample2/sample2_BAF.txt - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample2/sample2_CNVs - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample2/sample2_info.txt - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample2/sample2_ratio.BedGraph - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample2/sample2_ratio.log2.png - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample2/sample2_ratio.png - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample2/sample2_ratio.txt - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/controlfreec/sample2/sample2_sample.cpn - md5sum: d41d8cd98f00b204e9800998ecf8427e - - path: results/variant_calling/mpileup/sample2/sample2.tumor.mpileup.gz - should_exist: false - - path: results/controlfreec - should_exist: false - - path: results/mpileup - should_exist: false - - path: results/reports/mosdepth/sample2/sample2.recal.global.dist.txt - - path: results/reports/mosdepth/sample2/sample2.recal.region.dist.txt - - path: results/reports/mosdepth/sample2/sample2.recal.summary.txt - - path: results/reports/mosdepth/sample2/sample2.recal.regions.bed.gz - - path: results/reports/mosdepth/sample2/sample2.recal.regions.bed.gz.csi - - path: results/reports/samtools/sample2/sample2.recal.cram.stats diff --git a/tests/test_deepvariant.yml b/tests/test_deepvariant.yml index e79be5e2f6..05f4be02a4 100644 --- a/tests/test_deepvariant.yml +++ b/tests/test_deepvariant.yml @@ -1,37 +1,27 @@ - name: Run variant calling on germline sample with deepvariant - command: nextflow run main.nf -profile test,tools_germline --tools deepvariant --outdir results + command: nextflow run main.nf -profile test,tools_germline_deepvariant --tools deepvariant --outdir results tags: - deepvariant - germline - variant_calling files: - path: results/multiqc - - path: results/reports/bcftools/deepvariant/sample1/sample1.deepvariant.bcftools_stats.txt - # md5sum: a6634ceb1c712de14009b05d273713a7 - - path: results/reports/vcftools/deepvariant/sample1/sample1.deepvariant.FILTER.summary - md5sum: acce7a163f4070226429f9d6bc3fbd2c - - path: results/reports/vcftools/deepvariant/sample1/sample1.deepvariant.TsTv.count - md5sum: de1632b8413f4c14c78acdc2df5c5224 - - path: results/reports/vcftools/deepvariant/sample1/sample1.deepvariant.TsTv.qual - md5sum: a9c05f0ecb0bb71123e345589bd7089c - - path: results/variant_calling/deepvariant/sample1/sample1.deepvariant.g.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/deepvariant/sample1/sample1.deepvariant.g.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/deepvariant/sample1/sample1.deepvariant.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/deepvariant/sample1/sample1.deepvariant.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/deepvariant - should_exist: false - - path: results/reports/mosdepth/sample1/sample1.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample1/sample1.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/sample1/sample1.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample1/sample1.recal.regions.bed.gz - - path: results/reports/mosdepth/sample1/sample1.recal.regions.bed.gz.csi - - path: results/reports/samtools/sample1/sample1.recal.cram.stats + - path: results/reports/bcftools/deepvariant/test/test.deepvariant.bcftools_stats.txt + - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt + - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt + - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt + - path: results/reports/mosdepth/test/test.recal.regions.bed.gz + - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi + - path: results/reports/samtools/test/test.recal.cram.stats + - path: results/reports/vcftools/deepvariant/test/test.deepvariant.FILTER.summary + - path: results/reports/vcftools/deepvariant/test/test.deepvariant.TsTv.count + - path: results/reports/vcftools/deepvariant/test/test.deepvariant.TsTv.qual + - path: results/variant_calling/deepvariant/test/test.deepvariant.g.vcf.gz + - path: results/variant_calling/deepvariant/test/test.deepvariant.g.vcf.gz.tbi + - path: results/variant_calling/deepvariant/test/test.deepvariant.vcf.gz + - path: results/variant_calling/deepvariant/test/test.deepvariant.vcf.gz.tbi - name: Run variant calling on germline sample with deepvariant without intervals - command: nextflow run main.nf -profile test,tools_germline --tools deepvariant --no_intervals --outdir results + command: nextflow run main.nf -profile test,tools_germline_deepvariant --tools deepvariant --no_intervals --outdir results tags: - deepvariant - germline @@ -40,31 +30,18 @@ files: - path: results/multiqc - path: results/no_intervals.bed - md5sum: f3dac01ea66b95fe477446fde2d31489 - path: results/no_intervals.bed.gz - md5sum: f3dac01ea66b95fe477446fde2d31489 - path: results/no_intervals.bed.gz.tbi - md5sum: f3dac01ea66b95fe477446fde2d31489 - - path: results/reports/bcftools/deepvariant/sample1/sample1.deepvariant.bcftools_stats.txt - # md5sum: 0c48d8e315ca23c5dc2e7bf71ea0b6a6 - - path: results/reports/vcftools/deepvariant/sample1/sample1.deepvariant.FILTER.summary - md5sum: 7b17bd18c2d4bf129561c7c6a419a889 - - path: results/reports/vcftools/deepvariant/sample1/sample1.deepvariant.TsTv.count - md5sum: e570b07835a793bbab4f517cabed5a45 - - path: results/reports/vcftools/deepvariant/sample1/sample1.deepvariant.TsTv.qual - md5sum: 03f64b8092fc212bcb746b08f9e676a5 - - path: results/variant_calling/deepvariant/sample1/sample1.deepvariant.g.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/deepvariant/sample1/sample1.deepvariant.g.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/deepvariant/sample1/sample1.deepvariant.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/deepvariant/sample1/sample1.deepvariant.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/deepvariant - should_exist: false - - path: results/reports/mosdepth/sample1/sample1.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample1/sample1.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample1/sample1.recal.per-base.bed.gz - - path: results/reports/mosdepth/sample1/sample1.recal.per-base.bed.gz.csi - - path: results/reports/samtools/sample1/sample1.recal.cram.stats + - path: results/reports/bcftools/deepvariant/test/test.deepvariant.bcftools_stats.txt + - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt + - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt + - path: results/reports/mosdepth/test/test.recal.per-base.bed.gz + - path: results/reports/mosdepth/test/test.recal.per-base.bed.gz.csi + - path: results/reports/samtools/test/test.recal.cram.stats + - path: results/reports/vcftools/deepvariant/test/test.deepvariant.FILTER.summary + - path: results/reports/vcftools/deepvariant/test/test.deepvariant.TsTv.count + - path: results/reports/vcftools/deepvariant/test/test.deepvariant.TsTv.qual + - path: results/variant_calling/deepvariant/test/test.deepvariant.g.vcf.gz + - path: results/variant_calling/deepvariant/test/test.deepvariant.g.vcf.gz.tbi + - path: results/variant_calling/deepvariant/test/test.deepvariant.vcf.gz + - path: results/variant_calling/deepvariant/test/test.deepvariant.vcf.gz.tbi diff --git a/tests/test_default.yml b/tests/test_default.yml deleted file mode 100644 index 97ebb3918d..0000000000 --- a/tests/test_default.yml +++ /dev/null @@ -1,62 +0,0 @@ -- name: Run default pipeline - command: nextflow run main.nf -profile test --outdir results - tags: - - default - - preprocessing - - variant_calling - files: - - path: results/csv/markduplicates.csv - md5sum: 0d6120bb99e92f6810343270711ca53e - - path: results/csv/markduplicates_no_table.csv - md5sum: 2a2d3d4842befd4def39156463859ee3 - - path: results/csv/recalibrated.csv - md5sum: 2d29d9e53894dcce96a1b5beb6ef3312 - - path: results/csv/variantcalled.csv - md5sum: 4d0effd3d8dc2b814230a189e7ca9dba - - path: results/multiqc - - path: results/preprocessing/markduplicates/test/test.md.cram - # binary changes md5sums on reruns - - path: results/preprocessing/markduplicates/test/test.md.cram.crai - # binary changes md5sums on reruns - - path: results/preprocessing/recal_table/test/test.recal.table - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - # binary changes md5sums on reruns - - path: results/reports/bcftools/strelka/test/test.strelka.variants.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/fastqc/test-test_L1 - - path: results/reports/markduplicates/test/test.md.cram.metrics - contains: ["test 17094 1534 168 1046782 12429 197 0 0.635998", "1.0 0.999991 1171"] - - path: results/reports/mosdepth/test/test.md.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.md.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.md.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.md.regions.bed.gz - - path: results/reports/mosdepth/test/test.md.regions.bed.gz.csi - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi - - path: results/reports/samtools/test/test.md.cram.stats - # conda changes md5sums for test - - path: results/reports/samtools/test/test.recal.cram.stats - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/test/test.strelka.variants.FILTER.summary - md5sum: ad417bc96d31223f61170987975d8128 - - path: results/reports/vcftools/strelka/test/test.strelka.variants.TsTv.count - md5sum: fa27f678965b7cba6a92efcd039f802a - - path: results/reports/vcftools/strelka/test/test.strelka.variants.TsTv.qual - md5sum: bc68ae4e688e9fb772b457069e604883 - - path: results/variant_calling/strelka/test/test.strelka.genome.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.genome.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.variants.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.variants.vcf.gz.tbi - - path: results/strelka - should_exist: false - - path: results/preprocessing/mapped/ - should_exist: false diff --git a/tests/test_fastp.yml b/tests/test_fastp.yml index 20eab4065f..3d70bf120d 100644 --- a/tests/test_fastp.yml +++ b/tests/test_fastp.yml @@ -29,7 +29,7 @@ - path: results/reports/fastp/test - path: results/reports/fastqc/test-test_L1 - path: results/reports/markduplicates/test/test.md.cram.metrics - contains: ["test 16608 1860 160 1046616 12117 256 0 0.621261"] + contains: ["test 16608 1860 164 1046488 12097 254 0 0.620081 6174"] - path: results/reports/mosdepth/test/test.md.mosdepth.global.dist.txt - path: results/reports/mosdepth/test/test.md.mosdepth.region.dist.txt - path: results/reports/mosdepth/test/test.md.mosdepth.summary.txt @@ -78,7 +78,10 @@ - path: results/reports/fastp/test - path: results/reports/fastqc/test-test_L1 - path: results/reports/markduplicates/test/test.md.cram.metrics - contains: ["test 17482 890 170 1047682 12552 69 0 0.65881"] + contains: + [ + "LIBRARY UNPAIRED_READS_EXAMINED READ_PAIRS_EXAMINED SECONDARY_OR_SUPPLEMENTARY_RDS UNMAPPED_READS UNPAIRED_READ_DUPLICATES READ_PAIR_DUPLICATES READ_PAIR_OPTICAL_DUPLICATES PERCENT_DUPLICATION ESTIMATED_LIBRARY_SIZE", + ] - path: results/reports/mosdepth/test/test.md.mosdepth.global.dist.txt - path: results/reports/mosdepth/test/test.md.mosdepth.region.dist.txt - path: results/reports/mosdepth/test/test.md.mosdepth.summary.txt diff --git a/tests/test_haplotypecaller.yml b/tests/test_haplotypecaller.yml index 43cfca38bc..43179e69f0 100644 --- a/tests/test_haplotypecaller.yml +++ b/tests/test_haplotypecaller.yml @@ -8,10 +8,6 @@ - path: results/csv/variantcalled.csv md5sum: d7d86e82902a4f57876b2414a4f812a4 - path: results/multiqc - - path: results/preprocessing/converted/test/test.converted.cram - # binary changes md5sums on reruns - - path: results/preprocessing/converted/test/test.converted.cram.crai - # binary changes md5sums on reruns - path: results/preprocessing/recalibrated/test/test.recal.cram should_exist: false - path: results/preprocessing/recalibrated/test/test.recal.cram.crai @@ -58,10 +54,6 @@ md5sum: f3dac01ea66b95fe477446fde2d31489 - path: results/no_intervals.bed.gz.tbi md5sum: f3dac01ea66b95fe477446fde2d31489 - - path: results/preprocessing/converted/test/test.converted.cram - # binary changes md5sums on reruns - - path: results/preprocessing/converted/test/test.converted.cram.crai - # binary changes md5sums on reruns - path: results/preprocessing/recalibrated/test/test.recal.cram should_exist: false - path: results/preprocessing/recalibrated/test/test.recal.cram.crai diff --git a/tests/test_haplotypecaller_skip_filter.yml b/tests/test_haplotypecaller_skip_filter.yml index dd9faf7e42..98db41beee 100644 --- a/tests/test_haplotypecaller_skip_filter.yml +++ b/tests/test_haplotypecaller_skip_filter.yml @@ -8,10 +8,6 @@ - path: results/csv/variantcalled.csv md5sum: f1041cfc30cedb240f224dd8e3dbf9d2 - path: results/multiqc - - path: results/preprocessing/converted/test/test.converted.cram - # binary changes md5sums on reruns - - path: results/preprocessing/converted/test/test.converted.cram.crai - # binary changes md5sums on reruns - path: results/preprocessing/recalibrated/test/test.recal.cram should_exist: false - path: results/preprocessing/recalibrated/test/test.recal.cram.crai @@ -61,10 +57,6 @@ md5sum: f3dac01ea66b95fe477446fde2d31489 - path: results/no_intervals.bed.gz.tbi md5sum: f3dac01ea66b95fe477446fde2d31489 - - path: results/preprocessing/converted/test/test.converted.cram - # binary changes md5sums on reruns - - path: results/preprocessing/converted/test/test.converted.cram.crai - # binary changes md5sums on reruns - path: results/preprocessing/recalibrated/test/test.recal.cram should_exist: false - path: results/preprocessing/recalibrated/test/test.recal.cram.crai diff --git a/tests/test_lofreq.yml b/tests/test_lofreq.yml new file mode 100644 index 0000000000..b5a268ad5f --- /dev/null +++ b/tests/test_lofreq.yml @@ -0,0 +1,64 @@ +- name: Run variant calling on tumor only sample with lofreq + command: nextflow run main.nf -profile test,tools_tumoronly --tools lofreq --outdir results + tags: + - lofreq + - tumor_only + - variant_calling + files: + - path: results/csv/variantcalled.csv + md5sum: cc7725ef0808ee07002a50ab873ee45c + - path: results/multiqc + - path: results/reports/bcftools/lofreq/sample2/sample2.lofreq.bcftools_stats.txt + md5sum: 9c9de2e4ed2f324adf1912a45d73601f + # conda changes md5sums for test + - path: results/reports/samtools/sample2/sample2.recal.cram.stats + # unstable + # conda changes md5sums for test + - path: results/variant_calling/lofreq/sample2/sample2.lofreq.vcf.gz + contains: + [ + '##INFO=', + ] + # conda changes md5sums for test + - path: results/reports/vcftools/lofreq/sample2/sample2.lofreq.FILTER.summary + md5sum: 8dd8a0c91d5c4a260b462e04f615e502 + - path: results/reports/vcftools/lofreq/sample2/sample2.lofreq.TsTv.qual + md5sum: fe2f1133a9894852603b5252f48bbc05 + # binary changes md5sums on reruns + - path: results/reports/mosdepth/sample2/sample2.recal.mosdepth.global.dist.txt + - path: results/reports/mosdepth/sample2/sample2.recal.mosdepth.region.dist.txt + - path: results/reports/mosdepth/sample2/sample2.recal.mosdepth.summary.txt + - path: results/reports/mosdepth/sample2/sample2.recal.regions.bed.gz + - path: results/reports/mosdepth/sample2/sample2.recal.regions.bed.gz.csi +- name: Run variant calling on tumor only sample with lofreq without intervals + command: nextflow run main.nf -profile test,tools_tumoronly --tools lofreq --no_intervals --outdir results + tags: + - lofreq + - no_intervals + - tumor_only + - variant_calling + files: + - path: results/csv/variantcalled.csv + md5sum: cc7725ef0808ee07002a50ab873ee45c + - path: results/multiqc + - path: results/reports/bcftools/lofreq/sample2/sample2.lofreq.bcftools_stats.txt + md5sum: e838ce412fc091918059e79727b35785 + # conda changes md5sums for test + - path: results/reports/samtools/sample2/sample2.recal.cram.stats + # unstable + # conda changes md5sums for test + - path: results/variant_calling/lofreq/sample2/sample2.lofreq.vcf.gz + contains: + [ + '##INFO=', + ] + # conda changes md5sums for test + - path: results/reports/vcftools/lofreq/sample2/sample2.lofreq.FILTER.summary + md5sum: 72beda1b57da053eb352204828605a40 + - path: results/reports/vcftools/lofreq/sample2/sample2.lofreq.TsTv.qual + md5sum: e4cd60cf32b0a24df426d243b337cf90 + # binary changes md5sums on reruns + - path: results/reports/mosdepth/sample2/sample2.recal.mosdepth.global.dist.txt + - path: results/reports/mosdepth/sample2/sample2.recal.mosdepth.summary.txt + - path: results/reports/mosdepth/sample2/sample2.recal.per-base.bed.gz + - path: results/reports/mosdepth/sample2/sample2.recal.per-base.bed.gz.csi diff --git a/tests/test_markduplicates_from_bam.yml b/tests/test_markduplicates_from_bam.yml deleted file mode 100644 index ab436a7ce2..0000000000 --- a/tests/test_markduplicates_from_bam.yml +++ /dev/null @@ -1,84 +0,0 @@ -- name: Run markduplicates starting from BAM - command: nextflow run main.nf -profile test,markduplicates_bam --outdir results - tags: - - input_bam - - gatk4/markduplicates - - preprocessing - files: - - path: results/csv/markduplicates.csv - md5sum: 8e9408ef8d4f9e6e00e531268eebd42a - - path: results/csv/markduplicates_no_table.csv - md5sum: f8b1b25fec472453a98c3f7f0e3a7953 - - path: results/csv/recalibrated.csv - md5sum: 1888a924bc70bd80165a96ad641e22d6 - - path: results/multiqc - - path: results/preprocessing/markduplicates/test/test.md.cram - # binary changes md5sums on reruns - - path: results/preprocessing/markduplicates/test/test.md.cram.crai - # binary changes md5sums on reruns - - path: results/preprocessing/recal_table/test/test.recal.table - md5sum: 9603b69fdc3b5090de2e0dd78bfcc4bf - - path: results/preprocessing/recalibrated/test/test.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - # binary changes md5sums on reruns - - path: results/reports/markduplicates/test/test.md.cram.metrics - contains: ["testN 0 2820 2 2 0 828 0 0.293617 3807", "1.0 0.999986 1178 1178", "100.0 1.911145 0 0"] - - path: results/reports/mosdepth/test/test.md.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.md.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.md.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.md.regions.bed.gz - - path: results/reports/mosdepth/test/test.md.regions.bed.gz.csi - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi - - path: results/reports/samtools/test/test.md.cram.stats - # conda changes md5sums for test - - path: results/reports/samtools/test/test.recal.cram.stats - # conda changes md5sums for test - -- name: Run skip markduplicates bam from step markduplicates - command: nextflow run main.nf -profile test,markduplicates_bam,skip_markduplicates --outdir results - tags: - - input_bam - - markduplicates - - preprocessing - - skip_markduplicates - files: - - path: results/csv/recalibrated.csv - md5sum: 1888a924bc70bd80165a96ad641e22d6 - - path: results/multiqc - - path: results/preprocessing/converted/test/test.converted.cram - # binary changes md5sums on reruns - - path: results/preprocessing/converted/test/test.converted.cram.crai - # binary changes md5sums on reruns - - path: results/preprocessing/recal_table/test/test.recal.table - md5sum: 35d89a3811aa31711fc9815b6b80e6ec - - path: results/preprocessing/recalibrated/test/test.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - # binary changes md5sums on reruns - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi - - path: results/reports/mosdepth/test/test.sorted.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.sorted.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.sorted.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.sorted.regions.bed.gz - - path: results/reports/mosdepth/test/test.sorted.regions.bed.gz.csi - - path: results/reports/samtools/test/test.recal.cram.stats - # conda changes md5sums for test - - path: results/reports/samtools/test/test.sorted.cram.stats - # conda changes md5sums for test - - path: results/csv/markduplicates.csv - should_exist: false - - path: results/csv/markduplicates_no_table.csv - should_exist: false - - path: results/preprocessing/mapped/test/test.bam - should_exist: false - - path: results/preprocessing/mapped/test/test.sorted.bam - should_exist: false diff --git a/tests/test_markduplicates_from_cram.yml b/tests/test_markduplicates_from_cram.yml deleted file mode 100644 index ff28636112..0000000000 --- a/tests/test_markduplicates_from_cram.yml +++ /dev/null @@ -1,81 +0,0 @@ -- name: Run markduplicates starting from CRAM - command: nextflow run main.nf -profile test,markduplicates_cram --outdir results - tags: - - input_cram - - gatk4/markduplicates - - preprocessing - files: - - path: results/csv/markduplicates.csv - md5sum: 8e9408ef8d4f9e6e00e531268eebd42a - - path: results/csv/markduplicates_no_table.csv - md5sum: f8b1b25fec472453a98c3f7f0e3a7953 - - path: results/csv/recalibrated.csv - md5sum: 1888a924bc70bd80165a96ad641e22d6 - - path: results/multiqc - - path: results/preprocessing/markduplicates/test/test.md.cram - # binary changes md5sums on reruns - - path: results/preprocessing/markduplicates/test/test.md.cram.crai - # binary changes md5sums on reruns - - path: results/preprocessing/recal_table/test/test.recal.table - md5sum: 9603b69fdc3b5090de2e0dd78bfcc4bf - - path: results/preprocessing/recalibrated/test/test.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - # binary changes md5sums on reruns - - path: results/reports/markduplicates/test/test.md.cram.metrics - contains: ["testN 0 2820 2 2 0 828 0 0.293617 3807", "1.0 0.999986 1178 1178", "100.0 1.911145 0 0"] - - path: results/reports/mosdepth/test/test.md.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.md.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.md.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.md.regions.bed.gz - - path: results/reports/mosdepth/test/test.md.regions.bed.gz.csi - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi - - path: results/reports/samtools/test/test.md.cram.stats - # conda changes md5sums for test - - path: results/reports/samtools/test/test.recal.cram.stats - # conda changes md5sums for test - - path: results/preprocessing/mapped/ - should_exist: false -- name: Run skip markduplicates cram from step markduplicates - command: nextflow run main.nf -profile test,markduplicates_cram,skip_markduplicates --outdir results - tags: - - input_cram - - markduplicates - - preprocessing - - skip_markduplicates - files: - - path: results/csv/recalibrated.csv - md5sum: 1888a924bc70bd80165a96ad641e22d6 - - path: results/multiqc - - path: results/preprocessing/recal_table/test/test.recal.table - md5sum: 35d89a3811aa31711fc9815b6b80e6ec - - path: results/preprocessing/recalibrated/test/test.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - # binary changes md5sums on reruns - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi - - path: results/reports/mosdepth/test/test.sorted.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.sorted.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.sorted.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.sorted.regions.bed.gz - - path: results/reports/mosdepth/test/test.sorted.regions.bed.gz.csi - - path: results/reports/samtools/test/test.recal.cram.stats - # conda changes md5sums for test - - path: results/reports/samtools/test/test.sorted.cram.stats - # conda changes md5sums for test - - path: results/csv/markduplicates.csv - should_exist: false - - path: results/csv/markduplicates_no_table.csv - should_exist: false - - path: results/preprocessing/mapped/test/test.sorted.cram - should_exist: false - - path: results/preprocessing/mapped/test/test.sorted.cram.crai - should_exist: false diff --git a/tests/test_prepare_recalibration_from_bam.yml b/tests/test_prepare_recalibration_from_bam.yml deleted file mode 100644 index 335a67deed..0000000000 --- a/tests/test_prepare_recalibration_from_bam.yml +++ /dev/null @@ -1,79 +0,0 @@ -- name: Run prepare_recalibration starting from bam - command: nextflow run main.nf -profile test,prepare_recalibration_bam --outdir results - tags: - - input_bam - - prepare_recalibration - - preprocessing - files: - - path: results/csv/markduplicates.csv - md5sum: 90e2ab85d8af642d6548af448a9d4226 - - path: results/csv/recalibrated.csv - md5sum: 1888a924bc70bd80165a96ad641e22d6 - - path: results/multiqc - - path: results/preprocessing/converted/test/test.converted.cram - # binary changes md5sums on reruns - - path: results/preprocessing/converted/test/test.converted.cram.crai - # binary changes md5sums on reruns - - path: results/preprocessing/recal_table/test/test.recal.table - md5sum: 35d89a3811aa31711fc9815b6b80e6ec - - path: results/preprocessing/recalibrated/test/test.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - # binary changes md5sums on reruns - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi - - path: results/reports/samtools/test/test.recal.cram.stats - # conda changes md5sums for test - - path: results/preprocessing/mapped/ - should_exist: false - - path: results/preprocessing/markduplicates/ - should_exist: false -- name: Run prepare_recalibration starting from bam and skip baserecalibration - command: nextflow run main.nf -profile test,prepare_recalibration_bam,skip_bqsr --tools strelka --outdir results - tags: - - input_bam - - prepare_recalibration - - preprocessing - files: - - path: results/csv/variantcalled.csv - md5sum: 4d0effd3d8dc2b814230a189e7ca9dba - - path: results/multiqc - - path: results/preprocessing/converted/test/test.converted.cram - # binary changes md5sums on reruns - - path: results/preprocessing/converted/test/test.converted.cram.crai - # binary changes md5sums on reruns - - path: results/reports/bcftools/strelka/test/test.strelka.variants.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/test/test.strelka.variants.FILTER.summary - md5sum: 39ff2cc8eb7495a14a6b76e0ab627027 - - path: results/reports/vcftools/strelka/test/test.strelka.variants.TsTv.count - md5sum: ee7dafc8d941b8502a04a63dc3126fff - - path: results/reports/vcftools/strelka/test/test.strelka.variants.TsTv.qual - # conda changes md5sums for test - - path: results/variant_calling/strelka/test/test.strelka.genome.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.genome.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.variants.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.variants.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/csv/recalibrated.csv - should_exist: false - - path: results/preprocessing/recal_table/test/test.recal.table - should_exist: false - - path: results/preprocessing/recalibrated/test/test.recal.cram - should_exist: false - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - should_exist: false - - path: results/reports/mosdepth - should_exist: false - - path: results/reports/samtools_stats - should_exist: false - - path: results/preprocessing/mapped/ - should_exist: false - - path: results/preprocessing/markduplicates/ - should_exist: false diff --git a/tests/test_prepare_recalibration_from_cram.yml b/tests/test_prepare_recalibration_from_cram.yml deleted file mode 100644 index 6e362ccd7b..0000000000 --- a/tests/test_prepare_recalibration_from_cram.yml +++ /dev/null @@ -1,79 +0,0 @@ -- name: Run prepare_recalibration starting from cram - command: nextflow run main.nf -profile test,prepare_recalibration_cram --outdir results - tags: - - input_cram - - prepare_recalibration - - preprocessing - files: - - path: results/csv/recalibrated.csv - md5sum: 1888a924bc70bd80165a96ad641e22d6 - - path: results/multiqc - - path: results/preprocessing/recal_table/test/test.recal.table - md5sum: 35d89a3811aa31711fc9815b6b80e6ec - - path: results/preprocessing/recalibrated/test/test.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - # binary changes md5sums on reruns - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi - - path: results/reports/samtools/test/test.recal.cram.stats - # conda changes md5sums for test - - path: results/preprocessing/mapped/ - should_exist: false - - path: results/preprocessing/markduplicates/ - should_exist: false -- name: Run prepare_recalibration starting from cram and skip baserecalibration - command: nextflow run main.nf -profile test,prepare_recalibration_cram,skip_bqsr --tools strelka --outdir results - tags: - - input_cram - - prepare_recalibration - - preprocessing - files: - - path: results/csv/variantcalled.csv - md5sum: 4d0effd3d8dc2b814230a189e7ca9dba - - path: results/multiqc - - path: results/reports/bcftools/strelka/test/test.strelka.variants.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/test/test.strelka.variants.FILTER.summary - md5sum: 39ff2cc8eb7495a14a6b76e0ab627027 - - path: results/reports/vcftools/strelka/test/test.strelka.variants.TsTv.count - md5sum: ee7dafc8d941b8502a04a63dc3126fff - - path: results/reports/vcftools/strelka/test/test.strelka.variants.TsTv.qual - # conda changes md5sums for test - - path: results/variant_calling/strelka/test/test.strelka.genome.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.genome.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.variants.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.variants.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/csv/recalibrated.csv - should_exist: false - - path: results/preprocessing/markduplicates/test/test.md.cram - should_exist: false - - path: results/preprocessing/markduplicates/test/test.md.cram.crai - should_exist: false - - path: results/preprocessing/recal_table/test/test.recal.table - should_exist: false - - path: results/preprocessing/recalibrated/test/test.recal.cram - should_exist: false - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - should_exist: false - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - should_exist: false - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - should_exist: false - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - should_exist: false - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - should_exist: false - - path: results/reports/samtools/test/test.recal.cram.stats - should_exist: false - - path: results/preprocessing/mapped/ - should_exist: false - - path: results/preprocessing/markduplicates/ - should_exist: false diff --git a/tests/test_recalibrate_from_bam.yml b/tests/test_recalibrate_from_bam.yml deleted file mode 100644 index fb96a3d305..0000000000 --- a/tests/test_recalibrate_from_bam.yml +++ /dev/null @@ -1,82 +0,0 @@ -- name: Run Recalibration starting from bam - command: nextflow run main.nf -profile test,recalibrate_bam --outdir results - tags: - - input_bam - - recalibrate - - preprocessing - files: - - path: results/csv/recalibrated.csv - md5sum: 1888a924bc70bd80165a96ad641e22d6 - - path: results/multiqc - - path: results/preprocessing/converted/test/test.converted.cram - # binary changes md5sums on reruns - - path: results/preprocessing/converted/test/test.converted.cram.crai - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - # binary changes md5sums on reruns - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi - - path: results/reports/samtools/test/test.recal.cram.stats - # conda changes md5sums for test - - path: results/preprocessing/mapped/ - should_exist: false - - path: results/preprocessing/markduplicates/ - should_exist: false -- name: Run Recalibration starting from bam and skip baserecalibration - command: nextflow run main.nf -profile test,recalibrate_bam,skip_bqsr --tools strelka --outdir results - tags: - - input_bam - - recalibrate - - preprocessing - - variant_calling - files: - - path: results/csv/variantcalled.csv - md5sum: 4d0effd3d8dc2b814230a189e7ca9dba - - path: results/multiqc - - path: results/preprocessing/converted/test/test.converted.cram - # binary changes md5sums on reruns - - path: results/preprocessing/converted/test/test.converted.cram.crai - # binary changes md5sums on reruns - - path: results/reports/bcftools/strelka/test/test.strelka.variants.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/test/test.strelka.variants.FILTER.summary - md5sum: 39ff2cc8eb7495a14a6b76e0ab627027 - - path: results/reports/vcftools/strelka/test/test.strelka.variants.TsTv.count - md5sum: ee7dafc8d941b8502a04a63dc3126fff - - path: results/reports/vcftools/strelka/test/test.strelka.variants.TsTv.qual - contains: ["0 0 0 -nan 3 4 0.75", "2 0 1 0 2 4 0.5", "5 1 1 1 2 3 0.666667"] - - path: results/variant_calling/strelka/test/test.strelka.genome.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.genome.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.variants.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.variants.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/csv/recalibrated.csv - should_exist: false - - path: results/preprocessing/recal_table/test/test.recal.table - should_exist: false - - path: results/preprocessing/recalibrated/test/test.recal.cram - should_exist: false - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - should_exist: false - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - should_exist: false - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - should_exist: false - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - should_exist: false - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - should_exist: false - - path: results/reports/samtools/test/test.recal.cram.stats - should_exist: false - - path: results/preprocessing/mapped/ - should_exist: false - - path: results/preprocessing/markduplicates/ - should_exist: false diff --git a/tests/test_recalibrate_from_cram.yml b/tests/test_recalibrate_from_cram.yml deleted file mode 100644 index 9e53da4f2d..0000000000 --- a/tests/test_recalibrate_from_cram.yml +++ /dev/null @@ -1,78 +0,0 @@ -- name: Run Recalibration starting from cram - command: nextflow run main.nf -profile test,recalibrate_cram --outdir results - tags: - - input_cram - - recalibrate - - preprocessing - files: - - path: results/csv/recalibrated.csv - md5sum: 1888a924bc70bd80165a96ad641e22d6 - - path: results/multiqc - - path: results/preprocessing/recalibrated/test/test.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - # binary changes md5sums on reruns - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi - - path: results/reports/samtools/test/test.recal.cram.stats - # conda changes md5sums for test - - path: results/preprocessing/mapped/ - should_exist: false - - path: results/preprocessing/markduplicates/ - should_exist: false -- name: Run Recalibration starting from cram and skip baserecalibration - command: nextflow run main.nf -profile test,recalibrate_cram,skip_bqsr --tools strelka --outdir results - tags: - - input_cram - - recalibrate - - preprocessing - - variant_calling - files: - - path: results/csv/variantcalled.csv - md5sum: 4d0effd3d8dc2b814230a189e7ca9dba - - path: results/multiqc - - path: results/reports/bcftools/strelka/test/test.strelka.variants.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/test/test.strelka.variants.FILTER.summary - md5sum: 39ff2cc8eb7495a14a6b76e0ab627027 - - path: results/reports/vcftools/strelka/test/test.strelka.variants.TsTv.count - md5sum: ee7dafc8d941b8502a04a63dc3126fff - - path: results/reports/vcftools/strelka/test/test.strelka.variants.TsTv.qual - contains: ["0 0 0 -nan 3 4 0.75", "2 0 1 0 2 4 0.5", "5 1 1 1 2 3 0.666667"] - - path: results/variant_calling/strelka/test/test.strelka.genome.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.genome.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.variants.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.variants.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/csv/recalibrated.csv - should_exist: false - - path: results/preprocessing/markduplicates/test/test.md.cram - should_exist: false - - path: results/preprocessing/markduplicates/test/test.md.cram.crai - should_exist: false - - path: results/preprocessing/recal_table/test/test.recal.table - should_exist: false - - path: results/preprocessing/recalibrated/test/test.recal.cram - should_exist: false - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - should_exist: false - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - should_exist: false - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - should_exist: false - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - should_exist: false - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - should_exist: false - - path: results/reports/samtools/test/test.recal.cram.stats - should_exist: false - - path: results/preprocessing/mapped/ - should_exist: false - - path: results/preprocessing/markduplicates/ - should_exist: false diff --git a/tests/test_samplesheet_validation_spaces.yml b/tests/test_samplesheet_validation_spaces.yml deleted file mode 100644 index f0983b0518..0000000000 --- a/tests/test_samplesheet_validation_spaces.yml +++ /dev/null @@ -1,9 +0,0 @@ -- name: Test that pipeline fail when sample/patient name contain space - command: nextflow run main.nf -profile test --input ./tests/csv/3.0/sample_with_space.csv --outdir results - tags: - - sample_with_space - - validation_checks - exit_code: 1 - stderr: - contains: - - "Sample ID must be provided and cannot contain spaces" diff --git a/tests/test_save_mapped.yml b/tests/test_save_mapped.yml deleted file mode 100644 index 251e7ddacc..0000000000 --- a/tests/test_save_mapped.yml +++ /dev/null @@ -1,66 +0,0 @@ -- name: Run save_mapped - command: nextflow run main.nf -profile test --save_mapped --outdir results - tags: - - default_extended - - preprocessing - - save_mapped - - save_mapped_only - - variant_calling - files: - - path: results/csv/mapped.csv - md5sum: 3bee45ccf65e301ce09ee4eed8f26933 - - path: results/csv/markduplicates.csv - md5sum: 0d6120bb99e92f6810343270711ca53e - - path: results/csv/markduplicates_no_table.csv - md5sum: 2a2d3d4842befd4def39156463859ee3 - - path: results/csv/recalibrated.csv - md5sum: 2d29d9e53894dcce96a1b5beb6ef3312 - - path: results/multiqc - - path: results/preprocessing/mapped/test/test.sorted.cram - # binary changes md5sums on reruns - - path: results/preprocessing/mapped/test/test.sorted.cram.crai - # binary changes md5sums on reruns - - path: results/preprocessing/markduplicates/test/test.md.cram - # binary changes md5sums on reruns - - path: results/preprocessing/markduplicates/test/test.md.cram.crai - # binary changes md5sums on reruns - - path: results/preprocessing/recal_table/test/test.recal.table - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - # binary changes md5sums on reruns - - path: results/reports/fastqc/test-test_L1 - - path: results/reports/markduplicates/test/test.md.cram.metrics - contains: ["test 17094 1534 168 1046782 12429 197 0 0.635998", "1.0 0.999991 1171"] - - path: results/reports/mosdepth/test/test.md.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.md.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.md.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.md.regions.bed.gz - - path: results/reports/mosdepth/test/test.md.regions.bed.gz.csi - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi - - path: results/reports/samtools/test/test.md.cram.stats - # conda changes md5sums for test - - path: results/reports/samtools/test/test.recal.cram.stats - # conda changes md5sums for test - - path: results/reports/bcftools/strelka/test/test.strelka.variants.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/test/test.strelka.variants.FILTER.summary - md5sum: ad417bc96d31223f61170987975d8128 - - path: results/reports/vcftools/strelka/test/test.strelka.variants.TsTv.count - md5sum: fa27f678965b7cba6a92efcd039f802a - - path: results/reports/vcftools/strelka/test/test.strelka.variants.TsTv.qual - md5sum: bc68ae4e688e9fb772b457069e604883 - - path: results/variant_calling/strelka/test/test.strelka.genome.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.genome.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.variants.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.variants.vcf.gz.tbi - - path: results/strelka - should_exist: false diff --git a/tests/test_skip_all_qc.yml b/tests/test_skip_all_qc.yml deleted file mode 100644 index 24938562c2..0000000000 --- a/tests/test_skip_all_qc.yml +++ /dev/null @@ -1,52 +0,0 @@ -- name: Run default pipeline with skipping all QC steps - command: nextflow run main.nf -profile test --skip_tools fastqc,markduplicates_report,mosdepth,multiqc,samtools --outdir results - tags: - - default_extended - - preprocessing - - skip_all_qc - - skip_qc - - variant_calling - files: - - path: results/csv/markduplicates.csv - md5sum: 0d6120bb99e92f6810343270711ca53e - - path: results/csv/markduplicates_no_table.csv - md5sum: 2a2d3d4842befd4def39156463859ee3 - - path: results/csv/recalibrated.csv - md5sum: 2d29d9e53894dcce96a1b5beb6ef3312 - - path: results/preprocessing/markduplicates/test/test.md.cram - # binary changes md5sums on reruns - - path: results/preprocessing/markduplicates/test/test.md.cram.crai - # binary changes md5sums on reruns - - path: results/preprocessing/recal_table/test/test.recal.table - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - # binary changes md5sums on reruns - - path: results/multiqc - should_exist: false - - path: results/reports/fastqc - should_exist: false - - path: results/reports/markduplicates - should_exist: false - - path: results/reports/mosdepth - should_exist: false - - path: results/reports/samtools - should_exist: false - - path: results/reports/bcftools/strelka/test/test.strelka.variants.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/test/test.strelka.variants.FILTER.summary - md5sum: ad417bc96d31223f61170987975d8128 - - path: results/reports/vcftools/strelka/test/test.strelka.variants.TsTv.count - md5sum: fa27f678965b7cba6a92efcd039f802a - - path: results/reports/vcftools/strelka/test/test.strelka.variants.TsTv.qual - md5sum: bc68ae4e688e9fb772b457069e604883 - - path: results/variant_calling/strelka/test/test.strelka.genome.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.genome.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.variants.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/test/test.strelka.variants.vcf.gz.tbi - - path: results/strelka - should_exist: false diff --git a/tests/test_skip_markduplicates.yml b/tests/test_skip_markduplicates.yml deleted file mode 100644 index 93718954c8..0000000000 --- a/tests/test_skip_markduplicates.yml +++ /dev/null @@ -1,133 +0,0 @@ -- name: Run default pipeline with skipping Markduplicates - command: nextflow run main.nf -profile test,skip_markduplicates --outdir results - tags: - - default_extended - - preprocessing - - skip_markduplicates - files: - - path: results/csv/mapped.csv - md5sum: 3bee45ccf65e301ce09ee4eed8f26933 - - path: results/csv/recalibrated.csv - md5sum: 2d29d9e53894dcce96a1b5beb6ef3312 - - path: results/multiqc - - path: results/preprocessing/mapped/test/test.sorted.cram - # binary changes md5sums on reruns - - path: results/preprocessing/mapped/test/test.sorted.cram.crai - # binary changes md5sums on reruns - - path: results/preprocessing/recal_table/test/test.recal.table - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - # binary changes md5sums on reruns - - path: results/reports/fastqc/test-test_L1 - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi - - path: results/reports/mosdepth/test/test.sorted.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.sorted.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.sorted.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.sorted.regions.bed.gz - - path: results/reports/mosdepth/test/test.sorted.regions.bed.gz.csi - - path: results/reports/samtools/test/test.recal.cram.stats - # conda changes md5sums for test - - path: results/reports/samtools/test/test.sorted.cram.stats - # conda changes md5sums for test - - path: results/csv/markduplicates.csv - should_exist: false - - path: results/csv/markduplicates_no_table.csv - should_exist: false - - path: results/preprocessing/mapped/test/test.bam - should_exist: false - - path: results/preprocessing/mapped/test/test.sorted.bam - should_exist: false -- name: Run default pipeline with skipping Markduplicates with save_mapped - command: nextflow run main.nf -profile test,skip_markduplicates --save_mapped --outdir results - tags: - - default_extended - - preprocessing - - save_mapped - - skip_markduplicates - files: - - path: results/csv/mapped.csv - md5sum: 3bee45ccf65e301ce09ee4eed8f26933 - - path: results/csv/recalibrated.csv - md5sum: 2d29d9e53894dcce96a1b5beb6ef3312 - - path: results/multiqc - - path: results/preprocessing/mapped/test/test.sorted.cram - # binary changes md5sums on reruns - - path: results/preprocessing/mapped/test/test.sorted.cram.crai - # binary changes md5sums on reruns - - path: results/preprocessing/recal_table/test/test.recal.table - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - # binary changes md5sums on reruns - - path: results/reports/fastqc/test-test_L1 - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi - - path: results/reports/mosdepth/test/test.sorted.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.sorted.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.sorted.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.sorted.regions.bed.gz - - path: results/reports/mosdepth/test/test.sorted.regions.bed.gz.csi - - path: results/reports/samtools/test/test.recal.cram.stats - # conda changes md5sums for test - - path: results/reports/samtools/test/test.sorted.cram.stats - # conda changes md5sums for test - - path: results/csv/markduplicates.csv - should_exist: false - - path: results/csv/markduplicates_no_table.csv - should_exist: false - - path: results/preprocessing/mapped/test/test.bam - should_exist: false -- name: Run default pipeline with skipping Markduplicates with save_mapped & save_output_as_bam - command: nextflow run main.nf -profile test,skip_markduplicates --save_mapped --save_output_as_bam --outdir results - tags: - - default_extended - - preprocessing - - save_output_as_bam - - skip_markduplicates - files: - - path: results/csv/mapped.csv - md5sum: 7f21bf40d3fbc248ee2ea3fdf0f7cdb2 - - path: results/csv/recalibrated.csv - md5sum: 2dfbcaaeaaf4937c51c5c310f1c77614 - - path: results/multiqc - - path: results/preprocessing/mapped/test/test.sorted.bam - # binary changes md5sums on reruns - - path: results/preprocessing/mapped/test/test.sorted.bam.bai - # binary changes md5sums on reruns - - path: results/preprocessing/recal_table/test/test.recal.table - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.bam - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.bam.bai - # binary changes md5sums on reruns - - path: results/reports/fastqc/test-test_L1 - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi - - path: results/reports/mosdepth/test/test.sorted.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.sorted.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.sorted.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.sorted.regions.bed.gz - - path: results/reports/mosdepth/test/test.sorted.regions.bed.gz.csi - - path: results/reports/samtools/test/test.recal.cram.stats - # conda changes md5sums for test - - path: results/reports/samtools/test/test.sorted.cram.stats - # conda changes md5sums for test - - path: results/csv/markduplicates.csv - should_exist: false - - path: results/csv/markduplicates_no_table.csv - should_exist: false - - path: results/preprocessing/mapped/test/test.bam - should_exist: false diff --git a/tests/test_strelka.yml b/tests/test_strelka.yml deleted file mode 100644 index b2b25672a2..0000000000 --- a/tests/test_strelka.yml +++ /dev/null @@ -1,389 +0,0 @@ -- name: Skip variant calling on matched normal - command: nextflow run main.nf -profile test,variantcalling_channels --tools strelka --only_paired_variant_calling --outdir results - tags: - - somatic - - strelka - - variantcalling_channel - files: - - path: results/multiqc - - path: results/reports/bcftools/strelka/sample1/sample1.strelka.variants.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/bcftools/strelka/sample2/sample2.strelka.variants.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample1/sample1.strelka.variants.FILTER.summary - md5sum: 2048a5de0201a6052c988a0189979a5f - - path: results/reports/vcftools/strelka/sample1/sample1.strelka.variants.TsTv.count - md5sum: c5b7a8eda2526d899098439ae4c06a49 - - path: results/reports/vcftools/strelka/sample1/sample1.strelka.variants.TsTv.qual - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample2/sample2.strelka.variants.FILTER.summary - md5sum: fa3112841a4575d104916027c8851b30 - - path: results/reports/vcftools/strelka/sample2/sample2.strelka.variants.TsTv.count - md5sum: d7f54d09d38af01a574a4930af21cfc9 - - path: results/reports/vcftools/strelka/sample2/sample2.strelka.variants.TsTv.qual - contains: - [ - "19 453 47848 0.00946748 11 50 0.22", - "56 456 47875 0.0095248 8 25 0.32", - "72 458 47880 0.00956558 6 20 0.3", - "314 463 47899 0.00966617 1 1 1", - ] - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.FILTER.summary - md5sum: 3441628cd6550ed459ca1c3db989ceea - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.count - md5sum: 8dcfdbcaac118df1d5ad407dd2af699f - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.qual - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.FILTER.summary - md5sum: 4fc17fa5625b4d1dcc5d791b1eb22d85 - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.count - md5sum: fc7af1f534890c4ad3025588b3af62ae - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.qual - # conda changes md5sums for test - - path: results/variant_calling/strelka/sample1/sample1.strelka.genome.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample1/sample1.strelka.genome.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample1/sample1.strelka.variants.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample1/sample1.strelka.variants.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample2/sample2.strelka.genome.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample2/sample2.strelka.genome.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample2/sample2.strelka.variants.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample2/sample2.strelka.variants.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/sample3/strelka/sample3.strelka.variants.vcf.gz - should_exist: false - - path: results/variant_calling/strelka/sample3/sample3.strelka.variants.vcf.gz.tbi - should_exist: false - - path: results/variant_calling/strelka/sample3/sample3.strelka.genome.vcf.gz - should_exist: false - - path: results/variant_calling/strelka/sample3/sample3.strelka.genome.vcf.gz.tbi - should_exist: false - - path: results/reports/mosdepth/sample1/sample1.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample1/sample1.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/sample1/sample1.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample1/sample1.recal.regions.bed.gz - - path: results/reports/mosdepth/sample1/sample1.recal.regions.bed.gz.csi - - path: results/reports/samtools/sample1/sample1.recal.cram.stats - - path: results/reports/mosdepth/sample2/sample2.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample2/sample2.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/sample2/sample2.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample2/sample2.recal.regions.bed.gz - - path: results/reports/mosdepth/sample2/sample2.recal.regions.bed.gz.csi - - path: results/reports/samtools/sample2/sample2.recal.cram.stats - - path: results/reports/mosdepth/sample3/sample3.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample3/sample3.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/sample3/sample3.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample3/sample3.recal.regions.bed.gz - - path: results/reports/mosdepth/sample3/sample3.recal.regions.bed.gz.csi - - path: results/reports/samtools/sample3/sample3.recal.cram.stats - # conda changes md5sums for test - - path: results/reports/mosdepth/sample4/sample4.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample4/sample4.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/sample4/sample4.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample4/sample4.recal.regions.bed.gz - - path: results/reports/mosdepth/sample4/sample4.recal.regions.bed.gz.csi - - path: results/reports/samtools/sample4/sample4.recal.cram.stats - # conda changes md5sums for test -- name: Run variant calling on germline sample with strelka - command: nextflow run main.nf -profile test,tools_germline --tools strelka --outdir results - tags: - - germline - - strelka - - variant_calling - files: - - path: results/csv/variantcalled.csv - md5sum: cd8a47dfc3e44c395e9f693770ccc6c9 - - path: results/multiqc - - path: results/reports/bcftools/strelka/sample1/sample1.strelka.variants.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample1/sample1.strelka.variants.FILTER.summary - md5sum: 2048a5de0201a6052c988a0189979a5f - - path: results/reports/vcftools/strelka/sample1/sample1.strelka.variants.TsTv.count - md5sum: c5b7a8eda2526d899098439ae4c06a49 - - path: results/reports/vcftools/strelka/sample1/sample1.strelka.variants.TsTv.qual - # conda changes md5sums for test - - path: results/variant_calling/strelka/sample1/sample1.strelka.genome.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample1/sample1.strelka.genome.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample1/sample1.strelka.variants.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample1/sample1.strelka.variants.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/strelka - should_exist: false - - path: results/reports/mosdepth/sample1/sample1.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample1/sample1.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/sample1/sample1.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample1/sample1.recal.regions.bed.gz - - path: results/reports/mosdepth/sample1/sample1.recal.regions.bed.gz.csi - - path: results/reports/samtools/sample1/sample1.recal.cram.stats -- name: Run variant calling on germline sample with strelka without intervals - command: nextflow run main.nf -profile test,tools_germline --tools strelka --no_intervals --outdir results - tags: - - germline - - strelka - - no_intervals - - variant_calling - files: - - path: results/csv/variantcalled.csv - md5sum: cd8a47dfc3e44c395e9f693770ccc6c9 - - path: results/multiqc - - path: results/no_intervals.bed - md5sum: f3dac01ea66b95fe477446fde2d31489 - - path: results/no_intervals.bed.gz - md5sum: f3dac01ea66b95fe477446fde2d31489 - - path: results/no_intervals.bed.gz.tbi - md5sum: f3dac01ea66b95fe477446fde2d31489 - - path: results/reports/bcftools/strelka/sample1/sample1.strelka.variants.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample1/sample1.strelka.variants.FILTER.summary - md5sum: 2b7be6ff481fddc655210b836587810d - - path: results/reports/vcftools/strelka/sample1/sample1.strelka.variants.TsTv.count - md5sum: 1481854d2a765f5641856ecf95ca4097 - - path: results/reports/vcftools/strelka/sample1/sample1.strelka.variants.TsTv.qual - # conda changes md5sums for test - - path: results/variant_calling/strelka/sample1/sample1.strelka.genome.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample1/sample1.strelka.genome.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample1/sample1.strelka.variants.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample1/sample1.strelka.variants.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/strelka - should_exist: false - - path: results/reports/mosdepth/sample1/sample1.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample1/sample1.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample1/sample1.recal.per-base.bed.gz - - path: results/reports/mosdepth/sample1/sample1.recal.per-base.bed.gz.csi - - path: results/reports/samtools/sample1/sample1.recal.cram.stats -- name: Run variant calling on tumor only sample with strelka - command: nextflow run main.nf -profile test,tools_tumoronly --tools strelka --outdir results - tags: - - strelka - - tumor_only - - variant_calling - files: - - path: results/csv/variantcalled.csv - md5sum: 8d2a5e0ad12781c99e773b828e478d35 - - path: results/multiqc - - path: results/reports/bcftools/strelka/sample2/sample2.strelka.variants.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample2/sample2.strelka.variants.FILTER.summary - md5sum: fa3112841a4575d104916027c8851b30 - - path: results/reports/vcftools/strelka/sample2/sample2.strelka.variants.TsTv.count - md5sum: d7f54d09d38af01a574a4930af21cfc9 - - path: results/reports/vcftools/strelka/sample2/sample2.strelka.variants.TsTv.qual - contains: ["19 453 47848 0.00946748 11 50 0.22", "72 458 47880 0.00956558 6 20 0.3", "314 463 47899 0.00966617 1 1 1"] - - path: results/variant_calling/strelka/sample2/sample2.strelka.genome.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample2/sample2.strelka.genome.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample2/sample2.strelka.variants.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample2/sample2.strelka.variants.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/strelka - should_exist: false - - path: results/reports/mosdepth/sample2/sample2.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample2/sample2.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/sample2/sample2.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample2/sample2.recal.regions.bed.gz - - path: results/reports/mosdepth/sample2/sample2.recal.regions.bed.gz.csi - - path: results/reports/samtools/sample2/sample2.recal.cram.stats -- name: Run variant calling on tumor only sample with strelka without intervals - command: nextflow run main.nf -profile test,tools_tumoronly --tools strelka --no_intervals --outdir results - tags: - - no_intervals - - strelka - - tumor_only - - variant_calling - files: - - path: results/csv/variantcalled.csv - md5sum: 8d2a5e0ad12781c99e773b828e478d35 - - path: results/multiqc - - path: results/no_intervals.bed - md5sum: f3dac01ea66b95fe477446fde2d31489 - - path: results/no_intervals.bed.gz - md5sum: f3dac01ea66b95fe477446fde2d31489 - - path: results/no_intervals.bed.gz.tbi - md5sum: f3dac01ea66b95fe477446fde2d31489 - - path: results/reports/bcftools/strelka/sample2/sample2.strelka.variants.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample2/sample2.strelka.variants.FILTER.summary - md5sum: d1dcce19d82ced016724ace746e95d01 - - path: results/reports/vcftools/strelka/sample2/sample2.strelka.variants.TsTv.count - md5sum: 9de35bbe9ebe45166b6bd195717f733a - - path: results/reports/vcftools/strelka/sample2/sample2.strelka.variants.TsTv.qual - # conda changes md5sums for test - - path: results/variant_calling/strelka/sample2/sample2.strelka.genome.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample2/sample2.strelka.genome.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample2/sample2.strelka.variants.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample2/sample2.strelka.variants.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/strelka - should_exist: false - - path: results/reports/mosdepth/sample2/sample2.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample2/sample2.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample2/sample2.recal.per-base.bed.gz - - path: results/reports/mosdepth/sample2/sample2.recal.per-base.bed.gz.csi - - path: results/reports/samtools/sample2/sample2.recal.cram.stats -- name: Run variant calling on somatic sample with strelka - command: nextflow run main.nf -profile test,tools_somatic --tools strelka --outdir results - tags: - - somatic - - strelka - - variant_calling - files: - - path: results/csv/variantcalled.csv - md5sum: 31ccee9472fed8bd15798724c62aee15 - - path: results/multiqc - - path: results/reports/bcftools/strelka/sample3/sample3.strelka.variants.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample3/sample3.strelka.variants.FILTER.summary - md5sum: 2048a5de0201a6052c988a0189979a5f - - path: results/reports/vcftools/strelka/sample3/sample3.strelka.variants.TsTv.count - md5sum: c5b7a8eda2526d899098439ae4c06a49 - - path: results/reports/vcftools/strelka/sample3/sample3.strelka.variants.TsTv.qual - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.FILTER.summary - md5sum: 3441628cd6550ed459ca1c3db989ceea - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.count - md5sum: 8dcfdbcaac118df1d5ad407dd2af699f - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.qual - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.FILTER.summary - md5sum: 4fc17fa5625b4d1dcc5d791b1eb22d85 - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.count - md5sum: fc7af1f534890c4ad3025588b3af62ae - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.qual - # conda changes md5sums for test - - path: results/variant_calling/strelka/sample3/sample3.strelka.genome.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample3/sample3.strelka.genome.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample3/sample3.strelka.variants.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample3/sample3.strelka.variants.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/strelka - should_exist: false - - path: results/reports/mosdepth/sample3/sample3.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample3/sample3.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/sample3/sample3.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample3/sample3.recal.regions.bed.gz - - path: results/reports/mosdepth/sample3/sample3.recal.regions.bed.gz.csi - - path: results/reports/samtools/sample3/sample3.recal.cram.stats - # conda changes md5sums for test - - path: results/reports/mosdepth/sample4/sample4.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample4/sample4.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/sample4/sample4.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample4/sample4.recal.regions.bed.gz - - path: results/reports/mosdepth/sample4/sample4.recal.regions.bed.gz.csi - - path: results/reports/samtools/sample4/sample4.recal.cram.stats - # conda changes md5sums for test -- name: Run variant calling on somatic sample with strelka without intervals - command: nextflow run main.nf -profile test,tools_somatic --tools strelka --no_intervals --outdir results - tags: - - no_intervals - - somatic - - strelka - - variant_calling - files: - - path: results/csv/variantcalled.csv - md5sum: 31ccee9472fed8bd15798724c62aee15 - - path: results/multiqc - - path: results/no_intervals.bed - md5sum: f3dac01ea66b95fe477446fde2d31489 - - path: results/no_intervals.bed.gz - md5sum: f3dac01ea66b95fe477446fde2d31489 - - path: results/no_intervals.bed.gz.tbi - md5sum: f3dac01ea66b95fe477446fde2d31489 - - path: results/reports/bcftools/strelka/sample3/sample3.strelka.variants.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample3/sample3.strelka.variants.FILTER.summary - md5sum: 2b7be6ff481fddc655210b836587810d - - path: results/reports/vcftools/strelka/sample3/sample3.strelka.variants.TsTv.count - md5sum: 1481854d2a765f5641856ecf95ca4097 - - path: results/reports/vcftools/strelka/sample3/sample3.strelka.variants.TsTv.qual - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.FILTER.summary - md5sum: 3441628cd6550ed459ca1c3db989ceea - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.count - md5sum: 8dcfdbcaac118df1d5ad407dd2af699f - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.qual - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.FILTER.summary - md5sum: 7a81b11aa29fec73d5bc872b7b58f8aa - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.count - md5sum: a922c51ca3b2ea7cdcfa09e9c8c55d52 - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.qual - # conda changes md5sums for test - - path: results/variant_calling/strelka/sample3/sample3.strelka.genome.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample3/sample3.strelka.genome.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample3/sample3.strelka.variants.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample3/sample3.strelka.variants.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/strelka - should_exist: false - - path: results/reports/mosdepth/sample3/sample3.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample3/sample3.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample3/sample3.recal.per-base.bed.gz - - path: results/reports/mosdepth/sample3/sample3.recal.per-base.bed.gz.csi - - path: results/reports/samtools/sample3/sample3.recal.cram.stats - # conda changes md5sums for test - - path: results/reports/mosdepth/sample4/sample4.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample4/sample4.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample4/sample4.recal.per-base.bed.gz - - path: results/reports/mosdepth/sample4/sample4.recal.per-base.bed.gz.csi - - path: results/reports/samtools/sample4/sample4.recal.cram.stats - # conda changes md5sums for test diff --git a/tests/test_strelka_bp.yml b/tests/test_strelka_bp.yml deleted file mode 100644 index f5954de37c..0000000000 --- a/tests/test_strelka_bp.yml +++ /dev/null @@ -1,213 +0,0 @@ -- name: Run variant calling on somatic sample with Strelka BP - command: nextflow run main.nf -profile test,tools_somatic --tools strelka,manta --outdir results - tags: - - somatic - - strelka_bp - - variant_calling - files: - - path: results/csv/variantcalled.csv - md5sum: eff248896ca462b76c79749403e44f48 - - path: results/multiqc - - path: results/reports/bcftools/manta/sample3/sample3.manta.diploid_sv.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/bcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/bcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/bcftools/strelka/sample3/sample3.strelka.variants.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/vcftools/manta/sample3/sample3.manta.diploid_sv.FILTER.summary - md5sum: 1ce42d34e4ae919afb519efc99146423 - - path: results/reports/vcftools/manta/sample3/sample3.manta.diploid_sv.TsTv.count - md5sum: fa27f678965b7cba6a92efcd039f802a - - path: results/reports/vcftools/manta/sample3/sample3.manta.diploid_sv.TsTv.qual - # conda changes md5sums for test - - path: results/reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.FILTER.summary - md5sum: 1ce42d34e4ae919afb519efc99146423 - - path: results/reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.TsTv.count - md5sum: fa27f678965b7cba6a92efcd039f802a - - path: results/reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.TsTv.qual - # conda changes md5sums for test - - path: results/reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.FILTER.summary - md5sum: 1ce42d34e4ae919afb519efc99146423 - - path: results/reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.TsTv.count - md5sum: 8dcfdbcaac118df1d5ad407dd2af699f - - path: results/reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.TsTv.qual - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample3/sample3.strelka.variants.FILTER.summary - md5sum: 2048a5de0201a6052c988a0189979a5f - - path: results/reports/vcftools/strelka/sample3/sample3.strelka.variants.TsTv.count - md5sum: c5b7a8eda2526d899098439ae4c06a49 - - path: results/reports/vcftools/strelka/sample3/sample3.strelka.variants.TsTv.qual - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.FILTER.summary - md5sum: 3441628cd6550ed459ca1c3db989ceea - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.count - md5sum: 8dcfdbcaac118df1d5ad407dd2af699f - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.qual - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.FILTER.summary - md5sum: 4fc17fa5625b4d1dcc5d791b1eb22d85 - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.count - md5sum: fc7af1f534890c4ad3025588b3af62ae - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.qual - # conda changes md5sums for test - - path: results/variant_calling/manta/sample3/sample3.manta.diploid_sv.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/manta/sample3/sample3.manta.diploid_sv.vcf.gz.tbi - md5sum: 4cb176febbc8c26d717a6c6e67b9c905 - - path: results/variant_calling/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.vcf.gz.tbi - md5sum: 4cb176febbc8c26d717a6c6e67b9c905 - - path: results/variant_calling/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.vcf.gz.tbi - md5sum: 4cb176febbc8c26d717a6c6e67b9c905 - - path: results/variant_calling/strelka/sample3/sample3.strelka.genome.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample3/sample3.strelka.genome.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample3/sample3.strelka.variants.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample3/sample3.strelka.variants.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/manta - should_exist: false - - path: results/strelka - should_exist: false - - path: results/reports/mosdepth/sample3/sample3.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample3/sample3.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/sample3/sample3.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample3/sample3.recal.regions.bed.gz - - path: results/reports/mosdepth/sample3/sample3.recal.regions.bed.gz.csi - - path: results/reports/samtools/sample3/sample3.recal.cram.stats - # conda changes md5sums for test - - path: results/reports/mosdepth/sample4/sample4.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample4/sample4.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/sample4/sample4.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample4/sample4.recal.regions.bed.gz - - path: results/reports/mosdepth/sample4/sample4.recal.regions.bed.gz.csi - - path: results/reports/samtools/sample4/sample4.recal.cram.stats - # conda changes md5sums for test -- name: Run variant calling on somatic sample with Strelka BP without intervals - command: nextflow run main.nf -profile test,tools_somatic --tools strelka,manta --no_intervals --outdir results - tags: - - no_intervals - - somatic - - strelka_bp - - variant_calling - files: - - path: results/csv/variantcalled.csv - md5sum: eff248896ca462b76c79749403e44f48 - - path: results/multiqc - - path: results/no_intervals.bed - md5sum: f3dac01ea66b95fe477446fde2d31489 - - path: results/no_intervals.bed.gz - md5sum: f3dac01ea66b95fe477446fde2d31489 - - path: results/no_intervals.bed.gz.tbi - md5sum: f3dac01ea66b95fe477446fde2d31489 - - path: results/reports/bcftools/manta/sample3/sample3.manta.diploid_sv.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/bcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/bcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/bcftools/strelka/sample3/sample3.strelka.variants.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.bcftools_stats.txt - # conda changes md5sums for test - - path: results/reports/vcftools/manta/sample3/sample3.manta.diploid_sv.FILTER.summary - md5sum: 1ce42d34e4ae919afb519efc99146423 - - path: results/reports/vcftools/manta/sample3/sample3.manta.diploid_sv.TsTv.count - md5sum: fa27f678965b7cba6a92efcd039f802a - - path: results/reports/vcftools/manta/sample3/sample3.manta.diploid_sv.TsTv.qual - # conda changes md5sums for test - - path: results/reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.FILTER.summary - md5sum: 1ce42d34e4ae919afb519efc99146423 - - path: results/reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.TsTv.count - md5sum: fa27f678965b7cba6a92efcd039f802a - - path: results/reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.TsTv.qual - # conda changes md5sums for test - - path: results/reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.FILTER.summary - md5sum: 1ce42d34e4ae919afb519efc99146423 - - path: results/reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.TsTv.count - md5sum: 8dcfdbcaac118df1d5ad407dd2af699f - - path: results/reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.TsTv.qual - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample3/sample3.strelka.variants.FILTER.summary - md5sum: 2b7be6ff481fddc655210b836587810d - - path: results/reports/vcftools/strelka/sample3/sample3.strelka.variants.TsTv.count - md5sum: 1481854d2a765f5641856ecf95ca4097 - - path: results/reports/vcftools/strelka/sample3/sample3.strelka.variants.TsTv.qual - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.FILTER.summary - md5sum: 3441628cd6550ed459ca1c3db989ceea - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.count - md5sum: 8dcfdbcaac118df1d5ad407dd2af699f - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.qual - # conda changes md5sums for test - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.FILTER.summary - md5sum: 7a81b11aa29fec73d5bc872b7b58f8aa - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.count - md5sum: a922c51ca3b2ea7cdcfa09e9c8c55d52 - - path: results/reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.qual - # conda changes md5sums for test - - path: results/variant_calling/manta/sample3/sample3.manta.diploid_sv.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/manta/sample3/sample3.manta.diploid_sv.vcf.gz.tbi - md5sum: 4cb176febbc8c26d717a6c6e67b9c905 - - path: results/variant_calling/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.vcf.gz.tbi - md5sum: 4cb176febbc8c26d717a6c6e67b9c905 - - path: results/variant_calling/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.vcf.gz.tbi - md5sum: 4cb176febbc8c26d717a6c6e67b9c905 - - path: results/variant_calling/strelka/sample3/sample3.strelka.genome.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample3/sample3.strelka.genome.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample3/sample3.strelka.variants.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample3/sample3.strelka.variants.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz - # binary changes md5sums on reruns - - path: results/variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz.tbi - # binary changes md5sums on reruns - - path: results/manta - should_exist: false - - path: results/strelka - should_exist: false - - path: results/reports/mosdepth/sample3/sample3.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample3/sample3.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample3/sample3.recal.per-base.bed.gz - - path: results/reports/mosdepth/sample3/sample3.recal.per-base.bed.gz.csi - - path: results/reports/samtools/sample3/sample3.recal.cram.stats - # conda changes md5sums for test - - path: results/reports/mosdepth/sample4/sample4.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/sample4/sample4.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/sample4/sample4.recal.per-base.bed.gz - - path: results/reports/mosdepth/sample4/sample4.recal.per-base.bed.gz.csi - - path: results/reports/samtools/sample4/sample4.recal.cram.stats - # conda changes md5sums for test diff --git a/tests/test_tools_manually.yml b/tests/test_tools_manually.yml index fe1642b3c3..19e519e268 100644 --- a/tests/test_tools_manually.yml +++ b/tests/test_tools_manually.yml @@ -287,7 +287,7 @@ - path: results/reports/samtools/sample3/sample3.recal.cram.stats # conda changes md5sums for test - name: Run full pipeline on tumoronly with most tools - command: nextflow run . -profile test --input tests/csv/3.0/fastq_tumor_only.csv --tools cnvkit,freebayes,merge,mpileup,mutect2,snpeff,strelka,tiddit,vep --outdir results + command: nextflow run . -profile test --input tests/csv/3.0/fastq_tumor_only.csv --tools cnvkit,freebayes,merge,mpileup,mutect2,snpeff,tiddit,vep --outdir results tags: - full_pipeline_manual - manual diff --git a/tests/test_tumor_normal_pair.yml b/tests/test_tumor_normal_pair.yml deleted file mode 100644 index 711e494ca1..0000000000 --- a/tests/test_tumor_normal_pair.yml +++ /dev/null @@ -1,70 +0,0 @@ -- name: Run default pipeline for tumor normal pair - command: nextflow run main.nf -profile test,pair --outdir results - tags: - - default_extended - - preprocessing - - tumor_normal_pair - files: - - path: results/csv/markduplicates.csv - md5sum: e8e587ac25253ff7ab8f1cc66d410c98 - - path: results/csv/markduplicates_no_table.csv - md5sum: 617574c9b607e5daaf4ad56d48982247 - - path: results/csv/recalibrated.csv - md5sum: 008dff17e2a0d96ef9c1cae12fcab6ab - - path: results/multiqc - - path: results/preprocessing/markduplicates/test/test.md.cram - # binary changes md5sums on reruns - - path: results/preprocessing/markduplicates/test/test.md.cram.crai - # binary changes md5sums on reruns - - path: results/preprocessing/recal_table/test/test.recal.table - md5sum: 4ac774bf5f1157e77426fd82f5ac0fbe - - path: results/preprocessing/recalibrated/test/test.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test/test.recal.cram.crai - # binary changes md5sums on reruns - - path: results/preprocessing/markduplicates/test2/test2.md.cram - # binary changes md5sums on reruns - - path: results/preprocessing/markduplicates/test2/test2.md.cram.crai - # binary changes md5sums on reruns - - path: results/preprocessing/recal_table/test2/test2.recal.table - md5sum: 0626cd4337eab79b38b5bc5c95e0c003 - - path: results/preprocessing/recalibrated/test2/test2.recal.cram - # binary changes md5sums on reruns - - path: results/preprocessing/recalibrated/test2/test2.recal.cram.crai - # binary changes md5sums on reruns - - path: results/reports/fastqc/test-test_L1 - - path: results/reports/fastqc/test2-test_L1 - - path: results/reports/markduplicates/test/test.md.cram.metrics - contains: ["test 8547 767 84 523391 3882 0 0 0.385081", "1.0 767 767"] - - path: results/reports/markduplicates/test2/test2.md.cram.metrics - contains: ["test2 10103 880 35 523579 4837 2 0 0.408076 193306", "1.0 1 876 876", "100.0 80.515303 0 0"] - - path: results/reports/mosdepth/test/test.md.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.md.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.md.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.md.regions.bed.gz - - path: results/reports/mosdepth/test/test.md.regions.bed.gz.csi - - path: results/reports/mosdepth/test/test.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test/test.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz - - path: results/reports/mosdepth/test/test.recal.regions.bed.gz.csi - - path: results/reports/mosdepth/test2/test2.md.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test2/test2.md.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test2/test2.md.mosdepth.summary.txt - - path: results/reports/mosdepth/test2/test2.md.regions.bed.gz - - path: results/reports/mosdepth/test2/test2.md.regions.bed.gz.csi - - path: results/reports/mosdepth/test2/test2.recal.mosdepth.global.dist.txt - - path: results/reports/mosdepth/test2/test2.recal.mosdepth.region.dist.txt - - path: results/reports/mosdepth/test2/test2.recal.mosdepth.summary.txt - - path: results/reports/mosdepth/test2/test2.recal.regions.bed.gz - - path: results/reports/mosdepth/test2/test2.recal.regions.bed.gz.csi - - path: results/reports/samtools/test/test.md.cram.stats - # conda changes md5sums for test - - path: results/reports/samtools/test/test.recal.cram.stats - # conda changes md5sums for test - - path: results/reports/samtools/test2/test2.md.cram.stats - # conda changes md5sums for test - - path: results/reports/samtools/test2/test2.recal.cram.stats - # conda changes md5sums for test - - path: results/preprocessing/mapped/ - should_exist: false diff --git a/tests/test_umi.yml b/tests/test_umi.yml index 0c8392f40e..c5c6b08fe4 100644 --- a/tests/test_umi.yml +++ b/tests/test_umi.yml @@ -6,7 +6,7 @@ files: - path: results/preprocessing/umi/test/test-test_L1_umi-consensus.bam # binary changes md5sums on reruns. - - path: results/reports/umi/test-test_L1_umi_histogram.txt + - path: results/reports/umi/test-test_L1_umi-grouped_histogram.txt md5sum: 85292e9acb83edf17110dce17be27f44 - path: results/csv/markduplicates.csv md5sum: 0d6120bb99e92f6810343270711ca53e diff --git a/tests/tumor-normal-pair.nf.test b/tests/tumor-normal-pair.nf.test new file mode 100644 index 0000000000..0a1c9f5cde --- /dev/null +++ b/tests/tumor-normal-pair.nf.test @@ -0,0 +1,43 @@ +nextflow_pipeline { + + name "Test pipeline" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + + test("Run with profile test | --input tests/csv/3.0/fastq_pair.csv") { + + when { + params { + input = "${projectDir}/tests/csv/3.0/fastq_pair.csv" + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + outdir = "$outputDir" + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } +} diff --git a/tests/tumor-normal-pair.nf.test.snap b/tests/tumor-normal-pair.nf.test.snap new file mode 100644 index 0000000000..92aef2e017 --- /dev/null +++ b/tests/tumor-normal-pair.nf.test.snap @@ -0,0 +1,393 @@ +{ + "Run with profile test | --input tests/csv/3.0/fastq_pair.csv": { + "content": [ + 40, + { + "BCFTOOLS_STATS": { + "bcftools": 1.2 + }, + "BWAMEM1_MEM": { + "bwa": "0.7.18-r1243-dirty", + "samtools": 1.2 + }, + "FASTQC": { + "fastqc": "0.12.1" + }, + "GATK4_APPLYBQSR": { + "gatk4": "4.5.0.0" + }, + "GATK4_BASERECALIBRATOR": { + "gatk4": "4.5.0.0" + }, + "GATK4_MARKDUPLICATES": { + "gatk4": "4.5.0.0", + "samtools": "1.19.2" + }, + "INDEX_CRAM": { + "samtools": 1.21 + }, + "MOSDEPTH": { + "mosdepth": "0.3.8" + }, + "SAMTOOLS_STATS": { + "samtools": 1.21 + }, + "STRELKA_SINGLE": { + "strelka": "2.9.10" + }, + "STRELKA_SOMATIC": { + "strelka": "2.9.10" + }, + "VCFTOOLS_TSTV_COUNT": { + "vcftools": "0.1.16" + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/markduplicates.csv", + "csv/markduplicates_no_table.csv", + "csv/recalibrated.csv", + "csv/variantcalled.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/bcftools_stats_indel-lengths.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Indels.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_SNP.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transitions.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transversions.txt", + "multiqc/multiqc_data/fastqc-status-check-heatmap.txt", + "multiqc/multiqc_data/fastqc_adapter_content_plot.txt", + "multiqc/multiqc_data/fastqc_per_base_n_content_plot.txt", + "multiqc/multiqc_data/fastqc_per_base_sequence_quality_plot.txt", + "multiqc/multiqc_data/fastqc_per_sequence_gc_content_plot_Counts.txt", + "multiqc/multiqc_data/fastqc_per_sequence_gc_content_plot_Percentages.txt", + "multiqc/multiqc_data/fastqc_per_sequence_quality_scores_plot.txt", + "multiqc/multiqc_data/fastqc_sequence_counts_plot.txt", + "multiqc/multiqc_data/fastqc_sequence_duplication_levels_plot.txt", + "multiqc/multiqc_data/fastqc_sequence_length_distribution_plot.txt", + "multiqc/multiqc_data/fastqc_top_overrepresented_sequences_table.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt", + "multiqc/multiqc_data/gatk-base-recalibrator-reported-empirical-plot.txt", + "multiqc/multiqc_data/gatk_base_recalibrator.txt", + "multiqc/multiqc_data/mosdepth-coverage-per-contig-single.txt", + "multiqc/multiqc_data/mosdepth-cumcoverage-dist-id.txt", + "multiqc/multiqc_data/mosdepth_cov_dist.txt", + "multiqc/multiqc_data/mosdepth_cumcov_dist.txt", + "multiqc/multiqc_data/mosdepth_perchrom.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_bcftools_stats.txt", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_fastqc.txt", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_picard_dups.txt", + "multiqc/multiqc_data/multiqc_samtools_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/picard_deduplication.txt", + "multiqc/multiqc_data/picard_histogram.txt", + "multiqc/multiqc_data/picard_histogram_1.txt", + "multiqc/multiqc_data/picard_histogram_2.txt", + "multiqc/multiqc_data/samtools-stats-dp.txt", + "multiqc/multiqc_data/samtools_alignment_plot.txt", + "multiqc/multiqc_data/vcftools_tstv_by_count.txt", + "multiqc/multiqc_data/vcftools_tstv_by_qual.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_indel-lengths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Indels.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_SNP.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transitions.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transversions.pdf", + "multiqc/multiqc_plots/pdf/fastqc-status-check-heatmap.pdf", + "multiqc/multiqc_plots/pdf/fastqc_adapter_content_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_base_n_content_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_base_sequence_quality_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_sequence_gc_content_plot_Counts.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_sequence_gc_content_plot_Percentages.pdf", + "multiqc/multiqc_plots/pdf/fastqc_per_sequence_quality_scores_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_sequence_counts_plot-cnt.pdf", + "multiqc/multiqc_plots/pdf/fastqc_sequence_counts_plot-pct.pdf", + "multiqc/multiqc_plots/pdf/fastqc_sequence_duplication_levels_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_sequence_length_distribution_plot.pdf", + "multiqc/multiqc_plots/pdf/fastqc_top_overrepresented_sequences_table.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.pdf", + "multiqc/multiqc_plots/pdf/gatk-base-recalibrator-reported-empirical-plot.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-coverage-per-contig-single-cnt.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-coverage-per-contig-single-pct.pdf", + "multiqc/multiqc_plots/pdf/mosdepth-cumcoverage-dist-id.pdf", + "multiqc/multiqc_plots/pdf/picard_deduplication-cnt.pdf", + "multiqc/multiqc_plots/pdf/picard_deduplication-pct.pdf", + "multiqc/multiqc_plots/pdf/samtools-stats-dp.pdf", + "multiqc/multiqc_plots/pdf/samtools_alignment_plot-cnt.pdf", + "multiqc/multiqc_plots/pdf/samtools_alignment_plot-pct.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_count.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_qual.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/bcftools_stats_indel-lengths.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Indels.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_SNP.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transitions.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transversions.png", + "multiqc/multiqc_plots/png/fastqc-status-check-heatmap.png", + "multiqc/multiqc_plots/png/fastqc_adapter_content_plot.png", + "multiqc/multiqc_plots/png/fastqc_per_base_n_content_plot.png", + "multiqc/multiqc_plots/png/fastqc_per_base_sequence_quality_plot.png", + "multiqc/multiqc_plots/png/fastqc_per_sequence_gc_content_plot_Counts.png", + "multiqc/multiqc_plots/png/fastqc_per_sequence_gc_content_plot_Percentages.png", + "multiqc/multiqc_plots/png/fastqc_per_sequence_quality_scores_plot.png", + "multiqc/multiqc_plots/png/fastqc_sequence_counts_plot-cnt.png", + "multiqc/multiqc_plots/png/fastqc_sequence_counts_plot-pct.png", + "multiqc/multiqc_plots/png/fastqc_sequence_duplication_levels_plot.png", + "multiqc/multiqc_plots/png/fastqc_sequence_length_distribution_plot.png", + "multiqc/multiqc_plots/png/fastqc_top_overrepresented_sequences_table.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.png", + "multiqc/multiqc_plots/png/gatk-base-recalibrator-reported-empirical-plot.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/mosdepth-coverage-per-contig-single-cnt.png", + "multiqc/multiqc_plots/png/mosdepth-coverage-per-contig-single-pct.png", + "multiqc/multiqc_plots/png/mosdepth-cumcoverage-dist-id.png", + "multiqc/multiqc_plots/png/picard_deduplication-cnt.png", + "multiqc/multiqc_plots/png/picard_deduplication-pct.png", + "multiqc/multiqc_plots/png/samtools-stats-dp.png", + "multiqc/multiqc_plots/png/samtools_alignment_plot-cnt.png", + "multiqc/multiqc_plots/png/samtools_alignment_plot-pct.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_count.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_qual.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/bcftools_stats_indel-lengths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Indels.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_SNP.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transitions.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transversions.svg", + "multiqc/multiqc_plots/svg/fastqc-status-check-heatmap.svg", + "multiqc/multiqc_plots/svg/fastqc_adapter_content_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_per_base_n_content_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_per_base_sequence_quality_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_per_sequence_gc_content_plot_Counts.svg", + "multiqc/multiqc_plots/svg/fastqc_per_sequence_gc_content_plot_Percentages.svg", + "multiqc/multiqc_plots/svg/fastqc_per_sequence_quality_scores_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_sequence_counts_plot-cnt.svg", + "multiqc/multiqc_plots/svg/fastqc_sequence_counts_plot-pct.svg", + "multiqc/multiqc_plots/svg/fastqc_sequence_duplication_levels_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_sequence_length_distribution_plot.svg", + "multiqc/multiqc_plots/svg/fastqc_top_overrepresented_sequences_table.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.svg", + "multiqc/multiqc_plots/svg/gatk-base-recalibrator-reported-empirical-plot.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/mosdepth-coverage-per-contig-single-cnt.svg", + "multiqc/multiqc_plots/svg/mosdepth-coverage-per-contig-single-pct.svg", + "multiqc/multiqc_plots/svg/mosdepth-cumcoverage-dist-id.svg", + "multiqc/multiqc_plots/svg/picard_deduplication-cnt.svg", + "multiqc/multiqc_plots/svg/picard_deduplication-pct.svg", + "multiqc/multiqc_plots/svg/samtools-stats-dp.svg", + "multiqc/multiqc_plots/svg/samtools_alignment_plot-cnt.svg", + "multiqc/multiqc_plots/svg/samtools_alignment_plot-pct.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_count.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_qual.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "preprocessing", + "preprocessing/markduplicates", + "preprocessing/markduplicates/test", + "preprocessing/markduplicates/test/test.md.cram", + "preprocessing/markduplicates/test/test.md.cram.crai", + "preprocessing/markduplicates/test2", + "preprocessing/markduplicates/test2/test2.md.cram", + "preprocessing/markduplicates/test2/test2.md.cram.crai", + "preprocessing/recal_table", + "preprocessing/recal_table/test", + "preprocessing/recal_table/test/test.recal.table", + "preprocessing/recal_table/test2", + "preprocessing/recal_table/test2/test2.recal.table", + "preprocessing/recalibrated", + "preprocessing/recalibrated/test", + "preprocessing/recalibrated/test/test.recal.cram", + "preprocessing/recalibrated/test/test.recal.cram.crai", + "preprocessing/recalibrated/test2", + "preprocessing/recalibrated/test2/test2.recal.cram", + "preprocessing/recalibrated/test2/test2.recal.cram.crai", + "reference", + "reports", + "reports/bcftools", + "reports/bcftools/strelka", + "reports/bcftools/strelka/test", + "reports/bcftools/strelka/test/test.strelka.variants.bcftools_stats.txt", + "reports/bcftools/strelka/test2_vs_test", + "reports/bcftools/strelka/test2_vs_test/test2_vs_test.strelka.somatic_indels.bcftools_stats.txt", + "reports/bcftools/strelka/test2_vs_test/test2_vs_test.strelka.somatic_snvs.bcftools_stats.txt", + "reports/fastqc", + "reports/fastqc/test-test_L1", + "reports/fastqc/test-test_L1/test-test_L1_1_fastqc.html", + "reports/fastqc/test-test_L1/test-test_L1_1_fastqc.zip", + "reports/fastqc/test-test_L1/test-test_L1_2_fastqc.html", + "reports/fastqc/test-test_L1/test-test_L1_2_fastqc.zip", + "reports/fastqc/test2-test_L1", + "reports/fastqc/test2-test_L1/test2-test_L1_1_fastqc.html", + "reports/fastqc/test2-test_L1/test2-test_L1_1_fastqc.zip", + "reports/fastqc/test2-test_L1/test2-test_L1_2_fastqc.html", + "reports/fastqc/test2-test_L1/test2-test_L1_2_fastqc.zip", + "reports/markduplicates", + "reports/markduplicates/test", + "reports/markduplicates/test/test.md.cram.metrics", + "reports/markduplicates/test2", + "reports/markduplicates/test2/test2.md.cram.metrics", + "reports/mosdepth", + "reports/mosdepth/test", + "reports/mosdepth/test/test.md.mosdepth.global.dist.txt", + "reports/mosdepth/test/test.md.mosdepth.region.dist.txt", + "reports/mosdepth/test/test.md.mosdepth.summary.txt", + "reports/mosdepth/test/test.md.regions.bed.gz", + "reports/mosdepth/test/test.md.regions.bed.gz.csi", + "reports/mosdepth/test/test.recal.mosdepth.global.dist.txt", + "reports/mosdepth/test/test.recal.mosdepth.region.dist.txt", + "reports/mosdepth/test/test.recal.mosdepth.summary.txt", + "reports/mosdepth/test/test.recal.regions.bed.gz", + "reports/mosdepth/test/test.recal.regions.bed.gz.csi", + "reports/mosdepth/test2", + "reports/mosdepth/test2/test2.md.mosdepth.global.dist.txt", + "reports/mosdepth/test2/test2.md.mosdepth.region.dist.txt", + "reports/mosdepth/test2/test2.md.mosdepth.summary.txt", + "reports/mosdepth/test2/test2.md.regions.bed.gz", + "reports/mosdepth/test2/test2.md.regions.bed.gz.csi", + "reports/mosdepth/test2/test2.recal.mosdepth.global.dist.txt", + "reports/mosdepth/test2/test2.recal.mosdepth.region.dist.txt", + "reports/mosdepth/test2/test2.recal.mosdepth.summary.txt", + "reports/mosdepth/test2/test2.recal.regions.bed.gz", + "reports/mosdepth/test2/test2.recal.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/test", + "reports/samtools/test/test.md.cram.stats", + "reports/samtools/test/test.recal.cram.stats", + "reports/samtools/test2", + "reports/samtools/test2/test2.md.cram.stats", + "reports/samtools/test2/test2.recal.cram.stats", + "reports/vcftools", + "reports/vcftools/strelka", + "reports/vcftools/strelka/test", + "reports/vcftools/strelka/test/test.strelka.variants.FILTER.summary", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.count", + "reports/vcftools/strelka/test/test.strelka.variants.TsTv.qual", + "reports/vcftools/strelka/test2_vs_test", + "reports/vcftools/strelka/test2_vs_test/test2_vs_test.strelka.somatic_indels.FILTER.summary", + "reports/vcftools/strelka/test2_vs_test/test2_vs_test.strelka.somatic_indels.TsTv.count", + "reports/vcftools/strelka/test2_vs_test/test2_vs_test.strelka.somatic_indels.TsTv.qual", + "reports/vcftools/strelka/test2_vs_test/test2_vs_test.strelka.somatic_snvs.FILTER.summary", + "reports/vcftools/strelka/test2_vs_test/test2_vs_test.strelka.somatic_snvs.TsTv.count", + "reports/vcftools/strelka/test2_vs_test/test2_vs_test.strelka.somatic_snvs.TsTv.qual", + "variant_calling", + "variant_calling/strelka", + "variant_calling/strelka/test", + "variant_calling/strelka/test/test.strelka.genome.vcf.gz", + "variant_calling/strelka/test/test.strelka.genome.vcf.gz.tbi", + "variant_calling/strelka/test/test.strelka.variants.vcf.gz", + "variant_calling/strelka/test/test.strelka.variants.vcf.gz.tbi", + "variant_calling/strelka/test2_vs_test", + "variant_calling/strelka/test2_vs_test/test2_vs_test.strelka.somatic_indels.vcf.gz", + "variant_calling/strelka/test2_vs_test/test2_vs_test.strelka.somatic_indels.vcf.gz.tbi", + "variant_calling/strelka/test2_vs_test/test2_vs_test.strelka.somatic_snvs.vcf.gz", + "variant_calling/strelka/test2_vs_test/test2_vs_test.strelka.somatic_snvs.vcf.gz.tbi" + ], + [ + "bcftools_stats_indel-lengths.txt:md5,a180ee1e52441923154a87e86949fe5f", + "bcftools_stats_vqc_Count_Indels.txt:md5,90d196606a7945f77edd6bea2b4625ed", + "bcftools_stats_vqc_Count_SNP.txt:md5,9682508dd65bb9889fe073b1bec59666", + "bcftools_stats_vqc_Count_Transitions.txt:md5,9682508dd65bb9889fe073b1bec59666", + "bcftools_stats_vqc_Count_Transversions.txt:md5,9682508dd65bb9889fe073b1bec59666", + "fastqc-status-check-heatmap.txt:md5,eeb4e7e7a45f4223c86bfe3aea81f90b", + "fastqc_adapter_content_plot.txt:md5,cc7a809f9f001c10646ee4199ccdb40f", + "fastqc_per_base_n_content_plot.txt:md5,1eba855ae0fa5b5ed4a1f90d1c97f759", + "fastqc_per_base_sequence_quality_plot.txt:md5,cbb2743dfb2ec74e72b578c83ec28ee8", + "fastqc_per_sequence_gc_content_plot_Counts.txt:md5,73c884822eba0bafcdf34b90fe81aec5", + "fastqc_per_sequence_gc_content_plot_Percentages.txt:md5,24eeb00e5e2b11c7ab90a3223d429d15", + "fastqc_per_sequence_quality_scores_plot.txt:md5,6f048594f02effb93608665be29bd35a", + "fastqc_sequence_counts_plot.txt:md5,fca7ee9ef3382e2837a302d8c5d33769", + "fastqc_sequence_duplication_levels_plot.txt:md5,2aa0c6f33e4cffbb29cdabe2c28bb097", + "fastqc_sequence_length_distribution_plot.txt:md5,61b1fe978a2c73b86c30c27ee4bc60ae", + "fastqc_top_overrepresented_sequences_table.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Count.txt:md5,611d52bfafbf4118e738fee9346ffb1c", + "gatk-base-recalibrator-quality-scores-plot_Pre-recalibration_Percent.txt:md5,9924357c076930220c7b3a9b02067f47", + "mosdepth-coverage-per-contig-single.txt:md5,e2bac78b61847b15c755dc7069670939", + "mosdepth-cumcoverage-dist-id.txt:md5,1a6c11f5d74f772870d49af0a7ff84d2", + "mosdepth_cov_dist.txt:md5,cb92686a4d6b7dcdeeee0ac13a63f369", + "mosdepth_cumcov_dist.txt:md5,cb92686a4d6b7dcdeeee0ac13a63f369", + "mosdepth_perchrom.txt:md5,e2bac78b61847b15c755dc7069670939", + "multiqc_bcftools_stats.txt:md5,7b8f1b48aa1c33067679168a3f0d9fd1", + "multiqc_citations.txt:md5,ace4ca89138a5f1e2be289c157c00bd9", + "multiqc_fastqc.txt:md5,2fd25e8c81f962594b801d5a9df3cd87", + "multiqc_samtools_stats.txt:md5,60aa328e68503436f51e9a9b13dc6665", + "picard_histogram.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "picard_histogram_1.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "picard_histogram_2.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "samtools-stats-dp.txt:md5,d81e84864ecb732951a137e88d87a263", + "samtools_alignment_plot.txt:md5,e757696d1f7107df2bd4ad92f607272e", + "vcftools_tstv_by_count.txt:md5,9d3f23467779f62d81573894f71ba0d4", + "test.strelka.variants.bcftools_stats.txt:md5,550430d0b8336ac650b7e50bdfdc914c", + "test2_vs_test.strelka.somatic_indels.bcftools_stats.txt:md5,2ce8bffbfbb88c4da6a5239b11dc2f44", + "test2_vs_test.strelka.somatic_snvs.bcftools_stats.txt:md5,869e2d3a46760e2bea9b2e027002848d", + "test.md.mosdepth.global.dist.txt:md5,76fa71922a3f748e507c2364c531dfcb", + "test.md.mosdepth.region.dist.txt:md5,abc5df85e302b79985627888870882da", + "test.md.mosdepth.summary.txt:md5,d536456436eb275159b8c6af83213d80", + "test.md.regions.bed.gz:md5,b25a2798061021c0b2f4e1d18219bbbd", + "test.md.regions.bed.gz.csi:md5,05d571f8d51ca6b1bde804d7a6d999af", + "test.recal.mosdepth.global.dist.txt:md5,76fa71922a3f748e507c2364c531dfcb", + "test.recal.mosdepth.region.dist.txt:md5,abc5df85e302b79985627888870882da", + "test.recal.mosdepth.summary.txt:md5,d536456436eb275159b8c6af83213d80", + "test.recal.regions.bed.gz:md5,b25a2798061021c0b2f4e1d18219bbbd", + "test.recal.regions.bed.gz.csi:md5,05d571f8d51ca6b1bde804d7a6d999af", + "test2.md.mosdepth.global.dist.txt:md5,2020cf6dfc7ddca020c921dd9f0549b7", + "test2.md.mosdepth.region.dist.txt:md5,38ff8b38c33b9231f047fea8ea830aae", + "test2.md.mosdepth.summary.txt:md5,8b991358768cade225470a07cd34f573", + "test2.md.regions.bed.gz:md5,08e767f91a0a8d82733f0040e804a85f", + "test2.md.regions.bed.gz.csi:md5,eba5aca0b8f72759192bbdd29330278e", + "test2.recal.mosdepth.global.dist.txt:md5,2020cf6dfc7ddca020c921dd9f0549b7", + "test2.recal.mosdepth.region.dist.txt:md5,38ff8b38c33b9231f047fea8ea830aae", + "test2.recal.mosdepth.summary.txt:md5,8b991358768cade225470a07cd34f573", + "test2.recal.regions.bed.gz:md5,08e767f91a0a8d82733f0040e804a85f", + "test2.recal.regions.bed.gz.csi:md5,eba5aca0b8f72759192bbdd29330278e", + "test.md.cram.stats:md5,7d0a7d3d4ca6bf21702e7775b1777fb8", + "test.recal.cram.stats:md5,269fd4151ac69c1de63e2224310386a0", + "test2.md.cram.stats:md5,592aae2575ef68c8c8d9e993ca288f2d", + "test2.recal.cram.stats:md5,47c5ad8a7d4221d03673fd67388f9960", + "test.strelka.variants.FILTER.summary:md5,dd87f507da7de20d5318841af312493b", + "test.strelka.variants.TsTv.count:md5,fa27f678965b7cba6a92efcd039f802a", + "test2_vs_test.strelka.somatic_indels.FILTER.summary:md5,1ce42d34e4ae919afb519efc99146423", + "test2_vs_test.strelka.somatic_indels.TsTv.count:md5,8dcfdbcaac118df1d5ad407dd2af699f", + "test2_vs_test.strelka.somatic_snvs.FILTER.summary:md5,1ce42d34e4ae919afb519efc99146423", + "test2_vs_test.strelka.somatic_snvs.TsTv.count:md5,8dcfdbcaac118df1d5ad407dd2af699f" + ], + [ + [ + "test.md.cram", + "59ecc5c82c7af1283eea7507c590c831" + ], + [ + "test2.md.cram", + "bac87cf9290577fd9a4def63e046031f" + ], + [ + "test.recal.cram", + "654909615a48db30bdc14ec4d9d7d17c" + ], + [ + "test2.recal.cram", + "f4205ab086600ba2927e1468dc732976" + ] + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-11-09T18:22:56.125003" + } +} diff --git a/tests/variant_calling_controlfreec.nf.test b/tests/variant_calling_controlfreec.nf.test new file mode 100644 index 0000000000..934cf97c4c --- /dev/null +++ b/tests/variant_calling_controlfreec.nf.test @@ -0,0 +1,187 @@ +nextflow_pipeline { + + name "Test pipeline" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + + test("Run with profile test | --tools controlfreec | somatic") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + chr_dir = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/chromosomes.tar.gz' + dbsnp = params.modules_testdata_base_path + '/genomics/homo_sapiens/genome/chr21/germlineresources/dbsnp_138.hg38.vcf.gz' + dbsnp_tbi = params.modules_testdata_base_path + '/genomics/homo_sapiens/genome/chr21/germlineresources/dbsnp_138.hg38.vcf.gz.tbi' + fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta' + fasta_fai = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai' + input = "${projectDir}/tests/csv/3.0/recalibrated_somatic.csv" + intervals = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed' + outdir = "$outputDir" + step = "variant_calling" + tools = 'controlfreec' + wes = true + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + test("Run with profile test | --tools controlfreec --no_intervals | tumoronly") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + chr_dir = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/chromosomes.tar.gz' + dbsnp = params.modules_testdata_base_path + '/genomics/homo_sapiens/genome/chr21/germlineresources/dbsnp_138.hg38.vcf.gz' + dbsnp_tbi = params.modules_testdata_base_path + '/genomics/homo_sapiens/genome/chr21/germlineresources/dbsnp_138.hg38.vcf.gz.tbi' + fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta' + fasta_fai = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai' + input = "${projectDir}/tests/csv/3.0/recalibrated_tumoronly.csv" + no_intervals = true + outdir = "$outputDir" + step = "variant_calling" + tools = 'controlfreec' + wes = true + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + test("Run with profile test | --tools controlfreec --no_intervals | somatic | stub") { + options "-stub" + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + chr_dir = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/chromosomes.tar.gz' + dbsnp = params.modules_testdata_base_path + '/genomics/homo_sapiens/genome/chr21/germlineresources/dbsnp_138.hg38.vcf.gz' + dbsnp_tbi = params.modules_testdata_base_path + '/genomics/homo_sapiens/genome/chr21/germlineresources/dbsnp_138.hg38.vcf.gz.tbi' + fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta' + fasta_fai = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai' + input = "${projectDir}/tests/csv/3.0/recalibrated_somatic.csv" + no_intervals = true + outdir = "$outputDir" + step = "variant_calling" + tools = 'controlfreec' + wes = true + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + test("Run with profile test | --tools controlfreec | tumoronly | stub") { + options "-stub" + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + chr_dir = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/chromosomes.tar.gz' + dbsnp = params.modules_testdata_base_path + '/genomics/homo_sapiens/genome/chr21/germlineresources/dbsnp_138.hg38.vcf.gz' + dbsnp_tbi = params.modules_testdata_base_path + '/genomics/homo_sapiens/genome/chr21/germlineresources/dbsnp_138.hg38.vcf.gz.tbi' + fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta' + fasta_fai = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai' + input = "${projectDir}/tests/csv/3.0/recalibrated_tumoronly.csv" + intervals = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed' + outdir = "$outputDir" + step = "variant_calling" + tools = 'controlfreec' + wes = true + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } +} diff --git a/tests/variant_calling_controlfreec.nf.test.snap b/tests/variant_calling_controlfreec.nf.test.snap new file mode 100644 index 0000000000..f468d84f5c --- /dev/null +++ b/tests/variant_calling_controlfreec.nf.test.snap @@ -0,0 +1,612 @@ +{ + "Run with profile test | --tools controlfreec --no_intervals | somatic": { + "content": [ + 8, + { + "SAMTOOLS_MPILEUP": { + "samtools": 1.21 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "no_intervals.bed", + "no_intervals.bed.gz", + "no_intervals.bed.gz.tbi", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reports", + "reports/mosdepth", + "reports/mosdepth/sample3", + "reports/mosdepth/sample3/sample3.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample3/sample3.recal.mosdepth.summary.txt", + "reports/mosdepth/sample3/sample3.recal.per-base.bed.gz", + "reports/mosdepth/sample3/sample3.recal.per-base.bed.gz.csi", + "reports/mosdepth/sample4", + "reports/mosdepth/sample4/sample4.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample4/sample4.recal.mosdepth.summary.txt", + "reports/mosdepth/sample4/sample4.recal.per-base.bed.gz", + "reports/mosdepth/sample4/sample4.recal.per-base.bed.gz.csi", + "reports/samtools", + "reports/samtools/sample3", + "reports/samtools/sample3/sample3.recal.cram.stats", + "reports/samtools/sample4", + "reports/samtools/sample4/sample4.recal.cram.stats" + ], + [ + "sample3.recal.mosdepth.global.dist.txt:md5,69e29702ef01fd8f6c7a5468fc35a16a", + "sample3.recal.mosdepth.summary.txt:md5,d2775eb102acc5950f7f53883dcb503d", + "sample3.recal.per-base.bed.gz:md5,297f96648928d0ca5184223fb9941e7c", + "sample3.recal.per-base.bed.gz.csi:md5,519cc5bf84da0d71b87a88c76f83194e", + "sample4.recal.mosdepth.global.dist.txt:md5,f2dcd00a64947c49e8e4b93c2f4fbf27", + "sample4.recal.mosdepth.summary.txt:md5,0a7300e56eda6fba7c7564f00aa000f0", + "sample4.recal.per-base.bed.gz:md5,39a1bc436aa8546c26faedbe94cb676c", + "sample4.recal.per-base.bed.gz.csi:md5,aaa7bed9e7ef873b23bca249b8b58eb9", + "sample3.recal.cram.stats:md5,bcc229318527e414e69aaa5cd092ad9b", + "sample4.recal.cram.stats:md5,0d1784cb4c3f14b9858247ac6128dd03" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.10.2" + }, + "timestamp": "2024-12-10T09:46:49.516904333" + }, + "Run with profile test | --tools controlfreec | tumoronly": { + "content": [ + 11, + { + "FREEC2BED": { + "controlfreec": "11.6b" + }, + "FREEC2CIRCOS": { + "controlfreec": "11.6b" + }, + "FREEC_TUMORONLY": { + "controlfreec": "11.6b" + }, + "SAMTOOLS_MPILEUP": { + "samtools": 1.21 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reports", + "reports/mosdepth", + "reports/mosdepth/sample2", + "reports/mosdepth/sample2/sample2.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample2/sample2.recal.mosdepth.region.dist.txt", + "reports/mosdepth/sample2/sample2.recal.mosdepth.summary.txt", + "reports/mosdepth/sample2/sample2.recal.per-base.bed.gz", + "reports/mosdepth/sample2/sample2.recal.per-base.bed.gz.csi", + "reports/mosdepth/sample2/sample2.recal.regions.bed.gz", + "reports/mosdepth/sample2/sample2.recal.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/sample2", + "reports/samtools/sample2/sample2.recal.cram.stats", + "variant_calling", + "variant_calling/controlfreec", + "variant_calling/controlfreec/sample2", + "variant_calling/controlfreec/sample2/config.txt", + "variant_calling/controlfreec/sample2/sample2.bed", + "variant_calling/controlfreec/sample2/sample2.circos.txt", + "variant_calling/controlfreec/sample2/sample2.p.value.txt", + "variant_calling/controlfreec/sample2/sample2.tumor.mpileup.gz_BAF.txt", + "variant_calling/controlfreec/sample2/sample2.tumor.mpileup.gz_CNVs", + "variant_calling/controlfreec/sample2/sample2.tumor.mpileup.gz_info.txt", + "variant_calling/controlfreec/sample2/sample2.tumor.mpileup.gz_ratio.BedGraph", + "variant_calling/controlfreec/sample2/sample2.tumor.mpileup.gz_ratio.txt", + "variant_calling/controlfreec/sample2/sample2.tumor.mpileup.gz_sample.cpn" + ], + [ + "sample2.recal.mosdepth.global.dist.txt:md5,f2dcd00a64947c49e8e4b93c2f4fbf27", + "sample2.recal.mosdepth.region.dist.txt:md5,39005ffaac22871ffaaf19656fe69c5b", + "sample2.recal.mosdepth.summary.txt:md5,68d4b98f17361fddf73052ead34fa370", + "sample2.recal.per-base.bed.gz:md5,39a1bc436aa8546c26faedbe94cb676c", + "sample2.recal.per-base.bed.gz.csi:md5,aaa7bed9e7ef873b23bca249b8b58eb9", + "sample2.recal.regions.bed.gz:md5,b7561bc56a955f7db0f11e67e2ec0386", + "sample2.recal.regions.bed.gz.csi:md5,538cb5d244411a670a4b041691f8825b", + "sample2.recal.cram.stats:md5,0d1784cb4c3f14b9858247ac6128dd03", + "sample2.bed:md5,480eb7f5467f7496c0c96da382563fed", + "sample2.circos.txt:md5,83b313af17990145404e4a98d59b98b2", + "sample2.p.value.txt:md5,a5baccb0b7e35c93848fc7d3f586bc98", + "sample2.tumor.mpileup.gz_BAF.txt:md5,c7f0d18d66988fe2938a8f5691772bcb", + "sample2.tumor.mpileup.gz_CNVs:md5,29d1456ae74a2a07c87dfae5dd1bf4a9", + "sample2.tumor.mpileup.gz_info.txt:md5,da8f48d99b48e85c2aac89f143e20e5b", + "sample2.tumor.mpileup.gz_ratio.BedGraph:md5,869661a75d7ec23f6cf1192b4c6f8512", + "sample2.tumor.mpileup.gz_ratio.txt:md5,cc7f5e1ae29196cc1f4052eebc03b65c", + "sample2.tumor.mpileup.gz_sample.cpn:md5,1b53d73a6f04b29c9693ecbffe8dc3a2" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.10.2" + }, + "timestamp": "2024-12-10T09:55:52.492595419" + }, + "Run with profile test | --tools controlfreec --no_intervals | somatic | stub": { + "content": [ + 14, + { + "ASSESS_SIGNIFICANCE": { + "controlfreec": 11.6 + }, + "FREEC2BED": { + "controlfreec": "11.6b" + }, + "FREEC2CIRCOS": { + "controlfreec": "11.6b" + }, + "FREEC_SOMATIC": { + "controlfreec": "11.6b" + }, + "MAKEGRAPH2": { + "controlfreec": "11.6b" + }, + "SAMTOOLS_MPILEUP": { + "samtools": 1.21 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_plots", + "multiqc/multiqc_report.html", + "no_intervals.bed", + "no_intervals.bed.gz", + "no_intervals.bed.gz.tbi", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reports", + "reports/mosdepth", + "reports/mosdepth/sample3", + "reports/mosdepth/sample3/sample3.recal.global.dist.txt", + "reports/mosdepth/sample3/sample3.recal.per-base.bed.gz", + "reports/mosdepth/sample3/sample3.recal.per-base.bed.gz.csi", + "reports/mosdepth/sample3/sample3.recal.per-base.d4", + "reports/mosdepth/sample3/sample3.recal.quantized.bed.gz", + "reports/mosdepth/sample3/sample3.recal.quantized.bed.gz.csi", + "reports/mosdepth/sample3/sample3.recal.region.dist.txt", + "reports/mosdepth/sample3/sample3.recal.regions.bed.gz", + "reports/mosdepth/sample3/sample3.recal.regions.bed.gz.csi", + "reports/mosdepth/sample3/sample3.recal.summary.txt", + "reports/mosdepth/sample3/sample3.recal.thresholds.bed.gz", + "reports/mosdepth/sample3/sample3.recal.thresholds.bed.gz.csi", + "reports/mosdepth/sample4", + "reports/mosdepth/sample4/sample4.recal.global.dist.txt", + "reports/mosdepth/sample4/sample4.recal.per-base.bed.gz", + "reports/mosdepth/sample4/sample4.recal.per-base.bed.gz.csi", + "reports/mosdepth/sample4/sample4.recal.per-base.d4", + "reports/mosdepth/sample4/sample4.recal.quantized.bed.gz", + "reports/mosdepth/sample4/sample4.recal.quantized.bed.gz.csi", + "reports/mosdepth/sample4/sample4.recal.region.dist.txt", + "reports/mosdepth/sample4/sample4.recal.regions.bed.gz", + "reports/mosdepth/sample4/sample4.recal.regions.bed.gz.csi", + "reports/mosdepth/sample4/sample4.recal.summary.txt", + "reports/mosdepth/sample4/sample4.recal.thresholds.bed.gz", + "reports/mosdepth/sample4/sample4.recal.thresholds.bed.gz.csi", + "reports/samtools", + "reports/samtools/sample3", + "reports/samtools/sample3/sample3.recal.cram.stats", + "reports/samtools/sample4", + "reports/samtools/sample4/sample4.recal.cram.stats", + "variant_calling", + "variant_calling/controlfreec", + "variant_calling/controlfreec/sample4_vs_sample3", + "variant_calling/controlfreec/sample4_vs_sample3/GC_profile.sample4_vs_sample3.cpn", + "variant_calling/controlfreec/sample4_vs_sample3/config.txt", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.bed", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.circos.txt", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.p.value.txt", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_BAF.png", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_BAF.txt", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_CNVs", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_info.txt", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_ratio.BedGraph", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_ratio.log2.png", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_ratio.png", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_ratio.txt", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_sample.cpn" + ], + [ + "sample3.recal.global.dist.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample3.recal.per-base.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "sample3.recal.per-base.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample3.recal.per-base.d4:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample3.recal.quantized.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "sample3.recal.quantized.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample3.recal.region.dist.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample3.recal.regions.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "sample3.recal.regions.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample3.recal.summary.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample3.recal.thresholds.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "sample3.recal.thresholds.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4.recal.global.dist.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4.recal.per-base.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "sample4.recal.per-base.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4.recal.per-base.d4:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4.recal.quantized.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "sample4.recal.quantized.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4.recal.region.dist.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4.recal.regions.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "sample4.recal.regions.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4.recal.summary.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4.recal.thresholds.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "sample4.recal.thresholds.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample3.recal.cram.stats:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4.recal.cram.stats:md5,d41d8cd98f00b204e9800998ecf8427e", + "GC_profile.sample4_vs_sample3.cpn:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4_vs_sample3.bed:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4_vs_sample3.circos.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4_vs_sample3.p.value.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4_vs_sample3_BAF.png:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4_vs_sample3_BAF.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4_vs_sample3_CNVs:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4_vs_sample3_info.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4_vs_sample3_ratio.BedGraph:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4_vs_sample3_ratio.log2.png:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4_vs_sample3_ratio.png:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4_vs_sample3_ratio.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample4_vs_sample3_sample.cpn:md5,d41d8cd98f00b204e9800998ecf8427e" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.10.2" + }, + "timestamp": "2024-12-10T10:14:32.674251218" + }, + "Run with profile test | --tools controlfreec | somatic": { + "content": [ + 16, + { + "ASSESS_SIGNIFICANCE": { + "controlfreec": 11.6 + }, + "FREEC2BED": { + "controlfreec": "11.6b" + }, + "FREEC2CIRCOS": { + "controlfreec": "11.6b" + }, + "FREEC_SOMATIC": { + "controlfreec": "11.6b" + }, + "MAKEGRAPH2": { + "controlfreec": "11.6b" + }, + "SAMTOOLS_MPILEUP": { + "samtools": 1.21 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reports", + "reports/mosdepth", + "reports/mosdepth/sample3", + "reports/mosdepth/sample3/sample3.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample3/sample3.recal.mosdepth.region.dist.txt", + "reports/mosdepth/sample3/sample3.recal.mosdepth.summary.txt", + "reports/mosdepth/sample3/sample3.recal.per-base.bed.gz", + "reports/mosdepth/sample3/sample3.recal.per-base.bed.gz.csi", + "reports/mosdepth/sample3/sample3.recal.regions.bed.gz", + "reports/mosdepth/sample3/sample3.recal.regions.bed.gz.csi", + "reports/mosdepth/sample4", + "reports/mosdepth/sample4/sample4.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample4/sample4.recal.mosdepth.region.dist.txt", + "reports/mosdepth/sample4/sample4.recal.mosdepth.summary.txt", + "reports/mosdepth/sample4/sample4.recal.per-base.bed.gz", + "reports/mosdepth/sample4/sample4.recal.per-base.bed.gz.csi", + "reports/mosdepth/sample4/sample4.recal.regions.bed.gz", + "reports/mosdepth/sample4/sample4.recal.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/sample3", + "reports/samtools/sample3/sample3.recal.cram.stats", + "reports/samtools/sample4", + "reports/samtools/sample4/sample4.recal.cram.stats", + "variant_calling", + "variant_calling/controlfreec", + "variant_calling/controlfreec/sample4_vs_sample3", + "variant_calling/controlfreec/sample4_vs_sample3/config.txt", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.bed", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.circos.txt", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.normal.mpileup.gz_control.cpn", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.p.value.txt", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.tumor.mpileup.gz_BAF.txt", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.tumor.mpileup.gz_CNVs", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.tumor.mpileup.gz_info.txt", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.tumor.mpileup.gz_ratio.BedGraph", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.tumor.mpileup.gz_ratio.txt", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3.tumor.mpileup.gz_sample.cpn", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_BAF.png", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_ratio.log2.png", + "variant_calling/controlfreec/sample4_vs_sample3/sample4_vs_sample3_ratio.png" + ], + [ + "multiqc_citations.txt:md5,4c806e63a283ec1b7e78cdae3a923d4f", + "sample3.recal.mosdepth.global.dist.txt:md5,69e29702ef01fd8f6c7a5468fc35a16a", + "sample3.recal.mosdepth.region.dist.txt:md5,6ec49cd7d510c2eb3d9d90fdb79b783a", + "sample3.recal.mosdepth.summary.txt:md5,103098d0bf76ed82d2b87d5f242b099a", + "sample3.recal.per-base.bed.gz:md5,297f96648928d0ca5184223fb9941e7c", + "sample3.recal.per-base.bed.gz.csi:md5,519cc5bf84da0d71b87a88c76f83194e", + "sample3.recal.regions.bed.gz:md5,314ce8d7273eff353072108aa77c327c", + "sample3.recal.regions.bed.gz.csi:md5,538cb5d244411a670a4b041691f8825b", + "sample4.recal.mosdepth.global.dist.txt:md5,f2dcd00a64947c49e8e4b93c2f4fbf27", + "sample4.recal.mosdepth.region.dist.txt:md5,39005ffaac22871ffaaf19656fe69c5b", + "sample4.recal.mosdepth.summary.txt:md5,68d4b98f17361fddf73052ead34fa370", + "sample4.recal.per-base.bed.gz:md5,39a1bc436aa8546c26faedbe94cb676c", + "sample4.recal.per-base.bed.gz.csi:md5,aaa7bed9e7ef873b23bca249b8b58eb9", + "sample4.recal.regions.bed.gz:md5,b7561bc56a955f7db0f11e67e2ec0386", + "sample4.recal.regions.bed.gz.csi:md5,538cb5d244411a670a4b041691f8825b", + "sample3.recal.cram.stats:md5,bcc229318527e414e69aaa5cd092ad9b", + "sample4.recal.cram.stats:md5,0d1784cb4c3f14b9858247ac6128dd03", + "sample4_vs_sample3.bed:md5,47f60179409e9389e59b2e2525e42210", + "sample4_vs_sample3.circos.txt:md5,68addb1d8bda08355842bef0ab15cd6e", + "sample4_vs_sample3.normal.mpileup.gz_control.cpn:md5,d50bf2c9a4d35f022364901c284e80ed", + "sample4_vs_sample3.p.value.txt:md5,3fad51341e7ee56c3b02de6a51d77efa", + "sample4_vs_sample3.tumor.mpileup.gz_BAF.txt:md5,723779bd103b66dcfa6fcfa692135a61", + "sample4_vs_sample3.tumor.mpileup.gz_CNVs:md5,1d9166f66bf72adf2aea74adfc4ab015", + "sample4_vs_sample3.tumor.mpileup.gz_info.txt:md5,b79da6ae026d86777d60d9f9edb9c6f6", + "sample4_vs_sample3.tumor.mpileup.gz_ratio.BedGraph:md5,cb087117ea046a6350885a34cb4bf667", + "sample4_vs_sample3.tumor.mpileup.gz_ratio.txt:md5,690cbefd87a77a6a37689135585c401c", + "sample4_vs_sample3.tumor.mpileup.gz_sample.cpn:md5,b4f97163fdb6a3d97ca4ea560394cdb1", + "sample4_vs_sample3_BAF.png:md5,16456932bb16e79c8cec4f747846c321", + "sample4_vs_sample3_ratio.log2.png:md5,89dce170acbb4c438be8359d242940df", + "sample4_vs_sample3_ratio.png:md5,6ac35ba93babc019c2d863f7e06b49b1" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-14T16:46:15.872055864" + }, + "Run with profile test | --tools controlfreec | tumoronly | stub": { + "content": [ + 13, + { + "ASSESS_SIGNIFICANCE": { + "controlfreec": 11.6 + }, + "FREEC2BED": { + "controlfreec": "11.6b" + }, + "FREEC2CIRCOS": { + "controlfreec": "11.6b" + }, + "FREEC_TUMORONLY": { + "controlfreec": "11.6b" + }, + "MAKEGRAPH2": { + "controlfreec": "11.6b" + }, + "SAMTOOLS_MPILEUP": { + "samtools": 1.21 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_plots", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reports", + "reports/mosdepth", + "reports/mosdepth/sample2", + "reports/mosdepth/sample2/sample2.recal.global.dist.txt", + "reports/mosdepth/sample2/sample2.recal.per-base.bed.gz", + "reports/mosdepth/sample2/sample2.recal.per-base.bed.gz.csi", + "reports/mosdepth/sample2/sample2.recal.per-base.d4", + "reports/mosdepth/sample2/sample2.recal.quantized.bed.gz", + "reports/mosdepth/sample2/sample2.recal.quantized.bed.gz.csi", + "reports/mosdepth/sample2/sample2.recal.region.dist.txt", + "reports/mosdepth/sample2/sample2.recal.regions.bed.gz", + "reports/mosdepth/sample2/sample2.recal.regions.bed.gz.csi", + "reports/mosdepth/sample2/sample2.recal.summary.txt", + "reports/mosdepth/sample2/sample2.recal.thresholds.bed.gz", + "reports/mosdepth/sample2/sample2.recal.thresholds.bed.gz.csi", + "reports/samtools", + "reports/samtools/sample2", + "reports/samtools/sample2/sample2.recal.cram.stats", + "variant_calling", + "variant_calling/controlfreec", + "variant_calling/controlfreec/sample2", + "variant_calling/controlfreec/sample2/GC_profile.sample2.cpn", + "variant_calling/controlfreec/sample2/config.txt", + "variant_calling/controlfreec/sample2/sample2.bed", + "variant_calling/controlfreec/sample2/sample2.circos.txt", + "variant_calling/controlfreec/sample2/sample2.p.value.txt", + "variant_calling/controlfreec/sample2/sample2_BAF.png", + "variant_calling/controlfreec/sample2/sample2_BAF.txt", + "variant_calling/controlfreec/sample2/sample2_CNVs", + "variant_calling/controlfreec/sample2/sample2_info.txt", + "variant_calling/controlfreec/sample2/sample2_ratio.BedGraph", + "variant_calling/controlfreec/sample2/sample2_ratio.log2.png", + "variant_calling/controlfreec/sample2/sample2_ratio.png", + "variant_calling/controlfreec/sample2/sample2_ratio.txt", + "variant_calling/controlfreec/sample2/sample2_sample.cpn" + ], + [ + "sample2.recal.global.dist.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2.recal.per-base.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "sample2.recal.per-base.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2.recal.per-base.d4:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2.recal.quantized.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "sample2.recal.quantized.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2.recal.region.dist.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2.recal.regions.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "sample2.recal.regions.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2.recal.summary.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2.recal.thresholds.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "sample2.recal.thresholds.bed.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2.recal.cram.stats:md5,d41d8cd98f00b204e9800998ecf8427e", + "GC_profile.sample2.cpn:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2.bed:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2.circos.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2.p.value.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2_BAF.png:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2_BAF.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2_CNVs:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2_info.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2_ratio.BedGraph:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2_ratio.log2.png:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2_ratio.png:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2_ratio.txt:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2_sample.cpn:md5,d41d8cd98f00b204e9800998ecf8427e" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.10.2" + }, + "timestamp": "2024-12-10T10:33:44.36063821" + }, + "Run with profile test | --tools controlfreec --no_intervals | tumoronly": { + "content": [ + 11, + { + "ASSESS_SIGNIFICANCE": { + "controlfreec": 11.6 + }, + "FREEC2BED": { + "controlfreec": "11.6b" + }, + "FREEC2CIRCOS": { + "controlfreec": "11.6b" + }, + "FREEC_TUMORONLY": { + "controlfreec": "11.6b" + }, + "MAKEGRAPH2": { + "controlfreec": "11.6b" + }, + "SAMTOOLS_MPILEUP": { + "samtools": 1.21 + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_report.html", + "no_intervals.bed", + "no_intervals.bed.gz", + "no_intervals.bed.gz.tbi", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reports", + "reports/mosdepth", + "reports/mosdepth/sample2", + "reports/mosdepth/sample2/sample2.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample2/sample2.recal.mosdepth.summary.txt", + "reports/mosdepth/sample2/sample2.recal.per-base.bed.gz", + "reports/mosdepth/sample2/sample2.recal.per-base.bed.gz.csi", + "reports/samtools", + "reports/samtools/sample2", + "reports/samtools/sample2/sample2.recal.cram.stats", + "variant_calling", + "variant_calling/controlfreec", + "variant_calling/controlfreec/sample2", + "variant_calling/controlfreec/sample2/config.txt", + "variant_calling/controlfreec/sample2/sample2.bed", + "variant_calling/controlfreec/sample2/sample2.circos.txt", + "variant_calling/controlfreec/sample2/sample2.p.value.txt", + "variant_calling/controlfreec/sample2/sample2.tumor.mpileup.gz_BAF.txt", + "variant_calling/controlfreec/sample2/sample2.tumor.mpileup.gz_CNVs", + "variant_calling/controlfreec/sample2/sample2.tumor.mpileup.gz_info.txt", + "variant_calling/controlfreec/sample2/sample2.tumor.mpileup.gz_ratio.BedGraph", + "variant_calling/controlfreec/sample2/sample2.tumor.mpileup.gz_ratio.txt", + "variant_calling/controlfreec/sample2/sample2.tumor.mpileup.gz_sample.cpn", + "variant_calling/controlfreec/sample2/sample2_BAF.png", + "variant_calling/controlfreec/sample2/sample2_ratio.log2.png", + "variant_calling/controlfreec/sample2/sample2_ratio.png" + ], + [ + "multiqc_citations.txt:md5,4c806e63a283ec1b7e78cdae3a923d4f", + "sample2.recal.mosdepth.global.dist.txt:md5,f2dcd00a64947c49e8e4b93c2f4fbf27", + "sample2.recal.mosdepth.summary.txt:md5,0a7300e56eda6fba7c7564f00aa000f0", + "sample2.recal.per-base.bed.gz:md5,39a1bc436aa8546c26faedbe94cb676c", + "sample2.recal.per-base.bed.gz.csi:md5,aaa7bed9e7ef873b23bca249b8b58eb9", + "sample2.recal.cram.stats:md5,0d1784cb4c3f14b9858247ac6128dd03", + "sample2.bed:md5,5249f46e614b60867bdd6b9b83327979", + "sample2.circos.txt:md5,2efab24d023931cec8b158c56d1f1765", + "sample2.p.value.txt:md5,38c8c9ad33a4fca3804a34d5c436cd1e", + "sample2.tumor.mpileup.gz_BAF.txt:md5,0bb91da6a637ed64d7622eb7d539fd71", + "sample2.tumor.mpileup.gz_CNVs:md5,741831784091e9a51e0c07117b67e18f", + "sample2.tumor.mpileup.gz_info.txt:md5,fed6aa0e0f4232255d5152f5774161b9", + "sample2.tumor.mpileup.gz_ratio.BedGraph:md5,d2347daecbb4eb1f1a3b5558acdf657a", + "sample2.tumor.mpileup.gz_ratio.txt:md5,7587b17b4303715aa45eae017e357c23", + "sample2.tumor.mpileup.gz_sample.cpn:md5,8bf25e5cf94e89bcbbd4bb0d453d3057", + "sample2_BAF.png:md5,32e1189f07f9ddb4892cae10f4003e4a", + "sample2_ratio.log2.png:md5,6d8b8cabc35d391f9a92ede6128eb378", + "sample2_ratio.png:md5,47fce58116c63d6cad34f80575e34e43" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-14T16:47:40.762784422" + } +} diff --git a/tests/variant_calling_strelka.nf.test b/tests/variant_calling_strelka.nf.test new file mode 100644 index 0000000000..0411c409fc --- /dev/null +++ b/tests/variant_calling_strelka.nf.test @@ -0,0 +1,215 @@ +nextflow_pipeline { + + name "Test pipeline" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + + test("Run with profile test | --tools strelka --only_paired_variant_calling") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta' + fasta_fai = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai' + intervals = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed' + input = "${projectDir}/tests/csv/3.0/recalibrated.csv" + outdir = "$outputDir" + step = "variant_calling" + tools = 'strelka' + only_paired_variant_calling = true + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + test("Run with profile test | --tools strelka | germline") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta' + fasta_fai = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai' + intervals = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed' + input = "${projectDir}/tests/csv/3.0/recalibrated_germline.csv" + outdir = "$outputDir" + step = "variant_calling" + tools = 'strelka' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + test("Run with profile test | --tools strelka --no_intervals | germline") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta' + fasta_fai = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai' + intervals = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed' + input = "${projectDir}/tests/csv/3.0/recalibrated_germline.csv" + outdir = "$outputDir" + no_intervals = true + step = "variant_calling" + tools = 'strelka' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + test("Run with profile test | --tools strelka | somatic") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta' + fasta_fai = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai' + intervals = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed' + input = "${projectDir}/tests/csv/3.0/recalibrated_somatic.csv" + outdir = "$outputDir" + step = "variant_calling" + tools = 'strelka' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + test("Run with profile test | --tools strelka --no_intervals | somatic") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta' + fasta_fai = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai' + intervals = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed' + input = "${projectDir}/tests/csv/3.0/recalibrated_somatic.csv" + outdir = "$outputDir" + no_intervals = true + step = "variant_calling" + tools = 'strelka' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } +} diff --git a/tests/variant_calling_strelka.nf.test.snap b/tests/variant_calling_strelka.nf.test.snap new file mode 100644 index 0000000000..c72841c0f2 --- /dev/null +++ b/tests/variant_calling_strelka.nf.test.snap @@ -0,0 +1,835 @@ +{ + "Run with profile test | --tools strelka --only_paired_variant_calling": { + "content": [ + 26, + { + "BCFTOOLS_STATS": { + "bcftools": 1.2 + }, + "STRELKA_SINGLE": { + "strelka": "2.9.10" + }, + "STRELKA_SOMATIC": { + "strelka": "2.9.10" + }, + "VCFTOOLS_TSTV_COUNT": { + "vcftools": "0.1.16" + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/variantcalled.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/bcftools-stats-subtypes.txt", + "multiqc/multiqc_data/bcftools_stats_indel-lengths.txt", + "multiqc/multiqc_data/bcftools_stats_variant_depths.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Indels.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_SNP.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transitions.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transversions.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_bcftools_stats.txt", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/vcftools_tstv_by_count.txt", + "multiqc/multiqc_data/vcftools_tstv_by_qual.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-cnt.pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-pct.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_indel-lengths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_variant_depths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Indels.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_SNP.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transitions.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transversions.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_count.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_qual.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-cnt.png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-pct.png", + "multiqc/multiqc_plots/png/bcftools_stats_indel-lengths.png", + "multiqc/multiqc_plots/png/bcftools_stats_variant_depths.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Indels.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_SNP.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transitions.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transversions.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_count.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_qual.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-cnt.svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-pct.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_indel-lengths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_variant_depths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Indels.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_SNP.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transitions.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transversions.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_count.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_qual.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reports", + "reports/bcftools", + "reports/bcftools/strelka", + "reports/bcftools/strelka/sample1", + "reports/bcftools/strelka/sample1/sample1.strelka.variants.bcftools_stats.txt", + "reports/bcftools/strelka/sample4_vs_sample3", + "reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.bcftools_stats.txt", + "reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.bcftools_stats.txt", + "reports/mosdepth", + "reports/mosdepth/sample1", + "reports/mosdepth/sample1/sample1.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample1/sample1.recal.mosdepth.region.dist.txt", + "reports/mosdepth/sample1/sample1.recal.mosdepth.summary.txt", + "reports/mosdepth/sample1/sample1.recal.regions.bed.gz", + "reports/mosdepth/sample1/sample1.recal.regions.bed.gz.csi", + "reports/mosdepth/sample2", + "reports/mosdepth/sample2/sample2.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample2/sample2.recal.mosdepth.region.dist.txt", + "reports/mosdepth/sample2/sample2.recal.mosdepth.summary.txt", + "reports/mosdepth/sample2/sample2.recal.regions.bed.gz", + "reports/mosdepth/sample2/sample2.recal.regions.bed.gz.csi", + "reports/mosdepth/sample3", + "reports/mosdepth/sample3/sample3.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample3/sample3.recal.mosdepth.region.dist.txt", + "reports/mosdepth/sample3/sample3.recal.mosdepth.summary.txt", + "reports/mosdepth/sample3/sample3.recal.regions.bed.gz", + "reports/mosdepth/sample3/sample3.recal.regions.bed.gz.csi", + "reports/mosdepth/sample4", + "reports/mosdepth/sample4/sample4.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample4/sample4.recal.mosdepth.region.dist.txt", + "reports/mosdepth/sample4/sample4.recal.mosdepth.summary.txt", + "reports/mosdepth/sample4/sample4.recal.regions.bed.gz", + "reports/mosdepth/sample4/sample4.recal.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/sample1", + "reports/samtools/sample1/sample1.recal.cram.stats", + "reports/samtools/sample2", + "reports/samtools/sample2/sample2.recal.cram.stats", + "reports/samtools/sample3", + "reports/samtools/sample3/sample3.recal.cram.stats", + "reports/samtools/sample4", + "reports/samtools/sample4/sample4.recal.cram.stats", + "reports/vcftools", + "reports/vcftools/strelka", + "reports/vcftools/strelka/sample1", + "reports/vcftools/strelka/sample1/sample1.strelka.variants.FILTER.summary", + "reports/vcftools/strelka/sample1/sample1.strelka.variants.TsTv.count", + "reports/vcftools/strelka/sample1/sample1.strelka.variants.TsTv.qual", + "reports/vcftools/strelka/sample4_vs_sample3", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.FILTER.summary", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.count", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.qual", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.FILTER.summary", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.count", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.qual", + "variant_calling", + "variant_calling/strelka", + "variant_calling/strelka/sample1", + "variant_calling/strelka/sample1/sample1.strelka.genome.vcf.gz", + "variant_calling/strelka/sample1/sample1.strelka.genome.vcf.gz.tbi", + "variant_calling/strelka/sample1/sample1.strelka.variants.vcf.gz", + "variant_calling/strelka/sample1/sample1.strelka.variants.vcf.gz.tbi", + "variant_calling/strelka/sample4_vs_sample3", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz.tbi", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz.tbi" + ], + [ + "bcftools-stats-subtypes.txt:md5,d4041869deeadd04932d77f1f376ba9d", + "bcftools_stats_indel-lengths.txt:md5,b649aa1a831505ab7909723e39d617b7", + "bcftools_stats_variant_depths.txt:md5,b4360600be4ee46148d30c428ec9e330", + "bcftools_stats_vqc_Count_Indels.txt:md5,a5a6f0a7bdd11356896815bc2a469824", + "bcftools_stats_vqc_Count_SNP.txt:md5,831b748f17546365d4a2a5311845d9a4", + "bcftools_stats_vqc_Count_Transitions.txt:md5,b1dbb01ad7ae16cb41da5e5a6d17f537", + "bcftools_stats_vqc_Count_Transversions.txt:md5,ddf32616fcb4dbb2f688dc7b6dae1b68", + "multiqc_bcftools_stats.txt:md5,b2bbb90ef05598e1a70cf36be416c3e2", + "multiqc_citations.txt:md5,ac2b3cf2dfb12c40837b9bbad8112d86", + "vcftools_tstv_by_count.txt:md5,d4fdc9722fec46722bda2d04be6801d0", + "sample1.strelka.variants.bcftools_stats.txt:md5,2936f10f99295fb772d8c35f246e223d", + "sample4_vs_sample3.strelka.somatic_indels.bcftools_stats.txt:md5,1c57e5cd6424157536276002ef1a58d6", + "sample4_vs_sample3.strelka.somatic_snvs.bcftools_stats.txt:md5,8cf6d0b3f41436cd2f2aa09c9831764d", + "sample1.recal.mosdepth.global.dist.txt:md5,d9a4dd6429560b2b647da346050766c5", + "sample1.recal.mosdepth.region.dist.txt:md5,1f3dab381958e08eb00f7c5e1135f677", + "sample1.recal.mosdepth.summary.txt:md5,d7676e7c1de851b0ee5185d21096123b", + "sample1.recal.regions.bed.gz:md5,6edeb8f7041a4403cb73651744b5bc82", + "sample1.recal.regions.bed.gz.csi:md5,f17cc9d960aa4a1e96548d570585cc8a", + "sample2.recal.mosdepth.global.dist.txt:md5,53f9ae9ab5002ffba340fa8cef7d70e4", + "sample2.recal.mosdepth.region.dist.txt:md5,17600d21ac453506c52249cf435ad8ea", + "sample2.recal.mosdepth.summary.txt:md5,7141030385af1f653718c9e0c9a5be80", + "sample2.recal.regions.bed.gz:md5,c680c5d75f0cea068e3f917f4cf9bf52", + "sample2.recal.regions.bed.gz.csi:md5,4003c8833ed5e9b9f45282a6915c935e", + "sample3.recal.mosdepth.global.dist.txt:md5,d9a4dd6429560b2b647da346050766c5", + "sample3.recal.mosdepth.region.dist.txt:md5,1f3dab381958e08eb00f7c5e1135f677", + "sample3.recal.mosdepth.summary.txt:md5,d7676e7c1de851b0ee5185d21096123b", + "sample3.recal.regions.bed.gz:md5,6edeb8f7041a4403cb73651744b5bc82", + "sample3.recal.regions.bed.gz.csi:md5,f17cc9d960aa4a1e96548d570585cc8a", + "sample4.recal.mosdepth.global.dist.txt:md5,53f9ae9ab5002ffba340fa8cef7d70e4", + "sample4.recal.mosdepth.region.dist.txt:md5,17600d21ac453506c52249cf435ad8ea", + "sample4.recal.mosdepth.summary.txt:md5,7141030385af1f653718c9e0c9a5be80", + "sample4.recal.regions.bed.gz:md5,c680c5d75f0cea068e3f917f4cf9bf52", + "sample4.recal.regions.bed.gz.csi:md5,4003c8833ed5e9b9f45282a6915c935e", + "sample1.recal.cram.stats:md5,bcc229318527e414e69aaa5cd092ad9b", + "sample2.recal.cram.stats:md5,0d1784cb4c3f14b9858247ac6128dd03", + "sample3.recal.cram.stats:md5,bcc229318527e414e69aaa5cd092ad9b", + "sample4.recal.cram.stats:md5,0d1784cb4c3f14b9858247ac6128dd03", + "sample1.strelka.variants.FILTER.summary:md5,fef8aeadd3b0f3b8c040c0da03bf1cbd", + "sample1.strelka.variants.TsTv.count:md5,c5b7a8eda2526d899098439ae4c06a49", + "sample4_vs_sample3.strelka.somatic_indels.FILTER.summary:md5,30a45e2bc87f40c89388032cbf75ec65", + "sample4_vs_sample3.strelka.somatic_indels.TsTv.count:md5,8dcfdbcaac118df1d5ad407dd2af699f", + "sample4_vs_sample3.strelka.somatic_snvs.FILTER.summary:md5,4fc17fa5625b4d1dcc5d791b1eb22d85", + "sample4_vs_sample3.strelka.somatic_snvs.TsTv.count:md5,fc7af1f534890c4ad3025588b3af62ae" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-15T22:22:40.803241819" + }, + "Run with profile test | --tools strelka --no_intervals | somatic": { + "content": [ + 20, + { + "BCFTOOLS_STATS": { + "bcftools": 1.2 + }, + "STRELKA_SINGLE": { + "strelka": "2.9.10" + }, + "STRELKA_SOMATIC": { + "strelka": "2.9.10" + }, + "VCFTOOLS_TSTV_COUNT": { + "vcftools": "0.1.16" + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/variantcalled.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/bcftools-stats-subtypes.txt", + "multiqc/multiqc_data/bcftools_stats_indel-lengths.txt", + "multiqc/multiqc_data/bcftools_stats_variant_depths.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Indels.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_SNP.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transitions.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transversions.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_bcftools_stats.txt", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/vcftools_tstv_by_count.txt", + "multiqc/multiqc_data/vcftools_tstv_by_qual.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-cnt.pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-pct.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_indel-lengths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_variant_depths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Indels.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_SNP.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transitions.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transversions.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_count.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_qual.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-cnt.png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-pct.png", + "multiqc/multiqc_plots/png/bcftools_stats_indel-lengths.png", + "multiqc/multiqc_plots/png/bcftools_stats_variant_depths.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Indels.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_SNP.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transitions.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transversions.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_count.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_qual.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-cnt.svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-pct.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_indel-lengths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_variant_depths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Indels.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_SNP.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transitions.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transversions.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_count.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_qual.svg", + "multiqc/multiqc_report.html", + "no_intervals.bed", + "no_intervals.bed.gz", + "no_intervals.bed.gz.tbi", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reports", + "reports/bcftools", + "reports/bcftools/strelka", + "reports/bcftools/strelka/sample3", + "reports/bcftools/strelka/sample3/sample3.strelka.variants.bcftools_stats.txt", + "reports/bcftools/strelka/sample4_vs_sample3", + "reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.bcftools_stats.txt", + "reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.bcftools_stats.txt", + "reports/mosdepth", + "reports/mosdepth/sample3", + "reports/mosdepth/sample3/sample3.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample3/sample3.recal.mosdepth.region.dist.txt", + "reports/mosdepth/sample3/sample3.recal.mosdepth.summary.txt", + "reports/mosdepth/sample3/sample3.recal.regions.bed.gz", + "reports/mosdepth/sample3/sample3.recal.regions.bed.gz.csi", + "reports/mosdepth/sample4", + "reports/mosdepth/sample4/sample4.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample4/sample4.recal.mosdepth.region.dist.txt", + "reports/mosdepth/sample4/sample4.recal.mosdepth.summary.txt", + "reports/mosdepth/sample4/sample4.recal.regions.bed.gz", + "reports/mosdepth/sample4/sample4.recal.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/sample3", + "reports/samtools/sample3/sample3.recal.cram.stats", + "reports/samtools/sample4", + "reports/samtools/sample4/sample4.recal.cram.stats", + "reports/vcftools", + "reports/vcftools/strelka", + "reports/vcftools/strelka/sample3", + "reports/vcftools/strelka/sample3/sample3.strelka.variants.FILTER.summary", + "reports/vcftools/strelka/sample3/sample3.strelka.variants.TsTv.count", + "reports/vcftools/strelka/sample3/sample3.strelka.variants.TsTv.qual", + "reports/vcftools/strelka/sample4_vs_sample3", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.FILTER.summary", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.count", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.qual", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.FILTER.summary", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.count", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.qual", + "variant_calling", + "variant_calling/strelka", + "variant_calling/strelka/sample3", + "variant_calling/strelka/sample3/sample3.strelka.genome.vcf.gz", + "variant_calling/strelka/sample3/sample3.strelka.genome.vcf.gz.tbi", + "variant_calling/strelka/sample3/sample3.strelka.variants.vcf.gz", + "variant_calling/strelka/sample3/sample3.strelka.variants.vcf.gz.tbi", + "variant_calling/strelka/sample4_vs_sample3", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz.tbi", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz.tbi" + ], + [ + "bcftools-stats-subtypes.txt:md5,da49d6b55f73aea5ae4af253e510a9dd", + "bcftools_stats_indel-lengths.txt:md5,9a21a12ae8ba2eb34a09b902e1d4dfbc", + "bcftools_stats_variant_depths.txt:md5,201388db8c5b940aaf05735106f63980", + "bcftools_stats_vqc_Count_Indels.txt:md5,90076001a4996b4f015ee7cfdfdc0f2d", + "bcftools_stats_vqc_Count_SNP.txt:md5,5bd2de6cab28b6ac4bda66a97e73ec4c", + "bcftools_stats_vqc_Count_Transitions.txt:md5,b46f5fae18ee0526c636e89979c10c5e", + "bcftools_stats_vqc_Count_Transversions.txt:md5,4a127df36efe1b68b7ae93b8c71dd7b1", + "multiqc_bcftools_stats.txt:md5,e47ab6f334efa3a95dbd609a27711375", + "multiqc_citations.txt:md5,ac2b3cf2dfb12c40837b9bbad8112d86", + "vcftools_tstv_by_count.txt:md5,4244903e90d55bc6cd0cb9f6efcd8a80", + "sample3.strelka.variants.bcftools_stats.txt:md5,c75a458b1aa0e1bae3b667d48684e13c", + "sample4_vs_sample3.strelka.somatic_indels.bcftools_stats.txt:md5,1c57e5cd6424157536276002ef1a58d6", + "sample4_vs_sample3.strelka.somatic_snvs.bcftools_stats.txt:md5,110810e5702ef11bc002bd0948dbdfff", + "sample3.recal.mosdepth.global.dist.txt:md5,d9a4dd6429560b2b647da346050766c5", + "sample3.recal.mosdepth.region.dist.txt:md5,1f3dab381958e08eb00f7c5e1135f677", + "sample3.recal.mosdepth.summary.txt:md5,d7676e7c1de851b0ee5185d21096123b", + "sample3.recal.regions.bed.gz:md5,6edeb8f7041a4403cb73651744b5bc82", + "sample3.recal.regions.bed.gz.csi:md5,f17cc9d960aa4a1e96548d570585cc8a", + "sample4.recal.mosdepth.global.dist.txt:md5,53f9ae9ab5002ffba340fa8cef7d70e4", + "sample4.recal.mosdepth.region.dist.txt:md5,17600d21ac453506c52249cf435ad8ea", + "sample4.recal.mosdepth.summary.txt:md5,7141030385af1f653718c9e0c9a5be80", + "sample4.recal.regions.bed.gz:md5,c680c5d75f0cea068e3f917f4cf9bf52", + "sample4.recal.regions.bed.gz.csi:md5,4003c8833ed5e9b9f45282a6915c935e", + "sample3.recal.cram.stats:md5,bcc229318527e414e69aaa5cd092ad9b", + "sample4.recal.cram.stats:md5,0d1784cb4c3f14b9858247ac6128dd03", + "sample3.strelka.variants.FILTER.summary:md5,8697a0a983314e98b99b5f6038af65f6", + "sample3.strelka.variants.TsTv.count:md5,1481854d2a765f5641856ecf95ca4097", + "sample4_vs_sample3.strelka.somatic_indels.FILTER.summary:md5,30a45e2bc87f40c89388032cbf75ec65", + "sample4_vs_sample3.strelka.somatic_indels.TsTv.count:md5,8dcfdbcaac118df1d5ad407dd2af699f", + "sample4_vs_sample3.strelka.somatic_snvs.FILTER.summary:md5,7a81b11aa29fec73d5bc872b7b58f8aa", + "sample4_vs_sample3.strelka.somatic_snvs.TsTv.count:md5,a922c51ca3b2ea7cdcfa09e9c8c55d52" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-15T22:29:05.475529758" + }, + "Run with profile test | --tools strelka | somatic": { + "content": [ + 22, + { + "BCFTOOLS_STATS": { + "bcftools": 1.2 + }, + "STRELKA_SINGLE": { + "strelka": "2.9.10" + }, + "STRELKA_SOMATIC": { + "strelka": "2.9.10" + }, + "VCFTOOLS_TSTV_COUNT": { + "vcftools": "0.1.16" + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/variantcalled.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/bcftools-stats-subtypes.txt", + "multiqc/multiqc_data/bcftools_stats_indel-lengths.txt", + "multiqc/multiqc_data/bcftools_stats_variant_depths.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Indels.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_SNP.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transitions.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transversions.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_bcftools_stats.txt", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/vcftools_tstv_by_count.txt", + "multiqc/multiqc_data/vcftools_tstv_by_qual.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-cnt.pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-pct.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_indel-lengths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_variant_depths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Indels.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_SNP.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transitions.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transversions.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_count.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_qual.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-cnt.png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-pct.png", + "multiqc/multiqc_plots/png/bcftools_stats_indel-lengths.png", + "multiqc/multiqc_plots/png/bcftools_stats_variant_depths.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Indels.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_SNP.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transitions.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transversions.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_count.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_qual.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-cnt.svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-pct.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_indel-lengths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_variant_depths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Indels.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_SNP.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transitions.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transversions.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_count.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_qual.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reports", + "reports/bcftools", + "reports/bcftools/strelka", + "reports/bcftools/strelka/sample3", + "reports/bcftools/strelka/sample3/sample3.strelka.variants.bcftools_stats.txt", + "reports/bcftools/strelka/sample4_vs_sample3", + "reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.bcftools_stats.txt", + "reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.bcftools_stats.txt", + "reports/mosdepth", + "reports/mosdepth/sample3", + "reports/mosdepth/sample3/sample3.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample3/sample3.recal.mosdepth.region.dist.txt", + "reports/mosdepth/sample3/sample3.recal.mosdepth.summary.txt", + "reports/mosdepth/sample3/sample3.recal.regions.bed.gz", + "reports/mosdepth/sample3/sample3.recal.regions.bed.gz.csi", + "reports/mosdepth/sample4", + "reports/mosdepth/sample4/sample4.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample4/sample4.recal.mosdepth.region.dist.txt", + "reports/mosdepth/sample4/sample4.recal.mosdepth.summary.txt", + "reports/mosdepth/sample4/sample4.recal.regions.bed.gz", + "reports/mosdepth/sample4/sample4.recal.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/sample3", + "reports/samtools/sample3/sample3.recal.cram.stats", + "reports/samtools/sample4", + "reports/samtools/sample4/sample4.recal.cram.stats", + "reports/vcftools", + "reports/vcftools/strelka", + "reports/vcftools/strelka/sample3", + "reports/vcftools/strelka/sample3/sample3.strelka.variants.FILTER.summary", + "reports/vcftools/strelka/sample3/sample3.strelka.variants.TsTv.count", + "reports/vcftools/strelka/sample3/sample3.strelka.variants.TsTv.qual", + "reports/vcftools/strelka/sample4_vs_sample3", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.FILTER.summary", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.count", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.qual", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.FILTER.summary", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.count", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.qual", + "variant_calling", + "variant_calling/strelka", + "variant_calling/strelka/sample3", + "variant_calling/strelka/sample3/sample3.strelka.genome.vcf.gz", + "variant_calling/strelka/sample3/sample3.strelka.genome.vcf.gz.tbi", + "variant_calling/strelka/sample3/sample3.strelka.variants.vcf.gz", + "variant_calling/strelka/sample3/sample3.strelka.variants.vcf.gz.tbi", + "variant_calling/strelka/sample4_vs_sample3", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz.tbi", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz.tbi" + ], + [ + "bcftools-stats-subtypes.txt:md5,d508119a3abde1e1f737cda79bbdbdbc", + "bcftools_stats_indel-lengths.txt:md5,668fd3e3d2c68e4a084ee1099980906f", + "bcftools_stats_variant_depths.txt:md5,b4360600be4ee46148d30c428ec9e330", + "bcftools_stats_vqc_Count_Indels.txt:md5,def7f6a03fac61287483a7c86d44fe49", + "bcftools_stats_vqc_Count_SNP.txt:md5,a014a7c6d4fb7bea63e89da4cfeef2a9", + "bcftools_stats_vqc_Count_Transitions.txt:md5,ec192419e14a2aeb0c18742879bef563", + "bcftools_stats_vqc_Count_Transversions.txt:md5,b7ca7c37370d2db7052efe21c41d81e9", + "multiqc_bcftools_stats.txt:md5,c2674bda60d8fa805601c436afe16f18", + "multiqc_citations.txt:md5,ac2b3cf2dfb12c40837b9bbad8112d86", + "vcftools_tstv_by_count.txt:md5,3888b9a69fdeaf6667dcc600ff776413", + "sample3.strelka.variants.bcftools_stats.txt:md5,d505d381e8c3906788e4135bb975ff84", + "sample4_vs_sample3.strelka.somatic_indels.bcftools_stats.txt:md5,1c57e5cd6424157536276002ef1a58d6", + "sample4_vs_sample3.strelka.somatic_snvs.bcftools_stats.txt:md5,8cf6d0b3f41436cd2f2aa09c9831764d", + "sample3.recal.mosdepth.global.dist.txt:md5,d9a4dd6429560b2b647da346050766c5", + "sample3.recal.mosdepth.region.dist.txt:md5,1f3dab381958e08eb00f7c5e1135f677", + "sample3.recal.mosdepth.summary.txt:md5,d7676e7c1de851b0ee5185d21096123b", + "sample3.recal.regions.bed.gz:md5,6edeb8f7041a4403cb73651744b5bc82", + "sample3.recal.regions.bed.gz.csi:md5,f17cc9d960aa4a1e96548d570585cc8a", + "sample4.recal.mosdepth.global.dist.txt:md5,53f9ae9ab5002ffba340fa8cef7d70e4", + "sample4.recal.mosdepth.region.dist.txt:md5,17600d21ac453506c52249cf435ad8ea", + "sample4.recal.mosdepth.summary.txt:md5,7141030385af1f653718c9e0c9a5be80", + "sample4.recal.regions.bed.gz:md5,c680c5d75f0cea068e3f917f4cf9bf52", + "sample4.recal.regions.bed.gz.csi:md5,4003c8833ed5e9b9f45282a6915c935e", + "sample3.recal.cram.stats:md5,bcc229318527e414e69aaa5cd092ad9b", + "sample4.recal.cram.stats:md5,0d1784cb4c3f14b9858247ac6128dd03", + "sample3.strelka.variants.FILTER.summary:md5,fef8aeadd3b0f3b8c040c0da03bf1cbd", + "sample3.strelka.variants.TsTv.count:md5,c5b7a8eda2526d899098439ae4c06a49", + "sample4_vs_sample3.strelka.somatic_indels.FILTER.summary:md5,30a45e2bc87f40c89388032cbf75ec65", + "sample4_vs_sample3.strelka.somatic_indels.TsTv.count:md5,8dcfdbcaac118df1d5ad407dd2af699f", + "sample4_vs_sample3.strelka.somatic_snvs.FILTER.summary:md5,4fc17fa5625b4d1dcc5d791b1eb22d85", + "sample4_vs_sample3.strelka.somatic_snvs.TsTv.count:md5,fc7af1f534890c4ad3025588b3af62ae" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-15T22:27:18.76415046" + }, + "Run with profile test | --tools strelka | germline": { + "content": [ + 11, + { + "BCFTOOLS_STATS": { + "bcftools": 1.2 + }, + "STRELKA_SINGLE": { + "strelka": "2.9.10" + }, + "VCFTOOLS_TSTV_COUNT": { + "vcftools": "0.1.16" + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/variantcalled.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/bcftools-stats-subtypes.txt", + "multiqc/multiqc_data/bcftools_stats_indel-lengths.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Indels.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_SNP.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transitions.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transversions.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_bcftools_stats.txt", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/vcftools_tstv_by_count.txt", + "multiqc/multiqc_data/vcftools_tstv_by_qual.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-cnt.pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-pct.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_indel-lengths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Indels.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_SNP.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transitions.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transversions.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_count.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_qual.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-cnt.png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-pct.png", + "multiqc/multiqc_plots/png/bcftools_stats_indel-lengths.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Indels.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_SNP.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transitions.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transversions.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_count.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_qual.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-cnt.svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-pct.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_indel-lengths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Indels.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_SNP.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transitions.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transversions.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_count.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_qual.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reports", + "reports/bcftools", + "reports/bcftools/strelka", + "reports/bcftools/strelka/sample1", + "reports/bcftools/strelka/sample1/sample1.strelka.variants.bcftools_stats.txt", + "reports/mosdepth", + "reports/mosdepth/sample1", + "reports/mosdepth/sample1/sample1.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample1/sample1.recal.mosdepth.region.dist.txt", + "reports/mosdepth/sample1/sample1.recal.mosdepth.summary.txt", + "reports/mosdepth/sample1/sample1.recal.regions.bed.gz", + "reports/mosdepth/sample1/sample1.recal.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/sample1", + "reports/samtools/sample1/sample1.recal.cram.stats", + "reports/vcftools", + "reports/vcftools/strelka", + "reports/vcftools/strelka/sample1", + "reports/vcftools/strelka/sample1/sample1.strelka.variants.FILTER.summary", + "reports/vcftools/strelka/sample1/sample1.strelka.variants.TsTv.count", + "reports/vcftools/strelka/sample1/sample1.strelka.variants.TsTv.qual", + "variant_calling", + "variant_calling/strelka", + "variant_calling/strelka/sample1", + "variant_calling/strelka/sample1/sample1.strelka.genome.vcf.gz", + "variant_calling/strelka/sample1/sample1.strelka.genome.vcf.gz.tbi", + "variant_calling/strelka/sample1/sample1.strelka.variants.vcf.gz", + "variant_calling/strelka/sample1/sample1.strelka.variants.vcf.gz.tbi" + ], + [ + "bcftools-stats-subtypes.txt:md5,86e480b8a6d2717e9d7094f592d7a9fd", + "bcftools_stats_indel-lengths.txt:md5,dbace4ab8202077acde658b8502ff121", + "bcftools_stats_vqc_Count_Indels.txt:md5,39a35f0514206b8b11850eafbd784497", + "bcftools_stats_vqc_Count_SNP.txt:md5,e8ba3f061de6f390e1638a825c21923e", + "bcftools_stats_vqc_Count_Transitions.txt:md5,b05927211410e013c298137c6ae451df", + "bcftools_stats_vqc_Count_Transversions.txt:md5,f6b8e4b89945ce8cae3ddb94eeed63d7", + "multiqc_bcftools_stats.txt:md5,cde557db5eb46f8325c06e9bb32e7a94", + "multiqc_citations.txt:md5,ac2b3cf2dfb12c40837b9bbad8112d86", + "vcftools_tstv_by_count.txt:md5,69cbd9060c01c1102ddf5839ee7b404a", + "sample1.strelka.variants.bcftools_stats.txt:md5,2936f10f99295fb772d8c35f246e223d", + "sample1.recal.mosdepth.global.dist.txt:md5,d9a4dd6429560b2b647da346050766c5", + "sample1.recal.mosdepth.region.dist.txt:md5,1f3dab381958e08eb00f7c5e1135f677", + "sample1.recal.mosdepth.summary.txt:md5,d7676e7c1de851b0ee5185d21096123b", + "sample1.recal.regions.bed.gz:md5,6edeb8f7041a4403cb73651744b5bc82", + "sample1.recal.regions.bed.gz.csi:md5,f17cc9d960aa4a1e96548d570585cc8a", + "sample1.recal.cram.stats:md5,bcc229318527e414e69aaa5cd092ad9b", + "sample1.strelka.variants.FILTER.summary:md5,fef8aeadd3b0f3b8c040c0da03bf1cbd", + "sample1.strelka.variants.TsTv.count:md5,c5b7a8eda2526d899098439ae4c06a49" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-15T22:24:13.636617483" + }, + "Run with profile test | --tools strelka --no_intervals | germline": { + "content": [ + 9, + { + "BCFTOOLS_STATS": { + "bcftools": 1.2 + }, + "STRELKA_SINGLE": { + "strelka": "2.9.10" + }, + "VCFTOOLS_TSTV_COUNT": { + "vcftools": "0.1.16" + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/variantcalled.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/bcftools-stats-subtypes.txt", + "multiqc/multiqc_data/bcftools_stats_indel-lengths.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Indels.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_SNP.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transitions.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transversions.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_bcftools_stats.txt", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/vcftools_tstv_by_count.txt", + "multiqc/multiqc_data/vcftools_tstv_by_qual.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-cnt.pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-pct.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_indel-lengths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Indels.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_SNP.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transitions.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transversions.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_count.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_qual.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-cnt.png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-pct.png", + "multiqc/multiqc_plots/png/bcftools_stats_indel-lengths.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Indels.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_SNP.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transitions.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transversions.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_count.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_qual.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-cnt.svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-pct.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_indel-lengths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Indels.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_SNP.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transitions.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transversions.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_count.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_qual.svg", + "multiqc/multiqc_report.html", + "no_intervals.bed", + "no_intervals.bed.gz", + "no_intervals.bed.gz.tbi", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reports", + "reports/bcftools", + "reports/bcftools/strelka", + "reports/bcftools/strelka/sample1", + "reports/bcftools/strelka/sample1/sample1.strelka.variants.bcftools_stats.txt", + "reports/mosdepth", + "reports/mosdepth/sample1", + "reports/mosdepth/sample1/sample1.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample1/sample1.recal.mosdepth.region.dist.txt", + "reports/mosdepth/sample1/sample1.recal.mosdepth.summary.txt", + "reports/mosdepth/sample1/sample1.recal.regions.bed.gz", + "reports/mosdepth/sample1/sample1.recal.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/sample1", + "reports/samtools/sample1/sample1.recal.cram.stats", + "reports/vcftools", + "reports/vcftools/strelka", + "reports/vcftools/strelka/sample1", + "reports/vcftools/strelka/sample1/sample1.strelka.variants.FILTER.summary", + "reports/vcftools/strelka/sample1/sample1.strelka.variants.TsTv.count", + "reports/vcftools/strelka/sample1/sample1.strelka.variants.TsTv.qual", + "variant_calling", + "variant_calling/strelka", + "variant_calling/strelka/sample1", + "variant_calling/strelka/sample1/sample1.strelka.genome.vcf.gz", + "variant_calling/strelka/sample1/sample1.strelka.genome.vcf.gz.tbi", + "variant_calling/strelka/sample1/sample1.strelka.variants.vcf.gz", + "variant_calling/strelka/sample1/sample1.strelka.variants.vcf.gz.tbi" + ], + [ + "bcftools-stats-subtypes.txt:md5,4857f0b890dbe13ef936c7f0de106e89", + "bcftools_stats_indel-lengths.txt:md5,59c9a8c7da8958b5ae0f66979ce87f0d", + "bcftools_stats_vqc_Count_Indels.txt:md5,e71d6ad11b1232d8b1e5de78b1996066", + "bcftools_stats_vqc_Count_SNP.txt:md5,bdc61a25e4dcfe51df06b7d56ede3908", + "bcftools_stats_vqc_Count_Transitions.txt:md5,a12e39e38f11e764758367a290e6de9c", + "bcftools_stats_vqc_Count_Transversions.txt:md5,469dcb73eab6ef2e3c3e783eddf04985", + "multiqc_bcftools_stats.txt:md5,23c19e033f3086bec87c5e2022795e79", + "multiqc_citations.txt:md5,ac2b3cf2dfb12c40837b9bbad8112d86", + "vcftools_tstv_by_count.txt:md5,ea99842f730543b5fd1a08d0e1d68278", + "sample1.strelka.variants.bcftools_stats.txt:md5,0e829f5d31d768a8e99786786282c9ef", + "sample1.recal.mosdepth.global.dist.txt:md5,d9a4dd6429560b2b647da346050766c5", + "sample1.recal.mosdepth.region.dist.txt:md5,1f3dab381958e08eb00f7c5e1135f677", + "sample1.recal.mosdepth.summary.txt:md5,d7676e7c1de851b0ee5185d21096123b", + "sample1.recal.regions.bed.gz:md5,6edeb8f7041a4403cb73651744b5bc82", + "sample1.recal.regions.bed.gz.csi:md5,f17cc9d960aa4a1e96548d570585cc8a", + "sample1.recal.cram.stats:md5,bcc229318527e414e69aaa5cd092ad9b", + "sample1.strelka.variants.FILTER.summary:md5,8697a0a983314e98b99b5f6038af65f6", + "sample1.strelka.variants.TsTv.count:md5,1481854d2a765f5641856ecf95ca4097" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-15T22:25:42.940353635" + } +} diff --git a/tests/variant_calling_strelka_bp.nf.test b/tests/variant_calling_strelka_bp.nf.test new file mode 100644 index 0000000000..3e03ea85f4 --- /dev/null +++ b/tests/variant_calling_strelka_bp.nf.test @@ -0,0 +1,90 @@ +nextflow_pipeline { + + name "Test pipeline" + script "../main.nf" + tag "pipeline" + tag "pipeline_sarek" + + test("Run with profile test | --tools manta,strelka | somatic") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta' + fasta_fai = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai' + intervals = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed' + input = "${projectDir}/tests/csv/3.0/recalibrated_somatic.csv" + outdir = "$outputDir" + step = "variant_calling" + tools = 'manta,strelka' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } + + test("Run with profile test | --tools manta,strelka --no_intervals | somatic") { + + when { + params { + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' + fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta' + fasta_fai = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome.fasta.fai' + intervals = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed' + input = "${projectDir}/tests/csv/3.0/recalibrated_somatic.csv" + outdir = "$outputDir" + no_intervals = true + step = "variant_calling" + tools = 'manta,strelka' + } + } + + then { + // stable_name: All files + folders in ${params.outdir}/ with a stable name + def stable_name = getAllFilesFromDir(params.outdir, relative: true, includeDir: true, ignore: ['pipeline_info/*.{html,json,txt}']) + // stable_path: All files in ${params.outdir}/ with stable content + def stable_path = getAllFilesFromDir(params.outdir, ignoreFile: 'tests/.nftignore') + // cram_files: All cram files + def cram_files = getAllFilesFromDir(params.outdir, include: ['**/*.cram']) + def fasta = params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta' + assertAll( + { assert workflow.success}, + { assert snapshot( + // Number of successful tasks + workflow.trace.succeeded().size(), + // pipeline versions.yml file for multiqc from which Nextflow version is removed because we tests pipelines on multiple Nextflow versions + removeNextflowVersion("$outputDir/pipeline_info/nf_core_sarek_software_mqc_versions.yml"), + // All stable path name, with a relative path + stable_name, + // All files with stable contents + stable_path, + // All cram files + cram_files.collect{ file -> [ file.getName(), cram(file.toString(), fasta).getReadsMD5() ] } + ).match() } + ) + } + } +} diff --git a/tests/variant_calling_strelka_bp.nf.test.snap b/tests/variant_calling_strelka_bp.nf.test.snap new file mode 100644 index 0000000000..7a773f06ec --- /dev/null +++ b/tests/variant_calling_strelka_bp.nf.test.snap @@ -0,0 +1,443 @@ +{ + "Run with profile test | --tools manta,strelka --no_intervals | somatic": { + "content": [ + 34, + { + "BCFTOOLS_STATS": { + "bcftools": 1.2 + }, + "MANTA_GERMLINE": { + "manta": "1.6.0" + }, + "MANTA_SOMATIC": { + "manta": "1.6.0" + }, + "STRELKA_SINGLE": { + "strelka": "2.9.10" + }, + "STRELKA_SOMATIC": { + "strelka": "2.9.10" + }, + "VCFTOOLS_TSTV_COUNT": { + "vcftools": "0.1.16" + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/variantcalled.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/bcftools-stats-subtypes.txt", + "multiqc/multiqc_data/bcftools_stats_indel-lengths.txt", + "multiqc/multiqc_data/bcftools_stats_variant_depths.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Indels.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_SNP.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transitions.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transversions.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_bcftools_stats.txt", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/vcftools_tstv_by_count.txt", + "multiqc/multiqc_data/vcftools_tstv_by_qual.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-cnt.pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-pct.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_indel-lengths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_variant_depths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Indels.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_SNP.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transitions.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transversions.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_count.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_qual.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-cnt.png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-pct.png", + "multiqc/multiqc_plots/png/bcftools_stats_indel-lengths.png", + "multiqc/multiqc_plots/png/bcftools_stats_variant_depths.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Indels.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_SNP.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transitions.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transversions.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_count.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_qual.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-cnt.svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-pct.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_indel-lengths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_variant_depths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Indels.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_SNP.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transitions.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transversions.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_count.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_qual.svg", + "multiqc/multiqc_report.html", + "no_intervals.bed", + "no_intervals.bed.gz", + "no_intervals.bed.gz.tbi", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reports", + "reports/bcftools", + "reports/bcftools/manta", + "reports/bcftools/manta/sample3", + "reports/bcftools/manta/sample3/sample3.manta.diploid_sv.bcftools_stats.txt", + "reports/bcftools/manta/sample4_vs_sample3", + "reports/bcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.bcftools_stats.txt", + "reports/bcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.bcftools_stats.txt", + "reports/bcftools/strelka", + "reports/bcftools/strelka/sample3", + "reports/bcftools/strelka/sample3/sample3.strelka.variants.bcftools_stats.txt", + "reports/bcftools/strelka/sample4_vs_sample3", + "reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.bcftools_stats.txt", + "reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.bcftools_stats.txt", + "reports/mosdepth", + "reports/mosdepth/sample3", + "reports/mosdepth/sample3/sample3.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample3/sample3.recal.mosdepth.region.dist.txt", + "reports/mosdepth/sample3/sample3.recal.mosdepth.summary.txt", + "reports/mosdepth/sample3/sample3.recal.regions.bed.gz", + "reports/mosdepth/sample3/sample3.recal.regions.bed.gz.csi", + "reports/mosdepth/sample4", + "reports/mosdepth/sample4/sample4.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample4/sample4.recal.mosdepth.region.dist.txt", + "reports/mosdepth/sample4/sample4.recal.mosdepth.summary.txt", + "reports/mosdepth/sample4/sample4.recal.regions.bed.gz", + "reports/mosdepth/sample4/sample4.recal.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/sample3", + "reports/samtools/sample3/sample3.recal.cram.stats", + "reports/samtools/sample4", + "reports/samtools/sample4/sample4.recal.cram.stats", + "reports/vcftools", + "reports/vcftools/manta", + "reports/vcftools/manta/sample3", + "reports/vcftools/manta/sample3/sample3.manta.diploid_sv.FILTER.summary", + "reports/vcftools/manta/sample3/sample3.manta.diploid_sv.TsTv.count", + "reports/vcftools/manta/sample3/sample3.manta.diploid_sv.TsTv.qual", + "reports/vcftools/manta/sample4_vs_sample3", + "reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.FILTER.summary", + "reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.TsTv.count", + "reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.TsTv.qual", + "reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.FILTER.summary", + "reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.TsTv.count", + "reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.TsTv.qual", + "reports/vcftools/strelka", + "reports/vcftools/strelka/sample3", + "reports/vcftools/strelka/sample3/sample3.strelka.variants.FILTER.summary", + "reports/vcftools/strelka/sample3/sample3.strelka.variants.TsTv.count", + "reports/vcftools/strelka/sample3/sample3.strelka.variants.TsTv.qual", + "reports/vcftools/strelka/sample4_vs_sample3", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.FILTER.summary", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.count", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.qual", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.FILTER.summary", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.count", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.qual", + "variant_calling", + "variant_calling/manta", + "variant_calling/manta/sample3", + "variant_calling/manta/sample3/sample3.manta.diploid_sv.vcf.gz", + "variant_calling/manta/sample3/sample3.manta.diploid_sv.vcf.gz.tbi", + "variant_calling/manta/sample4_vs_sample3", + "variant_calling/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.vcf.gz", + "variant_calling/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.vcf.gz.tbi", + "variant_calling/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.vcf.gz", + "variant_calling/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.vcf.gz.tbi", + "variant_calling/strelka", + "variant_calling/strelka/sample3", + "variant_calling/strelka/sample3/sample3.strelka.genome.vcf.gz", + "variant_calling/strelka/sample3/sample3.strelka.genome.vcf.gz.tbi", + "variant_calling/strelka/sample3/sample3.strelka.variants.vcf.gz", + "variant_calling/strelka/sample3/sample3.strelka.variants.vcf.gz.tbi", + "variant_calling/strelka/sample4_vs_sample3", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz.tbi", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz.tbi" + ], + [ + "bcftools-stats-subtypes.txt:md5,3268a22605b0fddf637de19c9202ade0", + "bcftools_stats_indel-lengths.txt:md5,5e76de7e494eb573dbce64f91c79d641", + "bcftools_stats_variant_depths.txt:md5,201388db8c5b940aaf05735106f63980", + "bcftools_stats_vqc_Count_Indels.txt:md5,90076001a4996b4f015ee7cfdfdc0f2d", + "bcftools_stats_vqc_Count_SNP.txt:md5,5bd2de6cab28b6ac4bda66a97e73ec4c", + "bcftools_stats_vqc_Count_Transitions.txt:md5,b46f5fae18ee0526c636e89979c10c5e", + "bcftools_stats_vqc_Count_Transversions.txt:md5,4a127df36efe1b68b7ae93b8c71dd7b1", + "multiqc_bcftools_stats.txt:md5,5c9c75a7ba51eec6d7ed539a5d0a2397", + "multiqc_citations.txt:md5,ac2b3cf2dfb12c40837b9bbad8112d86", + "vcftools_tstv_by_count.txt:md5,d5919b93082809cb0663b16ce1f0f60e", + "sample3.manta.diploid_sv.bcftools_stats.txt:md5,8adad91e1c8dc8db63cf9b3607bee3a0", + "sample4_vs_sample3.manta.diploid_sv.bcftools_stats.txt:md5,17dc847445b98885bc18622f862f44d9", + "sample4_vs_sample3.manta.somatic_sv.bcftools_stats.txt:md5,a8660b352950f0b32768fcbbd6b48896", + "sample3.strelka.variants.bcftools_stats.txt:md5,c75a458b1aa0e1bae3b667d48684e13c", + "sample4_vs_sample3.strelka.somatic_indels.bcftools_stats.txt:md5,1c57e5cd6424157536276002ef1a58d6", + "sample4_vs_sample3.strelka.somatic_snvs.bcftools_stats.txt:md5,110810e5702ef11bc002bd0948dbdfff", + "sample3.recal.mosdepth.global.dist.txt:md5,d9a4dd6429560b2b647da346050766c5", + "sample3.recal.mosdepth.region.dist.txt:md5,1f3dab381958e08eb00f7c5e1135f677", + "sample3.recal.mosdepth.summary.txt:md5,d7676e7c1de851b0ee5185d21096123b", + "sample3.recal.regions.bed.gz:md5,6edeb8f7041a4403cb73651744b5bc82", + "sample3.recal.regions.bed.gz.csi:md5,f17cc9d960aa4a1e96548d570585cc8a", + "sample4.recal.mosdepth.global.dist.txt:md5,53f9ae9ab5002ffba340fa8cef7d70e4", + "sample4.recal.mosdepth.region.dist.txt:md5,17600d21ac453506c52249cf435ad8ea", + "sample4.recal.mosdepth.summary.txt:md5,7141030385af1f653718c9e0c9a5be80", + "sample4.recal.regions.bed.gz:md5,c680c5d75f0cea068e3f917f4cf9bf52", + "sample4.recal.regions.bed.gz.csi:md5,4003c8833ed5e9b9f45282a6915c935e", + "sample3.recal.cram.stats:md5,bcc229318527e414e69aaa5cd092ad9b", + "sample4.recal.cram.stats:md5,0d1784cb4c3f14b9858247ac6128dd03", + "sample3.manta.diploid_sv.FILTER.summary:md5,1ce42d34e4ae919afb519efc99146423", + "sample3.manta.diploid_sv.TsTv.count:md5,fa27f678965b7cba6a92efcd039f802a", + "sample4_vs_sample3.manta.diploid_sv.FILTER.summary:md5,1ce42d34e4ae919afb519efc99146423", + "sample4_vs_sample3.manta.diploid_sv.TsTv.count:md5,fa27f678965b7cba6a92efcd039f802a", + "sample4_vs_sample3.manta.somatic_sv.FILTER.summary:md5,1ce42d34e4ae919afb519efc99146423", + "sample4_vs_sample3.manta.somatic_sv.TsTv.count:md5,8dcfdbcaac118df1d5ad407dd2af699f", + "sample3.strelka.variants.FILTER.summary:md5,8697a0a983314e98b99b5f6038af65f6", + "sample3.strelka.variants.TsTv.count:md5,1481854d2a765f5641856ecf95ca4097", + "sample4_vs_sample3.strelka.somatic_indels.FILTER.summary:md5,30a45e2bc87f40c89388032cbf75ec65", + "sample4_vs_sample3.strelka.somatic_indels.TsTv.count:md5,8dcfdbcaac118df1d5ad407dd2af699f", + "sample4_vs_sample3.strelka.somatic_snvs.FILTER.summary:md5,7a81b11aa29fec73d5bc872b7b58f8aa", + "sample4_vs_sample3.strelka.somatic_snvs.TsTv.count:md5,a922c51ca3b2ea7cdcfa09e9c8c55d52" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-15T22:39:15.50268779" + }, + "Run with profile test | --tools manta,strelka | somatic": { + "content": [ + 36, + { + "BCFTOOLS_STATS": { + "bcftools": 1.2 + }, + "MANTA_GERMLINE": { + "manta": "1.6.0" + }, + "MANTA_SOMATIC": { + "manta": "1.6.0" + }, + "STRELKA_SINGLE": { + "strelka": "2.9.10" + }, + "STRELKA_SOMATIC": { + "strelka": "2.9.10" + }, + "VCFTOOLS_TSTV_COUNT": { + "vcftools": "0.1.16" + }, + "Workflow": { + "nf-core/sarek": "v3.5.0" + } + }, + [ + "csv", + "csv/variantcalled.csv", + "multiqc", + "multiqc/multiqc_data", + "multiqc/multiqc_data/bcftools-stats-subtypes.txt", + "multiqc/multiqc_data/bcftools_stats_indel-lengths.txt", + "multiqc/multiqc_data/bcftools_stats_variant_depths.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Indels.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_SNP.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transitions.txt", + "multiqc/multiqc_data/bcftools_stats_vqc_Count_Transversions.txt", + "multiqc/multiqc_data/multiqc.log", + "multiqc/multiqc_data/multiqc_bcftools_stats.txt", + "multiqc/multiqc_data/multiqc_citations.txt", + "multiqc/multiqc_data/multiqc_data.json", + "multiqc/multiqc_data/multiqc_general_stats.txt", + "multiqc/multiqc_data/multiqc_software_versions.txt", + "multiqc/multiqc_data/multiqc_sources.txt", + "multiqc/multiqc_data/vcftools_tstv_by_count.txt", + "multiqc/multiqc_data/vcftools_tstv_by_qual.txt", + "multiqc/multiqc_plots", + "multiqc/multiqc_plots/pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-cnt.pdf", + "multiqc/multiqc_plots/pdf/bcftools-stats-subtypes-pct.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_indel-lengths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_variant_depths.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Indels.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_SNP.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transitions.pdf", + "multiqc/multiqc_plots/pdf/bcftools_stats_vqc_Count_Transversions.pdf", + "multiqc/multiqc_plots/pdf/general_stats_table.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_count.pdf", + "multiqc/multiqc_plots/pdf/vcftools_tstv_by_qual.pdf", + "multiqc/multiqc_plots/png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-cnt.png", + "multiqc/multiqc_plots/png/bcftools-stats-subtypes-pct.png", + "multiqc/multiqc_plots/png/bcftools_stats_indel-lengths.png", + "multiqc/multiqc_plots/png/bcftools_stats_variant_depths.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Indels.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_SNP.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transitions.png", + "multiqc/multiqc_plots/png/bcftools_stats_vqc_Count_Transversions.png", + "multiqc/multiqc_plots/png/general_stats_table.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_count.png", + "multiqc/multiqc_plots/png/vcftools_tstv_by_qual.png", + "multiqc/multiqc_plots/svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-cnt.svg", + "multiqc/multiqc_plots/svg/bcftools-stats-subtypes-pct.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_indel-lengths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_variant_depths.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Indels.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_SNP.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transitions.svg", + "multiqc/multiqc_plots/svg/bcftools_stats_vqc_Count_Transversions.svg", + "multiqc/multiqc_plots/svg/general_stats_table.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_count.svg", + "multiqc/multiqc_plots/svg/vcftools_tstv_by_qual.svg", + "multiqc/multiqc_report.html", + "pipeline_info", + "pipeline_info/nf_core_sarek_software_mqc_versions.yml", + "reference", + "reports", + "reports/bcftools", + "reports/bcftools/manta", + "reports/bcftools/manta/sample3", + "reports/bcftools/manta/sample3/sample3.manta.diploid_sv.bcftools_stats.txt", + "reports/bcftools/manta/sample4_vs_sample3", + "reports/bcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.bcftools_stats.txt", + "reports/bcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.bcftools_stats.txt", + "reports/bcftools/strelka", + "reports/bcftools/strelka/sample3", + "reports/bcftools/strelka/sample3/sample3.strelka.variants.bcftools_stats.txt", + "reports/bcftools/strelka/sample4_vs_sample3", + "reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.bcftools_stats.txt", + "reports/bcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.bcftools_stats.txt", + "reports/mosdepth", + "reports/mosdepth/sample3", + "reports/mosdepth/sample3/sample3.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample3/sample3.recal.mosdepth.region.dist.txt", + "reports/mosdepth/sample3/sample3.recal.mosdepth.summary.txt", + "reports/mosdepth/sample3/sample3.recal.regions.bed.gz", + "reports/mosdepth/sample3/sample3.recal.regions.bed.gz.csi", + "reports/mosdepth/sample4", + "reports/mosdepth/sample4/sample4.recal.mosdepth.global.dist.txt", + "reports/mosdepth/sample4/sample4.recal.mosdepth.region.dist.txt", + "reports/mosdepth/sample4/sample4.recal.mosdepth.summary.txt", + "reports/mosdepth/sample4/sample4.recal.regions.bed.gz", + "reports/mosdepth/sample4/sample4.recal.regions.bed.gz.csi", + "reports/samtools", + "reports/samtools/sample3", + "reports/samtools/sample3/sample3.recal.cram.stats", + "reports/samtools/sample4", + "reports/samtools/sample4/sample4.recal.cram.stats", + "reports/vcftools", + "reports/vcftools/manta", + "reports/vcftools/manta/sample3", + "reports/vcftools/manta/sample3/sample3.manta.diploid_sv.FILTER.summary", + "reports/vcftools/manta/sample3/sample3.manta.diploid_sv.TsTv.count", + "reports/vcftools/manta/sample3/sample3.manta.diploid_sv.TsTv.qual", + "reports/vcftools/manta/sample4_vs_sample3", + "reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.FILTER.summary", + "reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.TsTv.count", + "reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.TsTv.qual", + "reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.FILTER.summary", + "reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.TsTv.count", + "reports/vcftools/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.TsTv.qual", + "reports/vcftools/strelka", + "reports/vcftools/strelka/sample3", + "reports/vcftools/strelka/sample3/sample3.strelka.variants.FILTER.summary", + "reports/vcftools/strelka/sample3/sample3.strelka.variants.TsTv.count", + "reports/vcftools/strelka/sample3/sample3.strelka.variants.TsTv.qual", + "reports/vcftools/strelka/sample4_vs_sample3", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.FILTER.summary", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.count", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.TsTv.qual", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.FILTER.summary", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.count", + "reports/vcftools/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.TsTv.qual", + "variant_calling", + "variant_calling/manta", + "variant_calling/manta/sample3", + "variant_calling/manta/sample3/sample3.manta.diploid_sv.vcf.gz", + "variant_calling/manta/sample3/sample3.manta.diploid_sv.vcf.gz.tbi", + "variant_calling/manta/sample4_vs_sample3", + "variant_calling/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.vcf.gz", + "variant_calling/manta/sample4_vs_sample3/sample4_vs_sample3.manta.diploid_sv.vcf.gz.tbi", + "variant_calling/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.vcf.gz", + "variant_calling/manta/sample4_vs_sample3/sample4_vs_sample3.manta.somatic_sv.vcf.gz.tbi", + "variant_calling/strelka", + "variant_calling/strelka/sample3", + "variant_calling/strelka/sample3/sample3.strelka.genome.vcf.gz", + "variant_calling/strelka/sample3/sample3.strelka.genome.vcf.gz.tbi", + "variant_calling/strelka/sample3/sample3.strelka.variants.vcf.gz", + "variant_calling/strelka/sample3/sample3.strelka.variants.vcf.gz.tbi", + "variant_calling/strelka/sample4_vs_sample3", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_indels.vcf.gz.tbi", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz", + "variant_calling/strelka/sample4_vs_sample3/sample4_vs_sample3.strelka.somatic_snvs.vcf.gz.tbi" + ], + [ + "bcftools-stats-subtypes.txt:md5,3bd8ab52708f28306c9cf9d2779c1715", + "bcftools_stats_indel-lengths.txt:md5,a836b904ccf4fa2a733a26b0184a5f9e", + "bcftools_stats_variant_depths.txt:md5,b4360600be4ee46148d30c428ec9e330", + "bcftools_stats_vqc_Count_Indels.txt:md5,def7f6a03fac61287483a7c86d44fe49", + "bcftools_stats_vqc_Count_SNP.txt:md5,a014a7c6d4fb7bea63e89da4cfeef2a9", + "bcftools_stats_vqc_Count_Transitions.txt:md5,ec192419e14a2aeb0c18742879bef563", + "bcftools_stats_vqc_Count_Transversions.txt:md5,b7ca7c37370d2db7052efe21c41d81e9", + "multiqc_bcftools_stats.txt:md5,97bac5dbfe6cb21a78ca7f7327655b65", + "multiqc_citations.txt:md5,ac2b3cf2dfb12c40837b9bbad8112d86", + "vcftools_tstv_by_count.txt:md5,27388870c36896df470e2f5aa982332e", + "sample3.manta.diploid_sv.bcftools_stats.txt:md5,8adad91e1c8dc8db63cf9b3607bee3a0", + "sample4_vs_sample3.manta.diploid_sv.bcftools_stats.txt:md5,17dc847445b98885bc18622f862f44d9", + "sample4_vs_sample3.manta.somatic_sv.bcftools_stats.txt:md5,a8660b352950f0b32768fcbbd6b48896", + "sample3.strelka.variants.bcftools_stats.txt:md5,d505d381e8c3906788e4135bb975ff84", + "sample4_vs_sample3.strelka.somatic_indels.bcftools_stats.txt:md5,1c57e5cd6424157536276002ef1a58d6", + "sample4_vs_sample3.strelka.somatic_snvs.bcftools_stats.txt:md5,8cf6d0b3f41436cd2f2aa09c9831764d", + "sample3.recal.mosdepth.global.dist.txt:md5,d9a4dd6429560b2b647da346050766c5", + "sample3.recal.mosdepth.region.dist.txt:md5,1f3dab381958e08eb00f7c5e1135f677", + "sample3.recal.mosdepth.summary.txt:md5,d7676e7c1de851b0ee5185d21096123b", + "sample3.recal.regions.bed.gz:md5,6edeb8f7041a4403cb73651744b5bc82", + "sample3.recal.regions.bed.gz.csi:md5,f17cc9d960aa4a1e96548d570585cc8a", + "sample4.recal.mosdepth.global.dist.txt:md5,53f9ae9ab5002ffba340fa8cef7d70e4", + "sample4.recal.mosdepth.region.dist.txt:md5,17600d21ac453506c52249cf435ad8ea", + "sample4.recal.mosdepth.summary.txt:md5,7141030385af1f653718c9e0c9a5be80", + "sample4.recal.regions.bed.gz:md5,c680c5d75f0cea068e3f917f4cf9bf52", + "sample4.recal.regions.bed.gz.csi:md5,4003c8833ed5e9b9f45282a6915c935e", + "sample3.recal.cram.stats:md5,bcc229318527e414e69aaa5cd092ad9b", + "sample4.recal.cram.stats:md5,0d1784cb4c3f14b9858247ac6128dd03", + "sample3.manta.diploid_sv.FILTER.summary:md5,1ce42d34e4ae919afb519efc99146423", + "sample3.manta.diploid_sv.TsTv.count:md5,fa27f678965b7cba6a92efcd039f802a", + "sample4_vs_sample3.manta.diploid_sv.FILTER.summary:md5,1ce42d34e4ae919afb519efc99146423", + "sample4_vs_sample3.manta.diploid_sv.TsTv.count:md5,fa27f678965b7cba6a92efcd039f802a", + "sample4_vs_sample3.manta.somatic_sv.FILTER.summary:md5,1ce42d34e4ae919afb519efc99146423", + "sample4_vs_sample3.manta.somatic_sv.TsTv.count:md5,8dcfdbcaac118df1d5ad407dd2af699f", + "sample3.strelka.variants.FILTER.summary:md5,fef8aeadd3b0f3b8c040c0da03bf1cbd", + "sample3.strelka.variants.TsTv.count:md5,c5b7a8eda2526d899098439ae4c06a49", + "sample4_vs_sample3.strelka.somatic_indels.FILTER.summary:md5,30a45e2bc87f40c89388032cbf75ec65", + "sample4_vs_sample3.strelka.somatic_indels.TsTv.count:md5,8dcfdbcaac118df1d5ad407dd2af699f", + "sample4_vs_sample3.strelka.somatic_snvs.FILTER.summary:md5,4fc17fa5625b4d1dcc5d791b1eb22d85", + "sample4_vs_sample3.strelka.somatic_snvs.TsTv.count:md5,fc7af1f534890c4ad3025588b3af62ae" + ], + [ + + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-15T22:37:23.231917992" + } +} diff --git a/workflows/sarek/main.nf b/workflows/sarek/main.nf index 90307f19c2..f554cd9ddd 100644 --- a/workflows/sarek/main.nf +++ b/workflows/sarek/main.nf @@ -4,7 +4,7 @@ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ -include { paramsSummaryMap } from 'plugin/nf-validation' +include { paramsSummaryMap } from 'plugin/nf-schema' include { paramsSummaryMultiqc } from '../../subworkflows/nf-core/utils_nfcore_pipeline' include { softwareVersionsToYAML } from '../../subworkflows/nf-core/utils_nfcore_pipeline' include { methodsDescriptionText } from '../../subworkflows/local/utils_nfcore_sarek_pipeline' @@ -16,6 +16,7 @@ include { CHANNEL_BASERECALIBRATOR_CREATE_CSV } from '../../subwor include { CHANNEL_APPLYBQSR_CREATE_CSV } from '../../subworkflows/local/channel_applybqsr_create_csv/main' include { CHANNEL_VARIANT_CALLING_CREATE_CSV } from '../../subworkflows/local/channel_variant_calling_create_csv/main' + // Convert BAM files to FASTQ files include { BAM_CONVERT_SAMTOOLS as CONVERT_FASTQ_INPUT } from '../../subworkflows/local/bam_convert_samtools/main' include { BAM_CONVERT_SAMTOOLS as CONVERT_FASTQ_UMI } from '../../subworkflows/local/bam_convert_samtools/main' @@ -41,7 +42,6 @@ include { FASTQ_ALIGN_BWAMEM_MEM2_DRAGMAP_SENTIEON } from '../../subwor include { BAM_MERGE_INDEX_SAMTOOLS } from '../../subworkflows/local/bam_merge_index_samtools/main' // Convert BAM files -include { SAMTOOLS_CONVERT as BAM_TO_CRAM } from '../../modules/nf-core/samtools/convert/main' include { SAMTOOLS_CONVERT as BAM_TO_CRAM_MAPPING } from '../../modules/nf-core/samtools/convert/main' // Convert CRAM files (optional) @@ -249,6 +249,7 @@ workflow SAREK { FASTP( reads_for_fastp, [], // we are not using any adapter fastas at the moment + false, // we don't use discard_trimmed_pass at the moment save_trimmed_fail, save_merged ) @@ -386,16 +387,7 @@ workflow SAREK { if (params.step == 'mapping') { cram_skip_markduplicates = BAM_TO_CRAM_MAPPING.out.cram.join(BAM_TO_CRAM_MAPPING.out.crai, failOnDuplicate: true, failOnMismatch: true) } else { - input_markduplicates_convert = input_sample.branch{ - bam: it[0].data_type == "bam" - cram: it[0].data_type == "cram" - } - - // Convert any input BAMs to CRAM - BAM_TO_CRAM(input_markduplicates_convert.bam, fasta, fasta_fai) - versions = versions.mix(BAM_TO_CRAM.out.versions) - - cram_skip_markduplicates = Channel.empty().mix(input_markduplicates_convert.cram, BAM_TO_CRAM.out.cram.join(BAM_TO_CRAM.out.crai, failOnDuplicate: true, failOnMismatch: true)) + cram_skip_markduplicates = Channel.empty().mix(input_sample) } CRAM_QC_NO_MD(cram_skip_markduplicates, fasta, intervals_for_preprocessing) @@ -477,22 +469,10 @@ workflow SAREK { // Run if starting from step "prepare_recalibration" if (params.step == 'prepare_recalibration') { - // Support if starting from BAM or CRAM files - input_prepare_recal_convert = input_sample.branch{ - bam: it[0].data_type == "bam" - cram: it[0].data_type == "cram" - } - - // BAM files first must be converted to CRAM files since from this step on we base everything on CRAM format - BAM_TO_CRAM(input_prepare_recal_convert.bam, fasta, fasta_fai) - versions = versions.mix(BAM_TO_CRAM.out.versions) - - ch_cram_from_bam = BAM_TO_CRAM.out.cram.join(BAM_TO_CRAM.out.crai, failOnDuplicate: true, failOnMismatch: true) - // Make sure correct data types are carried through - .map{ meta, cram, crai -> [ meta + [data_type: "cram"], cram, crai ] } + ch_cram_for_bam_baserecalibrator = Channel.empty().mix(input_sample) - ch_cram_for_bam_baserecalibrator = Channel.empty().mix(ch_cram_from_bam, input_prepare_recal_convert.cram) - ch_md_cram_for_restart = ch_cram_from_bam + // Set the input samples for restart so we generate a samplesheet that contains the input files together with the recalibration table + ch_md_cram_for_restart = ch_cram_for_bam_baserecalibrator } else { @@ -566,27 +546,8 @@ workflow SAREK { // Run if starting from step "prepare_recalibration" if (params.step == 'recalibrate') { - // Support if starting from BAM or CRAM files - input_recal_convert = input_sample.branch{ - bam: it[0].data_type == "bam" - cram: it[0].data_type == "cram" - } - - // If BAM file, split up table and mapped file to convert BAM to CRAM - input_only_table = input_recal_convert.bam.map{ meta, bam, bai, table -> [ meta, table ] } - input_only_bam = input_recal_convert.bam.map{ meta, bam, bai, table -> [ meta, bam, bai ] } - - // BAM files first must be converted to CRAM files since from this step on we base everything on CRAM format - BAM_TO_CRAM(input_only_bam, fasta, fasta_fai) - versions = versions.mix(BAM_TO_CRAM.out.versions) + cram_applybqsr = Channel.empty().mix(input_sample) - cram_applybqsr = Channel.empty().mix( - BAM_TO_CRAM.out.cram.join(BAM_TO_CRAM.out.crai, failOnDuplicate: true, failOnMismatch: true) - .join(input_only_table, failOnDuplicate: true, failOnMismatch: true), - - input_recal_convert.cram) - // Join together converted cram with input tables - .map{ meta, cram, crai, table -> [ meta + [data_type: "cram"], cram, crai, table ]} } if (!(params.skip_tools && params.skip_tools.split(',').contains('baserecalibrator'))) { @@ -641,9 +602,7 @@ workflow SAREK { // cram_variant_calling contains either: // - input bams converted to crams, if started from step recal + skip BQSR // - input crams if started from step recal + skip BQSR - cram_variant_calling = Channel.empty().mix( - BAM_TO_CRAM.out.cram.join(BAM_TO_CRAM.out.crai, failOnDuplicate: true, failOnMismatch: true), - input_recal_convert.cram.map{ meta, cram, crai, table -> [ meta, cram, crai ] }) + cram_variant_calling = Channel.empty().mix(input_sample.map{ meta, cram, crai, table -> [ meta, cram, crai ] }) } else { // cram_variant_calling contains either: // - crams from markduplicates = ch_cram_for_bam_baserecalibrator if skip BQSR but not started from step recalibration @@ -651,21 +610,10 @@ workflow SAREK { } } - if (params.step == 'variant_calling') { - input_variant_calling_convert = input_sample.branch{ - bam: it[0].data_type == "bam" - cram: it[0].data_type == "cram" - } - - // BAM files first must be converted to CRAM files since from this step on we base everything on CRAM format - BAM_TO_CRAM(input_variant_calling_convert.bam, fasta, fasta_fai) - versions = versions.mix(BAM_TO_CRAM.out.versions) + if (params.step == 'variant_calling') { - cram_variant_calling = Channel.empty().mix( - BAM_TO_CRAM.out.cram.join(BAM_TO_CRAM.out.crai, failOnDuplicate: true, failOnMismatch: true), - input_variant_calling_convert.cram - ) + cram_variant_calling = Channel.empty().mix( input_sample ) } @@ -853,6 +801,8 @@ workflow SAREK { reports = reports.mix(VCF_QC_BCFTOOLS_VCFTOOLS.out.vcftools_tstv_counts.collect{ meta, counts -> [ counts ] }) reports = reports.mix(VCF_QC_BCFTOOLS_VCFTOOLS.out.vcftools_tstv_qual.collect{ meta, qual -> [ qual ] }) reports = reports.mix(VCF_QC_BCFTOOLS_VCFTOOLS.out.vcftools_filter_summary.collect{ meta, summary -> [ summary ] }) + reports = reports.mix(BAM_VARIANT_CALLING_GERMLINE_ALL.out.out_indexcov.collect{ meta, indexcov -> indexcov.flatten() }) + reports = reports.mix(BAM_VARIANT_CALLING_SOMATIC_ALL.out.out_indexcov.collect{ meta, indexcov -> indexcov.flatten() }) CHANNEL_VARIANT_CALLING_CREATE_CSV(vcf_to_annotate, params.outdir) @@ -874,7 +824,7 @@ workflow SAREK { vcf_to_annotate.map{meta, vcf -> [ meta + [ file_name: vcf.baseName ], vcf ] }, vep_fasta, params.tools, - params.snpeff_genome ? "${params.snpeff_genome}.${params.snpeff_db}" : "${params.genome}.${params.snpeff_db}", + params.snpeff_db, snpeff_cache, vep_genome, vep_species, @@ -921,7 +871,9 @@ workflow SAREK { ch_multiqc_files.collect(), ch_multiqc_config.toList(), ch_multiqc_custom_config.toList(), - ch_multiqc_logo.toList() + ch_multiqc_logo.toList(), + [], + [] ) multiqc_report = MULTIQC.out.report.toList() @@ -937,45 +889,68 @@ workflow SAREK { FUNCTIONS ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ + // Add readgroup to meta and remove lane def addReadgroupToMeta(meta, files) { def CN = params.seq_center ? "CN:${params.seq_center}\\t" : '' - - // Here we're assuming that fastq_1 and fastq_2 are from the same flowcell: def flowcell = flowcellLaneFromFastq(files[0]) - // TO-DO: Would it perhaps be better to also call flowcellLaneFromFastq(files[1]) and check that we get the same flowcell-id? + + // Check if flowcell ID matches + if ( flowcell && flowcell != flowcellLaneFromFastq(files[1]) ){ + error("Flowcell ID does not match for paired reads of sample ${meta.id} - ${files}") + } + + // If we cannot read the flowcell ID from the fastq file, then we don't use it + def sample_lane_id = flowcell ? "${meta.flowcell}.${meta.sample}.${meta.lane}" : "${meta.sample}.${meta.lane}" // Don't use a random element for ID, it breaks resuming - def read_group = "\"@RG\\tID:${flowcell}.${meta.sample}.${meta.lane}\\t${CN}PU:${meta.lane}\\tSM:${meta.patient}_${meta.sample}\\tLB:${meta.sample}\\tDS:${params.fasta}\\tPL:${params.seq_platform}\"" + def read_group = "\"@RG\\tID:${sample_lane_id}\\t${CN}PU:${meta.lane}\\tSM:${meta.patient}_${meta.sample}\\tLB:${meta.sample}\\tDS:${params.fasta}\\tPL:${params.seq_platform}\"" meta = meta - meta.subMap('lane') + [read_group: read_group.toString()] return [ meta, files ] } + // Parse first line of a FASTQ file, return the flowcell id and lane number. def flowcellLaneFromFastq(path) { - // expected format: - // xx:yy:FLOWCELLID:LANE:... (seven fields) - // or - // FLOWCELLID:LANE:xx:... (five fields) - def line - path.withInputStream { - InputStream gzipStream = new java.util.zip.GZIPInputStream(it) - Reader decoder = new InputStreamReader(gzipStream, 'ASCII') - BufferedReader buffered = new BufferedReader(decoder) - line = buffered.readLine() + // First line of FASTQ file contains sequence identifier plus optional description + def firstLine = readFirstLineOfFastq(path) + def flowcell_id = null + + // Expected format from ILLUMINA + // cf https://en.wikipedia.org/wiki/FASTQ_format#Illumina_sequence_identifiers + // Five fields: + // @::::... + // Seven fields or more (from CASAVA 1.8+): + // "@::::::..." + + fields = firstLine ? firstLine.split(':') : [] + if (fields.size() == 5) { + // Get the instrument name as flowcell ID + flowcell_id = fields[0].substring(1) + } else if (fields.size() >= 7) { + // Get the actual flowcell ID + flowcell_id = fields[2] + } else if (fields.size() != 0) { + log.warn "FASTQ file(${path}): Cannot extract flowcell ID from ${firstLine}" } - assert line.startsWith('@') - line = line.substring(1) - def fields = line.split(':') - String fcid - - if (fields.size() >= 7) { - // CASAVA 1.8+ format, from https://support.illumina.com/help/BaseSpace_OLH_009008/Content/Source/Informatics/BS/FileFormat_FASTQ-files_swBS.htm - // "@::::::: :::" - fcid = fields[2] - } else if (fields.size() == 5) { - fcid = fields[0] + return flowcell_id +} + +// Get first line of a FASTQ file +def readFirstLineOfFastq(path) { + def line = null + try { + path.withInputStream { + InputStream gzipStream = new java.util.zip.GZIPInputStream(it) + Reader decoder = new InputStreamReader(gzipStream, 'ASCII') + BufferedReader buffered = new BufferedReader(decoder) + line = buffered.readLine() + assert line.startsWith('@') + } + } catch (Exception e) { + log.warn "FASTQ file(${path}): Error streaming" + log.warn "${e.message}" } - return fcid + return line } /*