Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 0.3.0 #325

Merged
merged 60 commits into from
Aug 30, 2024
Merged
Show file tree
Hide file tree
Changes from 55 commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
ffbbb28
Bump dev version to 0.3.0dev (#229)
fellen31 Jun 27, 2024
43f55c1
Add nf-test to short variant calling (#230)
fellen31 Jun 28, 2024
39fc829
Rework --preset parameter (#232)
fellen31 Jun 28, 2024
18d1eb0
Add deepvariant_model_type parameter (#234)
fellen31 Jul 1, 2024
53bb361
Don't allow running HiPhase with ONT data (#235)
fellen31 Jul 1, 2024
0ea8efd
Add ONT tests (#231)
fellen31 Jul 1, 2024
a053494
Remove --extra_gvcfs (#238)
fellen31 Jul 1, 2024
d9873c6
Allow ONT with HiFiCNV (#233)
fellen31 Jul 1, 2024
ca3ea3a
Remove CONVERT_ONT_READNAMES (#237)
fellen31 Jul 1, 2024
d359bb6
Add nf-test for pipeline (#239)
fellen31 Jul 1, 2024
df73c32
Reorganise processes for SNV calling and annotation (#240)
fellen31 Jul 3, 2024
c3317e9
Test snv annotation (#243)
fellen31 Jul 4, 2024
0735185
Annotate repeat expansions with Stranger (#245)
fellen31 Jul 9, 2024
8bc567d
Light refactoring of snv calling process names (#246)
fellen31 Jul 9, 2024
d8faba6
Run mosdepth in fast mode (#250)
fellen31 Jul 10, 2024
3c761de
updagte vep annotation (#244)
fellen31 Jul 10, 2024
97df846
Change ONT minimap2 preset and snapshot BAM reads (#247)
fellen31 Jul 10, 2024
04bea11
Annotate multisample SNV VCF (#251)
fellen31 Jul 11, 2024
e8d2f9f
Fix duplicate SNVs when overlapping BED regions and add SCATTER_GENOM…
fellen31 Jul 15, 2024
8de16ec
fix stranger (#256)
fellen31 Jul 18, 2024
9764f32
faster test (#258)
fellen31 Jul 18, 2024
f580e97
remove todos (#257)
fellen31 Jul 18, 2024
55c55c5
Add rank_variants (#255)
fellen31 Jul 18, 2024
0fabd11
Run snv-annotation in parallel (#261)
fellen31 Jul 23, 2024
b7ade8e
update deepvariant and htslib (#260)
fellen31 Jul 24, 2024
4b6099d
Fix modkit warning (#267)
fellen31 Jul 24, 2024
7a6cb47
Add BCFTools stats for SNVs (#269)
fellen31 Jul 25, 2024
a00684a
Only output unphased aligments when phasing is off (#268)
fellen31 Jul 25, 2024
901d7cb
Allow CNV calling to start as soon as SNV calling as done for that sa…
fellen31 Jul 25, 2024
2b430d7
Add skip for QC aligned reads (#271)
fellen31 Jul 26, 2024
b907245
Update README.md (#262)
fellen31 Jul 26, 2024
4ee898e
Add whatshap stats to MultiQC (#270)
fellen31 Jul 26, 2024
ef6a9a2
Rank variants in parallel (#278)
fellen31 Jul 26, 2024
1b1c8aa
Formatting (#300)
fellen31 Aug 7, 2024
687743e
Add CADD annotation (#266)
fellen31 Aug 9, 2024
7f064db
Use project name instead of multisample (#264)
fellen31 Aug 9, 2024
f3d9af3
Unused module (#305)
fellen31 Aug 12, 2024
a3d7252
Update echtvar (#306)
fellen31 Aug 12, 2024
0c98285
Add a second somalier relate (#307)
fellen31 Aug 12, 2024
a829933
Update modules (#308)
fellen31 Aug 13, 2024
16c9fd4
Treat BAM as primary input (#304)
fellen31 Aug 13, 2024
ed66340
Use project name in echtvar encode (#312)
fellen31 Aug 13, 2024
c7dc538
Fix typo (#315)
fellen31 Aug 13, 2024
f319465
Remove samtools reset from fastq (#319)
fellen31 Aug 14, 2024
ced1328
DeepVariant improved haploid calling (#313)
fellen31 Aug 14, 2024
341b8b5
Split vep_cache into vep_cache and vep_plugin_files (#314)
fellen31 Aug 15, 2024
4548dc2
Update citations (#320)
fellen31 Aug 15, 2024
e64470f
Use meta.id in BUILD_INTERVALS input (#321)
fellen31 Aug 15, 2024
cc6d26a
Fix file requirements (#317)
fellen31 Aug 15, 2024
1f15553
Fix parallel alignments in CI tests (#323)
fellen31 Aug 15, 2024
e76879d
Fix file requirements
fellen31 Aug 13, 2024
7a855b1
ignore .prettierignore
fellen31 Aug 15, 2024
107869f
update parameters
fellen31 Aug 15, 2024
b14b7ad
try again with linting
fellen31 Aug 15, 2024
83f346c
Merge pull request #318 from fellen31/update-usage
jemten Aug 19, 2024
a6e4fee
Update docs/README.md
fellen31 Aug 27, 2024
8ed32d7
Update output.md
fellen31 Aug 27, 2024
6d2c31f
version bump and fix missing stranger in readme (#330)
fellen31 Aug 27, 2024
10d7081
Use updated sex in genmod PED-file (#332)
fellen31 Aug 28, 2024
14849a8
Add sample name to TRGT output (#333)
fellen31 Aug 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
Expand Up @@ -31,3 +31,8 @@ indent_size = unset
# ignore python and markdown
[*.{py,md}]
indent_style = unset

# ignore parameters.md
[parameters.md]
trim_trailing_whitespace = false
indent_style = unset
63 changes: 62 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@ concurrency:
group: "${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}"
cancel-in-progress: true

permissions:
checks: write

jobs:
test:
name: Run pipeline with test data
Expand All @@ -25,7 +28,7 @@ jobs:
matrix:
parameters:
- ""
- "--input https://raw.githubusercontent.com/genomic-medicine-sweden/test-datasets/nallo/testdata/samplesheet_multisample_bam.csv --split_fastq 2 --parallel_snv 1 --phaser hiphase_sv"
- "--preset ONT_R10 --input https://github.com/genomic-medicine-sweden/test-datasets/raw/e2266a34c14d1e0a9ef798de3cd81a76c9216fc1/testdata/samplesheet_multisample_bam_ont.csv --parallel_alignments 2 --parallel_snv 1"
NXF_VER:
- "23.04.0"
- "latest-everything"
Expand All @@ -44,3 +47,61 @@ jobs:
- name: Run pipeline with test data
run: |
nextflow run ${GITHUB_WORKSPACE} -profile test,docker --outdir ./results ${{ matrix.parameters }}
nftest:
name: ${{ matrix.tags }} ${{ matrix.profile }} NF-${{ matrix.NXF_VER }}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
NXF_VER:
- "latest-everything"
- "23.04.0"
tags:
- "SHORT_VARIANT_CALLING"
- "SNV_ANNOTATION"
- "samplesheet"
- "samplesheet_multisample_bam"
profile:
- "docker"

steps:
- name: Check out pipeline code
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4

- name: Install Nextflow
uses: nf-core/setup-nextflow@b9f764e8ba5c76b712ace14ecbfcef0e40ae2dd8 # v1
with:
version: "${{ matrix.NXF_VER }}"

- uses: nf-core/setup-nf-test@v1

- uses: actions/setup-python@v4
with:
python-version: "3.11"
architecture: "x64"

- name: Install pdiff to see diff between nf-test snapshots
run: |
python -m pip install --upgrade pip
pip install pdiff

- name: Run nf-test
run: |
nf-test test --verbose --tag ${{ matrix.tags }} --profile "+${{ matrix.profile }}" --junitxml=test.xml --tap=test.tap

- uses: pcolby/tap-summary@v1
with:
path: >-
test.tap

- name: Output log on failure
if: failure()
run: |
sudo apt install bat > /dev/null
batcat --decorations=always --color=always ${{ github.workspace }}/.nf-test/tests/*/meta/nextflow.log

- name: Publish Test Report
uses: mikepenz/action-junit-report@v3
if: always() # always run even if the previous step fails
with:
report_paths: test.xml
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ results/
testing/
testing*
*.pyc
.nf-test*
3 changes: 3 additions & 0 deletions .nf-core.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ lint:
files_exist:
- CODE_OF_CONDUCT.md
- assets/nf-core-nallo_logo_light.png
- docs/README.md
- docs/images/nf-core-nallo_logo_light.png
- docs/images/nf-core-nallo_logo_dark.png
- .github/ISSUE_TEMPLATE/config.yml
Expand All @@ -11,10 +12,12 @@ lint:
files_unchanged:
- CODE_OF_CONDUCT.md
- assets/nf-core-nallo_logo_light.png
- docs/README.md
- docs/images/nf-core-nallo_logo_light.png
- docs/images/nf-core-nallo_logo_dark.png
- .github/ISSUE_TEMPLATE/bug_report.yml
- .github/CONTRIBUTING.md
- .prettierignore
multiqc_config:
- report_comment
nextflow_config:
Expand Down
1 change: 1 addition & 0 deletions .prettierignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ slackreport.json
.nextflow*
work/
data/
docs/parameters.md
results/
.DS_Store
testing/
Expand Down
122 changes: 122 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,128 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## v0.3.0dev - [xxxx-xx-xx]

### `Added`

- [#230](https://github.com/genomic-medicine-sweden/nallo/pull/230) - Added nf-test to the short variant calling workflow
- [#231](https://github.com/genomic-medicine-sweden/nallo/pull/231) - Added initial tests for ONT data
- [#234](https://github.com/genomic-medicine-sweden/nallo/pull/234) - Added a `--deepvariant_model_type` parameter to override the model type set by `--preset`
- [#239](https://github.com/genomic-medicine-sweden/nallo/pull/239) - Added initial nf-test to the pipeline
- [#243](https://github.com/genomic-medicine-sweden/nallo/pull/243) - Added nf-test to the short variant annotation workflow
- [#245](https://github.com/genomic-medicine-sweden/nallo/pull/245) - Added repeat annotation with Stranger
- [#252](https://github.com/genomic-medicine-sweden/nallo/pull/252) - Added a new `SCATTER_GENOME` subworkflow
- [#255](https://github.com/genomic-medicine-sweden/nallo/pull/255) - Added a new `RANK_VARIANTS` subworkflow to rank SNVs using genmod
- [#261](https://github.com/genomic-medicine-sweden/nallo/pull/261) - Added a `--skip_rank_variants` parameter to skip the rank_variants subworkflow
- [#264](https://github.com/genomic-medicine-sweden/nallo/pull/264) - Added a `project` column to the sampleheet
- [#266](https://github.com/genomic-medicine-sweden/nallo/pull/266) - Added CADD to dynamically calculate indel CADD-scores
- [#270](https://github.com/genomic-medicine-sweden/nallo/pull/270) - Added SNV phasing stats to MultiQC
- [#271](https://github.com/genomic-medicine-sweden/nallo/pull/271) - Added a `--skip_aligned_read_qc` parameter to skip the qc aligned reads subworkflow
- [#314](https://github.com/genomic-medicine-sweden/nallo/pull/314) - Added a `--vep_plugin_files` parameter to separate VEP plugins from cache
- [#320](https://github.com/genomic-medicine-sweden/nallo/pull/320) - Added complete citations to CITATIONS.md and MultiQC report

### `Changed`

- [#232](https://github.com/genomic-medicine-sweden/nallo/pull/232) - Changed to softer `--preset` requirements, non-supported subworkflows can now be explicitly enabled if necessary
- [#232](https://github.com/genomic-medicine-sweden/nallo/pull/232) - Changed `--skip_repeat_wf` to default to true for preset ONT_R10
- [#233](https://github.com/genomic-medicine-sweden/nallo/pull/233) - Changed the CNV calling workflow to allow calling using ONT data
- [#235](https://github.com/genomic-medicine-sweden/nallo/pull/235) - Changed the ONT_R10 preset to not allow phasing with HiPhase
- [#240](https://github.com/genomic-medicine-sweden/nallo/pull/240) - Reorganize processes in the snv annotation and short variant calling workflows
- [#240](https://github.com/genomic-medicine-sweden/nallo/pull/240) - GLNexus multisample output is now decomposed and normalized
- [#244](https://github.com/genomic-medicine-sweden/nallo/pull/244) - Updated VEP with more annotations
- [#245](https://github.com/genomic-medicine-sweden/nallo/pull/245) - Merged (multisample) repeats from TRGT is now output even if there's only one sample
- [#245](https://github.com/genomic-medicine-sweden/nallo/pull/245) - Split the repeat analysis workflow into one calling and one annotation workflow, `--skip_repeat_wf` becomes `--skip_repeat_calling` and `--skip_repeat_annotation`
- [#246](https://github.com/genomic-medicine-sweden/nallo/pull/246) - Renamed processes and light refactoring of the short variant calling workflow
- [#246](https://github.com/genomic-medicine-sweden/nallo/pull/246) - Use groupKey to remove bottleneck in the short variant calling workflow
- [#247](https://github.com/genomic-medicine-sweden/nallo/pull/247) - Updated nft-bam to 0.3.0 and added BAM reads to snapshot
- [#247](https://github.com/genomic-medicine-sweden/nallo/pull/247) - Changed minimap2 preset from `map-ont` to `lr:hq` for `--preset ONT_R10`
- [#250](https://github.com/genomic-medicine-sweden/nallo/pull/250) - Run mosdepth with `--fast-mode` and add to MultiQC report
- [#251](https://github.com/genomic-medicine-sweden/nallo/pull/251) - Switched from annotating single sample VCFs to annotating a multisample VCF, splitting the VCF per sample afterwards to keep outputs almost consistent
- [#256](https://github.com/genomic-medicine-sweden/nallo/pull/256) - Changed Stranger to annotate single-sample VCFs instead of a multi-sample VCF
- [#258](https://github.com/genomic-medicine-sweden/nallo/pull/258) - Updated test profile parameters to speed up tests
- [#260](https://github.com/genomic-medicine-sweden/nallo/pull/260) - Updated DeepVariant to 1.6.1 and htslib (tabix) to 1.20
- [#261](https://github.com/genomic-medicine-sweden/nallo/pull/261) - Changed SNV annotation to run in parallel
- [#261](https://github.com/genomic-medicine-sweden/nallo/pull/261) - Changed SNV output file names and directory structure
- [#262](https://github.com/genomic-medicine-sweden/nallo/pull/262) - Updated README
- [#264](https://github.com/genomic-medicine-sweden/nallo/pull/264) - Changed PED file creation from groovy script to process
- [#264](https://github.com/genomic-medicine-sweden/nallo/pull/264) - Changed all `multisample` filenames to `{project}` from samplesheet
- [#268](https://github.com/genomic-medicine-sweden/nallo/pull/268) - Only output unphased alignments when phasing is off
- [#268](https://github.com/genomic-medicine-sweden/nallo/pull/268) - Changed alignment output file names and directory structure
- [#270](https://github.com/genomic-medicine-sweden/nallo/pull/270) - Changed whatshap stats to always run, regardless of phasing software, and changed the output from `*.stats.tsv.gz` to `*.stats.tsv` to allow being picked up by MultiQC
- [#277](https://github.com/genomic-medicine-sweden/nallo/pull/277) - Allowed CNV calling as soon as SNV calling for a sample is finished
- [#278](https://github.com/genomic-medicine-sweden/nallo/pull/278) - Changed the SNV ranking to run in parallel per region
- [#300](https://github.com/genomic-medicine-sweden/nallo/pull/300) - Clarified and formatted nallo.nf
- [#304](https://github.com/genomic-medicine-sweden/nallo/pull/304) - Changed to treat (u)BAM as the primary input by skipping fastq conversion before aligning
- [#306](https://github.com/genomic-medicine-sweden/nallo/pull/306) - Updated echtvar version
- [#307](https://github.com/genomic-medicine-sweden/nallo/pull/307) - Changed somalier relate to also run per sample on sampes with unknown sex, removing the need to wait on all samples to finish aligment before starting variant calling
- [#307](https://github.com/genomic-medicine-sweden/nallo/pull/307) - Changed the removal of n_files from meta from bam_infer_sex to nallo.nf
- [#308](https://github.com/genomic-medicine-sweden/nallo/pull/308) - Updated nf-core modules, fixed warnings in local modules, added Dockerfile to fqcrs
- [#312](https://github.com/genomic-medicine-sweden/nallo/pull/312) - Changed echtvar encode database creation to use dynamic `${project}` from samplesheet
- [#313](https://github.com/genomic-medicine-sweden/nallo/pull/313) - Updated calling of variants in non-autosomal contigs for DeepVariant
- [#314](https://github.com/genomic-medicine-sweden/nallo/pull/314) - Changed VEP annotation added in #244 to not include SpliceAI
- [#317](https://github.com/genomic-medicine-sweden/nallo/pull/317) - Changed so that `--reduced_penetrance` and `--score_config_snv` is required by rank variants and not SNV annotation
- [#318](https://github.com/genomic-medicine-sweden/nallo/pull/318) - Updated docs and schema to clarify pipeline usage
- [#321](https://github.com/genomic-medicine-sweden/nallo/pull/321) - Changed the input to BUILD_INTERVALS to have `meta.id` when building intervals from reference
- [#323](https://github.com/genomic-medicine-sweden/nallo/pull/323) - Changed `parallel_alignment` to `parallel_alignments` in CI tests as well

### `Removed`

- [#237](https://github.com/genomic-medicine-sweden/nallo/pull/237) - Removed the CONVERT_ONT_READNAMES module that was run before calling repeats with TRGT
- [#238](https://github.com/genomic-medicine-sweden/nallo/pull/238) - Removed the `--extra_gvcfs` parameter
- [#243](https://github.com/genomic-medicine-sweden/nallo/pull/243) - Removed VEP report from output files
- [#257](https://github.com/genomic-medicine-sweden/nallo/pull/257) - Removed obsolete TODO statements
- [#258](https://github.com/genomic-medicine-sweden/nallo/pull/258) - Removed VCF report from DeepVariant output
- [#264](https://github.com/genomic-medicine-sweden/nallo/pull/264) - Removed the option to provide extra SNF files to Sniffles with `--extra_snfs`
- [#305](https://github.com/genomic-medicine-sweden/nallo/pull/305) - Removed unused local module bcftools view regions
- [#319](https://github.com/genomic-medicine-sweden/nallo/pull/319) - Removed samtools reset before samtools fastq when converting BAM to FASTQ

### `Fixed`

- [#231](https://github.com/genomic-medicine-sweden/nallo/pull/231) - Fixed certain tags in input BAM files being transfered over to (re)aligned BAM
- [#252](https://github.com/genomic-medicine-sweden/nallo/pull/252) - Fixed duplicate SNVs in outputs when providing a BED-regions with overlapping regions
- [#267](https://github.com/genomic-medicine-sweden/nallo/pull/267) - Fixed warning where `MODKIT_PILEUP_HAPLOTYPES` would be defined more than once
- [#300](https://github.com/genomic-medicine-sweden/nallo/pull/300) - Fixed missing paraphase version

### Parameters

| Old parameter | New parameter |
| ------------------ | -------------------------- |
| `--skip_repeat_wf` | `--skip_repeat_calling` |
| `--skip_repeat_wf` | `--skip_repeat_annotation` |
| | `--deepvariant_model_type` |
| | `--skip_rank_variants` |
| | `--skip_aligned_read_qc` |
| | `--cadd_resources` |
| | `--cadd_prescored` |
| `--split_fastq` | `--parallel_alignments` |
| `--extra_gvcfs` | |
| `--extra_snfs` | |
| `--dipcall_par` | `--par_regions` |
| | `--vep_plugin_files` |

> [!NOTE]
> Parameter has been updated if both old and new parameter information is present.
> Parameter has been added if just the new parameter information is present.
> Parameter has been removed if new parameter information isn't present.

### Module updates

| Tool | Old version | New version |
| --------------------------- | ----------- | ----------- |
| deepvariant | 1.5.0 | 1.6.1 |
| tabix | 1.19.1 | 1.20 |
| echtvar | 0.1.7 | 0.2.0 |
| somalier | 0.2.15 | 0.2.18 |
| cadd | | 1.6.post1 |
| gawk | | 5.3.0 |
| add_most_severe_consequence | | v1.0 |
| add_most_severe_pli | | v1.0 |
| create_pedigree_file | | v1.0 |
| genmod | | 3.8.2 |
| stranger | | 0.9.1 |
| splitubam | | 0.1.1 |
| fastp | 0.23.4 | |

## v0.2.0 - [2024-06-26]

### `Added`
Expand Down
Loading
Loading