Skip to content

Commit

Permalink
Add longphase
Browse files Browse the repository at this point in the history
  • Loading branch information
fellen31 committed Sep 20, 2024
1 parent 1450dd4 commit d56bcf0
Show file tree
Hide file tree
Showing 34 changed files with 1,731 additions and 450 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#366](https://github.com/genomic-medicine-sweden/nallo/pull/366) - Added sorting of samples when creating PED files, so the output is always the same
- [#367](https://github.com/genomic-medicine-sweden/nallo/pull/367) - Added Severus as the default SV caller, together with a `--sv_caller` parameter to choose caller
- [#371](https://github.com/genomic-medicine-sweden/nallo/pull/371) - Added `FOUND_IN=caller` tags to SV output
- [#388](https://github.com/genomic-medicine-sweden/nallo/pull/388) - Added longphase as the default phaser
- [#388](https://github.com/genomic-medicine-sweden/nallo/pull/388) - Added single-sample tbi output to the short variant calling subworkflow

### `Changed`

Expand All @@ -31,12 +33,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#365](https://github.com/genomic-medicine-sweden/nallo/pull/365) - Changed CI to only use nf-test for pipeline tests
- [#381](https://github.com/genomic-medicine-sweden/nallo/pull/381) - Updated CI nf-test version to 0.9.0
- [#382](https://github.com/genomic-medicine-sweden/nallo/pull/382) - Changed vep_plugin_files description in schema and docs
- [#388](https://github.com/genomic-medicine-sweden/nallo/pull/388) - Changed phasing output structure and naming, and updated docs

### `Removed`

- [#352](https://github.com/genomic-medicine-sweden/nallo/pull/352) - Removed the fqcrs module
- [#356](https://github.com/genomic-medicine-sweden/nallo/pull/356) - Removed filter_vep section from output documentation since it is not in the pipeline
- [#379](https://github.com/genomic-medicine-sweden/nallo/pull/379) - Removed VEP Plugins from testdata ([genomic-medicine-sweden/test-datasets#16](https://github.com/genomic-medicine-sweden/test-datasets/pull/16))
- [#388](https://github.com/genomic-medicine-sweden/nallo/pull/388) - Removed support for co-phasing SVs with HiPhase, as the officially supported caller (pbsv) is not in the pipeline

### `Fixed`

Expand Down
4 changes: 4 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,10 @@
- [HiFiCNV](https://github.com/PacificBiosciences/HiFiCNV)

- [LongPhase](https://github.com/twolinin/longphase)

> Jyun-Hong Lin, Liang-Chi Chen, Shu-Chi Yu, Yao-Ting Huang, LongPhase: an ultra-fast chromosome-scale phasing algorithm for small and large variants, Bioinformatics, Volume 38, Issue 7, March 2022, Pages 1816–1822, https://doi.org/10.1093/bioinformatics/btac058
- [minimap2](https://academic.oup.com/bioinformatics/article/34/18/3094/4994778)

> Heng Li, Minimap2: pairwise alignment for nucleotide sequences, Bioinformatics, Volume 34, Issue 18, September 2018, Pages 3094–3100, https://doi.org/10.1093/bioinformatics/bty191
Expand Down
2 changes: 1 addition & 1 deletion conf/base.config
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ process {
maxRetries = 2
}

withName: '.*:SAMTOOLS_MERGE' {
withName: 'SAMTOOLS_MERGE|SAMTOOLS_INDEX' {
label = 'process_medium'
}
}
37 changes: 21 additions & 16 deletions conf/modules/phasing.config
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,7 @@ process {
]
}

withName: '.*:PHASING:HIPHASE_SNV' {
ext.prefix = { "$meta.id}_phased" }
withName: '.*:PHASING:HIPHASE' {
ext.args = { [
'--ignore-read-groups',
"--stats-file ${meta.id}_phased.stats.tsv",
Expand All @@ -35,22 +34,28 @@ process {
publishDir = [
path: { "${params.outdir}/" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : ((filename.endsWith('bam') || filename.endsWith('bai')) ? "aligned_reads/${meta.id}/${filename}" : "phasing/hiphase/snv/${meta.id}/${filename}" ) }
saveAs: { filename -> filename.equals('versions.yml') ? null : ((filename.endsWith('bam') || filename.endsWith('bai')) ? "aligned_reads/${meta.id}/${filename}" : "phased_variants/${meta.id}/${filename}" ) }
]
}

withName: '.*:PHASING:HIPHASE_SV' {
ext.prefix = { "$meta.id}_phased" }
ext.args = { [
'--ignore-read-groups',
"--stats-file ${meta.id}_phased.stats.tsv",
"--blocks-file ${meta.id}_phased.blocks.tsv",
"--summary-file ${meta.id}_phased.summary.tsv"
].join(' ') }
withName: '.*:PHASING:LONGPHASE_PHASE' {
ext.prefix = { "${meta.id}_phased" }
ext.args = [
params.preset.equals('ONT_R10') ? "--ont" : "--pb",
'--indels'
].join(' ')
publishDir = [
path: { "${params.outdir}/" },
path: { "${params.outdir}/phased_variants/${meta.id}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : ((filename.endsWith('bam') || filename.endsWith('bai')) ? "aligned_reads/${meta.id}/${filename}" : "phasing/hiphase/sv/${meta.id}/${filename}" ) }
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}
withName: '.*:PHASING:LONGPHASE_HAPLOTAG' {
ext.prefix = { "${meta.id}_haplotagged" }
publishDir = [
path: { "${params.outdir}/aligned_reads/${meta.id}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

Expand All @@ -61,7 +66,7 @@ process {
'--indels'
].join(' ')
publishDir = [
path: { "${params.outdir}/phasing/whatshap/phase/${meta.id}" },
path: { "${params.outdir}/phased_variants/${meta.id}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -70,14 +75,14 @@ process {
withName: '.*:PHASING:WHATSHAP_STATS' {
ext.prefix = { "${meta.id}_stats" }
publishDir = [
path: { "${params.outdir}/phasing/whatshap/stats/${meta.id}" },
path: { "${params.outdir}/qc/phasing_stats/${meta.id}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

withName: '.*:PHASING:WHATSHAP_HAPLOTAG' {
ext.prefix = { "${meta.id}_phased" }
ext.prefix = { "${meta.id}_haplotagged" }
ext.args = [
'--ignore-read-groups',
'--tag-supplementary'
Expand Down
3 changes: 2 additions & 1 deletion conf/modules/short_variant_calling.config
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,8 @@ process {
ext.args = [
'-m -',
'-w 10000',
'--output-type u',
'--output-type z',
'--write-index=tbi'
].join(' ')
}

Expand Down
2 changes: 1 addition & 1 deletion conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ params {

// References
fasta = params.pipelines_testdata_base_path + 'nallo/reference/hg38.test.fa.gz'
input = 'https://github.com/genomic-medicine-sweden/test-datasets/raw/2948776ddf24ea131f527aa1f2dc23a43bb7b952/testdata/samplesheet.csv'
input = params.pipelines_testdata_base_path + 'nallo/testdata/samplesheet.csv'
bed = params.pipelines_testdata_base_path + 'nallo/reference/test_data.bed'
hificnv_xy = params.pipelines_testdata_base_path + 'nallo/reference/expected_cn.hg38.XY.bed'
hificnv_xx = params.pipelines_testdata_base_path + 'nallo/reference/expected_cn.hg38.XX.bed'
Expand Down
30 changes: 6 additions & 24 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,40 +157,22 @@ Results generated by MultiQC collate pipeline QC from supported tools e.g. FastQ

### Phasing

[WhatsHap](https://whatshap.readthedocs.io/en/latest/) or [HiPhase](https://github.com/PacificBiosciences/HiPhase) are used to phase variants and haplotag reads.
[LongPhase](https://github.com/twolinin/longphase), [WhatsHap](https://whatshap.readthedocs.io/en/latest/) or [HiPhase](https://github.com/PacificBiosciences/HiPhase) are used to phase variants and haplotag reads.

<details markdown="1">
<summary>Output files from WhatsHap</summary>
<summary>Output files from phasing</summary>

- `{outputdir}/aligned_reads/{sample}/`
- `{sample}_phased.bam`: BAM file with haplotags
- `{sample}_phased.bam.bai`: Index of the corresponding bam file
- `{outputdir}/phasing/whatshap/phase/{sample}/`
- `{sample}_haplotagged.bam`: BAM file with haplotags
- `{sample}_haplotagged.bam.bai`: Index of the corresponding bam file
- `{outputdir}/phased_variants/{sample}/`
- `*.vcf.gz`: VCF file with phased variants
- `*.vcf.gz.tbi`: Index of the corresponding VCF file
- `{outputdir}/phasing/whatshap/stats/{sample}/`
- `{outputdir}/qc/phasing_stats/{sample}/`
- `*.blocks.tsv`: File with phase blocks
- `*.stats.tsv`: File with phasing statistics
</details>

<details markdown="1">
<summary>Output files from HiPhase</summary>

- `{outputdir}/aligned_reads/{sample}/`

- `{sample}_phased.bam`: BAM file with haplotags
- `{sample}_phased.bam.bai`: Index of the corresponding bam file

- `{outputdir}/phasing/hiphase/{snv,sv}/{sample}/`

- `*.blocks.tsv`: File with phase blocks
- `*.stats.tsv.gz`: File with phasing statistics
- `*.vcf.gz`: VCF file with phased variants
- `*.vcf.gz.tbi`: Index of the corresponding VCF file
- `*.summary.tsv`: HiPhase summary file

</details>

### Pipeline information

[Nextflow](https://www.nextflow.io/docs/latest/tracing.html) provides excellent functionality for generating various reports relevant to the running and execution of the pipeline. This will allow you to troubleshoot errors with the running of the pipeline, and also provide you with other information such as launch commands, run times and resource usage.
Expand Down
Loading

0 comments on commit d56bcf0

Please sign in to comment.