Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

medaka worflow fails if you have few reads #77

Open
fwa93 opened this issue May 10, 2023 · 1 comment
Open

medaka worflow fails if you have few reads #77

fwa93 opened this issue May 10, 2023 · 1 comment

Comments

@fwa93
Copy link

fwa93 commented May 10, 2023

Hi. There is a problem with the medaka workflow.
If I remove barcode71 (a really bad sample), the pipeline finish. If I keep it, the pipeline krasches. It seems like longshot with the -A flag does not accept when the coverage is 0. See a similar issue here -> artic-network/fieldbioinformatics#91

nextflow run main.nf -profile singularity --medaka --prefix "full_test1" --basecalled_fastq 23v17_Sars-cov2/no_sample/20230427_1424_X1_FAV87849_b991bac3/fastq_pass/ --outdir full_test_results --scheme midnight-primer --schemeVersion V1
N E X T F L O W ~ version 20.10.0
Launching main.nf [determined_wescoff] - revision: 4b2eb4a204
WARN: DSL 2 IS AN EXPERIMENTAL FEATURE UNDER DEVELOPMENT -- SYNTAX MAY CHANGE IN FUTURE RELEASE
executor > local (182)
[25/8bc07c] process > articNcovNanopore:sequenceAnalysisMedaka:versions [100%] 1 of 1 ✔
[2c/4a21a0] process > articNcovNanopore:sequenceAnalysisMedaka:pangoversions [100%] 1 of 1 ✔
[41/19da2b] process > articNcovNanopore:sequenceAnalysisMedaka:fastqcNanopore (46) [ 98%] 46 of 47
[69/ac0778] process > articNcovNanopore:sequenceAnalysisMedaka:multiqcNanopore (46) [ 98%] 45 of 46
[ac/9b0bba] process > articNcovNanopore:sequenceAnalysisMedaka:articDownloadScheme (https://github.com/genomic-medicine-sweden/gms-art... [100%] 1 of 1 ✔
[5c/00539b] process > articNcovNanopore:sequenceAnalysisMedaka:articGuppyPlex (full_test1-barcode60) [ 85%] 40 of 47
[7a/ebc876] process > articNcovNanopore:sequenceAnalysisMedaka:articMinIONMedaka (full_test1_barcode63) [ 0%] 0 of 39
[- ] process > articNcovNanopore:sequenceAnalysisMedaka:articRemoveUnmappedReads -
[- ] process > articNcovNanopore:sequenceAnalysisMedaka:makeQCCSV -
[- ] process > articNcovNanopore:sequenceAnalysisMedaka:writeQCSummaryCSV -
[- ] process > articNcovNanopore:sequenceAnalysisMedaka:collateSamples -
[- ] process > articNcovNanopore:sequenceAnalysisMedaka:nextclade -
[- ] process > articNcovNanopore:sequenceAnalysisMedaka:pangolinTyping -
[- ] process > articNcovNanopore:sequenceAnalysisMedaka:makeReport -
Error executing process > 'articNcovNanopore:sequenceAnalysisMedaka:articMinIONMedaka (full_test1_barcode71)'

Caused by:
Process articNcovNanopore:sequenceAnalysisMedaka:articMinIONMedaka (full_test1_barcode71) terminated with an error exit status (20)

Command executed:

executor > local (182)
[25/8bc07c] process > articNcovNanopore:sequenceAnalysisMedaka:versions [100%] 1 of 1 ✔
[2c/4a21a0] process > articNcovNanopore:sequenceAnalysisMedaka:pangoversions [100%] 1 of 1 ✔
[56/707da2] process > articNcovNanopore:sequenceAnalysisMedaka:fastqcNanopore (43) [100%] 46 of 46
[69/ac0778] process > articNcovNanopore:sequenceAnalysisMedaka:multiqcNanopore (46) [ 98%] 45 of 46
[ac/9b0bba] process > articNcovNanopore:sequenceAnalysisMedaka:articDownloadScheme (https://github.com/genomic-medicine-sweden/gms-art... [100%] 1 of 1 ✔
[f3/4f282b] process > articNcovNanopore:sequenceAnalysisMedaka:articGuppyPlex (full_test1-barcode70) [100%] 40 of 40
[29/b56259] process > articNcovNanopore:sequenceAnalysisMedaka:articMinIONMedaka (full_test1_barcode33) [ 6%] 1 of 18, failed: 1
[- ] process > articNcovNanopore:sequenceAnalysisMedaka:articRemoveUnmappedReads -
[- ] process > articNcovNanopore:sequenceAnalysisMedaka:makeQCCSV -
[- ] process > articNcovNanopore:sequenceAnalysisMedaka:writeQCSummaryCSV -
[- ] process > articNcovNanopore:sequenceAnalysisMedaka:collateSamples -
[- ] process > articNcovNanopore:sequenceAnalysisMedaka:nextclade -
[- ] process > articNcovNanopore:sequenceAnalysisMedaka:pangolinTyping -
[- ] process > articNcovNanopore:sequenceAnalysisMedaka:makeReport -
Error executing process > 'articNcovNanopore:sequenceAnalysisMedaka:articMinIONMedaka (full_test1_barcode71)'

Caused by:
Process articNcovNanopore:sequenceAnalysisMedaka:articMinIONMedaka (full_test1_barcode71) terminated with an error exit status (20)

Command executed:

artic minion --medaka --normalise 500 --minimap2 --threads 1 --scheme-directory gms-artic --read-file full_test1_barcode71.fastq midnight-primer/V1 full_test1_barcode71

Command exit status:
20

Command output:
error: {} ERROR: Max read coverage set to 0. printing empty VCF file

Command error:
Running: samtools view -b -r "nCoV-2019_2" full_test1_barcode71.primertrimmed.rg.sorted.bam > full_test1_barcode71.primertrimmed.nCoV-2019_2.sorted.bam
Running: samtools index full_test1_barcode71.primertrimmed.nCoV-2019_2.sorted.bam
Running: samtools view -b -r "nCoV-2019_1" full_test1_barcode71.primertrimmed.rg.sorted.bam > full_test1_barcode71.primertrimmed.nCoV-2019_1.sorted.b] Initializing data loader
[17:11:17 - PWorker] Running inference for 0.0M draft bases.
[17:11:17 - Sampler] Initializing sampler for consensus of region MN908947.3:0-29903.
[17:11:17 - Sampler] Took 0.00s to make features.
[17:11:18 - PWorker] All done, 0 remainder regions.
[17:11:18 - Predict] Finished processing all regions.
[17:11:21 - DataIndex] Loaded 1/1 (100.00%) sample files.
[17:11:24 - Predict] Processing region(s): MN908947.3:0-29903
[17:11:24 - Predict] Setting tensorflow threads to 1.
[17:11:24 - Predict] Processing 1 long region(s) with batching.
[17:11:24 - Predict] Using model: /opt/conda/envs/artic/lib/python3.6/site-packages/medaka/data/r941_min_high_g360_model.hdf5.
[17:11:24 - ModelLoad] Building model with cudnn optimization: False
[17:11:25 - DLoader] Initializing data loader
[17:11:25 - PWorker] Running inference for 0.0M draft bases.
[17:11:25 - Sampler] Initializing sampler for consensus of region MN908947.3:0-29903.
[17:11:25 - Feature] Pileup counts do not span requested region, requested MN908947.3:0-29903, received 28699-29506.
[17:11:25 - Feature] Processed MN908947.3:28699.0-29506.0 (median depth 1.0)
[17:11:25 - Sampler] Took 0.01s to make features.
[17:11:26 - PWorker] All done, 0 remainder regions.
[17:11:26 - Predict] Finished processing all regions.
[17:11:29 - DataIndex] Loaded 1/1 (100.00%) sample files.
[17:11:29 - Variants] Processing MN908947.3:0-.

2023-05-10 17:11:31 Automatically determining max read coverage.
2023-05-10 17:11:31 Estimating mean read coverage...
2023-05-10 17:11:31 WARNING: Max coverage calculation is highly likely to be incorrect. The number of reference bases covered by the bam file (808) differs significantly from the expected number of positions in the reference (29903). If you are using a bam file that only covers part of the genome, please specify this region exactly with the --region argument so the number of reference bases is known. Alternatively, disable maximum coverage filtering by setting -C to a large number.
2023-05-10 17:11:31 Total reference positions: 29903
2023-05-10 17:11:31 Total bases in bam: 808
2023-05-10 17:11:31 Mean read coverage: 0.03
Running: minimap2 -a -x map-ont -t 1 gms-artic/midnight-primer/V1/midnight-primer.reference.fasta full_test1_barcode71.fastq | samtools view -bS -F 4 - | samtools sort -o full_test1_barcode71.sorted.bam -
Running: samtools index full_test1_barcode71.sorted.bam
Running: align_trim --start --normalise 500 gms-artic/midnight-primer/V1/midnight-primer.scheme.bed --report full_test1_barcode71.alignreport.txt < full_test1_barcode71.sorted.bam 2> full_test1_barcode71.alignreport.er | samtools sort -T full_test1_barcode71 - -o full_test1_barcode71.trimmed.rg.sorted.bam
Running: align_trim --normalise 500 gms-artic/midnight-primer/V1/midnight-primer.scheme.bed --remove-incorrect-pairs --report full_test1_barcode71.alignreport.txt < full_test1_barcode71.sorted.bam 2> full_test1_barcode71.alignreport.er | samtools sort -T full_test1_barcode71 - -o full_test1_barcode71.primertrimmed.rg.sorted.bam
Running: samtools index full_test1_barcode71.trimmed.rg.sorted.bam
Running: samtools index full_test1_barcode71.primertrimmed.rg.sorted.bam
Running: samtools view -b -r "nCoV-2019_2" full_test1_barcode71.primertrimmed.rg.sorted.bam > full_test1_barcode71.primertrimmed.nCoV-2019_2.sorted.bam
Running: samtools index full_test1_barcode71.primertrimmed.nCoV-2019_2.sorted.bam
Running: samtools view -b -r "nCoV-2019_1" full_test1_barcode71.primertrimmed.rg.sorted.bam > full_test1_barcode71.primertrimmed.nCoV-2019_1.sorted.bam
Running: samtools index full_test1_barcode71.primertrimmed.nCoV-2019_1.sorted.bam
Running: medaka consensus --chunk_len 800 --chunk_ovlp 400 full_test1_barcode71.primertrimmed.nCoV-2019_2.sorted.bam full_test1_barcode71.nCoV-2019_2.hdf
Running: medaka variant gms-artic/midnight-primer/V1/midnight-primer.reference.fasta full_test1_barcode71.nCoV-2019_2.hdf full_test1_barcode71.nCoV-2019_2.vcf
Running: medaka consensus --chunk_len 800 --chunk_ovlp 400 full_test1_barcode71.primertrimmed.nCoV-2019_1.sorted.bam full_test1_barcode71.nCoV-2019_1.hdf
Running: medaka variant gms-artic/midnight-primer/V1/midnight-primer.reference.fasta full_test1_barcode71.nCoV-2019_1.hdf full_test1_barcode71.nCoV-2019_1.vcf
Running: artic_vcf_merge full_test1_barcode71 gms-artic/midnight-primer/V1/midnight-primer.scheme.bed nCoV-2019_2:full_test1_barcode71.nCoV-2019_2.vcf nCoV-2019_1:full_test1_barcode71.nCoV-2019_1.vcf
Running: bgzip -f full_test1_barcode71.merged.vcf
Running: tabix -p vcf full_test1_barcode71.merged.vcf.gz
Running: longshot -P 0 -F -A --no_haps --bam full_test1_barcode71.primertrimmed.rg.sorted.bam --ref gms-artic/midnight-primer/V1/midnight-primer.reference.fasta --out full_test1_barcode71.longshot.vcf --potential_variants full_test1_barcode71.merged.vcf.gz
Command failed:longshot -P 0 -F -A --no_haps --bam full_test1_barcode71.primertrimmed.rg.sorted.bam --ref gms-artic/midnight-primer/V1/midnight-primer.reference.fasta --out full_test1_barcode71.longshot.vcf --potential_variants full_test1_barcode71.merged.vcf.gz

Work dir:
/aux/db/gms-artic/work/71/6f2fcada2474605bd307b6decbf047

Tip: when you have fixed the problem you can continue the execution adding the option -resume to the run command line

@JD2112
Copy link
Member

JD2112 commented Jun 16, 2023

@fwa93 #78 looks like longshot installation problem. In the container (environment.yaml), we need to add longshot module from Conda. Could you please check locally if it works? I don't have any Medaka data to test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants