You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
I am trying to assemble ~20 linear plasmids which are very similar to each other, the whole plasmids (average length ~16kbp) are almost identical to each other, except for a single region (~20bp) within each plasmid is different. In other words, all the plasmids have same backbone, just the insert part is different. I already tried to reduce correctedErrorRate to 0.13 (we have OXN data), but we still can not get all 20 sequence in the final assembly, we have only ~10.
I realized this is a special case for any genome assemblers, but since we are having really awesome results with Canu for regular genome assembly, so we would like to give it a try.
I am just wondering how Canu will treat this localized 20 bp mismatches during assembly, and is there any parameters I can adjust in Canu to make it work? My current commands are:
canu maxMemory=80 redMemory=50 oeaMemory=50 gridOptions="--time=100:00:00 --partition=**" -p $prefix -d $dir genomeSize=320k correctedErrorRate=0.13 gnuplotTested=true -nanopore-raw $OXN_file
Just in case it might help, we have high coverage raw data, about a few thousands X.
Thank you!
Yuanwen
The text was updated successfully, but these errors were encountered:
A 20bp difference out of 16kb is quite small given nanopore error rates, in general I'd expect the correction to corrupt that difference. Ideally, I'd suggest something similar to what I suggested in #1885. That is, use a consensus plasmid to map reads + call variants. The variant callers can give you reads supporting each variant so you can bin + assemble subsets rather than mixing them all.
You could also try re-calling the ONT data with a newer basecaller (like bonito or recent guppy versions) which can be assembled w/o correction due to their higher accuracy as I suggested in #1715 (-untrimmed 'batOptions=-eg 0.12 -sb 0.01' 'correctedErrorRate=0.12' 'maxInputCoverage=100' -pacbio-hifi <your nanopore fastq>) assuming you're using Canu 2.1.1
Hello,
I am trying to assemble ~20 linear plasmids which are very similar to each other, the whole plasmids (average length ~16kbp) are almost identical to each other, except for a single region (~20bp) within each plasmid is different. In other words, all the plasmids have same backbone, just the insert part is different. I already tried to reduce correctedErrorRate to 0.13 (we have OXN data), but we still can not get all 20 sequence in the final assembly, we have only ~10.
I realized this is a special case for any genome assemblers, but since we are having really awesome results with Canu for regular genome assembly, so we would like to give it a try.
I am just wondering how Canu will treat this localized 20 bp mismatches during assembly, and is there any parameters I can adjust in Canu to make it work? My current commands are:
canu maxMemory=80 redMemory=50 oeaMemory=50 gridOptions="--time=100:00:00 --partition=**" -p $prefix -d $dir genomeSize=320k correctedErrorRate=0.13 gnuplotTested=true -nanopore-raw $OXN_file
Just in case it might help, we have high coverage raw data, about a few thousands X.
Thank you!
Yuanwen
The text was updated successfully, but these errors were encountered: