Flye does not generate any output ("No disjointigs were assembled" message) #128

StefanoLonardi · 2019-06-24T01:18:50Z

I have been trying to assemble a 10Mb genome with uncorrected nanopore data (3-4 chromosomes expected). We have a lot of data, is that the reason Flye fails at the end?

[2019-06-22 11:00:05] INFO: >>>STAGE: configure
[2019-06-22 11:00:05] INFO: Configuring run
[2019-06-22 11:00:27] INFO: Total read length: 10964270213
[2019-06-22 11:00:27] INFO: Input genome size: 10000000
[2019-06-22 11:00:27] INFO: Estimated coverage: 1096
[2019-06-22 11:00:27] WARNING: Expected read coverage is 1096, the assembly is not guaranteed to be optimal in this setting. Are you sure that the genome size was entered correctly?
[2019-06-22 11:00:27] INFO: Reads N50/N90: 29675 / 9753
[2019-06-22 11:00:27] INFO: Minimum overlap set to 5000
[2019-06-22 11:00:27] INFO: Selected k-mer size: 15
[2019-06-22 11:00:27] INFO: >>>STAGE: assembly
[2019-06-22 11:00:27] INFO: Assembling disjointigs
[2019-06-22 11:00:27] INFO: Reading sequences
[2019-06-22 11:01:01] INFO: Generating solid k-mer index
[2019-06-22 11:01:17] INFO: Counting k-mers (1/2):
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2019-06-22 11:02:49] INFO: Counting k-mers (2/2):
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2019-06-22 11:08:39] INFO: Filling index table
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2019-06-22 11:13:50] INFO: Extending reads
[2019-06-22 12:54:29] INFO: Overlap-based coverage: 1177
[2019-06-22 12:54:29] INFO: Median overlap divergence: 0.119637
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2019-06-23 17:20:11] INFO: Assembled 0 disjointigs
[2019-06-23 17:20:23] INFO: Generating sequence
[2019-06-23 17:22:11] ERROR: No disjointigs were assembled - please check if the read type and genome size parameters are correct

flye --nano-raw one.fastq --out-dir flye --genome-size 10m --threads 20

mikolmogorov · 2019-06-25T06:10:03Z

Interesting, looks like indeed a lot of overlaps were found, but no disjointigs were assembled. Is it possible to send me the full flye.log? I also suggest to try --meta mode - it is more robust to solid k-mer selection in case there is any contamination / instrumental artificial sequence.

StefanoLonardi · 2019-06-25T15:32:58Z

[2019-06-22 11:00:05] root: INFO: Starting Flye 2.4.2-release
[2019-06-22 11:00:05] root: DEBUG: Cmd: /home/stelo/miniconda2/bin/flye --nano-raw Bduncani_06182019_pass.fastq --out-dir babesia_flye --genome-size
10m --threads 20
[2019-06-22 11:00:05] root: INFO: >>>STAGE: configure
[2019-06-22 11:00:05] root: INFO: Configuring run
[2019-06-22 11:00:27] root: INFO: Total read length: 10964270213
[2019-06-22 11:00:27] root: INFO: Input genome size: 10000000
[2019-06-22 11:00:27] root: INFO: Estimated coverage: 1096
[2019-06-22 11:00:27] root: WARNING: Expected read coverage is 1096, the assembly is not guaranteed to be optimal in this setting. Are you sure that
the genome size was entered correctly?
[2019-06-22 11:00:27] root: INFO: Reads N50/N90: 29675 / 9753
[2019-06-22 11:00:27] root: INFO: Minimum overlap set to 5000
[2019-06-22 11:00:27] root: INFO: Selected k-mer size: 15
[2019-06-22 11:00:27] root: INFO: >>>STAGE: assembly
[2019-06-22 11:00:27] root: INFO: Assembling disjointigs
[2019-06-22 11:00:27] root: DEBUG: -----Begin assembly log------
[2019-06-22 11:00:27] root: DEBUG: Running: flye-assemble -l /24-2/home/stelo/babesia/babesia_flye/flye.log -t 20 -v 5000 -k 15 Bduncani_06182019_pas
s.fastq /24-2/home/stelo/babesia/babesia_flye/00-assembly/draft_assembly.fasta 10000000 /home/stelo/miniconda2/lib/python2.7/site-packages/flye/confi
g/bin_cfg/asm_raw_reads.cfg
[2019-06-22 11:00:27] DEBUG: Build date: Apr 7 2019 02:34:37
[2019-06-22 11:00:27] DEBUG: Total RAM: 251 Gb
[2019-06-22 11:00:27] DEBUG: Available RAM: 245 Gb
[2019-06-22 11:00:27] DEBUG: Total CPUs: 40
[2019-06-22 11:00:27] DEBUG: Parameters:
[2019-06-22 11:00:27] DEBUG: big_genome_threshold=29000000
[2019-06-22 11:00:27] DEBUG: low_cutoff_warning=1
[2019-06-22 11:00:27] DEBUG: hard_min_coverage_rate=10
[2019-06-22 11:00:27] DEBUG: assemble_kmer_sample=1
[2019-06-22 11:00:27] DEBUG: repeat_graph_kmer_sample=1
[2019-06-22 11:00:27] DEBUG: read_align_kmer_sample=1
[2019-06-22 11:00:27] DEBUG: maximum_jump=1500
[2019-06-22 11:00:27] DEBUG: maximum_overhang=1500
[2019-06-22 11:00:27] DEBUG: repeat_kmer_rate=100
[2019-06-22 11:00:27] DEBUG: assemble_ovlp_divergence=0.30
[2019-06-22 11:00:27] DEBUG: repeat_graph_ovlp_divergence=0.15
[2019-06-22 11:00:27] DEBUG: repeat_graph_ovlp_end_adjust=0.00
[2019-06-22 11:00:27] DEBUG: read_align_ovlp_divergence=0.25
[2019-06-22 11:00:27] DEBUG: max_coverage_drop_rate=5
[2019-06-22 11:00:27] DEBUG: chimera_window=100
[2019-06-22 11:00:27] DEBUG: min_reads_in_disjointig=4
[2019-06-22 11:00:27] DEBUG: max_inner_reads=10
[2019-06-22 11:00:27] DEBUG: max_inner_fraction=0.25
[2019-06-22 11:00:27] DEBUG: add_unassembled_reads=0
[2019-06-22 11:00:27] DEBUG: max_separation=500
[2019-06-22 11:00:27] DEBUG: tip_length_threshold=100000
[2019-06-22 11:00:27] DEBUG: unique_edge_length=50000
[2019-06-22 11:00:27] DEBUG: min_repeat_res_support=0.51
[2019-06-22 11:00:27] DEBUG: out_paths_ratio=5
[2019-06-22 11:00:27] DEBUG: graph_cov_drop_rate=10
[2019-06-22 11:00:27] DEBUG: coverage_estimate_window=100
[2019-06-22 11:00:27] DEBUG: extend_contigs_with_repeats=1
[2019-06-22 11:00:27] DEBUG: Running with k-mer size: 15
[2019-06-22 11:00:27] DEBUG: Running with minimum overlap 5000
[2019-06-22 11:00:27] DEBUG: Metagenome mode: N
[2019-06-22 11:00:27] INFO: Reading sequences
[2019-06-22 11:01:01] DEBUG: Building positional index
[2019-06-22 11:01:01] DEBUG: Total sequence: 10964270213 bp
[2019-06-22 11:01:01] DEBUG: Expected read coverage: 1096
[2019-06-22 11:01:01] INFO: Generating solid k-mer index
[2019-06-22 11:01:01] DEBUG: Hard threshold set to 5
[2019-06-22 11:01:01] DEBUG: Started k-mer counting
[2019-06-22 11:01:17] INFO: Counting k-mers (1/2):
[2019-06-22 11:02:49] INFO: Counting k-mers (2/2):
[2019-06-22 11:08:39] DEBUG: Estimated minimum kmer coverage: 155
[2019-06-22 11:08:39] DEBUG: Filtered 301351751 erroneous k-mers
[2019-06-22 11:08:39] DEBUG: Repetitive k-mer frequency: 55681
[2019-06-22 11:08:39] DEBUG: Filtered 897 repetitive k-mers (8.98678e-05)
[2019-06-22 11:08:39] INFO: Filling index table
[2019-06-22 11:08:44] DEBUG: Sampling rate: 1
[2019-06-22 11:08:44] DEBUG: Solid k-mers: 9980428
[2019-06-22 11:08:44] DEBUG: K-mer index size: 5380562281
[2019-06-22 11:08:44] DEBUG: Mean k-mer frequency: 539.111
[2019-06-22 11:12:31] DEBUG: Sorting k-mer index
[2019-06-22 11:13:50] DEBUG: Peak RAM usage: 28 Gb
[2019-06-22 11:13:50] INFO: Extending reads
[2019-06-22 11:13:50] DEBUG: Estimating overlap coverage
[2019-06-22 12:54:29] INFO: Overlap-based coverage: 1177
[2019-06-22 12:54:29] INFO: Median overlap divergence: 0.119637
[2019-06-22 12:54:29] DEBUG: Sequence divergence distribution:

|                      *
|                      *
|                    * *
|                   ** **
|                   *****
|                   ******
|                   ********
|                   ********
|                  *********
|                  *********
|                  ***********
|                 ************
|                 ************* *
|                 ************* *
|                 ************* *
|                *****************  *
|                *********************
|                **********************
|               *************************
|             **************************************** * *     ** *
----------------------------------------------------------------------------------------------------
0%        5%        10%       15%       20%       25%       30%       35%       40%       45%

Q25 = 0.1, Q50 = 0.12, Q75 = 0.14

[2019-06-23 17:20:11] INFO: Assembled 0 disjointigs
[2019-06-23 17:20:23] INFO: Generating sequence
[2019-06-23 17:20:23] DEBUG: Writing FASTA
[2019-06-23 17:20:23] DEBUG: Peak RAM usage: 78 Gb
-----------End assembly log------------
[2019-06-23 17:22:11] root: ERROR: No disjointigs were assembled - please check if the read type and genome size parameters are correct

mikolmogorov · 2019-06-27T21:01:20Z

Thank you, indeed looks strange. Maybe high coverage confuses Flye, but I also suspect there might be some non-target reads in the sample.

I suggest to try two more runs (i) metagenome mode (ii) normal mode with --asm-coverage 50 to use the longest 50x reads for disjointig assembly. Please post the corresponding logs as well.

StefanoLonardi · 2019-07-18T19:43:12Z

I just finished running Flye using the two runs that you suggest. Both of them completed, but the assembly with ''--asm-coverage 50'' seems better (in terms of N50, total size, etc.)
Thank you

mikolmogorov · 2019-07-21T17:29:47Z

Glad that it helped!

dgiguer · 2019-11-14T19:25:10Z

The solution of normal mode with --asm-coverage 50 has helped in a similar case where lots of overlap is found but no disjointigs are assembled for a plasmid!

ptrebert · 2020-02-13T10:10:03Z

@fenderglass
Could you please take a quick look at the log output for the sample where flye fails to assemble disjointigs:
gist.github.com/ptrebert/3964d66cd60af3e7a19d95d166707ed2

Since I am running flye with --asm-coverage 50 by default, I am a bit unsure how to proceed with this sample.

mikolmogorov · 2020-02-13T20:52:51Z

@ptrebert Seems strange. My only guess would be that PacBio reads might not be properly split into subreads (we had a couple cases like that before). Try to process the reads with https://github.com/fenderglass/pbclip - it should tell you if there is a significant amount of "chimeric" subreads.

Alternatively, you can also try to run with --meta option if the reads turn out ok.

ptrebert · 2020-02-14T11:48:06Z

@fenderglass
Ok, thanks for pointing out your tool, I'll check that and get back to you.

ptrebert · 2020-02-19T08:42:03Z

ping: testing Flye 2.7b-b1562 on sample with no disjointigs assembled - still running...

ptrebert · 2020-02-26T07:35:21Z

@fenderglass
For my problematic sample, flye 2.7b did not solve the issue (same "no disjointigs assembled"). I followed your suggestion and used your pbclip tool, which finished and reported the following:

Good: 15725667 chopped: 409754 bad: 662955

Could you help with interpreting these numbers (I may want to get in touch with the seq lab about this sample)? I'll try to assemble to output FASTA now with flye v2.7b, let's see what happens.

mikolmogorov · 2020-02-27T01:13:19Z

@ptrebert

pbclip finds PacBio reads that were not properly split into subreads. Depending on the DNA library, polymerase might make multiple passes over the fragment (which is used to produce high quality CCS reads). However, fragments in CLR libraries (at least from the assembly perspective) are not expected to be read multiple times to produce longer reads. When multiple passes does happen, such reads should be split into subreads (each subread is a single polymerase pass). Typically this is handled by the PacBio software at the FASTQ generation stage.

The numbers suggest that ~40% of your reads have multiple polymerase passes. This is a lot (typical value could be 1-2%) and suggests that there is indeed an issue with subread splitting. The number of chopped reads are those reads that pbclip was able to split into parts successfully. The bad reads are the reads with the same pattern that pbclip was not able to recover.

Feel free to run the latest Flye version on the output produced by pbclip - I think it it should work now. You can also double check with the lab if they performed subread splitting or have raw PacBio files to regenerate valid Fastqs.

ptrebert · 2020-02-27T08:52:02Z

@fenderglass
Thanks a lot for your detailed explanation. I am not sure, however, I can follow your argument about the 40% "bad" reads:
Total: 16798376
Bad = chopped + bad = 409754 + 662955 = 1072709
% bad = 1072709 / 16798376 ~6.4%
Am I missing something, or did you just misread the "bad" number as 6 million instead of 600k?
In either case, thanks again for all your input, that is very valuable. I'll update this issue as soon as I have the 2.7b results for the corrected reads.

ptrebert · 2020-03-02T13:43:25Z

probably last comment regarding this: even with the corrected reads (FASTA input now), flye 2.7b fails to assemble disjointigs. Seems like there is something else off about this data...

mikolmogorov · 2020-03-02T17:49:24Z

@ptrebert I see - this could be tricky sometimes. Did you have any luck with other assemblers? Wtdbg2 might be a fast way to check.

ptrebert · 2020-03-03T09:14:06Z

@fenderglass If I find the time, I'll try another assembler. For now, I asked the sequencing centre to double-check everything about this particular sample, let's see if they find something...

ptrebert · 2020-03-17T10:20:51Z

@fenderglass
A postdoc in the sequencing center that produced the problematic data in the first place ran a couple of tests with different input combinations, and also with wtdbg2 as a comparison. Since none of those test runs produced an assembly, it seems fairly clear that the problem is the data. Just out of curiosity, since we have all the flye logs for the different runs, is there any statistic in those log files that could tell us anything about the problem(s) in the data? To me, they all look pretty similar (well, they all failed), so just being thorough here...

mikolmogorov · 2020-03-17T20:12:10Z

@ptrebert good to know, thanks for the update! At this early stage of assembly, not much could be inferred from the logs, I think.. I guess it the log shows that "Overlap-based coverage" is reasonable (let's say, >10), but no disjointigs are produced, then there is a problem somewhere.

ptrebert · 2020-03-19T13:26:38Z

No, they all show a zero for the "overlap-based coverage". Whatever the problem is, it's in the data then... thanks for all your support!

vappiah · 2020-05-21T16:54:10Z

Hello All, I am working an Mycobacterium ulcerans genome which was sequenced with oxford nanopore technology. I am trying to do denovo assembly with flye but I run into a warning and the pipeline stops . The command I used is
flye --nano-raw filename.fa -o outdir -g 0.05m -t 34 -i 2

I get this message below

WARNING: Expected read coverage is 4744, the assembly is not guaranteed to be optimal in this setting. Are you sure that the genome size was entered correctly?
Pipeline aborted

mikolmogorov · 2020-05-23T00:39:49Z

@jotes35 your expected genome size is 50kb (0.05 Mb). It needs to be "5m", not "0.05m" (assuming you are aiming for 5 Mb genome).

vappiah · 2020-05-23T01:31:09Z

Please is there a way to know the expected genome size before hand?

vappiah · 2020-05-26T02:06:59Z

@fenderglass is there a way to know the expected genome size before starting the assembly?

mikolmogorov · 2020-05-27T21:39:54Z

@jotes35 Please check the FAQ - it provides some answers to your question. Let me know if anything us unclear.

eyayd · 2020-06-03T09:10:27Z

Hello, I have the same problem "No disjointigs were assembled". Expected genome is 110M and my expected coverage is about 49, I tried --meta and different --asm-coverage (since my over all coverage is smaller than 50x) but it didn't solve the issue. My N50 is quite high, would that be the reason I am getting the error?
P40.pdf

mikolmogorov · 2022-09-12T15:24:52Z

@matteo1313 Seems that you have ~800kb of reads for a bacteria of size 1.6Mb, so it simply not enough coverage to assemble. You typically need at least 10x, and 30x+ is recommended.

Also, your read N50 is 70kb, seems too good to be true for a bacteria - something might be wrong with the input data formatting.

Scott-Godwin · 2022-09-30T07:47:24Z

I'm also encountering this error. I'm running Flye as a plugin in Geneious Prime. My data consists of Nanopore reads generated from a cDNA library produced from RNA extracted from a cell culture infected with a virus. I'm trying to assemble the viral genome. I've filtered my reads by mapping against the host transcriptome, but this process is imperfect. I think that of the ~100,000 unmapped reads I have left, about 90% are viral. The virus has a segmented genome consisting of eight segments, with a total size of about 15 Kb. I've tried setting the genome size to various values including 15k, 100k and 2.4g (the approximate size of the host genome), but I keep getting the same error message.

ERROR: No disjointigs were assembled - please check if the read type and genome size parameters are correct

Failed to run: C:\WINDOWS\System32\bash.exe -c '/mnt/c/Users/sgodwin/AppData/Local/Geneious/plugins/Flye/resources/Windows/bin/flye' --nano-corr input_0_Unpaired.fastq --threads 24 --genome-size 15k --meta --iterations 1 --out-dir out >stdout.txt 2>stderr.txt, exit code: 1

Flye reported the following errors: [2022-09-30 17:43:19] INFO: Starting Flye 2.7-b1585
[2022-09-30 17:43:19] INFO: >>>STAGE: configure [2022-09-30 17:43:19] INFO: Configuring run
[2022-09-30 17:43:19] INFO: Total read length: 4464863
[2022-09-30 17:43:19] INFO: Input genome size: 15000
[2022-09-30 17:43:19] INFO: Estimated coverage: 297
[2022-09-30 17:43:19] INFO: Reads N50/N90: 699 / 191
[2022-09-30 17:43:19] INFO: Minimum overlap set to 1000
[2022-09-30 17:43:19] INFO: Selected k-mer size: 17
[2022-09-30 17:43:19] INFO: >>>STAGE: assembly
[2022-09-30 17:43:19] INFO: Assembling disjointigs
[2022-09-30 17:43:19] INFO: Reading sequences
[2022-09-30 17:43:20] INFO: Generating solid k-mer index
[2022-09-30 17:43:31] INFO: Counting k-mers (1/2): 00102030405060708090100% 0% 020% % 02030% % 0203040% % 020304050% % 02030405060% % 0203040506070% % 020304050607080% % 02030405060708090% % %
[2022-09-30 17:43:31] INFO: Counting k-mers (2/2): 0% 506% % 604% 60% % 6040% % % 60% % % % % 80% 90% 100%

[2022-09-30 17:43:31] INFO: Filling index table (1/2) 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2022-09-30 17:43:31] INFO: Filling index table (2/2) 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2022-09-30 17:43:32] INFO: Extending reads
[2022-09-30 17:43:51] INFO: Overlap-based coverage: 66
[2022-09-30 17:43:51] INFO: Median overlap divergence: 0.0697123 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% [2022-09-30 17:43:52] INFO: Assembled 0 disjointigs
[2022-09-30 17:43:52] INFO: Generating sequence
[2022-09-30 17:43:52] ERROR: No disjointigs were assembled - please check if the read type and genome size parameters are correct
[2022-09-30 17:43:52] ERROR: Pipeline aborted

mikolmogorov · 2022-09-30T23:03:39Z

@Scott-Godwin you are using outdated version of Flye. The latest release (2.9+) was optimized for viral assembly and should work better for you.

katievigil · 2022-10-01T00:55:18Z

I stopped using flye because it did not work on all my virus fastq files. What codes are people using for viral assembly now? I want to try it again. I remember by error was with the genome size.

thanks!

Scott-Godwin · 2022-10-05T05:40:41Z

@fenderglass Can I run Flye 2.9 from a bash terminal on a windows machine? I'm a wet lab guy. I'm a total beginner when it comes to all things bioinformatics.

tolot27 · 2022-10-05T06:55:12Z

@Scott-Godwin No, you can't. But you can install WSL (Windows System for Linux) and a Linux distribution like Ubuntu.

katievigil · 2022-11-19T22:11:08Z

Hi I uploaded the new version of Flye and I'am still getting "Pipeline aborted".
Thanks!

Also, do you know why Canu can assemble contigs with this fastq file but flye cannot?- I am trying to understand the theory behind different long-read de novo assemblers and why some can assemble, and some cannot, even though I am using the same fastq file.

Thanks!

flye --nano-raw barcode01.fastq --out-dir barcode01.flye --meta --threads 20
[2022-11-19 15:57:26] INFO: Starting Flye 2.9.1-b1780
[2022-11-19 15:57:26] INFO: >>>STAGE: configure
[2022-11-19 15:57:26] INFO: Configuring run
[2022-11-19 15:57:26] INFO: Total read length: 2427265
[2022-11-19 15:57:26] INFO: Reads N50/N90: 760 / 486
[2022-11-19 15:57:26] INFO: Minimum overlap set to 1000
[2022-11-19 15:57:26] INFO: >>>STAGE: assembly
[2022-11-19 15:57:26] INFO: Assembling disjointigs
[2022-11-19 15:57:26] INFO: Reading sequences
[2022-11-19 15:59:56] INFO: Counting k-mers:
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2022-11-19 16:00:54] INFO: Filling index table (1/2)
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2022-11-19 16:00:54] INFO: Filling index table (2/2)
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2022-11-19 16:00:55] INFO: Extending reads
[2022-11-19 16:00:56] INFO: Overlap-based coverage: 59
[2022-11-19 16:00:56] INFO: Median overlap divergence: 0.191868
0% 100%
[2022-11-19 16:00:56] INFO: Assembled 0 disjointigs
[2022-11-19 16:00:56] INFO: Generating sequence
[2022-11-19 16:00:56] INFO: Filtering contained disjointigs
[2022-11-19 16:00:57] INFO: Contained seqs: 0
[2022-11-19 16:00:57] ERROR: No disjointigs were assembled - please check if the read type and genome size parameters are correct
[2022-11-19 16:00:57] ERROR: Pipeline aborted
(/lustre/project/taw/share/conda-envs/flye) [kvigil@cypress2 Fastq_Concat]$ flye --nano-raw barcode01.fastq --out-dir barcode01.flye --meta --threads 20
[2022-11-19 16:03:02] INFO: Starting Flye 2.9.1-b1780
[2022-11-19 16:03:02] INFO: >>>STAGE: configure
[2022-11-19 16:03:02] INFO: Configuring run
[2022-11-19 16:03:02] INFO: Total read length: 2427265
[2022-11-19 16:03:02] INFO: Reads N50/N90: 760 / 486
[2022-11-19 16:03:02] INFO: Minimum overlap set to 1000
[2022-11-19 16:03:02] INFO: >>>STAGE: assembly
[2022-11-19 16:03:02] INFO: Assembling disjointigs
[2022-11-19 16:03:02] INFO: Reading sequences
[2022-11-19 16:05:35] INFO: Counting k-mers:
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2022-11-19 16:06:36] INFO: Filling index table (1/2)
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2022-11-19 16:06:36] INFO: Filling index table (2/2)
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2022-11-19 16:06:37] INFO: Extending reads
[2022-11-19 16:06:38] INFO: Overlap-based coverage: 59
[2022-11-19 16:06:38] INFO: Median overlap divergence: 0.191868
0% 100%
[2022-11-19 16:06:38] INFO: Assembled 0 disjointigs
[2022-11-19 16:06:38] INFO: Generating sequence
[2022-11-19 16:06:39] INFO: Filtering contained disjointigs
[2022-11-19 16:06:39] INFO: Contained seqs: 0
[2022-11-19 16:06:39] ERROR: No disjointigs were assembled - please check if the read type and genome size parameters are correct
[2022-11-19 16:06:39] ERROR: Pipeline aborted

katievigil · 2022-11-23T19:18:40Z

Looks like my N50 is <1kb, so Flye can't assemble anything where the N50 is <1kb? What does N50 mean?

ChristopherRichie · 2022-11-23T20:07:48Z

https://en.wikipedia.org/wiki/N50,_L50,_and_related_statistics N50[edit<https://en.wikipedia.org/w/index.php?title=N50,_L50,_and_related_statistics&action=edit&section=2>] N50 statistic defines assembly quality in terms of contiguity<https://en.wiktionary.org/wiki/contiguity>. Given a set of contigs, the N50 is defined as the sequence length of the shortest contig at 50% of the total assembly length. It can be thought of as the point of half of the mass of the distribution; the number of bases<https://en.wikipedia.org/wiki/Nucleotide> from all contigs longer than the N50 will be close to the number of bases from all contigs shorter than the N50. For example, consider 9 contigs with the lengths 2,3,4,5,6,7,8,9,and 10; their sum is 54, half of the sum is 27, and the size of the genome also happens to be 54. 50% of this assembly would be 10 + 9 + 8 = 27 (half the length of the sequence). Thus the N50=8, which is the size of the contig which, along with the larger contigs, contain half of sequence of a particular genome. Note: When comparing N50 values from different assemblies, the assembly sizes must be the same size in order for N50 to be meaningful. N50 can be described as a weighted median statistic such that 50% of the entire assembly is contained in contigs or scaffolds equal to or larger than this value. From: katie vigil ***@***.***> Sent: Wednesday, November 23, 2022 2:19 PM To: fenderglass/Flye ***@***.***> Cc: Richie, Christopher (NIH/NIDA) [E] ***@***.***>; Comment ***@***.***> Subject: [EXTERNAL] Re: [fenderglass/Flye] Flye does not generate any output ("No disjointigs were assembled" message) (#128) Looks like my N50 is <1kb, so Flye can't assemble anything where the N50 is <1kb? What does N50 mean? - Reply to this email directly, view it on GitHub<https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ffenderglass%2FFlye%2Fissues%2F128%23issuecomment-1325553733&data=05%7C01%7Cchrisr%40nida.nih.gov%7C749744c54dd64e3a793708dacd878fad%7C14b77578977342d58507251ca2dc2b06%7C0%7C0%7C638048279365438405%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ntXLaw9IQL1ZHcZ8kPk3wdqH1g3BML6lO1CKTsiCM5Y%3D&reserved=0>, or unsubscribe<https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAR4COQYPNLB7IQE5BZQJTGLWJZUZZANCNFSM4H22HVOQ&data=05%7C01%7Cchrisr%40nida.nih.gov%7C749744c54dd64e3a793708dacd878fad%7C14b77578977342d58507251ca2dc2b06%7C0%7C0%7C638048279365438405%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=0Ovc%2Fm%2FxKPK28wJt4Ha4WphelWpqIqGjKH5QlDfxvGY%3D&reserved=0>. You are receiving this because you commented.Message ID: ***@***.******@***.***>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and are confident the content is safe.

katievigil · 2023-01-02T20:42:03Z

@ChristopherRichie Thank you! I figured out that Metaflye is based on De Bruijn graph and Canu is an overlapping graph (OLC) based method.

jolespin · 2023-03-27T16:35:52Z

I've had this issue when using --nano-hq (my guppy version was 6.4.6+ae70e8f). When I changed the input to --nano-raw it ran to completion.

PavithraV0223 · 2023-05-03T07:45:59Z

Hello, I'm working with the Nanopore data, of the alpacas. I have tried all the different parameters but each run gives the same error. I'm unsure what the problem is. I have been using the adapter and barcode trimmed fastq file as an input to nano-raw. I have tried all the trouble shooting as mentioned above in the discussion but ending up with the same error.
I have provided my log file for your reference. I have tried using the meta and the normal mode as well. You're help would be much appreciated.

2023-04-27 12:58:27] root: INFO: Starting Flye 2.9.2-b1786
[2023-04-27 12:58:27] root: DEBUG: Cmd: /home/pavi/miniconda3/bin/flye --nano-raw /home/pavi/flye/fitered3_MinIONadapt.fastq --out-dir ./flye_output
[2023-04-27 12:58:27] root: DEBUG: Python version: 3.7.16 (default, Jan 17 2023, 22:20:44)
[GCC 11.2.0]
[2023-04-27 12:58:27] root: INFO: >>>STAGE: configure
[2023-04-27 12:58:27] root: INFO: Configuring run
[2023-04-27 12:58:28] root: INFO: Total read length: 252562133
[2023-04-27 12:58:28] root: INFO: Reads N50/N90: 1137 / 994
[2023-04-27 12:58:28] root: INFO: Minimum overlap set to 1000
[2023-04-27 12:58:28] root: INFO: >>>STAGE: assembly
[2023-04-27 12:58:28] root: INFO: Assembling disjointigs
[2023-04-27 12:58:28] root: DEBUG: -----Begin assembly log------
[2023-04-27 12:58:28] root: DEBUG: Running: flye-modules assemble --reads /home/pavi/flye/fitered3_MinIONadapt.fastq --out-asm /home/pavi/flye/flye_output/00-assembly/draft_assembly.fasta --config /home/pavi/miniconda3/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg --log /home/pavi/flye/flye_output/flye.log --threads 1 --min-ovlp 1000
[2023-04-27 12:58:28] DEBUG: Build date: Mar 27 2023 14:17:04
[2023-04-27 12:58:28] DEBUG: Total RAM: 22 Gb
[2023-04-27 12:58:28] DEBUG: Available RAM: 19 Gb
[2023-04-27 12:58:28] DEBUG: Total CPUs: 7
[2023-04-27 12:58:28] DEBUG: Loading /home/pavi/miniconda3/lib/python3.7/site-packages/flye/config/bin_cfg/asm_raw_reads.cfg
[2023-04-27 12:58:28] DEBUG: Loading /home/pavi/miniconda3/lib/python3.7/site-packages/flye/config/bin_cfg/asm_defaults.cfg
[2023-04-27 12:58:28] DEBUG: big_genome_threshold=29000000
[2023-04-27 12:58:28] DEBUG: meta_read_filter_kmer_freq=100
[2023-04-27 12:58:28] DEBUG: chain_large_gap_penalty=2
[2023-04-27 12:58:28] DEBUG: chain_small_gap_penalty=0.5
[2023-04-27 12:58:28] DEBUG: chain_gap_jump_threshold=100
[2023-04-27 12:58:28] DEBUG: max_coverage_drop_rate=5
[2023-04-27 12:58:28] DEBUG: max_extensions_drop_rate=5
[2023-04-27 12:58:28] DEBUG: chimera_window=100
[2023-04-27 12:58:28] DEBUG: chimera_overhang=1000
[2023-04-27 12:58:28] DEBUG: min_reads_in_disjointig=4
[2023-04-27 12:58:28] DEBUG: max_inner_reads=10
[2023-04-27 12:58:28] DEBUG: max_inner_fraction=0.25
[2023-04-27 12:58:28] DEBUG: max_separation=500
[2023-04-27 12:58:28] DEBUG: unique_edge_length=50000
[2023-04-27 12:58:28] DEBUG: min_repeat_res_support=0.51
[2023-04-27 12:58:28] DEBUG: out_paths_ratio=5
[2023-04-27 12:58:28] DEBUG: graph_cov_drop_rate=5
[2023-04-27 12:58:28] DEBUG: coverage_estimate_window=100
[2023-04-27 12:58:28] DEBUG: max_bubble_length=50000
[2023-04-27 12:58:28] DEBUG: loop_coverage_rate=1.5
[2023-04-27 12:58:28] DEBUG: repeat_edge_cov_mult=1.75
[2023-04-27 12:58:28] DEBUG: weak_detach_rate=5
[2023-04-27 12:58:28] DEBUG: tip_coverage_rate=2
[2023-04-27 12:58:28] DEBUG: tip_length_rate=2
[2023-04-27 12:58:28] DEBUG: output_gfa_before_rr=0
[2023-04-27 12:58:28] DEBUG: remove_alt_edges=0
[2023-04-27 12:58:28] DEBUG: low_cutoff_warning=1
[2023-04-27 12:58:28] DEBUG: kmer_size=17
[2023-04-27 12:58:28] DEBUG: use_minimizers=0
[2023-04-27 12:58:28] DEBUG: reads_base_alignment=0
[2023-04-27 12:58:28] DEBUG: meta_read_top_kmer_rate=0.40
[2023-04-27 12:58:28] DEBUG: maximum_jump=1500
[2023-04-27 12:58:28] DEBUG: maximum_overhang=1500
[2023-04-27 12:58:28] DEBUG: repeat_kmer_rate=100
[2023-04-27 12:58:28] DEBUG: assemble_ovlp_divergence=0.10
[2023-04-27 12:58:28] DEBUG: assemble_divergence_relative=1
[2023-04-27 12:58:28] DEBUG: repeat_graph_ovlp_divergence=0.08
[2023-04-27 12:58:28] DEBUG: read_align_ovlp_divergence=0.25
[2023-04-27 12:58:28] DEBUG: hpc_scoring_on=0
[2023-04-27 12:58:28] DEBUG: add_unassembled_reads=0
[2023-04-27 12:58:28] DEBUG: extend_contigs_with_repeats=0
[2023-04-27 12:58:28] DEBUG: min_read_cov_cutoff=3
[2023-04-27 12:58:28] DEBUG: short_tip_length=20000
[2023-04-27 12:58:28] DEBUG: long_tip_length=100000
[2023-04-27 12:58:28] DEBUG: Running with k-mer size: 17
[2023-04-27 12:58:28] DEBUG: Running with minimum overlap 1000
[2023-04-27 12:58:28] DEBUG: Metagenome mode: N
[2023-04-27 12:58:28] DEBUG: Short mode: N
[2023-04-27 12:58:28] INFO: Reading sequences
[2023-04-27 12:58:29] DEBUG: Building positional index
[2023-04-27 12:58:29] DEBUG: Total sequence: 224735072 bp
[2023-04-27 12:58:31] INFO: Counting k-mers:
[2023-04-27 12:59:01] DEBUG: Updating k-mer histogram
[2023-04-27 12:59:39] DEBUG: Hash size: 1033102
[2023-04-27 12:59:39] DEBUG: Total k-mers 40609435
[2023-04-27 12:59:39] INFO: Filling index table (1/2)
[2023-04-27 13:00:49] DEBUG: Mean k-mer frequency: 340.156
[2023-04-27 13:00:49] DEBUG: Repetitive k-mer frequency: 34015
[2023-04-27 13:00:49] DEBUG: Filtered 28293692 repetitive k-mers (0.319157)
[2023-04-27 13:00:49] INFO: Filling index table (2/2)
[2023-04-27 13:01:59] DEBUG: Sorting k-mer index
[2023-04-27 13:02:00] DEBUG: Selected k-mers: 354076
[2023-04-27 13:02:00] DEBUG: Index size: 60427371
[2023-04-27 13:02:00] DEBUG: Mean k-mer index frequency: 170.662
[2023-04-27 13:02:00] DEBUG: Peak RAM usage: 8 Gb
[2023-04-27 13:02:00] DEBUG: Estimating k-mer identity bias
[2023-04-27 13:04:53] DEBUG: Initial divergence estimate : 0.234128
[2023-04-27 13:04:53] DEBUG: Relative threshold: Y
[2023-04-27 13:04:53] DEBUG: Max divergence threshold set to 0.334128
[2023-04-27 13:04:53] INFO: Extending reads
[2023-04-27 13:04:53] DEBUG: Estimating overlap coverage
[2023-04-27 13:07:48] INFO: Overlap-based coverage: 205
[2023-04-27 13:07:48] INFO: Median overlap divergence: 0.234818
[2023-04-27 13:07:48] DEBUG: Sequence divergence distribution:

| * |
| * |
| * * |
| * * * * |
| * * * * |
| *** * * * |
| **** * * * |
| * |
| |
| |
| ****** |
| * ****** |
| * |
| |
| |
| ************ |
| * * ********************** |
| * * * * |
| ********************* * | *
| **************************************************| *

0% 5% 10% 15% 20% 25% 30% 35% 40% 45%

Q25 = 0.21, Q50 = 0.23, Q75 = 0.26
[2023-04-27 22:07:06] INFO: Assembled 0 disjointigs
[2023-04-27 22:07:06] INFO: Generating sequence
[2023-04-27 22:07:06] DEBUG: Building positional index
[2023-04-27 22:07:06] DEBUG: Mean k-mer frequency: 0
[2023-04-27 22:07:06] DEBUG: Repetitive k-mer frequency: 0
[2023-04-27 22:07:06] DEBUG: Filtered 0 repetitive k-mers (-nan)
[2023-04-27 22:07:06] DEBUG: Sorting k-mer index
[2023-04-27 22:07:06] DEBUG: Selected k-mers: 0
[2023-04-27 22:07:06] DEBUG: K-mer index size: 0
[2023-04-27 22:07:06] DEBUG: Mean k-mer frequency: -nan
[2023-04-27 22:07:06] DEBUG: Minimizer rate: -nan
[2023-04-27 22:07:06] INFO: Filtering contained disjointigs
[2023-04-27 22:07:06] DEBUG: Computing transitive closure for overlaps
[2023-04-27 22:07:06] DEBUG: Found 0 overlaps
[2023-04-27 22:07:06] DEBUG: Left 0 overlaps after filtering
[2023-04-27 22:07:06] INFO: Contained seqs: 0
[2023-04-27 22:07:06] DEBUG: Writing FASTA
[2023-04-27 22:07:06] DEBUG: Peak RAM usage: 8 Gb
-----------End assembly log------------
[2023-04-27 22:07:06] root: ERROR: No disjointigs were assembled - please check if the read type and genome size parameters are correct
[2023-04-27 22:07:06] root: ERROR: Pipeline aborted

mikolmogorov · 2023-05-03T15:01:48Z

@PavithraV0223 could you tell more about your sample? And please attach a log with --meta run. In general, read length seems to be very short 1kb N50, is it some kind of amplicon sequencing?

emmannaemeka · 2023-05-22T08:35:50Z

Hello, I am having similar issues. I have tried the --meta mode and the --asm-coverage 50 without success.

[2023-05-22 09:29:27] root: INFO: Starting Flye 2.9-b1778
[2023-05-22 09:29:27] root: DEBUG: Cmd: /Users/pamluka/Desktop/programs_bioinformatics/Flye/bin/flye --meta --nano-raw /Users/pamluka/Desktop/UNGSM/sample_6/Sample-06-X-2022_fastq.fastq.gz -o /Users/pamluka/Desktop/UNGSM
[2023-05-22 09:29:27] root: DEBUG: Python version: 3.6.15 | packaged by conda-forge | (default, Dec 3 2021, 18:49:43)
[GCC Clang 11.1.0]
[2023-05-22 09:29:27] root: INFO: >>>STAGE: configure
[2023-05-22 09:29:27] root: INFO: Configuring run
[2023-05-22 09:29:37] root: INFO: Total read length: 229301908
[2023-05-22 09:29:37] root: INFO: Reads N50/N90: 353 / 282
[2023-05-22 09:29:37] root: INFO: Minimum overlap set to 1000
[2023-05-22 09:29:37] root: INFO: >>>STAGE: assembly
[2023-05-22 09:29:37] root: INFO: Assembling disjointigs
[2023-05-22 09:29:37] root: DEBUG: -----Begin assembly log------
[2023-05-22 09:29:37] root: DEBUG: Running: flye-modules assemble --reads /Users/pamluka/Desktop/UNGSM/sample_6/Sample-06-X-2022_fastq.fastq.gz --out-asm /Users/pamluka/Desktop/UNGSM/00-assembly/draft_assembly.fasta --config /Users/pamluka/Desktop/programs_bioinformatics/Flye/flye/config/bin_cfg/asm_raw_reads.cfg --log /Users/pamluka/Desktop/UNGSM/flye.log --threads 1 --meta --min-ovlp 1000
[2023-05-22 09:29:37] DEBUG: Build date: Jun 7 2022 09:22:15
[2023-05-22 09:29:37] DEBUG: Total RAM: 16 Gb
[2023-05-22 09:29:37] DEBUG: Available RAM: 0 Gb
[2023-05-22 09:29:37] DEBUG: Total CPUs: 8
[2023-05-22 09:29:37] DEBUG: Loading /Users/pamluka/Desktop/programs_bioinformatics/Flye/flye/config/bin_cfg/asm_raw_reads.cfg
[2023-05-22 09:29:37] DEBUG: Loading /Users/pamluka/Desktop/programs_bioinformatics/Flye/flye/config/bin_cfg/asm_defaults.cfg
[2023-05-22 09:29:37] DEBUG: big_genome_threshold=29000000
[2023-05-22 09:29:37] DEBUG: meta_read_filter_kmer_freq=100
[2023-05-22 09:29:37] DEBUG: chain_large_gap_penalty=2
[2023-05-22 09:29:37] DEBUG: chain_small_gap_penalty=0.5
[2023-05-22 09:29:37] DEBUG: chain_gap_jump_threshold=100
[2023-05-22 09:29:37] DEBUG: max_coverage_drop_rate=5
[2023-05-22 09:29:37] DEBUG: max_extensions_drop_rate=5
[2023-05-22 09:29:37] DEBUG: chimera_window=100
[2023-05-22 09:29:37] DEBUG: chimera_overhang=1000
[2023-05-22 09:29:37] DEBUG: min_reads_in_disjointig=4
[2023-05-22 09:29:37] DEBUG: max_inner_reads=10
[2023-05-22 09:29:37] DEBUG: max_inner_fraction=0.25
[2023-05-22 09:29:37] DEBUG: max_separation=500
[2023-05-22 09:29:37] DEBUG: unique_edge_length=50000
[2023-05-22 09:29:37] DEBUG: min_repeat_res_support=0.51
[2023-05-22 09:29:37] DEBUG: out_paths_ratio=5
[2023-05-22 09:29:37] DEBUG: graph_cov_drop_rate=5
[2023-05-22 09:29:37] DEBUG: coverage_estimate_window=100
[2023-05-22 09:29:37] DEBUG: max_bubble_length=50000
[2023-05-22 09:29:37] DEBUG: loop_coverage_rate=1.5
[2023-05-22 09:29:37] DEBUG: repeat_edge_cov_mult=1.75
[2023-05-22 09:29:37] DEBUG: weak_detach_rate=5
[2023-05-22 09:29:37] DEBUG: tip_coverage_rate=2
[2023-05-22 09:29:37] DEBUG: tip_length_rate=2
[2023-05-22 09:29:37] DEBUG: output_gfa_before_rr=0
[2023-05-22 09:29:37] DEBUG: remove_alt_edges=0
[2023-05-22 09:29:37] DEBUG: low_cutoff_warning=1
[2023-05-22 09:29:37] DEBUG: kmer_size=17
[2023-05-22 09:29:37] DEBUG: use_minimizers=0
[2023-05-22 09:29:37] DEBUG: reads_base_alignment=0
[2023-05-22 09:29:37] DEBUG: meta_read_top_kmer_rate=0.40
[2023-05-22 09:29:37] DEBUG: maximum_jump=1500
[2023-05-22 09:29:37] DEBUG: maximum_overhang=1500
[2023-05-22 09:29:37] DEBUG: repeat_kmer_rate=100
[2023-05-22 09:29:37] DEBUG: assemble_ovlp_divergence=0.10
[2023-05-22 09:29:37] DEBUG: assemble_divergence_relative=1
[2023-05-22 09:29:37] DEBUG: repeat_graph_ovlp_divergence=0.08
[2023-05-22 09:29:37] DEBUG: read_align_ovlp_divergence=0.25
[2023-05-22 09:29:37] DEBUG: hpc_scoring_on=0
[2023-05-22 09:29:37] DEBUG: add_unassembled_reads=0
[2023-05-22 09:29:37] DEBUG: extend_contigs_with_repeats=0
[2023-05-22 09:29:37] DEBUG: min_read_cov_cutoff=3
[2023-05-22 09:29:37] DEBUG: short_tip_length=20000
[2023-05-22 09:29:37] DEBUG: long_tip_length=100000
[2023-05-22 09:29:37] DEBUG: Running with k-mer size: 17
[2023-05-22 09:29:37] DEBUG: Running with minimum overlap 1000
[2023-05-22 09:29:37] DEBUG: Metagenome mode: Y
[2023-05-22 09:29:37] DEBUG: Short mode: N
[2023-05-22 09:29:37] INFO: Reading sequences
[2023-05-22 09:29:42] DEBUG: Building positional index
[2023-05-22 09:29:42] DEBUG: Total sequence: 3440345 bp
[2023-05-22 09:29:46] INFO: Counting k-mers:
[2023-05-22 09:29:47] DEBUG: Updating k-mer histogram
[2023-05-22 09:30:31] DEBUG: Hash size: 10893
[2023-05-22 09:30:31] DEBUG: Total k-mers 1848766
[2023-05-22 09:30:31] INFO: Filling index table (1/2)
[2023-05-22 09:30:32] DEBUG: Mean k-mer frequency: 7.46855
[2023-05-22 09:30:32] DEBUG: Repetitive k-mer frequency: 746
[2023-05-22 09:30:32] DEBUG: Filtered 5983 repetitive k-mers (0.00455754)
[2023-05-22 09:30:32] INFO: Filling index table (2/2)
[2023-05-22 09:30:34] DEBUG: Sorting k-mer index
[2023-05-22 09:30:34] DEBUG: Selected k-mers: 220513
[2023-05-22 09:30:34] DEBUG: Index size: 1350695
[2023-05-22 09:30:34] DEBUG: Mean k-mer index frequency: 6.12524
[2023-05-22 09:30:34] DEBUG: Peak RAM usage: 8 Gb
[2023-05-22 09:30:34] DEBUG: Estimating k-mer identity bias
[2023-05-22 09:30:35] DEBUG: Initial divergence estimate : 0.0703537
[2023-05-22 09:30:35] DEBUG: Relative threshold: Y
[2023-05-22 09:30:35] DEBUG: Max divergence threshold set to 0.170354
[2023-05-22 09:30:35] INFO: Extending reads
[2023-05-22 09:30:35] DEBUG: Estimating overlap coverage
[2023-05-22 09:30:37] INFO: Overlap-based coverage: 1
[2023-05-22 09:30:37] INFO: Median overlap divergence: 0.0717406
[2023-05-22 09:30:37] DEBUG: Sequence divergence distribution:

|              *                   |                                                                 
|              *                   |                                                                 
|              *                   |                                                                 
|              *                   |                                                                 
|              *                   |                                                                 
|              *                   |                                                                 
|             **                   |                                                                 
|             **                   |                                                                 
|            ***                   |                                                                 
|           ****                   |                                                                 
|           ****                   |                                                                 
|           *****                  |                                                                 
|           *****                  |                                                                 
|           ******                 |                                                                 
|           ****** *               |                                                                 
|           ********               |                                                                 
|          *********               |                                                                 
|          *********  **           |                                                                 
|       *  ********* ***           |                 *                                               
|      ** ************** * *       | *           *   *  *   *                                        
----------------------------------------------------------------------------------------------------
0%        5%        10%       15%       20%       25%       30%       35%       40%       45%       

Q25 = 0.064, Q50 = 0.072, Q75 = 0.083

[2023-05-22 09:30:42] INFO: Assembled 0 disjointigs
[2023-05-22 09:30:42] INFO: Generating sequence
[2023-05-22 09:30:42] DEBUG: Building positional index
[2023-05-22 09:30:42] DEBUG: Mean k-mer frequency: 0
[2023-05-22 09:30:42] DEBUG: Repetitive k-mer frequency: 0
[2023-05-22 09:30:42] DEBUG: Filtered 0 repetitive k-mers (nan)
[2023-05-22 09:30:42] DEBUG: Sorting k-mer index
[2023-05-22 09:30:42] DEBUG: Selected k-mers: 0
[2023-05-22 09:30:42] DEBUG: K-mer index size: 0
[2023-05-22 09:30:42] DEBUG: Mean k-mer frequency: nan
[2023-05-22 09:30:42] DEBUG: Minimizer rate: nan
[2023-05-22 09:30:42] INFO: Filtering contained disjointigs
[2023-05-22 09:30:42] DEBUG: Computing transitive closure for overlaps
[2023-05-22 09:30:42] DEBUG: Found 0 overlaps
[2023-05-22 09:30:42] DEBUG: Left 0 overlaps after filtering
[2023-05-22 09:30:42] INFO: Contained seqs: 0
[2023-05-22 09:30:42] DEBUG: Writing FASTA
[2023-05-22 09:30:42] DEBUG: Peak RAM usage: 8 Gb
-----------End assembly log------------
[2023-05-22 09:30:42] root: ERROR: No disjointigs were assembled - please check if the read type and genome size parameters are correct
[2023-05-22 09:30:42] root: ERROR: Pipeline aborted

mikolmogorov · 2023-05-29T16:59:28Z

@emmannaemeka seems like you're assembling very short reads, Flye really needs few kb reads to work.

Trying to fix Error "No disjointigs were assembled", based on mikolmogorov/Flye#128

trying --meta, the other suggestion from mikolmogorov/Flye#128 --asm-coverage requires genome size estimate

miniluphy · 2024-04-09T09:47:52Z

I encountered a similar issue.
Based on the latest version 2.9.3, when inputting the pacbio-hifi file, I received the following error message:

INFO: Overlap-based coverage: 0
ERROR: No disjointigs were assembled - please check if the read type and genome size parameters are correct.

I have attached the log file.

Upon checking the fastq.gz file via pbclip, the result shows:
Good: 152693 chopped: 5824 bad: 1897.

The -meta result is similar. How should I resolve this issue？
Thank you
flye.log
flye-meta.log

mikolmogorov · 2024-04-12T14:46:27Z

@miniluphy your read error rate is ~13%, so this is not HiFi reads. If it is pacbio, use --pacbio-raw instead.

SinaedaA · 2024-07-31T14:42:48Z

Hi,

I have more a conceptual question that arose while solving a similar issue as the one mentioned in this thread.
I used both methods of solving the "No disjointigs were assembled" error (--asm-coverage and --meta), and both worked. The problem seemed indeed to be that I have a very high coverage to start with ("Overlap-based coverage: 515" is what it says on the normal run), and Flye doesn't deal with that well.

With the --asm-coverage 50 option, I get a single fragment / contig at the end of assembly (length 3.568.177, coverage 490), which is technically what I'm looking for considering I'm supposed to be working with pure bacterial strains.
With the --meta option, I get 5 contigs, the longest one 3.568.354 bp long (coverage 491), and 4 other contigs with lengths between 6.000 and 35.000 bp, and a very low coverage of 3.

I guess the --meta flag tells Flye there is a possibility that there are several strains, and thus it just acts differently, but I'm worried that my strain might not have been pure (although we have checked it by 16S amplification and sequencing beforehand). If it wasn't pure, would Flye return 2 "equally covered" contigs, of similar lengths?

If this question doesn't belong here, but should be a separate "issue", I will change it :)

Thank you for any additional information you can provide !
(PS: I didn't think the log files were super important in this case, but if needed I will upload them)

mikolmogorov · 2024-08-16T19:37:42Z

@SinaedaA sorry for the late response! The shorter fragments may be plasmids. You can try to visualize the assembly graph using Bandage to see if they form separate connected components and are circular. To check for strain heterogeneity, you can run flye with --keep-haplotypes and check if it results in additional "bubbles" on your chromosome. You can also try our new tool for strain profiling: https://github.com/katerinakazantseva/stRainy

mikolmogorov closed this as completed Jul 21, 2019

mikolmogorov mentioned this issue Feb 6, 2020

No disjointigs were assembled #211

Closed

mikolmogorov reopened this Feb 13, 2020

mikolmogorov added the question label Feb 13, 2020

mikolmogorov changed the title ~~No disjointigs were assembled~~ Flye does not generate any output ("No disjointigs were assembled" message) Feb 13, 2020

plnspineda mentioned this issue Nov 6, 2022

Stuck at Sequence divergence distribution but I am not getting any error #545

Closed

mikolmogorov mentioned this issue Dec 15, 2022

ERROR: No disjointigs were assembled #557

Closed

mikolmogorov added the discussion label Mar 18, 2023

mikolmogorov mentioned this issue May 2, 2023

No disjointigs were assembled - please check if the read type and genome size parameters are correct #600

Closed

granek added a commit to granek/wf-bacterial-genomes that referenced this issue Aug 4, 2023

Trying to fix fly failure

c86ffd4

Trying to fix Error "No disjointigs were assembled", based on mikolmogorov/Flye#128

granek added a commit to granek/wf-bacterial-genomes that referenced this issue Aug 4, 2023

Still trying to fix flye failure

a7263cf

trying --meta, the other suggestion from mikolmogorov/Flye#128 --asm-coverage requires genome size estimate

bsalehe mentioned this issue Aug 4, 2023

Got "No disjointigs were assembled" error when running denovo epi2me-labs/wf-bacterial-genomes#17

Closed

bgruening mentioned this issue Nov 5, 2023

Flye error for PacBio sequence galaxyproject/galaxy#16980

Closed

mikolmogorov mentioned this issue May 22, 2024

assembly error No disjointigs were assembled #704

Closed

mikolmogorov mentioned this issue Jun 10, 2024

No disjointigs were assembled - please check if the read type and genome size parameters are correct #708

Closed

pdobbler mentioned this issue Oct 17, 2024

[Feature request] Add extra flags for running Flye gbouras13/hybracter#101

Closed

Flye does not generate any output ("No disjointigs were assembled" message) #128

Flye does not generate any output ("No disjointigs were assembled" message) #128

Comments

StefanoLonardi commented Jun 24, 2019 • edited Loading

mikolmogorov commented Jun 25, 2019

StefanoLonardi commented Jun 25, 2019

mikolmogorov commented Jun 27, 2019

StefanoLonardi commented Jul 18, 2019 • edited Loading

mikolmogorov commented Jul 21, 2019

dgiguer commented Nov 14, 2019

ptrebert commented Feb 13, 2020

mikolmogorov commented Feb 13, 2020

ptrebert commented Feb 14, 2020

ptrebert commented Feb 19, 2020

ptrebert commented Feb 26, 2020

mikolmogorov commented Feb 27, 2020

ptrebert commented Feb 27, 2020

ptrebert commented Mar 2, 2020

mikolmogorov commented Mar 2, 2020

ptrebert commented Mar 3, 2020

ptrebert commented Mar 17, 2020

mikolmogorov commented Mar 17, 2020

ptrebert commented Mar 19, 2020

vappiah commented May 21, 2020 • edited Loading

mikolmogorov commented May 23, 2020

vappiah commented May 23, 2020

vappiah commented May 26, 2020

mikolmogorov commented May 27, 2020

eyayd commented Jun 3, 2020

mikolmogorov commented Sep 12, 2022

Scott-Godwin commented Sep 30, 2022

mikolmogorov commented Sep 30, 2022

katievigil commented Oct 1, 2022

Scott-Godwin commented Oct 5, 2022

tolot27 commented Oct 5, 2022

katievigil commented Nov 19, 2022

katievigil commented Nov 23, 2022

ChristopherRichie commented Nov 23, 2022 via email

katievigil commented Jan 2, 2023

jolespin commented Mar 27, 2023

PavithraV0223 commented May 3, 2023

mikolmogorov commented May 3, 2023

emmannaemeka commented May 22, 2023

mikolmogorov commented May 29, 2023 • edited Loading

miniluphy commented Apr 9, 2024

mikolmogorov commented Apr 12, 2024

SinaedaA commented Jul 31, 2024

mikolmogorov commented Aug 16, 2024

StefanoLonardi commented Jun 24, 2019 •

edited

Loading

StefanoLonardi commented Jul 18, 2019 •

edited

Loading

vappiah commented May 21, 2020 •

edited

Loading

mikolmogorov commented May 29, 2023 •

edited

Loading