You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
thanks for providing the tool. I have a question regarding its scaffolding of contigs. Below are some details.
I have a heterozygous diploid genome, and try to scaffold an initial set of 9727 contigs (N50: ~55 kb, from canu) with a Hi-C coverage of ~40x (which was after filtering out non-uniquely mapped reads and gave an alignment.bed file of 8.3 Gb with 124 million lines) to chr-level, with the help of the canu-generated .gfa file. After running SALSA, however, it gave 6294 contigs (N50: ~220 kb). As seen, there is no expected improvement to reach chr-level.
Hi SALSA authors,
thanks for providing the tool. I have a question regarding its scaffolding of contigs. Below are some details.
I have a heterozygous diploid genome, and try to scaffold an initial set of 9727 contigs (N50: ~55 kb, from canu) with a Hi-C coverage of ~40x (which was after filtering out non-uniquely mapped reads and gave an alignment.bed file of 8.3 Gb with 124 million lines) to chr-level, with the help of the canu-generated .gfa file. After running SALSA, however, it gave 6294 contigs (N50: ~220 kb). As seen, there is no expected improvement to reach chr-level.
The cmd I used was:
python /bin/SALSA/run_pipeline.py -a ln_contigs.fasta -l ln_contigs.fasta.fai -g asm.unitigs.gfa -m yes -b alignment.bed -e GATC -o scaffolds_utgs -s 480000000 >salsa2_run_pipeline.log
It finished successfully in 5 iterations, though without expected result.
Below is the log, is it normal seeing "Hi-C implied edges = 0"? And, can you spot any other message that can help reset parameters for rerunning it?
Or, could it be that initial N50 was not good enough?
Thank you in advance for the help and looikng forward to your replies!
Best,
Hequan Sun
Starting Iteration 1
bedfile started
bedfile loaded
Done loading GFA file
Number of Nodes = 19450
Number of Edges = 101976
Loading Hi-C links
Hybrid scaffold graph loaded, nodes = 12166 edges = 8210
Hi-C implied edges = 0
Starting Iteration 2
bedfile started
bedfile loaded
Starting Iteration 2
Loading Hi-C links
Hybrid scaffold graph loaded, nodes = 9794 edges = 5859
Hi-C implied edges = 0
Starting Iteration 3
bedfile started
bedfile loaded
Starting Iteration 3
Loading Hi-C links
Hybrid scaffold graph loaded, nodes = 8038 edges = 4453
Hi-C implied edges = 0
Starting Iteration 4
bedfile started
bedfile loaded
Starting Iteration 4
Loading Hi-C links
Hybrid scaffold graph loaded, nodes = 7054 edges = 3737
Hi-C implied edges = 0
/bin/SALSA
python /bin/SALSA/RE_sites.py -a scaffolds_utgs/assembly.cleaned.fasta -e GATC > scaffolds_utgs/re_counts_iteration_1
python /bin/SALSA/make_links.py -b scaffolds_utgs/alignment_iteration_1.bed -d scaffolds_utgs -i 1 -x abc
python /bin/SALSA/fast_scaled_scores.py -d scaffolds_utgs -i 1
sort -k 5 -gr scaffolds_utgs/contig_links_scaled_iteration_1 > scaffolds_utgs/contig_links_scaled_sorted_iteration_1
python /bin/SALSA/layout_unitigs.py -x asmRojoPasion.unitigs.gfa -l scaffolds_utgs/contig_links_scaled_sorted_iteration_1 -c 1000 -i 1 -d scaffolds_utgs
/bin/SALSA/break_contigs -a scaffolds_utgs/alignment_iteration_2.bed -b scaffolds_utgs/breakpoints_iteration_2.txt -l scaffolds_utgs/scaffold_length_iteration_2 -i 2 -s 100 > scaffolds_utgs/misasm_iteration_2.report
python /bin/SALSA/refactor_breaks.py -d scaffolds_utgs -i 2
python /bin/SALSA/make_links.py -b scaffolds_utgs/alignment_iteration_2.bed -d scaffolds_utgs -i 2
python /bin/SALSA/layout_unitigs.py -x abc -l scaffolds_utgs/contig_links_scaled_sorted_iteration_2 -c 1000 -i 2 -d scaffolds_utgs
/bin/SALSA/break_contigs -a scaffolds_utgs/alignment_iteration_3.bed -b scaffolds_utgs/breakpoints_iteration_3.txt -l scaffolds_utgs/scaffold_length_iteration_3 -i 3 -s 100 > scaffolds_utgs/misasm_iteration_3.report
python /bin/SALSA/refactor_breaks.py -d scaffolds_utgs -i 3 > scaffolds_utgs/misasm_3.log
python /bin/SALSA/make_links.py -b scaffolds_utgs/alignment_iteration_3.bed -d scaffolds_utgs -i 3
python /bin/SALSA/layout_unitigs.py -x abc -l scaffolds_utgs/contig_links_scaled_sorted_iteration_3 -c 1000 -i 3 -d scaffolds_utgs
/bin/SALSA/break_contigs -a scaffolds_utgs/alignment_iteration_4.bed -b scaffolds_utgs/breakpoints_iteration_4.txt -l scaffolds_utgs/scaffold_length_iteration_4 -i 4 -s 100 > scaffolds_utgs/misasm_iteration_4.report
python /bin/SALSA/refactor_breaks.py -d scaffolds_utgs -i 4 > scaffolds_utgs/misasm_4.log
python /bin/SALSA/make_links.py -b scaffolds_utgs/alignment_iteration_4.bed -d scaffolds_utgs -i 4
python /bin/SALSA/layout_unitigs.py -x abc -l scaffolds_utgs/contig_links_scaled_sorted_iteration_4 -c 1000 -i 4 -d scaffolds_utgs
/bin/SALSA/break_contigs -a scaffolds_utgs/alignment_iteration_5.bed -b scaffolds_utgs/breakpoints_iteration_5.txt -l scaffolds_utgs/scaffold_length_iteration_5 -i 5 -s 100 > scaffolds_utgs/misasm_iteration_5.report
python /bin/SALSA/refactor_breaks.py -d scaffolds_utgs -i 5 > scaffolds_utgs/misasm_5.log
The text was updated successfully, but these errors were encountered: