Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No chr-level scaffolding #107

Open
HeQSun opened this issue Aug 5, 2020 · 0 comments
Open

No chr-level scaffolding #107

HeQSun opened this issue Aug 5, 2020 · 0 comments

Comments

@HeQSun
Copy link

HeQSun commented Aug 5, 2020

Hi SALSA authors,

thanks for providing the tool. I have a question regarding its scaffolding of contigs. Below are some details.

I have a heterozygous diploid genome, and try to scaffold an initial set of 9727 contigs (N50: ~55 kb, from canu) with a Hi-C coverage of ~40x (which was after filtering out non-uniquely mapped reads and gave an alignment.bed file of 8.3 Gb with 124 million lines) to chr-level, with the help of the canu-generated .gfa file. After running SALSA, however, it gave 6294 contigs (N50: ~220 kb). As seen, there is no expected improvement to reach chr-level.

The cmd I used was:

python /bin/SALSA/run_pipeline.py -a ln_contigs.fasta -l ln_contigs.fasta.fai -g asm.unitigs.gfa -m yes -b alignment.bed -e GATC -o scaffolds_utgs -s 480000000 >salsa2_run_pipeline.log

It finished successfully in 5 iterations, though without expected result.

Below is the log, is it normal seeing "Hi-C implied edges = 0"? And, can you spot any other message that can help reset parameters for rerunning it?

Or, could it be that initial N50 was not good enough?

Thank you in advance for the help and looikng forward to your replies!

Best,
Hequan Sun

Starting Iteration 1
bedfile started
bedfile loaded
Done loading GFA file
Number of Nodes = 19450
Number of Edges = 101976
Loading Hi-C links
Hybrid scaffold graph loaded, nodes = 12166 edges = 8210
Hi-C implied edges = 0
Starting Iteration 2
bedfile started
bedfile loaded
Starting Iteration 2
Loading Hi-C links
Hybrid scaffold graph loaded, nodes = 9794 edges = 5859
Hi-C implied edges = 0
Starting Iteration 3
bedfile started
bedfile loaded
Starting Iteration 3
Loading Hi-C links
Hybrid scaffold graph loaded, nodes = 8038 edges = 4453
Hi-C implied edges = 0
Starting Iteration 4
bedfile started
bedfile loaded
Starting Iteration 4
Loading Hi-C links
Hybrid scaffold graph loaded, nodes = 7054 edges = 3737
Hi-C implied edges = 0
/bin/SALSA
python /bin/SALSA/RE_sites.py -a scaffolds_utgs/assembly.cleaned.fasta -e GATC > scaffolds_utgs/re_counts_iteration_1
python /bin/SALSA/make_links.py -b scaffolds_utgs/alignment_iteration_1.bed -d scaffolds_utgs -i 1 -x abc
python /bin/SALSA/fast_scaled_scores.py -d scaffolds_utgs -i 1
sort -k 5 -gr scaffolds_utgs/contig_links_scaled_iteration_1 > scaffolds_utgs/contig_links_scaled_sorted_iteration_1
python /bin/SALSA/layout_unitigs.py -x asmRojoPasion.unitigs.gfa -l scaffolds_utgs/contig_links_scaled_sorted_iteration_1 -c 1000 -i 1 -d scaffolds_utgs
/bin/SALSA/break_contigs -a scaffolds_utgs/alignment_iteration_2.bed -b scaffolds_utgs/breakpoints_iteration_2.txt -l scaffolds_utgs/scaffold_length_iteration_2 -i 2 -s 100 > scaffolds_utgs/misasm_iteration_2.report
python /bin/SALSA/refactor_breaks.py -d scaffolds_utgs -i 2
python /bin/SALSA/make_links.py -b scaffolds_utgs/alignment_iteration_2.bed -d scaffolds_utgs -i 2
python /bin/SALSA/layout_unitigs.py -x abc -l scaffolds_utgs/contig_links_scaled_sorted_iteration_2 -c 1000 -i 2 -d scaffolds_utgs
/bin/SALSA/break_contigs -a scaffolds_utgs/alignment_iteration_3.bed -b scaffolds_utgs/breakpoints_iteration_3.txt -l scaffolds_utgs/scaffold_length_iteration_3 -i 3 -s 100 > scaffolds_utgs/misasm_iteration_3.report
python /bin/SALSA/refactor_breaks.py -d scaffolds_utgs -i 3 > scaffolds_utgs/misasm_3.log
python /bin/SALSA/make_links.py -b scaffolds_utgs/alignment_iteration_3.bed -d scaffolds_utgs -i 3
python /bin/SALSA/layout_unitigs.py -x abc -l scaffolds_utgs/contig_links_scaled_sorted_iteration_3 -c 1000 -i 3 -d scaffolds_utgs
/bin/SALSA/break_contigs -a scaffolds_utgs/alignment_iteration_4.bed -b scaffolds_utgs/breakpoints_iteration_4.txt -l scaffolds_utgs/scaffold_length_iteration_4 -i 4 -s 100 > scaffolds_utgs/misasm_iteration_4.report
python /bin/SALSA/refactor_breaks.py -d scaffolds_utgs -i 4 > scaffolds_utgs/misasm_4.log
python /bin/SALSA/make_links.py -b scaffolds_utgs/alignment_iteration_4.bed -d scaffolds_utgs -i 4
python /bin/SALSA/layout_unitigs.py -x abc -l scaffolds_utgs/contig_links_scaled_sorted_iteration_4 -c 1000 -i 4 -d scaffolds_utgs
/bin/SALSA/break_contigs -a scaffolds_utgs/alignment_iteration_5.bed -b scaffolds_utgs/breakpoints_iteration_5.txt -l scaffolds_utgs/scaffold_length_iteration_5 -i 5 -s 100 > scaffolds_utgs/misasm_iteration_5.report
python /bin/SALSA/refactor_breaks.py -d scaffolds_utgs -i 5 > scaffolds_utgs/misasm_5.log

@HeQSun HeQSun changed the title Unexpected chr-level scaffolding No chr-level scaffolding Aug 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant