Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the halting problem, NG50 compared with a few other metrics #104

Open
jbh-cas opened this issue May 10, 2020 · 0 comments
Open

the halting problem, NG50 compared with a few other metrics #104

jbh-cas opened this issue May 10, 2020 · 0 comments

Comments

@jbh-cas
Copy link

jbh-cas commented May 10, 2020

Thanks for the wonderful tool. Recently I ran Salsa2 with arg -i 10 and it terminated before that number of iterations. Though there were good results, there were also several super-scaffoldings that had been made in a 3D-dna run that weren't present.

So, inspecting the code I saw the NG50 advancement test to determine when to break out of the loop. To see what more iterations might accomplish, I commented that out and reran with a setting of -i 30 to create the 31 scaffolds_ITERATION_# agp files.

I'm attaching the text of stats_scaffolds_ITERATION.txt (I changed the suffix to txt for uploading) a short awk script that printed out the following info below for each of the agps. The columns are filename, total of scaffold lengths, number of scaffolds and various N#/L# values
The asterisk before a value means it's the same as it was in the prior agp file.

You can see the N50 test has iter_7 as the first repeat, but using number of scaffolds it's iter_10, which has all the others values repeated from iter_9 as well.

If we allow for an additional iteration look-ahead then iter_11 gives us something new. We get a repeat of iter_12 at iter_13, but again a 1 iter look ahead gets us something new at iter_14 and then we have the first double repeats (i.e., 3 of the same set of values in a row) starting at iter_16, with 8 sets of the same values in a row. So that improves from iter_6 140 scaffs N50 60,979,473 L50 16, to iter_16 126 scaffs N50 79,034,342 L50 14.

Anyway, food for thought. Thanks again for a great tool.

--Jim Henderson, California Academy of Sciences

output:

scaffolds_ITERATION_1.agp	 2644769511	 183	 N50:29638679 L50:30	 N60:24241170 L60:40	 N70:20428790 L70:52	 N80:14244363 L80:66	 N90:7701609 L90:92
scaffolds_ITERATION_2.agp	 2644777511	 167	 N50:36411706 L50:24	 N60:28328424 L60:33	 N70:23927955 L70:43	 N80:18131835 L80:56	 N90:8502176 L90:78
scaffolds_ITERATION_3.agp	 2644783511	 155	 N50:40639694 L50:20	 N60:31698195 L60:27	 N70:25231206 L70:36	 N80:19159797 L80:49	 N90:9145689 L90:67
scaffolds_ITERATION_4.agp	 2644786011	 150	 N50:50315364 L50:18	 N60:36411706 L60:24	 N70:27576223 L70:33	 N80:19959479 L80:44	*N90:9145689 L90:63
scaffolds_ITERATION_5.agp	 2644788011	 146	 N50:50898031 L50:17	 N60:40166381 L60:23	 N70:29638679 L70:31	 N80:20428790 L80:42	 N90:9663506 L90:59
scaffolds_ITERATION_6.agp	 2644790011	 142	 N50:60979473 L50:16	 N60:40538746 L60:22	 N70:31406863 L70:29	 N80:20615824 L80:39	 N90:9949571 L90:56
scaffolds_ITERATION_7.agp	 2644791011	 140	*N50:60979473 L50:16	 N60:40639694 L60:21	*N70:31406863 L70:29	*N80:20615824 L80:39	 N90:9989659 L90:55
scaffolds_ITERATION_8.agp	 2644793511	 135	 N50:67572291 L50:15	 N60:44061934 L60:20	 N70:31746683 L70:27	 N80:21748937 L80:37	 N90:13181980 L90:51
scaffolds_ITERATION_9.agp	 2644794511	 133	*N50:67572291 L50:15	 N60:50315364 L60:19	 N70:33095761 L70:26	 N80:24234933 L80:35	*N90:13181980 L90:50
scaffolds_ITERATION_10.agp	*2644794511	*133	*N50:67572291 L50:15	*N60:50315364 L60:19	*N70:33095761 L70:26	*N80:24234933 L80:35	*N90:13181980 L90:50
scaffolds_ITERATION_11.agp	 2644795011	 132	*N50:67572291 L50:15	 N60:58838040 L60:19	 N70:36411706 L70:25	*N80:24234933 L80:34	*N90:13181980 L90:49
scaffolds_ITERATION_12.agp	 2644795511	 131	 N50:67716364 L50:15	*N60:58838040 L60:19	 N70:40166381 L70:24	*N80:24234933 L80:33	*N90:13181980 L90:48
scaffolds_ITERATION_13.agp	*2644795511	*131	*N50:67716364 L50:15	*N60:58838040 L60:19	*N70:40166381 L70:24	*N80:24234933 L80:33	*N90:13181980 L90:48
scaffolds_ITERATION_14.agp	 2644796011	 130	*N50:67716364 L50:15	*N60:58838040 L60:19	*N70:40166381 L70:24	*N80:24234933 L80:33	 N90:13773325 L90:47
scaffolds_ITERATION_15.agp	 2644797011	 128	 N50:68788111 L50:15	 N60:67206133 L60:18	*N70:40166381 L70:24	*N80:24234933 L80:33	 N90:13922777 L90:46
scaffolds_ITERATION_16.agp	 2644798011	 126	 N50:79034342 L50:14	*N60:67206133 L60:18	 N70:40538746 L70:23	 N80:24261773 L80:31	*N90:13922777 L90:45
scaffolds_ITERATION_17.agp	*2644798011	*126	*N50:79034342 L50:14	*N60:67206133 L60:18	*N70:40538746 L70:23	*N80:24261773 L80:31	*N90:13922777 L90:45
scaffolds_ITERATION_18.agp	*2644798011	*126	*N50:79034342 L50:14	*N60:67206133 L60:18	*N70:40538746 L70:23	*N80:24261773 L80:31	*N90:13922777 L90:45
scaffolds_ITERATION_19.agp	*2644798011	*126	*N50:79034342 L50:14	*N60:67206133 L60:18	*N70:40538746 L70:23	*N80:24261773 L80:31	*N90:13922777 L90:45
scaffolds_ITERATION_20.agp	*2644798011	*126	*N50:79034342 L50:14	*N60:67206133 L60:18	*N70:40538746 L70:23	*N80:24261773 L80:31	*N90:13922777 L90:45
scaffolds_ITERATION_21.agp	*2644798011	*126	*N50:79034342 L50:14	*N60:67206133 L60:18	*N70:40538746 L70:23	*N80:24261773 L80:31	*N90:13922777 L90:45
scaffolds_ITERATION_22.agp	*2644798011	*126	*N50:79034342 L50:14	*N60:67206133 L60:18	*N70:40538746 L70:23	*N80:24261773 L80:31	*N90:13922777 L90:45
scaffolds_ITERATION_23.agp	*2644798011	*126	*N50:79034342 L50:14	*N60:67206133 L60:18	*N70:40538746 L70:23	*N80:24261773 L80:31	*N90:13922777 L90:45
scaffolds_ITERATION_24.agp	 2644798511	 125	*N50:79034342 L50:13	*N60:67206133 L60:17	*N70:40538746 L70:22	*N80:24261773 L80:30	*N90:13922777 L90:44
scaffolds_ITERATION_25.agp	*2644798511	*125	*N50:79034342 L50:13	*N60:67206133 L60:17	*N70:40538746 L70:22	*N80:24261773 L80:30	*N90:13922777 L90:44
scaffolds_ITERATION_26.agp	*2644798511	*125	*N50:79034342 L50:13	*N60:67206133 L60:17	*N70:40538746 L70:22	*N80:24261773 L80:30	*N90:13922777 L90:44
scaffolds_ITERATION_27.agp	*2644798511	*125	*N50:79034342 L50:13	*N60:67206133 L60:17	*N70:40538746 L70:22	*N80:24261773 L80:30	*N90:13922777 L90:44
scaffolds_ITERATION_28.agp	*2644798511	*125	*N50:79034342 L50:13	*N60:67206133 L60:17	*N70:40538746 L70:22	*N80:24261773 L80:30	*N90:13922777 L90:44
scaffolds_ITERATION_29.agp	 2644799011	 124	*N50:79034342 L50:13	*N60:67206133 L60:17	*N70:40538746 L70:22	 N80:26372909 L80:30	 N90:14244363 L90:43
scaffolds_ITERATION_30.agp	*2644799011	*124	*N50:79034342 L50:13	*N60:67206133 L60:17	*N70:40538746 L70:22	*N80:26372909 L80:30	*N90:14244363 L90:43
scaffolds_ITERATION_31.agp	*2644799011	*124	*N50:79034342 L50:13	*N60:67206133 L60:17	*N70:40538746 L70:22	*N80:26372909 L80:30	*N90:14244363 L90:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant