-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize vertical ordering of variants in beadplots #58
Comments
Now sorting vertical position of variant by number of mutations from parental variant, using collection date of earliest sample of variants to break ties: Lines 67 to 74 in 332b6c5
|
New idea (the above isn't bad, but this could be better) - instead of ordering children by genetic distance and then collection date for tie-breaking, what about genetic distance and then total number of descendant variants? The problem is that some variants that are only 1 mutation away are being bounced downwards by an earlier variant that is ancestral to a large number of descendant variants. |
Unsampled lineages can inherit the earliest sample date from their "parent" lineage (if that is sampled) - if parent is also unsampled, then proceed down the tree until you reach a sampled parent. |
We should consider re-rooting these neighbor-joining trees so that the sampled lineage with the earliest sample date (and also largest number of descendants as a tie-breaker?) becomes the root. |
@SyouTono242 can you please start commiting your code changes to a branch of this repo? thanks! |
@ArtPoon Yes... But I was an idiot who forgot to checkout and I accidentally committed and pushed my script to master... I am extremely sorry; I shouldn't be checking GitHub at 4am and I learnt my lesson hard... I didn't commit anything else than the one single script I wrote, and as it's a standalone script it shouldn't interfere with any other existing file. Please roll back masters if you can... Again I apologize for all the inconvenience if anyone else is fetching the incorrect changes. I'm really sorry. |
Haha whoops! Don’t stress about it @SyouTono242 - accidents happen and I’ve made the same mistake before. @GopiGugan can you please roll back this commit and @SyouTono242 can you push your changes to the dev branch? |
Code commited to |
Keep an eye on this new code while reviewing PR #409 |
Let's work on porting this into JS for the next PR (after #409) |
@GopiGugan to generate toy data sets to run |
Apply sorting algorithm directly to data structure before it is serialized as a JSON file |
@bonnielu has test data from @GopiGugan , code analysis in progress |
porting is finished, @bonnielu requesting larger test data |
We need to write some unit test fixtures (trees that should be re-ordered a specific way) to determine why these test data sets are not being changed. |
Can you please post the "before" layouts? |
Ok thanks! I'm willing to call the first, smaller beadplot an improved layout. Difficult to say for the second one - but I wonder what the heck is going on with all those samples taken on the same date? |
Can we please see what the beadplots look like if we prioritize earliest sampling date, and then tree traversal? |
Thanks for implementing this different approach @bonnielu - after reviewing the outputs, I think we should go ahead with the first revision (Yiran's version). Please revert to that algorithm and submit a PR to dev. |
I don't think we're going to find a good solution to this - the problem might be better addressed by collapsing variants (#434) |
Presently, the length of a vertical edge bears no relation to the genetic distance (i.e., number of mutations) it represents. Variants are ordered by pre-order traversal, which helps keep subtrees together.
Is there a more optimal arrangement of variants possible?
The text was updated successfully, but these errors were encountered: