very very large dataset recomendations #167

sapuizait · 2023-12-11T11:43:04Z

Hi all

As the title says, i have a very large dataset or 1500 genomes that share 1200 single copy genes. The plan is to build a concatenated alignment (lets see if its even possible :D ) and then use raxml to build a global phylogeny.
Do you think it is even feasible or am I daydreaming and I should consider alternative approaches?

Cheers
P

ps: I have access to a cluster which can run a maximum of 7 days, has 64 nodes and 500GB RAM -

amkozlov · 2023-12-18T16:01:43Z

In principle, it sounds feasible.

We successfully used raxml-ng for concatenated datasets with ~1400 taxa and ~1000 genes, as well as ~350 taxa and ~64000 loci (unfortunately, both papers are not published yet).

sapuizait · 2023-12-27T11:41:43Z

Thats excellent! Any advice/suggestions on how to do that? Do you use partitions and check models for each partition etc? Which algorithm? Thanks in advance for any tips! :)

amkozlov added the question label Dec 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

very very large dataset recomendations #167

very very large dataset recomendations #167

sapuizait commented Dec 11, 2023

amkozlov commented Dec 18, 2023

sapuizait commented Dec 27, 2023

very very large dataset recomendations #167

very very large dataset recomendations #167

Comments

sapuizait commented Dec 11, 2023

amkozlov commented Dec 18, 2023

sapuizait commented Dec 27, 2023