Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in cladeAnalysis.ipynb inflates main result by a factor of six #2

Open
nizzaneela opened this issue Jul 27, 2023 · 0 comments
Open

Comments

@nizzaneela
Copy link

The second cell generates two dictionaries for storing details of clades:

  • "clade_analyses_CC_d", for one mutation clades; and
  • "clade_analyses_AB_d", for two mutation clades.

Under "A/B analysis" in the function clade_analysis_updated, two mutation clades are tested against topology requirements. For each run, the largest two mutation clade is identified in the list "clade_sizes" and tested against size criteria. Its index is stored as "max2mutCladeLoc". The length of the corresponding entry in the list "subclade_sizes" (i.e. len(subclade_sizes[max2mutCladeLoc]) or, if there is only one clade, len(subclade_sizes[0]) ) is then used to check if that clade has enough direct descendants to be considered a polytomy.

The list "clades_sizes" is assigned values with the code:

"clade_sizes = clade_analyses_AB_d[run]['clade_sizes']"

The list "subclade_sizes" is assigned values with the code:

"subclade_sizes` = clade_analyses_CC_d[run]['subclade_sizes'].copy()"

The latter causes the polytomy test to use the number of subclades of a one mutation clade that has nothing to do with the two mutation clade being tested.

This seems to be a straightforward copy-paste error that can be corrected by replacing "CC" with "AB". However, the correction will significantly increase the proportion of two mutation clades that meet the topology requirements (since polytomies and large clades sizes are correlated), so that the Bayes' factors are substantially reduced. For the Main analysis, the Bayes' factor is reduced by a factor of 6.

I've made a branch, "bugfixes", with the correction, but you may prefer to implement the fix independently, since it is small.

A published correction may be necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant