Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using compareCluster to compare gene sets from different species #722

Open
lanasushko opened this issue Sep 13, 2024 · 1 comment
Open

Comments

@lanasushko
Copy link

Hi,

I have a question about the usage of compareCluster function. Is it only meant to be used to compare gene sets between different experimental conditions in the same species. Or can it also be used to compare gene sets from different species?

I have a number of closely-related species for which I want to test for enrichment in a comparative manner? Their gene annotations are quite similar. I tried to concatenate the TERM2GENE tables and extract the unique rows and the concatenated table did not contain significantly more genes than the tables for each one of the species. Would it be correct to use compareCluster function in this case?

Thanks!

@guidohooiveld
Copy link

guidohooiveld commented Sep 16, 2024

It is not fully clear to me what you try to achieve.

The function compareCluster indeed runs a gene set analysis method (ORA or GSEA) on different gene clusters (i.e. lists of genes) that are used as input. Depending on the type of analysis method, it will check which gene sets (i.e. 'terms') present in TERM2GENE are over-represented resp. enriched in these clusters.

compareCluster is agnostic to the content of TERM2GENE. Thus, as long as the TERM2GENE table contains biologically-relevant 'mappings' it can be used.
Please note that if for a species you have 200 gene sets defined, and for the related species another 200 sets, then the multiple testing adjustment will correct for testing 400 gene sets, and not for 200 test that would happen if 2 independent, species-specific analysis would be performed. Hence, FDR values will not be identical between the 2 approaches (but p-values should!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants