Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for exact tests #372

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Support for exact tests #372

wants to merge 2 commits into from

Conversation

ytzfhqs
Copy link

@ytzfhqs ytzfhqs commented Aug 14, 2023

Support for exact tests, including fisher exact, barnard exact and boschloo exact

Support for exact tests, including fisher exact, barnard exact and boschloo_exact
@raphaelvallat
Copy link
Owner

Hi @ytzfhqs

Thanks for opening the PR! Would you be able to provide some details/justification for adding this to Pingouin? I'm not in favor of adding them since the lower-level scipy functions are already pretty straightforward to use and well documented.

@ytzfhqs
Copy link
Author

ytzfhqs commented Sep 6, 2023

I'm very happy to receive your reply. I wanted to add exact tests to pingouin for two main reasons:

  1. For situations with expected frequencies less than 5, scipy provides 3 methods - fisher_exact, barnard_exact, and boschloo_exact. I wanted to integrate them together as a collection of exact tests.if you remember #23.

  2. I feel like switching back and forth between pingouin and scipy while doing hypothesis testing gives a disjointed feeling. I need to look up documentation in two different places, and the final result formats are also different. Of course this is just my personal opinion, and may not align with pingouin's development principles.

Additionally, I have another question I wanted to ask. scipy's fisher_exact can only be used for 2x2 contingency tables. If my table looks like this:

A B C
YES 11 6 17
NO 13 2 20

How can I use fisher_exact to test if A, B, C have significant effects on the target variable? Could we provide a solution for this in pingouin? I noticed pingouin's chi2_independence uses Yates correction when expected frequencies are <5 - could we prompt the user to use exact tests instead, and provide API interfaces?

Looking forward to your reply!

@raphaelvallat
Copy link
Owner

Hi @ytzfhqs,

Thanks for the details. Actually, most of the contingency functions implemented in Pingouin were written by @arthurpaulino, who might be able to chime in and answer your last question. I rarely use these functions so might not be the best person to answer.

I understand your points and I think that we can add the chi2_exact function to Pingouin. However, would you please be able to include some unit tests in https://github.com/raphaelvallat/pingouin/blob/master/pingouin/tests/test_contingency.py?

Thanks,
Raphael

@arthurpaulino
Copy link
Contributor

Oh it's been so long! Unfortunately, at this point, I have less context than @ytzfhqs ):

pingouin/contingency.py Outdated Show resolved Hide resolved
Co-authored-by: Quentin Barthélemy <[email protected]>
@ytzfhqs
Copy link
Author

ytzfhqs commented Sep 6, 2024

@qbarthelemy That's a great idea, thank you very much for the suggestion, I will optimise my code!

Copy link
Owner

@raphaelvallat raphaelvallat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ytzfhqs,

We should be good to merge once you've addressed the two minor comments below.

Thanks,
Raphael


stats = pd.DataFrame(stats)[["alternative", "odds ratio", "pval"]]

return expected, observed, _postprocess_dataframe(stats)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add some simple unit tests in https://github.com/raphaelvallat/pingouin/blob/main/tests/test_contingency.py to verify that the function works as expected?


*Boschloo’s test is an exact test used in the analysis of contingency tables.
It examines the association of two categorical variables, and is a uniformly
more powerful alternative to Fisher’s exact test for 2x2 contingency tables.*
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add:

See also
--------
scipy.stats.fisher_exact, scipy.stats.barnard_exact, scipy.stats.boschloo_exact

Comment on lines +359 to +360
method : string
Methods of exact test. Options include``fisher``,``barnard``,``boschloo``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
method : string
Methods of exact test. Options include``fisher``,``barnard``,``boschloo``.
method : {"fisher", "barnard", "boschloo"}, default="fisher"
Method of exact test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants