Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

powered_effect is not calculated in the StudentsTTest #80

Open
jpzhangvincent opened this issue Jan 10, 2023 · 1 comment
Open

powered_effect is not calculated in the StudentsTTest #80

jpzhangvincent opened this issue Jan 10, 2023 · 1 comment

Comments

@jpzhangvincent
Copy link

jpzhangvincent commented Jan 10, 2023

I'm a bit confused why the powered_effect is not calculated in the StudentsTTest but it's provided in ZTest.

Screenshot 2023-01-10 at 12 38 03 PM

The above is the data frame which I passed into both

stat_res_df = confidence.ZTest(
    stats_df,
    numerator_column='conversions',
    numerator_sum_squares_column=None,
    denominator_column='total',
    categorical_group_columns='variant_id',
    correction_method='bonferroni')

and

stat_res_df = confidence.StudentsTTest(
    stats_df,
    numerator_column='conversions',
    numerator_sum_squares_column=None,
    denominator_column='total',
    categorical_group_columns='variant_id',
    correction_method='bonferroni')

but when I called stat_res_df.difference(level_1='control', level_2='treatment') I found the result from z-test provides the powered_effect column as below
Screenshot 2023-01-10 at 12 39 11 PM
but it's missing from the t-test result. Another question, why is the required_sample_size missing? Is there a way to also provide the sample size estimation in the result? Thanks!

@iampelle
Copy link
Contributor

iampelle commented Feb 6, 2023

Since our sample sizes at Spotify are usually very large, it doesn't make a difference whether we use Z-tests or T-tests. Therefore we have mostly focused on the Z-test case and just not got around to implement everything for the other variants. It should be a simple thing to add though. The only difference should be in these lines, where we could use the corresponding t-distribution methods to get test statistics.

Time to make a first PR? 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants