Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bias in UncertaintyForest performance compared to paper #377

Open
EYezerets opened this issue Nov 23, 2020 · 6 comments
Open

Bias in UncertaintyForest performance compared to paper #377

EYezerets opened this issue Nov 23, 2020 · 6 comments
Assignees

Comments

@EYezerets
Copy link
Collaborator

My issue is about the fact that the UncertaintyForest benchmarks notebook shows that the UncertaintyForest class from ProgLearn underperforms IRF at d=20, which we did not see in the original paper.

I checked that samples are taken without replacement now in both the deprecated uncertainty-forest repo and in ProgLearn, i.e. bootstrap = False in the figure 2 tutorial in the uncertainty-forest repo, and replace = False in progressive-learner.py in ProgLearn. Also, I believe that the n_estimators (300), tree_construction_proportion (0.4), and kappa (3) values are the same.

Snapshot of documentation error:

From the paper (original Figure 2):
image

From benchmarks in EYezerets/ProgLearn on the fig2benchmark branch:

image

Additional context

Sorry, for some reason I'm not able to assign Richard to this issue. Could someone please help me include him in this conversation?

@PSSF23
Copy link
Member

PSSF23 commented Nov 28, 2020

@EYezerets as Richard is not a contributor he has to comment on this issue to be assigned.

@jovo
Copy link
Member

jovo commented Nov 29, 2020

@rguo123 i can't assign you, so i slacked you ;)

@jovo jovo assigned jovo and EYezerets and unassigned jovo, EYezerets and rmehta004 Nov 29, 2020
@levinwil
Copy link
Collaborator

@EYezerets Has there been enough progress on this to close this issue?

@EYezerets
Copy link
Collaborator Author

@levinwil Sorry Will, we haven't really had any new ideas on this recently. Is it impeding progress on the repo?

@PSSF23
Copy link
Member

PSSF23 commented Jan 14, 2021

@EYezerets One difference I found between the original notebook and the new benchmark functions is that in the estimate_ce function, the original UF never took out the 0.3 testing data for evaluation. So the original UF might perform better because it used all the training data? Which makes the original figures more "biased?"

@PSSF23
Copy link
Member

PSSF23 commented Jan 14, 2021

To put it more specifically, the original UF has the |DP| : |DV| : |DE| ratio as 0.4 : 0.3 : 0.3, but the new benchmark functions has 0.4*0.7 : 0.6*0.7 : 0.3 = 0.28 : 0.42 : 0.3. And the 0.3 evaluation dataset is the same for all trees, differing from the original implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants