-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducibility Issue With Parallel Processing? #22
Comments
Thanks for the report! I am not sure this has anything to do with mlr3automl, but I am looking into it. I'll give you an update when I know more |
As an experiment, I removed the lines relating to the create_autotuners, substituted a simple random autotuner for the ranger learner, and left the other learners untuned. This modified code gives reproducible results for the tuned ranger learner (as well as for the other untuned learners).
This result makes me think the choice of autotuner goes to the difference in terms of reproducibility. |
Here is code where I would have expected the aggregate results at the end for two identical benchmarks to be identical, but they are not. Since I am only an intermediate level coder in R, perhaps there is something wrong with my code. In any event, I pass this along for your consideration as a possible issue in mlr3automl. As you can imagine, this code takes a while to execute, ~10 minutes on my iMac Pro.
Here are a couple of interesting clues. If I run this code several times, the end result is the same each time (i.e., the same mix of TRUE and FALSE results for the different stochastic learners). But if I run this code in R and then run the same code in RStudio, I get a different mix of TRUE and FALSE results depending on the platform. Finally, if I run this code substituting a different dataset, then I get a different mix of TRUE and FALSE results at the end.
The text was updated successfully, but these errors were encountered: