-
-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caching Prototype #382
base: master
Are you sure you want to change the base?
Caching Prototype #382
Conversation
My thoughts about things to consider, in random order:
|
I added your comments and some responses. |
Have to we concluded to do this in a per-package base now rather than upstream in mlr3? |
This would actually be broader then doing it in
The only drawback would then be that in order to benefit from caching, those would need to be part of a pipeline, which they should be anyways in most cases.
|
OK. I again want to raise awareness that people who want to use mlr3 without pipelines should also profit from caching. For example, when filtering users should be able to profit from caching but this would require adding a per-package caching implementation then (there might be more ext packages besides filters). This one could potentially conflict with the pipelines caching. Also, did you look how drake does this? In the end its also a workflow package that aims to cache steps within the "Graph" and detect if there are steps in the cache that do not need to be rerun. |
In general, different levels of caching should not interfere, the worst case I could imagine would be to cache things twice, i.e.
I can not really judge this, but I am not sure I agree. I agree that we should provide enough convenience functions to enable people to work without learning all the in's and out's of 'mlr3pipelines` Currently it is as complicated as flt("variance") %>>% lrn("classif.rpart") to write a filtered learner. So correct me if I am wrong, but when would you do filtering without using I have not looked at |
Yeah or maybe to rely on the lower-level caching implementation if it exists.
Yeah maybe it does not make sense and mlr3pipelines is a "must-use" in the whole game. For now I've only done toy benchmarks without any wrapped learners - all of these are still written in the old mlr. Maybe you find mlr3-learnerdrake interesting. We/I should extend it with a pipelines example. |
Review Bernd:
|
}, | ||
stochastic = function(val) { | ||
if (!missing(val)) { | ||
private$.stochastic = assert_subset(val, c("train", "predict")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be read-only and set during initialization?
Hi folks, we have a use case in mlr3 where caching inside mlr3pipelines would be super useful: |
See caching.md