Translating Ensemble stacking from mlr to mlr3 #667

thengl · 2021-07-15T12:17:11Z

thengl
Jul 15, 2021

We are trying to translate mlr code to mlr3 but this seems to be at the order of magnitude more complex than what we expected. Basically all functions names are different, object types are different and functions are split over n packages.

This is the operation we would like to translate (see complete code):

library(mlr)
## Synthetic data ----
set.seed(200)
n = 100
x <- 1:n
y <- x + rnorm(n = 50, mean = 15, sd = 15)
dat <- data.frame(x,y)
## newdata:
newdata <- data.frame(
  x = -100:200
)
## Fir Ensemble ML using mlr ----
SL.library = c("regr.ranger", "regr.glm", "regr.gamboost", "regr.ksvm")
lrns <- lapply(SL.library, mlr::makeLearner)
tsk <- mlr::makeRegrTask(data = dat, target = "y")
init.m <- mlr::makeStackedLearner(lrns, method = "stack.cv", super.learner = "regr.lm", resampling=mlr::makeResampleDesc(method = "CV"))
eml = train(init.m, tsk)
summary(eml$learner.model$super.model$learner.model)

it is explained in detail in:

https://opengeohub.medium.com/extrapolation-is-tough-for-trees-tree-based-learners-combining-learners-of-different-type-makes-659187a6f58d

How to do ensemble ML with mlr3 and derive predictions? Thank you!

Answered by pat-s

Oct 7, 2021

For reference, I'll post the code here again which I wrote during my stay at OpenGeoHub in September which should address most of the questions discussed here:

library("mlr3verse")
library("mlr3spatiotempcv")
library("mlr3viz")
library(data.table)
set.seed(42)

logger = lgr::get_logger("bbotk")
logger$set_threshold("warn")

# parallelization --------------------------------------------------------------

future::plan("multisession", workers = 4)

# data -------------------------------------------------------------------------

# create dataset with blocking from example dataset
task = tsk("ecuador")
data_raw = task$backend$data(1:task$nrow, task$feature_names)
group = as.factor(sample(c("c…

View full answer

be-marc · 2021-07-15T16:33:15Z

be-marc
Jul 15, 2021
Maintainer Sponsor

Hey,

did you read the stacking section in the book?
We also have two gallery posts 1 & 2.

5 replies

thengl Jul 15, 2021
Author

Do you maybe have some examples with regression? I'm primarily interested in regression combined with feature selection (mlr::makeFeatSelWrapper) and fine-tuning (mlr::tuneParams).
And how to run mlr::getStackedBaseLearnerPredictions in mlr3?

berndbischl Jul 15, 2021
Maintainer Sponsor

can you maybe define a concrete scenario? like in pseudo code or whatever? so it becomes a bit more concrete

thengl Jul 15, 2021
Author

Yes of course --- the code above is as short as it gets (a longer version is here). Thank you!

berndbischl Jul 15, 2021
Maintainer Sponsor

sorry, i missed the OP a bit. was a bit busy and fast.
but....... what is then unclear if you take a look at the book and gallery posts?

thengl Jul 15, 2021
Author

If I look at the mlr3 book "4.5.2.3 Multilevel Stacking", (1) there is no regression example, (2) there is no feature selection example, (3) it mentions "unions", which I am not sure what it means, (4) it adds PCA which I am not sure I need, ... My objective is to translate the code I used in mlr to mlr3 (code example above). It appears that mlr3 is not only a new version of the software, but you have made completely new vocabulary / I have problems finding functions such as mlr::makeFeatSelWrapper, mlr::getStackedBaseLearnerPredictions, etc. If I could figure all differences myself, of course I would have not asked for help. Operational ML is the one with least possible code of course (define task, define learners, fine-tune learners, run feature selection, stack and predict). If you could make code from the example I have put above, I could possibly figure out how to translate all my other mlr code.

berndbischl · 2021-07-15T18:39:33Z

berndbischl
Jul 15, 2021
Maintainer Sponsor

@thengl well would be great if you could still use the resources to build your answer incrementally ;-)
and then tell me where you get stuck.
i mean
a) changing classic to regr
b) removing stuff you don't want
should be possible right?

0 replies

berndbischl · 2021-07-15T18:39:59Z

berndbischl
Jul 15, 2021
Maintainer Sponsor

does this help?

library(mlr3)
library(mlr3learners)
library(mlr3pipelines)

l1 = lrn("regr.rpart")
l2 = lrn("regr.ranger")
c1 = po("learner_cv", l1, id = "rp")
c2 = po("learner_cv", l2, id = "rf")
l3 = lrn("regr.glmnet")

ensemble =  gunion(list(c1, c2)) %>>% po("featureunion") %>>% l3
ensemble$plot(html = TRUE)


tt = tsk("boston_housing")
rr = rsmp("holdout")
r = resample(tt, ensemble, rr)

````

that seems to translate your OP into mlr3?

3 replies

thengl Jul 15, 2021
Author

Yes this is exactly what I need! Few minor remaining issues:

How to run mlr::getStackedBaseLearnerPredictions from your ensemble object?
How to specify resampling=mlr::makeResampleDesc(method = "CV")?
How to add a blocking argument?

berndbischl Jul 15, 2021
Maintainer Sponsor

I am ok with answering 1), but for the others... that's REALLY easily visible from the docs. Give it a try ;-)

You can query the output of every node in the pipeline graph. We don't store them for efficiency reasons so you need to set a flag.

l1 = lrn("regr.rpart")
l2 = lrn("regr.ranger")
c1 = po("learner_cv", l1, id = "rp")
c2 = po("learner_cv", l2, id = "rf")
l3 = lrn("regr.glmnet")

ensemble =  gunion(list(c1, c2)) %>>% po("featureunion") %>>% l3
ensemble$keep_results = TRUE
ensemble$plot(html = TRUE)

tt = tsk("boston_housing")
ll = GraphLearner$new(ensemble)
ll$train(tt)
po_fu = ll$graph$pipeops[["featureunion"]]
fu_res = po_fu$.result
print(str(fu_res))
print(fu_res$output)

you can see that the output is a task which includes the base learner preds as features


List of 1
 $ output:Classes 'TaskRegr', 'TaskSupervised', 'Task', 'R6' <TaskRegr:boston_housing> 



NULL
<TaskRegr:boston_housing> (506 x 3)
* Target: medv
* Properties: -
* Features (2):
  - dbl (2): rf.response, rp.response

thengl Jul 21, 2021
Author

Thanks for pointing me to the new functions. I am trying to figure out how to set the resampling method based on the mlr3 book. I assume I should be doing something like this correct?

rcv = rsmp("repeated_cv", repeats = 2, folds = 5)
rr = resample(tt, ensemble, rcv, store_models = TRUE)

This gives out a class of type <ResampleResult> of 5 iterations. How do I print a summary of this model?

Based on the example from above I would like to specify resampling rsmp("repeated_cv", repeats = 2, folds = 5) inside the code below --- where do I specify resampling type?

l1 = lrn("regr.rpart")
l2 = lrn("regr.ranger")
c1 = po("learner_cv", l1, id = "rp")
c2 = po("learner_cv", l2, id = "rf")
l3 = lrn("regr.lm")
ensemble =  gunion(list(c1, c2)) %>>% po("featureunion") %>>% l3
ensemble$keep_results = TRUE
ensemble$plot(html = TRUE)
tt = tsk("boston_housing")
ll = GraphLearner$new(ensemble)
ll$train(tt)
#INFO  [11:06:50.657] Applying learner 'regr.rpart' on task 'boston_housing' (iter 2/3) 
#INFO  [11:06:50.691] Applying learner 'regr.rpart' on task 'boston_housing' (iter 1/3) 
#INFO  [11:06:50.724] Applying learner 'regr.rpart' on task 'boston_housing' (iter 3/3) 
#INFO  [11:06:50.967] Applying learner 'regr.ranger' on task 'boston_housing' (iter 2/3) 
#INFO  [11:06:51.039] Applying learner 'regr.ranger' on task 'boston_housing' (iter 3/3) 
#INFO  [11:06:51.124] Applying learner 'regr.ranger' on task 'boston_housing' (iter 1/3) 
summary(ll$model$regr.lm$model)
#Call:
#stats::lm(formula = task$formula(), data = task$data())
#
#Residuals:
#   Min      1Q  Median      3Q     Max 
#-7.9516 -0.7272 -0.0161  0.7011  7.2958 
#
#Coefficients:
#            Estimate Std. Error t value Pr(>|t|)    
#(Intercept) -1.46878    0.19311  -7.606  1.4e-13 ***
#rp.response  0.48684    0.03034  16.048  < 2e-16 ***
#rf.response  0.57765    0.03293  17.542  < 2e-16 ***
#---
#Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
#Residual standard error: 1.467 on 503 degrees of freedom
#Multiple R-squared:  0.9747,	Adjusted R-squared:  0.9746 
#F-statistic:  9670 on 2 and 503 DF,  p-value: < 2.2e-16

pat-s · 2021-10-07T21:49:10Z

pat-s
Oct 7, 2021
Maintainer

For reference, I'll post the code here again which I wrote during my stay at OpenGeoHub in September which should address most of the questions discussed here:

library("mlr3verse")
library("mlr3spatiotempcv")
library("mlr3viz")
library(data.table)
set.seed(42)

logger = lgr::get_logger("bbotk")
logger$set_threshold("warn")

# parallelization --------------------------------------------------------------

future::plan("multisession", workers = 4)

# data -------------------------------------------------------------------------

# create dataset with blocking from example dataset
task = tsk("ecuador")
data_raw = task$backend$data(1:task$nrow, task$feature_names)
group = as.factor(sample(c("class1", "class2", "class3", "class4", "class5",
  "class6", "class7", "class8"),
task$nrow, replace = TRUE))
task$cbind(data.table("group" = group))

# tell the task about the grouping
task$set_col_roles("group", roles = "group")
task$col_roles

# preprocessing ----------------------------------------------------------------

# list of PipeOps for preprocessing
mlr_pipeops

# resampling -------------------------------------------------------------------
set.seed(42)
# spcv = rsmp("spcv_coords", folds = 4)$instantiate(task)
cv = rsmp("cv", folds = 4)$instantiate(task)
# bug in mlr3spatiotempcv
autoplot(cv, task, fold_id = 1)

# tuning -----------------------------------------------------------------------

# note: I would highly suggest not to use grid search

terminator = trm("evals", n_evals = 2)
tuner = tnr("random_search")

# the same logic can be created for RandomForest or any other learner
learner_xgb = lrn("classif.xgboost", predict_type = "prob")
# parallel predictions
learner_xgb$parallel_predict = TRUE

search_space = ps(
  max_depth = p_int(5, 10),
  eta = p_dbl(0.5, 0.8),
  subsample = p_dbl(0.9, 1),
  min_child_weight = p_int(8, 10),
  colsample_bytree = p_dbl(0.5, 1)
)

at_xgb = AutoTuner$new(
  learner = learner_xgb,
  resampling = rsmp("cv", folds = 4),
  measure = msr("classif.ce"),
  terminator = terminator,
  search_space = search_space,
  tuner = tuner,
  store_models = FALSE)

# stacking ---------------------------------------------------------------------

stacked_graph = gunion(list(
  po("learner_cv", at_xgb),
  po("learner_cv", lrn("classif.ranger", predict_type = "prob"))
)) %>>%
  po("featureunion") %>>% lrn("classif.log_reg", predict_type = "prob")

stacked_graph$plot()

stacked_graph$keep_results = TRUE
stacked_learner = as_learner(stacked_graph)

# train stacked learner on the full task
# stacked_learner$train(task)

# resample ---------------------------------------------------------------------

# Cross-validation for stacked learner
rr_res = resample(task,
  stacked_learner,
  cv)

# benchmark --------------------------------------------------------------------

# benchmark ensemble model against SVM and KKNN
bmr = benchmark(data.table(
  task = list(task),
  learner = list(
    stacked_graph,
    lrn("classif.svm"),
    lrn("classif.kknn")
  ),
  resampling = list(cv)))

autoplot(bmr)

# train & predict --------------------------------------------------------------

stacked_learner$train(task, row_ids = 1:100)
# get base learner Predictions
# requires $keep_results = TRUE to be set
# Note Patrick: we will most likely simplify this in the future
base_learner_preds = stacked_learner$graph$pipeops$featureunion$.result[[1]]$data()
# filter only prob
base_learner_preds[, grepl("prob.TRUE", colnames(base_learner_preds)), with = FALSE]

pred = stacked_learner$predict(task, row_ids = 101:200)

# get base learners ------------------------------------------------------------

# extract learner ids before the featureunion PO - no easy way currently
pos = names(stacked_learner$graph$pipeops)
i1 = pos == "featureunion"
grp = cumsum(i1)
base_learner_ids = split(pos[!i1], grp[!i1])$`0`

# for newdata, use $predict_newdata()
# predict with the fitted base learners
# usually you should only have .tuned models and then the else block can be discarded
pred_base_learners = lapply(base_learner_ids, function(x) {
  if (grepl(".tuned", x)) {
    stacked_learner$model[[x]]$model$learner$predict(task)
  } else {
    # for untuned models we need to call the S3 predict method because the object structure differs
    predict(stacked_learner$model[[x]]$model, task$data())
  }
})

1 reply

thengl Oct 8, 2021
Author

That is super useful @pat-s
I will check also regression models with mlr3 ensembles and will let you know if I experience any issues.

berndbischl · 2021-10-08T08:24:48Z

berndbischl
Oct 8, 2021
Maintainer Sponsor

@pat-s as mlr3ensembles is now on the roadmap, how about closing this here, with a link to the new repo
and copy this either into the tracker there or the readme?

1 reply

pat-s Oct 8, 2021
Maintainer

Discussions can't be closed like issue, they can "only" be answered (which I did). I am copying it into the discussions of mlr3ensemble.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Translating Ensemble stacking from mlr to mlr3 #667

{{title}}

Replies: 5 comments 10 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Translating Ensemble stacking from mlr to mlr3 #667

thengl Jul 15, 2021

Replies: 5 comments · 10 replies

be-marc Jul 15, 2021 Maintainer Sponsor

thengl Jul 15, 2021 Author

berndbischl Jul 15, 2021 Maintainer Sponsor

thengl Jul 15, 2021 Author

berndbischl Jul 15, 2021 Maintainer Sponsor

thengl Jul 15, 2021 Author

berndbischl Jul 15, 2021 Maintainer Sponsor

berndbischl Jul 15, 2021 Maintainer Sponsor

thengl Jul 15, 2021 Author

berndbischl Jul 15, 2021 Maintainer Sponsor

thengl Jul 21, 2021 Author

pat-s Oct 7, 2021 Maintainer

thengl Oct 8, 2021 Author

berndbischl Oct 8, 2021 Maintainer Sponsor

pat-s Oct 8, 2021 Maintainer

thengl
Jul 15, 2021

Replies: 5 comments 10 replies

be-marc
Jul 15, 2021
Maintainer Sponsor

thengl Jul 15, 2021
Author

berndbischl Jul 15, 2021
Maintainer Sponsor

thengl Jul 15, 2021
Author

berndbischl Jul 15, 2021
Maintainer Sponsor

thengl Jul 15, 2021
Author

berndbischl
Jul 15, 2021
Maintainer Sponsor

berndbischl
Jul 15, 2021
Maintainer Sponsor

thengl Jul 15, 2021
Author

berndbischl Jul 15, 2021
Maintainer Sponsor

thengl Jul 21, 2021
Author

pat-s
Oct 7, 2021
Maintainer

thengl Oct 8, 2021
Author

berndbischl
Oct 8, 2021
Maintainer Sponsor

pat-s Oct 8, 2021
Maintainer