-
-
Notifications
You must be signed in to change notification settings - Fork 405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically predict NA for rows w/ NAs and learners that don't support missings #2099
Conversation
@@ -362,7 +362,7 @@ plotHyperParsEffect = function(hyperpars.effect.data, x = NULL, y = NULL, | |||
regr.task = makeRegrTask(id = "interp", data = d.run[, c(x, y, z)], | |||
target = z) | |||
mod = train(lrn, regr.task) | |||
prediction = predict(mod, newdata = grid) | |||
prediction = predict(mod, newdata = grid[c(x, y)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bonus bugfix!
tests/testthat/test_base_predict.R
Outdated
@@ -144,3 +144,11 @@ test_that("predict works with data.table as newdata", { | |||
expect_warning(predict(mod, newdata = data.table(iris)), regexp = "Provided data for prediction is not a pure data.frame but from class data.table, hence it will be converted.") | |||
}) | |||
|
|||
test_that("predict with NA rows for learners that don't support missings automatically returns NA", { | |||
mod = train("classif.knn", pid.task) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add a test for the original random forest problem please?
R/predictLearner.R
Outdated
|
||
removeNALines = function(newdata) { | ||
namat = is.na(newdata) | ||
if (!any(vlapply(namat, any))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this check necessary? As far as I can see the code after would do the right thing in this case as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wrong about what format the return of is.na(data.frame)
has, I'll drop this part.
Thanks, merging. |
…ort missings (#2099) * predict NA if learner doesn't support that * adding test * drop = FALSE * bugfix * using old prediction as fallback when all rows are NA * implementing @larskotthoff's suggestions
A comprehensive fix for the larger issue around #1515, this does what I described in my comment to #2068:
If the learner doesn't support 'missings', the rows containing missings are stripped, and
NA
s are added to the prediction in their place. This has two weaknesses:"missings"
to mean support for missings in the prediction data.NA
, this falls back to the old prediction mode (and possibly creates an error if the Learner doesn't silently ignoreNA
s). A more thorough implementation could create the matrix / vector ofNA
s of appropriate type without callingpredictLearner
at all.