quickpred seemingly not detecting strong correlations & looking for faster way to specify "include" variable list #496
-
As per subject, in running quickpred, the imputation run seems to miss correlations that are substantially higher than what I've set "mincor" to be. For example, setting mincor = .2, a number of variables are left out of the imputation model, as per the predictor matrix. For instance, Variables A & B both correlate with Variable C at r > .40 are left out of the model when Variable C is being imputed. Variable C shares > .90 proportion usable with both Variable A & B. Here is the code I have used: ini <- mice(rtc.pre, With this discrepancy, I am curious whether I am somehow doing something wrong/not understanding how to properly use the mincor or minpuc functions or what might be causing this discrepancy for me. Thanks in advance Second point is that I am looking for a quicker way to specify the variables to always include in the imputations. For instance, say I have 5 outcomes I want to make sure get included in the imputation, using quickpre, I have the following, which works: ini <- mice(rtc.pre, method = meth.pre, predictorMatrix = quickpred(rtc.pre, My admittedly shallow knowledge of R coding conventions leads me to change this to the following, let's assume these 5 outcome variables occupy columns 6 to 10 in the "rtc.pre" dataset: ini <- mice(rtc.pre, method = meth.pre, predictorMatrix = quickpred(rtc.pre, While the imputation runs without getting stopped by an error, these 5 variables do not appear as predictors for most of the imputed variables in the dataset (as per the predictor matrix). Any direction would be very much appreciated, if only to let me know that there is not a way to shorten up how one identifies variables to always include in the imputation model. Ian. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Can you provide a |
Beta Was this translation helpful? Give feedback.
-
If you install the Hope this helps, because on the basis of a single line of code, it is near impossible to identify what goes wrong with With regard to your second problem, you could use |
Beta Was this translation helpful? Give feedback.
If you install the
reprex
package, you can simply copy all your (relevant) code to your clipboard (i.e., [ctrl + c] the code you use), and subsequently runreprex::reprex()
in the R-console (make sure that this selection of code runs in your R environment, otherwisereprex::reprex()
will just throw errors at you). This creates an entirely reproducible script, containing both the input from the script, and the output.Hope this helps, because on the basis of a single line of code, it is near impossible to identify what goes wrong with
quickpred()
.With regard to your second problem, you could use
colnames(rtc.pre)[6:10]
, which creates a vector of containing the 6th to 10th column name.