如何从 caret 的 glmnet train 对象中获取手动预测?

How to get manual prediction from caret's glmnet train object?

提问人:Sayf Said 提问时间:11/16/2023 最后编辑:Sayf Said 更新时间:11/16/2023 访问量:18

问:

我有一个使用 glmnet 训练的插入符号模型,其中包含 cv 和超参数调整。我需要手动获取每个案例的预测概率。 我正在尝试将变量乘以模型的系数,但我得到的结果与 不同。不确定这是否是正确的做法,或者我错过了什么。这是我复制这个问题的尝试。caret::predict.train

编辑了代码模型中的拼写错误$FinalModel$xNames,而不是model$FinalModel$xNamespred

library(caret)
library(tidyverse)
set.seed(1)
df1 <- data.frame(dep_var = sample(c("No","Yes"), size =1000, replace = TRUE),
                  var1= runif(1000, min = 0, max= 100),
                  var2 = runif(1000, min = 50, max= 100),
                  var_cat= sample(c("Male", "Female"), size = 1000, replace = TRUE))
set.seed(1)
train <- sample(1:nrow(df1), 0.75*nrow(df1))                  
dftrain <- df1[train,]
dftest <- df1[-train,]
fmla <- as.formula(paste("dep_var", "~", paste(c('var1', 'var2', 'var_cat'), collapse = "+")))

train_obj <- trainControl(method = "repeatedcv", 
                          number= 100, 
                          repeats=3,
                          classProbs = TRUE,
                          preProcOptions = c("BoxCox", "scale", "zv"))
pr_grid <- expand.grid(alpha = seq(0,1, length=10),
                       lambda = seq(0.0001,10, length= 20))

# Model
set.seed(2)
model <- train(fmla, 
               data = dftrain, 
               method = "glmnet",
               family = "binomial",
               metric= "ROC",
               tuneGrid= pr_grid, 
               trControl=train_obj, 
               na.action = "na.omit")

dfx <- dftest[1,]
dfx$dep_var <- NULL
pred <- caret::predict.train(model, newdata = dfx, type='prob')

#changing the name and value of a categorical variable

dfx2 <- dfx
colnames(dfx2) <- model$finalModel$xNames
dfx2$var_catMale <- 1
dfx2$`(Intercept)` <- 1

dfx2<- select(dfx2, "(Intercept)", "var1", "var2", "var_catMale")

coef <- coef(model$finalModel, model$bestTune$lambda)

pred_man <- sum(as.matrix(dfx2) %*% as.matrix(coef))

isTRUE(pred$Yes == pred_man)

\> pred$是

[1] 0.5142378

\> pred_man

[1] 0.05696666

\> model$preProcess

R R-Caret 预测 glmnet

评论

0赞 Sayf Said 11/16/2023
我发现了问题所在。我错过了计算概率的最后一步。我需要从 pred_man 中取出最终产品并将其插入以下内容中:probability = 1/(1+exp(-pred_man))。讨厌这么说,Chat GPT 救了我!

答: 暂无答案