无法使用 rsample 包中的 tidy 函数从 glm 模型中提取信息

Cannot Extract Information from glm model using tidy function from rsample package

提问人:George K-Agyen 提问时间:8/13/2023 更新时间:10/3/2023 访问量:62

问:

我一直在学习 Hands on Programing with R 中的逻辑回归章节。当我开始时,所有代码都工作正常,但后来我延迟了我的 R 会话,当我运行这段代码时

tidy(model1)

它会抛出此错误消息。

`Error in UseMethod("tidy") : 
  no applicable method for 'tidy' applied to an object of class "c('glm', 'lm')"`

所以这是我的代码,直到它抛出错误的地方

library(dplyr)
library(ggplot2)
library(rsample)
library(modeldata) #contains the attrition dataset
library(caret)
library(vip)

df <- attrition |> 
  mutate_if(is.ordered, factor, ordered=F)

#create training(70%) and test(30% sets )
set.seed(191) # for reproducibility
churn_split <- initial_split(df, prop =0.7, strata ='Attrition')
churn_train <- training(churn_split)
churn_test <- testing(churn_split)

#model simple logistic regression (use 1 variable for prediction)
model1 <- glm(Attrition ~ MonthlyIncome, 
              family ='binomial',
              data = churn_train)
model2 <- glm(Attrition ~ OverTime, 
              family = 'binomial',
              data = churn_train)
summary(model1)
exp(coef(model1))

所有这些代码都工作正常 在我重新启动 R studio 之前,这也一直有效。我想知道我是否做了什么,或者该功能只是在惹我,我该如何解决它tidy(model1)

逻辑回归 R-插入符号 rsample

评论


答:

2赞 jpmostats456 10/3/2023 #1

我尝试运行您的代码,但它给了我同样的错误;但是在重新加载包后,它确实创建了一个 和 :broommodel1model2

set.seed(123)  # for reproducibility
churn_split <- initial_split(df, prop = .7, strata = "Attrition")
churn_train <- training(churn_split)
churn_test  <- testing(churn_split)

model1 <- glm(Attrition ~ MonthlyIncome, family = "binomial", data = churn_train) # prob. of attrition on income
model2 <- glm(Attrition ~ OverTime, family = "binomial", data = churn_train) # prob. of attrition on overtime

broom::tidy(model1) 
A tibble: 2 × 5
  term           estimate std.error statistic      p.value
  <chr>             <dbl>     <dbl>     <dbl>        <dbl>
1 (Intercept)   -0.886    0.157         -5.64 0.0000000174
2 MonthlyIncome -0.000139 0.0000272     -5.10 0.000000344 

broom::tidy(model2)
# A tibble: 2 × 5
  term        estimate std.error statistic  p.value
  <chr>          <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)    -2.13     0.119    -17.9  1.46e-71
2 OverTimeYes     1.29     0.176      7.35 2.01e-13

R 环境中的包或已加载包的组合可能存在问题,使 R 对模型的类感到困惑。"c("glm","lm")"

据我所知,包中的函数通过使用一种机制来识别模型对象的特定类来工作。根据对象的类,它将选择正确的整理方法(即或 )并将其转换为整理数据框。tidybroomtidy.glmtidy.lm

R 可能试图找出模型的类,但在看到模型的类后停止未通过检查。(您可以键入以查看函数参数)tidy.glmbroom:::tidy.glm

您将看到类的模型对象将显示以下结果:glm

summary(model1)
Call:
glm(formula = Attrition ~ MonthlyIncome, family = "binomial", # identifies the model
    data = churn_train)

Coefficients:
                Estimate Std. Error z value Pr(>|z|)    # calculates p-value.
(Intercept)   -8.861e-01  1.572e-01  -5.636 1.74e-08 ***
MonthlyIncome -1.386e-04  2.719e-05  -5.098 3.44e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1) # Provides a dispersion parameter like `lm` objects

    Null deviance: 905.68  on 1027  degrees of freedom 
Residual deviance: 870.83  on 1026  degrees of freedom
AIC: 874.83 # Provides a statistic (AIC instead of the R-squared and the F-statistic)

Number of Fisher Scoring iterations: 5

因此,从理论上讲,无论您的输入是 glm 还是 lm 模型,都应该有效。如果您的软件包存在某种问题,R 可能会抛出错误,认为没有函数可以接受类的对象。tidy(model1)tidyc("glm","lm")

似乎重新加载包是解决问题的简单解决方案。

评论

1赞 George K-Agyen 10/5/2023
谢谢你的回答。我认为问题与扫帚包装有关。加载扫帚包👍后一切正常