plot_slopes无法在迭代函数中计算预测的回归模型值，但可以在函数外部预测每个值-解网

问：

这是一个很难回答的问题，因为我无法提供实际的复制数据（很抱歉，数据太大/敏感），但我很感激任何人的想法。我在这里使用示例数据来显示代码的结构，尽管此代码不会产生我在下面详述的错误。

我有一个复杂的函数，我执行多重回归以隔离交互效应，并使用精彩包中的函数将它们绘制在一起。我的代码结构如下。plot_slopesmarginaleffects

library(tidyverse)
library(marginaleffects)

outcome_var_list = c("mpg","cyl","hp")
interact_var_list =  c("am","wt")
subset_var_list = c("3","4")

combos = expand.grid(outcome_var_list, subset_var_list)


model_ <- function(k,c){

mtcars = mtcars[mtcars$gear == c,]

outcome_var_list = outcome_var_list[outcome_var_list == k] 

results = list()
  for (r in interact_var_list) {
    f = paste(k, "~", r, "*factor(vs)")
    m = lm(f, subset(mtcars, carb %in% c(1,2,3,4)))
    s = plot_slopes(m, variables = r, condition = "vs", draw = FALSE)
    tmp = s[, c("estimate", "conf.low", "conf.high", "vs")]
    tmp$outcome = k
    tmp$regressor = r
    results = c(results, list(tmp))
  }
results = do.call("rbind", results)  


plot1 = results %>% 
  mutate(min = min(conf.low), max = max(conf.high)) %>% 
  ggplot(aes(x=factor(vs), 
             y=estimate, 
             color = regressor, 
             ymin=conf.low, 
             ymax=conf.high)) + 
  geom_errorbar(position = position_dodge(0.4)) + 
  geom_point(position = position_dodge(0.4)) + 
  scale_x_discrete(expand = c(0,0)) + 
  theme_light() + 
  ggtitle(label = paste0("Model 1: ",k)) + 
  theme(plot.title = element_text(hjust = 0.5)) + 
  labs(y= "Interaction Coefficient", x = "X") + 
  theme(plot.title = element_textbox_simple(vjust=-1)) +
  theme(plot.margin = margin(2,0,0,0, "cm")) + 
  theme(axis.text.x = element_text(size = 5))

  ggsave(plot1,file=paste0("plot_",k,".png"),path=paste0(c))

}


safe_model <- safely(model_)
iterate <-mapply(safe_model,combos$Var1,combos$Var2)

当我在添加额外内容之前运行它时，一切都很好。subset_var_list

但是，当我添加子集 var 并切换到 mapply 时，我现在为整个数据子集生成模型，其中只有几秒可以生成要保存的图。我认为这是由于缺乏覆盖范围或其他原因，所以我删除了，这揭示了以下错误：outcome_varsafely

Error: Unable to compute predicted values with this model. You can try to supply a different dataset to the `newdata` argument.
  If this does not work, you can file a report on the Github Issue Tracker:
  https://github.com/vincentarelbundock/marginaleffects/issues

我再次认为这是由于缺乏覆盖率/变异性，因此我尝试手动运行每个单独的交互回归，而不是通过函数运行，并且每个图/回归都成功为其中一个子集生成，否则不会生成图。因此，这似乎不是模型/缺乏变化的问题。plot_slopes

我可能会误解错误消息中提出的论点是如何工作的，但似乎它具有执行任务的完整数据，这应该不会带来任何好处。回归/斜率模型可以单独/手动生成，而当放在这个函数中时，它们无法预测，这有什么原因吗？如何解决此问题并为每个子集创建所需的所有绘图？newdata

更新 2

我能够让它工作，但我仍然想知道问题是什么，希望是否有办法提高运行模型的速度。

我在 for 循环中过滤了模型，以删除结果变量的 NA。我这样做的理由是，我的数据集中的某些交互变量没有任何数据覆盖率，因此，当它们都是 NA 时，即使所有 NA 都在回归中被删除，它也会触发有关因子水平不足的消息，因此应该删除它们。在回归之前，我手动执行此操作，现在它可以工作了

我的新代码如下：


library(tidyverse)
library(marginaleffects)

outcome_var_list = c("mpg","cyl","hp")
interact_var_list =  c("am","wt")
subset_var_list = c("3","4")

combos = expand.grid(outcome_var_list, subset_var_list)


model_ <- function(k,c){

mtcars = mtcars[mtcars$gear == c,]

outcome_var_list = outcome_var_list[outcome_var_list == k] 

results = list()
  for (r in interact_var_list) {
    f = paste(k, "~", r, "*factor(vs)")
    m = lm(f, mtcars %>% filter(is.na(k))
    s = plot_slopes(m, variables = r, condition = "vs", draw = FALSE)
    tmp = s[, c("estimate", "conf.low", "conf.high", "vs")]
    tmp$outcome = k
    tmp$regressor = r
    results = c(results, list(tmp))
  }
results = do.call("rbind", results)  


plot1 = results %>% 
  mutate(min = min(conf.low), max = max(conf.high)) %>% 
  ggplot(aes(x=factor(vs), 
             y=estimate, 
             color = regressor, 
             ymin=conf.low, 
             ymax=conf.high)) + 
  geom_errorbar(position = position_dodge(0.4)) + 
  geom_point(position = position_dodge(0.4)) + 
  scale_x_discrete(expand = c(0,0)) + 
  theme_light() + 
  ggtitle(label = paste0("Model 1: ",k)) + 
  theme(plot.title = element_text(hjust = 0.5)) + 
  labs(y= "Interaction Coefficient", x = "X") + 
  theme(plot.title = element_textbox_simple(vjust=-1)) +
  theme(plot.margin = margin(2,0,0,0, "cm")) + 
  theme(axis.text.x = element_text(size = 5))

  ggsave(plot1,file=paste0("plot_",k,".png"),path=paste0(c))

}


safe_model <- safely(model_)
iterate <-mapply(safe_model,combos$Var1,combos$Var2)

这将运行模型并允许我成功生成它们。但是，当我尝试过滤循环外的数据帧以提高处理速度时，我仍然收到原始错误：results = list(...


Error: Unable to compute predicted values with this model. You can try to supply a different dataset to the `newdata`
  argument. If this does not work, you can file a report on the Github Issue Tracker:
  https://github.com/vincentarelbundock/marginaleffects/issues

这对我来说很奇怪，因为从理论上讲，如果在任一阶段过滤 df，结果应该是相同的。在for循环之外过滤时，是否有任何原因导致此错误？如果无法解决，在for循环中过滤时，有没有办法加快这个过程？

ggplot2 mapply 际 r-边际效应

我不确定问题出在哪里，这是软件包中最棘手和最难调试的角落之一（即：在嵌套环境中挖掘模型对象以恢复训练数据）。但如果我是你，我会尝试直接使用而不是.您可以使用以下方法轻松获得相同的结果：slopes()plot_slopes()slopes(model, newdata=datagrid(vs = seq(min(vs), max(vs), length.out = 100)))

0赞 flâneur 10/5/2023

谢谢你的想法。在这种情况下，奇怪的是，可以生成的模型更少@Vincent

0赞 flâneur 10/7/2023

我已经更新了这个问题，再次尝试解决它，这次有不同的错误。

答：

1赞 I_O 10/6/2023 #1

无论如何，使用 tidyverse，您可以创建一个具有所需组合的网格（），并使用一列来扩展此数据帧以保存模型，另一列用于保存绘图对象以供进一步使用（在此过程中将它们保存为副作用）。请注意，使用 rowwise（）和 list（） 将任意对象放入 DataFrame 单元格中：expand.grid

library(dplyr)
library(ggplot2)

## subset mtcars once only (unless you want different subsets
## per row):
d <- subset(mtcars, carb %in% 1:4)

result <- 
  expand.grid(outcome = c('mpg','cyl','hp'),
              regressor = c('am', 'wt')
              ) |>
  rowwise() |>
  mutate(model = list(
           lm(sprintf('%s ~ %s * factor(vs)', outcome, regressor), data = d)
         ),
         slopes = list(slopes(model)),
         plots = list({p <- ggplot(data = slopes) +
                    geom_point(aes(factor(vs), estimate, color = term))
                    ## save the plot as you go:
                    ggsave(p, filename = sprintf('plot_%s_%s.png', outcome, regressor))
                    p ## return the plot object
         })
         )

输出：

## > result
## # A tibble: 6 x 5
## # Rowwise: 
##   outcome regressor model  slopes             plots 
##   <fct>   <fct>     <list> <list>             <list>
## 1 mpg     am        <lm>   <slopes [60 x 15]> <gg>  
## 2 cyl     am        <lm>   <slopes [60 x 15]> <gg>  
## 3 hp      am        <lm>   <slopes [60 x 15]> <gg>  
## 4 mpg     wt        <lm>   <slopes [60 x 15]> <gg>  
## 5 cyl     wt        <lm>   <slopes [60 x 15]> <gg>

根据需要拾取和重复使用对象，例如：

> result$model[[3]]

Call:
lm(formula = sprintf("%s ~ %s * factor(vs)", outcome, regressor), 
    data = d)

Coefficients:
   (Intercept)              am     factor(vs)1  am:factor(vs)1  
        194.17          -50.42          -92.02           28.85

编辑

要计算每个齿轮的叶子，您可以稍微修改上面的代码（{dplyr} 将是另一种方法）：nest_by

result <- 
  expand.grid(gear = 3:5, ## add gear to factor combinations
              outcome = c('mpg','cyl','hp'),
              regressor = c('am', 'wt')
              ) |>
  rowwise() |>
  mutate(model = list(
           lm(sprintf('%s ~ %s * factor(vs)', outcome, regressor),
              ## subset by gear:
              data = subset(d, gear == gear)
              )
         ),
         slopes = list(slopes(model)),
         plots = list({p <- ggplot(data = slopes) +
                    geom_point(aes(factor(vs), estimate, color = term))
                    ## save the plot as you go:
                    ggsave(p, filename = sprintf('plot_%s_%s.png', outcome, regressor))
                    p ## return the plot object
         })
         )

输出：

## # A tibble: 18 x 6
## # Rowwise: 
##     gear outcome regressor model  slopes             plots 
##    <int> <fct>   <fct>     <list> <list>             <list>
##  1     3 mpg     am        <lm>   <slopes [60 x 15]> <gg>  
##  2     4 mpg     am        <lm>   <slopes [60 x 15]> <gg>  
##  3     5 mpg     am        <lm>   <slopes [60 x 15]> <gg>  
##  4     3 cyl     am        <lm>   <slopes [60 x 15]> <gg>  
## ... 14 more rows

谢谢，这更接近了，我确实可以为我曾经无法的变量生成绘图。但它仍然与我在第一个模型中所做的并不完全匹配。在编辑中，结果是每个回归的多个图。但我希望每个回归都绘制在同一图中。我只想对回归变量和 vs 的交互作用进行建模，而不是像当前模型那样获得 VS 的独立估计值。这可能吗？（我也很欣赏 spintf 添加的简单性，这是一个很好的触感！

0赞 I_O 10/7/2023

现在，您已经有了一个工作模板，用于迭代自定义输入组合，将任意函数的有效负载提供给工厂，收集副作用（例如绘图）并将结果与输入一起存储在数据帧中......为什么不从那里开始定制您的解决方案呢？如果失败了，你总是可以挑出具体的问题（除了迭代机制之外，可能会有一些细节）并将其作为最小的可重现示例发布？

上一个：是否可以通过glmnet计算对脊/套索逻辑回归模型拟合的边际效应？

下一个：如何防止 R 使用混合模型将回归线外推到数据点范围之外？

plot_slopes无法在迭代函数中计算预测的回归模型值，但可以在函数外部预测每个值

plot_slopes cannot compute predicted regression model values in an iterative function, but can predict each one outside of the function

评论

评论