在 R 中创建“marginsplot”

Creating a 'marginsplot' in R

提问人:Sandro 提问时间:7/26/2023 最后编辑:jay.sfSandro 更新时间:7/27/2023 访问量:81

问:

受这个 youtube https://www.youtube.com/watch?v=7maMbX_65b0 的启发,我怎样才能在 R 中重新创建 Stata 的边距图?

换句话说,对于代码块末尾的行,如何让绘图显示 的预测值 的增量 的水平 ?cplot()'age''smoke'

任何帮助总是非常感谢!

library(margins)
set.seed(42)
n <- 1000
patient <- data.frame(id=1:n,
                      treat = factor(sample(c('Treat','Control'), n, rep=TRUE, prob=c(.5, .5))),
                      age=sample(18:80, n, replace=TRUE),
                      sex = factor(sample(c('Male','Female'), n, rep=TRUE, prob=c(.6, .4))),
                      smoke=factor(sample(c("Never", 'Former', 'Current'), n, rep=TRUE, prob=c(.25, .6, .15))),
                      outcome=runif(n, min=16, max=45))

model <- lm(outcome ~ treat*age + smoke, data = patient)
cplot(model, x="age", by="smoke", overlay=TRUE)
LM R -边际效应

评论


答:

1赞 SamR 7/26/2023 #1

我认为您正在寻找的东西可以用 sjPlot::p lot_model() 来完成:

library(ggplot2)
library(sjPlot)

plot_model(
    model,
    type = "pred",
    terms = c("age", "smoke"),
    ci.lvl = NA
) +
    theme_bw()

enter image description here

1赞 Vincent 7/26/2023 #2

您可以使用软件包完成所有这些操作(免责声明:我是维护者)。在网站上,您会发现超过 25 个小插曲,包括一个完整的情节小插曲:marginaleffects

请注意,我添加了一个交互,使情节看起来更有趣:

library(marginaleffects)
library(ggplot2)
set.seed(42)
n <- 1000
patient <- data.frame(id=1:n,
                      treat = factor(sample(c('Treat','Control'), n, rep=TRUE, prob=c(.5, .5))),
                      age=sample(18:80, n, replace=TRUE),
                      sex = factor(sample(c('Male','Female'), n, rep=TRUE, prob=c(.6, .4))),
                      smoke=factor(sample(c("Never", 'Former', 'Current'), n, rep=TRUE, prob=c(.25, .6, .15))),
                      outcome=runif(n, min=16, max=45))

model <- lm(outcome ~ treat * age * smoke, data = patient)

plot_predictions(model, condition = c("age", "smoke")) + theme_minimal()

或者您可以绘制斜率:

plot_slopes(model, variables = "treat", condition = c("age", "smoke")) + theme_minimal()

1赞 jay.sf 7/27/2023 #3

通过复制 Stata 的 ,您想要的是沿着所需的 xy 变量可视化模型中所有可能的离子,例如在本例中为年龄烟雾marginsplotmeanpredict

lm1 <- lm(outcome ~ treat*age*smoke, data=patient)

因此,首先,我们使用expand.grid

.newdata <- expand.grid(
  treat=unique(patient$treat),
  age=with(patient, min(age):max(age)),
  sex=unique(patient$sex),
  smoke=unique(patient$smoke)
  )

为了喂食,这结果我们.predictcbind

.newdata <- cbind(.newdata, predict(lm1, newdata=.newdata, interval='conf'))

接下来,我们计算烟雾年龄变量的 ed 值的平均值,以及置信区间的各自和边界。aggregatefitlwrupr

agg <- aggregate(cbind(fit, lwr, upr) ~ smoke + age, .newdata, mean)

至此,我们已经完成了预处理,并准备好了。plot

par(mar=c(4, 4, 3, 2) + .1)
plot.new();plot.window(range(agg$age) + c(0, 2), range(agg[3:5]) + c(0, 2))
by(agg, agg$smoke, \(x) with(x, lines(age + as.integer(smoke) - 2, fit, col=smoke)))
dec <- agg$age %% 10 == 0
by(agg[dec, ], agg[dec, ]$smoke, \(x) 
   with(x, points(age + as.integer(smoke) - 2, fit, col=smoke, pch=20)))
by(agg[dec, ], agg[dec, ]$smoke, \(x) 
   with(x, arrows(age + as.integer(smoke) - 2, lwr, age + as.integer(smoke) - 2, upr, 
                  col=smoke, code=3, angle=90, length=.05)))
axis(1, axTicks(1)); axis(2, axTicks(2))
mtext('age', 1, 2.5); mtext('pred. outcome', 2, 2.5)
legend('topleft', pch=20, col=1:3, legend=unique(agg$smoke),
       title='smoke', horiz=TRUE, cex=.9)
box()

enter image description here

数据:

set.seed(42)
n <- 1000
patient <- data.frame(
  id=1:n, treat=factor(sample(c('Treat','Control'), n, T)),
  age=sample(18:80, n, T), sex=factor(sample(c('Male','Female'), n, T, c(.6, .4))),
  smoke=factor(sample(c("Never", 'Former', 'Current'), n, T, c(.25, .6, .15))),
  outcome=runif(n, min=16, max=45))

评论

0赞 Sandro 7/28/2023
谢谢@jay.sf !!在尝试对我的实际数据使用您的解决方案时,我遇到了以下错误消息“错误:矢量内存耗尽(达到限制?)” 关于临时提供更多内存的任何提示?
0赞 jay.sf 7/28/2023
@Sandro 欢迎!嗯,很奇怪,不确定,错误到底是什么时候发生的?也许试试这个:stackoverflow.com/q/51295402/6574038