提问人:Rodrigo Saquicela 提问时间:2/15/2023 最后编辑:jay.sfRodrigo Saquicela 更新时间:2/15/2023 访问量:34
增强似然比,用于比较使用 lm() 函数获得的模型
Enhanced likelihood ratio for comparison of models obtained with the lm() function
问:
我在 R 中寻找一个函数来比较使用 lm() 函数创建的模型,并具有以下统计数据:
Models df AIC BIC logLik Test L.ratio p.value
1 model1 6 161.65 170.44 -74.83
2 model2 8 159.39 171.11 -71.69 1 vs. 2 6.26 0.0437
3 model3 5 159.77 167.10 -74.88 2 vs. 3 6.38 0.0945
4 model4 5 162.26 169.59 -76.13 3 vs. 4 2.49 0
R 中是否有函数可以返回所示的比较表,特别是针对使用 lm() 创建的模型?我在 R 中使用以下代码获取了比较表:
dat <- read.csv2("https://raw.githubusercontent.com/saquicela/mineria/main/vapor_data.csv",sep=";",dec = ".", stringsAsFactors = TRUE)
model1 <- lm(y ~ x1 + x2 + x3 + x4, data = dat)
model2 <- lm(y ~ poly(x1,2) + x2 + poly(x3,2) + x4, data = dat)
model3 <- lm(y ~ x2 + x3 + x4, data = dat)
model4 <- lm(y ~ x1 + x2 + x4 , data = dat)
aic <- AIC(model1,model2,model3,model4)
bic <- BIC(model1,model2,model3,model4)
row_names <- rownames(aic)
library(lmtest)
likelihood_tests <- lrtest(model1,model2,model3,model4)
test_names <- c("",paste(seq(1,length(row_names)-1,1),
"vs.",
seq(2,length(row_names),1)))
models_comparison <- data.frame("Models" = row_names,
"df" = aic$df,
"AIC" = round(aic$AIC,2),
"BIC" = round(bic$BIC,2),
"logLik" = round(likelihood_tests$LogLik,2),
"Test" = test_names,
"L.ratio" = round(likelihood_tests$Chisq,2),
"p-value" = round(likelihood_tests$`Pr(>Chisq)`,4))
models_comparison[is.na(models_comparison)] <- ""
models_comparison
非常感谢您的帮助。
答:
1赞
jay.sf
2/15/2023
#1
实际上,您刚刚编写了函数。我们可以使用代替和添加意义星星,这也是一个回报,这不是ed。formatC
round
invisible
round
my_LR <- \(..., quiet=FALSE) {
aic <- AIC(...)
bic <- BIC(...)
row_names <- sapply(substitute(...()), deparse)
likelihood_tests <- lmtest::lrtest(...)
test_names <- c("", paste(seq(1, length(row_names)-1, 1),
"vs.",
seq(2, length(row_names), 1)))
models_comparison <- out <- data.frame(Models=row_names,
df=aic$df,
AIC=aic$AIC,
BIC=bic$BIC,
logLik=likelihood_tests$LogLik,
Test=test_names,
L.ratio=likelihood_tests$Chisq,
p.value=likelihood_tests$`Pr(>Chisq)`
)
symp <- symnum(out$p.value[-1], corr=FALSE,
cutpoints=c(0, .001, .01, .05, .1, 1),
symbols=c("***", "** ", "* ", ". ", " "))
out[, c(3:5, 7:8)] <- mapply(\(x, y) formatC(x, digits=y, format='f'),
out[, c(3:5, 7:8)], c(2, 2, 2, 2, 4))
out[1, 7:8] <- ''
if (!quiet) print(cbind(out, ' '=c(' ', symp)))
return(invisible(models_comparison))
}
用法
res <- my_LR(model1, model2, model3, model4)
# Models df AIC BIC logLik Test L.ratio p.value
# 1 model1 6 161.65 170.44 -74.83
# 2 model2 8 159.39 171.11 -71.69 1 vs. 2 6.26 0.0437 *
# 3 model3 5 159.77 167.10 -74.88 2 vs. 3 6.38 0.0945 .
# 4 model4 5 162.26 169.59 -76.13 3 vs. 4 2.49 0.0000 ***
res
# Models df AIC BIC logLik Test L.ratio p.value
# 1 model1 6 161.6505 170.4449 -74.82524 NA NA
# 2 model2 8 159.3886 171.1145 -71.69430 1 vs. 2 6.261884 0.04367663
# 3 model3 5 159.7684 167.0971 -74.88422 2 vs. 3 6.379839 0.09452374
# 4 model4 5 162.2623 169.5909 -76.13113 3 vs. 4 2.493821 0.00000000
如果我们不想打印任何东西,只需返回即可。quiet=FALSE
评论
anova(model1, model2, model3, model4)
不完全是你要找的,但它能帮上忙吗?