提问人:Lis 提问时间:8/29/2022 更新时间:8/29/2022 访问量:274
手动获取具有伽马分布的 GLM 和具有反瓜斯分布的 GLM 的响应
Manually get the responses from GLM with gamma distribution and a GLM with inverse guassian distribution
问:
我一直在尝试从 R 中的 stats 包中手动获取 predict.glm 函数给出的响应值。但是,我无法这样做。我只知道如何手动获取二项分布的值。我真的很感激一些帮助。我创建了两个小模型(一个是伽马族,另一个是逆高斯族)。
library(stats)
library(dplyr)
data("USArrests")
#Gamma distribution model
model_gam <- glm(Rape~Murder + Assault + UrbanPop, data=USArrests, family=Gamma)
print(summary(model_gam))
responses_gam <- model_gam %>% predict(USArrests[1,], type="response")
print(responses_gam)
#Trying to manually get responses for gamma model
paste(coef(model_gam), names(coef(model_gam)), sep="*", collapse="+")
# "0.108221470842499*(Intercept)+-0.00122165587689519*Murder+-9.47425665022909e-05*Assault+-0.000467789606041651*UrbanPop"
print(USArrests[1,])
#Murder: 13.2, Assault: 236, UrbanPop: 58
x = 0.108221470842499 - 0.00122165587689519 * 13.2 - 9.47425665022909e-05 * 236 - 0.000467789606041651 * 58
# This is wrong. Do I have to include the dispersion? (which is 0.10609)
print (exp(x)/(1+exp(x)))
# result should be (from predict function): 26.02872
# exp(x)/(1+exp(x)) gives: 0.510649
# Gaussian distribution model
model_gaus <- glm(Rape~Murder + Assault + UrbanPop, data=USArrests, family=inverse.gaussian(link="log"))
responses_gaus <- model_gaus %>% predict(USArrests[1,], type="response")
print(summary(model_gaus))
print(responses_gaus)
#Trying to manually get responses for gaussian model
paste(coef(model_gaus), names(coef(model_gaus)), sep="*", collapse="+")
# "0.108221470842499*(Intercept)+-0.00122165587689519*Murder+-9.47425665022909e-05*Assault+-0.000467789606041651*UrbanPop"
x = 1.70049202188329-0.0326196928618521* 13.2 -0.00234379099421488*236-0.00991369000675323*58
# Dispersion in this case is 0.004390825
print(exp(x)/(1+exp(x)))
# result should be (from predict function): 26.02872
# exp(x)/(1+exp(x)) it is: 0.5353866
答:
3赞
Ben Bolker
8/29/2022
#1
内置predict()
predict(model_gaus)["Alabama"] ## 3.259201
手工
cat(paste(round(coef(model_gaus),5), names(coef(model_gaus)), sep="*", collapse="+"),"\n")
## 1.70049*(Intercept)+0.03262*Murder+0.00234*Assault+0.00991*UrbanPop
USArrests["Albama",]
## Murder Assault UrbanPop Rape
## Alabama 13.2 236 58 21.2
截距的值始终为 1,因此我们有
1.70049*1+0.03262*13.2+0.00234*236+0.00991*58
## [1] 3.258094
(足够接近,因为我四舍五入了一些东西)
您不需要对色散或反链接函数执行任何操作,因为高斯模型使用恒等链接。
使用模型矩阵
在数学上,回归方程被定义为 其中 是系数向量,是模型矩阵(例如,它是截距加预测变量的一列;对于具有分类预测变量或更复杂的术语(如样条)的模型,它稍微复杂一些)。您可以使用以下命令从矩阵中提取模型矩阵:X %*% beta
beta
X
model.matrix()
Xg <- model.matrix(model_gaus)
drop(Xg["Alabama",] %*% coef(model_gaus))
对于 Gamma 模型,您将使用完全相同的过程,但最后您将转换您计算的线性表达式(线性预测变量)(Gamma 的反链接函数)。(请注意,您需要获得逆变换的预测;否则 [default ] R 将只为您提供纯线性表达式。如果改用日志链接,则会求幂。更一般地说,1/x
predict(..., type = "response")
type = "link"
invlinkfun <- family(fitted_model)$linkinv
X <- model.matrix(fitted_model)
beta <- coef(fitted_model)
invlinkfun(X %*% beta)
逆高斯模型默认使用链接; 是1/mu^2
inverse.gaussian()$linkinv
function(eta) { 1/sqrt(eta) }
评论