为什么我的 Gibbs 采样器没有收敛到真实的参数值?

Why is my Gibbs Sampler not converging to true parameter values?

提问人:redmuffin 提问时间:11/9/2023 最后编辑:redmuffin 更新时间:11/9/2023 访问量:23

问:

我模拟了根据以下模型生成的时间序列数据:y_ij = alpha_i + epsilon_i,其中 i=1 或 2.,j=1,...,100。(错误epsilon_i假定为行为良好)。时间序列数据在 tau=50 处有一个变化点,其中 i=2。我正在尝试对数据使用吉布斯采样,看看我是否可以让 MCMC 收敛到我最初设置的真实参数值(只有制度 1 和 2 表示 alpha_1 和 alpha_2,误差方差sig2_1和 sig2_2),但是,尽管链中进行了多次迭代,但我还是收敛到与这些值不同的值。前面的发行版如下所述,我使用的代码遵循它,结果在底部。任何帮助将不胜感激:

      alpha_1 ~ N((sumdata1/sig2_1 + mu0_1/ssquared0_1)/(tau/sig2_1 + 1/ssquared0_1), 
                            1/(tau/sig2_1 + 1/ssquared0_1))     
              
              alpha_2 ~ N((sumdata2/sig2_2 + mu0_2/ssquared0_2)/(n-tau)/sig2_2 +     1/ssquared0_2), 
              1/(n-tau)/sig2_2 + 1/ssquared0_2))
              
              sig2_1 ~ InvGamma(zeta1 + tau/2, delta1 + sumsq1/2)
              
              sig2_2 ~ InvGamma(zeta2 + (n-tau)/2, delta2 + sumsq2/2)
              
              #Where sumdatai is the sum of the y's for regime i=1 or 2 and ssquared0_i and mu0_i are prior
              #hyperparameters for alpha_i. sumsqi is the sum of squared differences between the y's from regime i
              #and the corresponding estimated mean alpha_i. deltai and zetai are hyperparameters for sig2_i.
              
              # Set parameters
              n=100
              tau<-50
              alpha_1<-2
              alpha_2<-5
              sig2_1<-1
              sig2_2<-1
              
              # Generate time series data
              
              Epsilon <- c(rnorm(tau, mean = 0, sd = sqrt(sig2_1)), 
                           rnorm(n - tau, mean = 0, sd = sqrt(sig2_2)))
              
              y <- rep(0, n)
              for (i in 1:n) 
                if (i <= tau) {
                  y[i] <- alpha_1 + Epsilon[i]
                } else {
                  y[i] <- alpha_2 + Epsilon[i]
                }
              # Plot the time series data
              #plot(1:n, Y, type = "l", xlab = "Time", ylab = "Y", main = "Time Series with Change-      Point")
              #abline(v = tau, col = "red", lty = 2) # Plot change-point line
              #write functions to simulate from full conditionals, inputting the mean and 
              #variance parameters from the full conditionals and finally with rnorm
              #we take a draw from the normal distribution with mean mu1 and variance ssquared1
              #to take a sample for alpha1 and alpha2
              update_alpha1 = function(sumdata1, sig2_1, mu0_1, ssquared0_1, tau) {
                sumdata1 = sum(y[1:tau])
                ssquared1 = (1.0/((tau/sig2_1) + (1.0/ssquared0_1)))^2
                mu1 = ssquared1*((sumdata1/sig2_1) + (mu0_1/ssquared0_1))
                rnorm(n=1, mean=mu1, sd=ssquared1)
              }
              update_alpha2 = function(sumdata2, sig2_2, mu0_2, ssquared0_2, tau) {
                sumdata2 = sum(y[(tau + 1):n])
                ssquared2 = (1.0/((n-tau)/sig2_2) + 1.0/(ssquared0_2))^2
                mu2 = ssquared2*((sumdata2/sig2_2) + (mu0_2/ssquared0_2))
                rnorm(n=1, mean=mu2, sd=ssquared2)
              }
              
              #we input parameters from the full conditional for sig2_1 and sig2_2
              #in the first three lines we perform the calculations described by the 
              #full conditionals and finally we draw from an inverse gamma distribution
              #using out_gamma
              update_sig2_1 = function(zeta1, delta1, tau, y, alpha_1){
                shape1 = zeta1 + (tau/2.0)
                sumsq =sum((y[1:tau]-alpha_1)^2)
                rate1 = delta1 + (sumsq1)/2.0
                out_gamma1 = rgamma(n=1, shape=shape1, rate=rate1)
                1.0/out_gamma1
              }
              update_sig2_2 = function(n, zeta2, delta2, tau, y, alpha_2){
                shape2 = zeta2 + ((n - tau)/2.0)
                sumsq2 = sumsq2=sum((y[(tau + 1):100]-alpha_2)^2)
                rate2 = delta2 + (sumsq2)/2.0
                out_gamma2 = rgamma(n=1, shape=shape2, rate=rate2)
                1.0/out_gamma2
              }
              
              #Create a function for the Gibbs Sampling with arguments y = data vector, n_iter = number of iterations we want to sample for,
              #init=initial values of the parameters we set, prior=list with prior hyperparameters
              n_iter=10000
              
              gibbs = function(y=y, n_iter=n_iter, init, prior) {
                n = length(y)
                #Vectors of length n_iter for draws of parameters  
                alpha1_out = numeric(n_iter)
                alpha2_out = numeric(n_iter)
                sig2_1_out = numeric(n_iter)
                sig2_2_out = numeric(n_iter)
                #Gibbs sampling will update alphas and sigmas by trading off, sampling 
                #back and forth given the other parameter so we need to initialise one of the 
                #variables to get this started from each regime 
                alpha1_now=init$alpha1
                alpha2_now=init$alpha2
                
                #Gibbs Sampler
                #side note: can speed up computation time by using vectorised code rather than for loops
                for (i in 1:n_iter) {
                  #next two steps update sigma1^2 and sigma2^2 respectively we need to take 
                  #draws from the full conditional distribution given the prior hyperparameters
                  #which are stored in a list so need '$' to access it. These steps update 
                  #sig2_1 and sig2_2 so we can go back and update alpha1 and alpha2 again.
                  sig2_1_now = update_sig2_1(y=y, alpha_1=alpha1_now, zeta1=prior$zeta1, delta1=prior$delta1, tau=prior$tau)
                  sig2_2_now = update_sig2_2(n=n, y=y, alpha_2=alpha2_now, zeta2=prior$zeta2, delta2=prior$delta2, tau=prior$tau)
                  #now we update alpha1 and alpha2
                  alpha1_now = update_alpha1(sumdata1=sumdata1, sig2_1 = sig2_1_now, mu0_1 = prior$mu0_1, ssquared0_1 = prior$ssquared0_1, tau=prior$tau) #data needs to inform these parameters
                  alpha2_now = update_alpha2(sumdata2=sumdata2, sig2_2 = sig2_2_now, mu0_2 = prior$mu0_2, ssquared0_2 = prior$ssquared0_2, tau=prior$tau) #think of data generating part then posterior
                  #data needs to get into full conditionals 
                  #tau_now = update_tau(n=n, tau=tau_now, sig2_1=sig2_1_now, sig2_2=sig2_2_now, y=y, alpha1=alpha1_now, alpha2=alpha2_now)
                  #We have updated and now we must save our updated parameters
                  sig2_1_out[i] = sig2_1_now
                  sig2_2_out[i] = sig2_2_now
                  alpha1_out[i] = alpha1_now
                  alpha2_out[i] = alpha2_now
                  #out_tau[i] = tau_now
                }
                
                #now we output results, combining 5 vectors (sig2_1_out, sig2_2_out, alpha1_out, alpha2_out, tau_out)
                #into a matrix using cbind
                post <- cbind(alpha1=alpha1_out, alpha2=alpha2_out, sig2_1=sig2_1_out, sig2_2=sig2_2_out) #tau=out_tau)
                
              }
              #for the prior, the Gibbs function accepts it as a list
              prior = list()
              #ssquared0_1 is a representation of confidence in our choice of prior mean 
              #mu1 for alpha1 etc.
              prior$mu0_1 = 0#1.0
              prior$mu0_2 = 0#7.0
              prior$ssquared0_1 = 10 #1.0
              prior$ssquared0_2 = 10 #1.8
              prior$tau = 42
              #This parametrization of the Inverse Gamma Distribution is 
              #the scaled inverse Chi-squared distribution, with n1_0, sig2_1_0 and n2_0,
              #sig2_2_0 the two parameters for each respectively (sigma_1^2 and sigma_2^2).
              #We are now free to enter our prior effective sample size (info contained 
              #in the chain, how much independent draws we would have to take from the 
              #stationary distribution to have equivalent information). Once we specify prior
              #effective sample size and our prior guesses for sig2_1 and sig2_2 our 
              #parameters zeta and delta will be automatically determined.
              prior$n1_0 = 2*prior$zeta1 #prior effective sample size for sigma_1^2, it is
              #two times the zeta1 parameter
              prior$n2_0 = 2*prior$zeta2 #prior effective sample size for sigma_2^2, it is
              #two times the zeta2 parameter therefore:
              prior$zeta1 = 0.01
              prior$zeta2 = 0.01
              #and:
              prior$delta1 = 0.01 #(prior$n1_0*prior$sig2_1_0)/2.0 #the relationship between our prior point estimate for sig2_1
              #and delta 1 is this 
              prior$delta2 = 0.01 #(prior$n2_0*prior$sig2_2_0)/2.0 #the relationship between our prior point estimate for sig2_1
              #and delta 2 is this 
              prior$sig2_1_0=1.0 #prior point estimate for sigma_1^2
              prior$sig2_2_0=1.2 #prior point estimate for sigma_2^2
              
              #Choose an initial value for alpha1 to get the sampler started, create the 
              #variables we are going to pass to our Gibbs function
              init=list()
              init$alpha1 = 1.5
              init$alpha2 = 2.5
              #set up sampler specifying y, number of iterations, initial values list, prior hyperparameters
              post = gibbs(y=y, n_iter=n_iter, init=init, prior=prior)
              
              set.seed(53)
              library(mcmcr)
              install.packages("coda")
              library("coda")  
              summary(as.mcmc(post))
              plot(as.mcmc(post))
              
              Iterations = 1:10000
              Thinning interval = 1 
              Number of chains = 1 
              Sample size per chain = 10000 
              
              1. Empirical mean and standard deviation for each variable,
              plus standard error of the mean:
                
                Mean      SD  Naive SE Time-series SE
              alpha1 0.2246 0.05307 0.0005307      0.0005307
              alpha2 1.9485 0.12237 0.0012237      0.0009630
              sig2_1 5.1990 1.19878 0.0119878      0.0122971
              sig2_2 9.5833 1.93015 0.0193015      0.0148634
              
              2. Quantiles for each variable:
                
                2.5%    25%    50%     75%   97.5%
              alpha1 0.1458 0.1874 0.2162  0.2526  0.3517
              alpha2 1.7703 1.8630 1.9266  2.0120  2.2476
              sig2_1 3.3649 4.3542 5.0182  5.8550  8.0327
              sig2_2 6.3770 8.2385 9.3699 10.7054 13.9536
R 时间序列 蒙特卡洛 马尔科夫链分层 贝叶斯链

评论


答: 暂无答案