提问人:Gabriele Gurrieri 提问时间:6/9/2023 更新时间:6/9/2023 访问量:20
满怀期待的 DID
DID with anticipation
问:
我正在尝试构建一个函数来计算个体预期治疗时的群体时间平均治疗效果。我使用的数据集是这样的:
year State countyreal first.treat treat_group emp pop lemp lpop
<dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2003 CO 8001 2007 1 4729 364 8.46 5.90
2 2004 CO 8001 2007 1 4175 364 8.34 5.90
3 2005 CO 8001 2007 1 4189 364 8.34 5.90
4 2006 CO 8001 2007 1 4351 364 8.38 5.90
5 2007 CO 8001 2007 1 4304 364 8.37 5.90
6 2003 CO 8019 2007 1 148 9 5.00 2.20
7 2004 CO 8019 2007 1 161 9 5.08 2.20
8 2005 CO 8019 2007 1 120 9 4.79 2.20
9 2006 CO 8019 2007 1 147 9 4.99 2.20
10 2007 CO 8019 2007 1 139 9 4.93 2.20
我有县的地方,他们可以在 2004 年、2006 年、2007 年未接受治疗或接受治疗。
我写了这个函数:
compute.attgt <- function(mpdta1) {
# pick up all groups
groups <- unique(mpdta1$first.treat)
# pick up all time periods
time.periods <- unique(mpdta1$year)
# sort the groups and drop the untreated group
groups <- sort(groups)[-1]
# sort the time periods and drop the first two
# (can't compute treatment effects for these two
# periods with one-period anticipation -- w/o anticipation
# we would just drop one period here)
time.periods <- sort(time.periods)[-c(1,2)]
# drop last time period (because we don't know if
# these units are affected by anticipation or not
# and we are being very careful)
# (if you were worried about more than one anticipation
# period here, would need to drop more time periods
# from the end)
time.periods <- time.periods[-length(time.periods)]
# list to store all group-time average treatment effects
# that we calculate
attgt.list <- list()
counter <- 1
# loop over all groups
for (g in groups) {
# get the correct "base" period for this group
# (subtract 2 to avoid anticipation)
main.base.period <- g - 2
# loop over all time periods
for (tp in time.periods) {
#----------------------------------------------------
# if it's a pre-treatment time period (used for the
# pre-test, we need to adjust the base period)
# group not treated yet
if (tp < g) {
# move two periods before
base.period <- tp - 2
} else {
# this is a post-treatment period
base.period <- main.base.period
}
#----------------------------------------------------
#----------------------------------------------------
# now, all we need to do is collect the right subset
# of the data and estimate a 2x2 DiD
# get group g and untreated group
this.data <- subset(mpdta1, first.treat==g | first.treat==0)
# get current period and base period data
this.data <- subset(this.data, year==tp | year==base.period)
# set up to compute 2x2 estimator
lemppost <- subset(this.data, year==tp)$lemp
lemppre <- subset(this.data, year==base.period)$lemp
# dummy variable being in group g
G <- 1*(subset(this.data, year==tp)$first.treat == g)
# compute 2x2 estimator using DRDID package
# (in this unconditional case, it would be straightforward
# to calculate the 2x2 att just using averages, but we
# like the DRDID package as it will work for more complicated
# cases as well)
attgt <- DRDID::reg_did_panel(lemppost, lemppre, G, covariates=NULL)$ATT
# save results
attgt.list[[counter]] <- list(att=attgt, group=g, time.period=tp)
counter <- counter+1
#----------------------------------------------------
}
}
#-----------------------------------------------------------------------------
# aggregate into dynamic effects
# turn results into a data.frame
attgt.results <- do.call("rbind.data.frame", attgt.list)
# add event time to the results
attgt.results$e <- attgt.results$time.period - attgt.results$group
# calculate relative sizes of each group
# (will be used as weights)
n.group <- sapply(groups, function(gg) nrow(subset(mpdta1, first.treat==gg)))
# merge in group sizes
ngroup.mat <- cbind(groups, n.group)
attgt.results <- merge(attgt.results, ngroup.mat, by.x = "group", by.y = "groups")
# event times to calculate dynamic effects
eseq <- unique(attgt.results$e)
eseq <- sort(eseq)
# calculate average effects by event time
att.e <- c()
counter <- 1
for (this.e in eseq) {
# get subset of results at this event time
res.e <- subset(attgt.results, e==this.e)
# calculate weights by group size
res.e$weight <- res.e$n.group / sum(res.e$n.group)
# calculate dynamic effect as weighted average
att.e[counter] <- sum(res.e$att * res.e$weight)
# on to the next one
counter <- counter+1
}
# store dynamic effects results
dyn.results <- cbind.data.frame(e = eseq, att.e = att.e)
# return group-time average treatment effects and dynamic effects
return(list(attgt.results=attgt.results[,c("group","att","time.period")],
dyn.results=dyn.results))
}
但是当我尝试计算数据集上的attgt时,出现以下错误:
Error in model.frame.default(formula = deltaY ~ -1 + int.cov, subset = D == :
variable lengths differ (found for 'int.cov')
我该如何解决?
答: 暂无答案
评论