提问人:Mark 提问时间:9/20/2022 最后编辑:ThomasIsCodingMark 更新时间:9/20/2022 访问量:94
计算索引日期与带有第一个指标的日期之间的差异
Calculate difference between index date and date with first indicator
问:
假设我有以下数据框,其中包含索引日期和后续日期,并以“1”作为止损指标。我想在索引行中输入以天为单位的日期差异,如果没有止损指示器,则输入从索引日期到最后一次观测的天数:
id date group indicator
1 15-01-2022 1 0
1 15-01-2022 2 0
1 16-01-2022 2 1
1 20-01-2022 2 0
2 18-01-2022 1 0
2 20-01-2022 2 0
2 27-01-2022 2 0
要:
id date group indicator stoptime
1 15-01-2022 1 0 NA
1 15-01-2022 2 0 NA
1 16-01-2022 2 1 1
1 20-01-2022 2 0 NA
2 18-01-2022 1 0 NA
2 20-01-2022 2 0 NA
2 27-01-2022 2 0 9
答:
1赞
akrun
9/20/2022
#1
将 'date' 转换为类,按 'id' 分组,从 'indicator' 中找到 1 的位置(如果未找到,则使用最后一个位置 -),然后从 到 中的位置得到 'date' 的差值Date
n()
first
days
library(dplyr)
library(lubridate)
df1 %>%
mutate(date = dmy(date)) %>%
group_by(id) %>%
mutate(ind = match(1, indicator, nomatch = n()),
stoptime = case_when(row_number() == ind ~
as.integer(difftime(date[ind], first(date), units = "days"))),
ind = NULL) %>%
ungroup
-输出
# A tibble: 7 × 5
id date group indicator stoptime
<int> <date> <int> <int> <int>
1 1 2022-01-15 1 0 NA
2 1 2022-01-15 2 0 NA
3 1 2022-01-16 2 1 1
4 1 2022-01-20 2 0 NA
5 2 2022-01-18 1 0 NA
6 2 2022-01-20 2 0 NA
7 2 2022-01-27 2 0 9
数据
df1 <- structure(list(id = c(1L, 1L, 1L, 1L, 2L, 2L, 2L), date = c("15-01-2022",
"15-01-2022", "16-01-2022", "20-01-2022", "18-01-2022", "20-01-2022",
"27-01-2022"), group = c(1L, 2L, 2L, 2L, 1L, 2L, 2L), indicator = c(0L,
0L, 1L, 0L, 0L, 0L, 0L)), class = "data.frame",
row.names = c(NA,
-7L))
下一个:按指标变量计算日期差异
评论