提问人:diggi2395 提问时间:10/19/2023 最后编辑:diggi2395 更新时间:10/20/2023 访问量:67
如何在午夜之前剪切日期时间行?
How to cut datetime rows by midnight?
问:
我的 DF 看起来像这样:
event start.time end.time
connected 04.08.21 20:48 04.08.21 20:59
disconnected 04.08.21 20:59 06.08.21 14:29
我怎样才能像这样在午夜之前拆分行?请注意,跳过的天数可能会有所不同:在我的示例中,它是 8 月 4 日至 6 日,但也可能是 9 月 10 日至 17 日。一个整洁的解决方案将不胜感激!
event start.time end.time
connected 04.08.21 20:48 04.08.21 20:59
disconnected 04.08.21 20:59 04.08.21 23:59
disconnected 05.08.21 00:00 05.08.21 23:59
disconnected 06.08.21 00:00 06.08.21 14:29
这是我的可重现代码:
data <- data.frame(
event = c("connected", "disconnected"),
start.time = c("04.08.21 20:48", "04.08.21 20:59"),
end.time = c("04.08.21 20:59", "06.08.21 14:29")
)
答:
1赞
jay.sf
10/19/2023
#1
使用 ,您可以为每个 .在序列中,查看午夜或午夜 -1 不重复的位置。将上次 23:59 值替换为上次观察到的时间戳。by
seq.POSIXt
"min"
data[-1] <- lapply(data[-1], strptime, format='%d.%m.%y %H:%M') ## format time first
by(data, data$event, \(x) {
sq <- seq.POSIXt(x[[2]], x[[3]], 'min')
r <- data.frame(sq[!duplicated(strptime(sq, '%F'))],
c(sq[!duplicated(strptime(sq, '%F 23:59'))][-1], as.POSIXct(x[[3]]))) |>
setNames(names(x)[-1])
}) |> c(make.row.names=F) |> do.call(what='rbind')
# start.time end.time
# 1 2021-08-04 20:48:00 2021-08-04 20:59:00
# 2 2021-08-04 20:59:00 2021-08-04 23:59:00
# 3 2021-08-05 00:00:00 2021-08-05 23:59:00
# 4 2021-08-06 00:00:00 2021-08-06 14:29:00
旧 R 版本:
do.call(c(by(data, data$event, function(x) {
sq <- seq.POSIXt(x[[2]], x[[3]], "min")
r <- setNames(data.frame(sq[!duplicated(strptime(sq, "%F"))],
c(sq[!duplicated(strptime(sq, "%F 23:59"))][-1], as.POSIXct(x[[3]]))),
names(x)[-1])
}), make.row.names = F), what = "rbind")
数据:
data <- structure(list(event = c("connected", "disconnected"), start.time = c("04.08.21 20:48",
"04.08.21 20:59"), end.time = c("04.08.21 20:59", "06.08.21 14:29"
)), class = "data.frame", row.names = c(NA, -2L))
评论
0赞
diggi2395
10/20/2023
感谢您的帮助!我可以请您提供一个也适用于 tibbles 的解决方案吗?您的代码仅适用于结构。另外,格式(例如,(x)或“|>”)从何而来?
1赞
jay.sf
10/20/2023
@diggi2395 在我看来,你使用的是一个非常过时的 R 版本。添加了替代代码。最好使 R 保持最新状态。
0赞
diggi2395
10/20/2023
#2
多亏了 jay.sf 的回答,我才能复制逻辑来创建最喜欢的 tidyverse 解决方案:
data <- data.frame(
event = c("connected", "disconnected"),
start.time = c("04.08.21 20:48", "04.08.21 20:59"),
end.time = c("04.08.21 20:59", "06.08.21 14:29")
)
data[-1] <- lapply(data[-1], function(x) as.POSIXct(strptime(x, format='%d.%m.%y %H:%M')))
data %>%
# Group data by event
group_by(event) %>%
rowwise() %>%
# Create rows when times are "overnight"
mutate(timestamp_seq = list(seq(from = start.time, to = end.time, by = "secs"))) %>%
ungroup() %>%
mutate(timestamp_seq = map(timestamp_seq, ~ .x[!duplicated(as.Date(.x)) | (as.POSIXct(.x) == max(.x))])) %>%
unnest(timestamp_seq) %>%
filter(!start.time==timestamp_seq) %>%
# Correct the times
mutate(start.time = if_else(row_number() == 1, start.time, lag(timestamp_seq)),
tmp = if_else(lead(hms(format(start.time, format = "%H:%M:%S"))) == hms("00:00:00"), timestamp_seq, end.time),
end.time = if_else(!is.na(tmp), tmp, end.time)) %>%
select(-c(timestamp_seq, tmp))
下一个:维护聚合栅格的时间维度 (Z)
评论