提问人:Ryan Gary 提问时间:10/10/2023 更新时间:10/10/2023 访问量:35
基于具有两个变量的另一列创建具有顺序编号的新列
Create new column with sequential numbering based on another column with two variables
问:
我有一个数据帧,如下所示:
structure(list(Date = structure(c(1630544400, 1630548000, 1630551600,
1630555200, 1630558800, 1630562400, 1630566000, 1630569600, 1630573200,
1630576800, 1630580400, 1630630800, 1630634400, 1630638000, 1630641600,
1630645200, 1630648800, 1630652400, 1630656000, 1630659600), tzone = "America/Chicago", class = c("POSIXct",
"POSIXt")), daytime = c("Night", "Night", "Night", "Night", "Morning",
"Morning", "Morning", "Morning", "Morning", "Morning", "Morning",
"Night", "Night", "Night", "Night", "Morning", "Morning", "Morning",
"Morning", "Morning")), row.names = c(NA, -20L), class = c("tbl_df",
"tbl", "data.frame"))
我想创建另一列来按顺序对晚上和早晨进行分组,因此输出将如下所示:
Date daytime nightcount
<dttm> <chr> <dbl>
1 2021-09-01 20:00:00 Night 1
2 2021-09-01 21:00:00 Night 1
3 2021-09-01 22:00:00 Night 1
4 2021-09-01 23:00:00 Night 1
5 2021-09-02 00:00:00 Morning 1
6 2021-09-02 01:00:00 Morning 1
7 2021-09-02 02:00:00 Morning 1
8 2021-09-02 03:00:00 Morning 1
9 2021-09-02 04:00:00 Morning 1
10 2021-09-02 05:00:00 Morning 1
11 2021-09-02 06:00:00 Morning 1
12 2021-09-02 20:00:00 Night 2
13 2021-09-02 21:00:00 Night 2
14 2021-09-02 22:00:00 Night 2
15 2021-09-02 23:00:00 Night 2
16 2021-09-03 00:00:00 Morning 2
17 2021-09-03 01:00:00 Morning 2
18 2021-09-03 02:00:00 Morning 2
19 2021-09-03 03:00:00 Morning 2
20 2021-09-03 04:00:00 Morning 2
有没有使用 dplyr 的简单解决方案?
答:
3赞
Seth
10/10/2023
#1
对于这个答案,我们创建一个变量,然后使用它来获取变量的顺序 ID。dplyr
group_by
nightcount
library(dplyr)
library(lubridate)
df %>%
mutate(day = if_else(
daytime == 'Night', date(Date), date(Date) - days(1)
)) %>%
mutate(nightcount = cur_group_id(),
.by = day) %>%
select(-day)
#> # A tibble: 20 × 3
#> Date daytime nightcount
#> <dttm> <chr> <int>
#> 1 2021-09-01 20:00:00 Night 1
#> 2 2021-09-01 21:00:00 Night 1
#> 3 2021-09-01 22:00:00 Night 1
#> 4 2021-09-01 23:00:00 Night 1
#> 5 2021-09-02 00:00:00 Morning 1
#> 6 2021-09-02 01:00:00 Morning 1
#> 7 2021-09-02 02:00:00 Morning 1
#> 8 2021-09-02 03:00:00 Morning 1
#> 9 2021-09-02 04:00:00 Morning 1
#> 10 2021-09-02 05:00:00 Morning 1
#> 11 2021-09-02 06:00:00 Morning 1
#> 12 2021-09-02 20:00:00 Night 2
#> 13 2021-09-02 21:00:00 Night 2
#> 14 2021-09-02 22:00:00 Night 2
#> 15 2021-09-02 23:00:00 Night 2
#> 16 2021-09-03 00:00:00 Morning 2
#> 17 2021-09-03 01:00:00 Morning 2
#> 18 2021-09-03 02:00:00 Morning 2
#> 19 2021-09-03 03:00:00 Morning 2
#> 20 2021-09-03 04:00:00 Morning 2
创建于 2023-10-10 使用 reprex v2.0.2
4赞
LMc
10/10/2023
#2
当“Morning”变为“Night”时,您可以创建一个逻辑值,然后用于跨行对这些逻辑值求和:cumsum
library(dplyr)
df |>
mutate(nightcount = cumsum(daytime == "Night" & lag(daytime, default = "Morning") == "Morning"))
上一个:维护聚合栅格的时间维度 (Z)
下一个:汇总重叠日期上列的平均值
评论