R 中的数据操作与数据随时间的变化-解网

问：

基于下面的 R data.frame，我正在寻找一个优雅的解决方案来计算不同时间之间在组之间转换的人数。

dat <- data.frame(people = c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4),
                  time = c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5),
                  group = c(5,4,4,3,2,4,4,3,2,1,5,5,4,4,4,3,3,2,2,1))

我想要一个通用的解决方案，因为我的问题规模要大得多。我正在考虑带有突变的东西可以做到这一点，但我不确定从哪里开始

我正在寻找的输出开始的示例如下：

dat_result <- data.frame(time_start = c(1,1,1,1,1),
                         time_end = c(2,2,2,2,2),
                         group_start = c(1,1,1,1,1),
                         group_end = c(1,2,3,4,5),
                         count = "")

这将在所有时间转换和所有组转换中重复。时间当然是线性的，所以 1 只能去 2,2 到 3，依此类推。但是，任何组都可以过渡到任何其他组，包括在两次之间停留在同一组中。

R DataFrame 数据操作 DPLYR

library(data.table)

# Convert to data.table:
setDT(dat)

# Make sure your data is ordered by people and time:
setorder(dat, people, time)

# Create a new column with the next group
dat[, next.group := shift(group, -1), by = people]

# Remove rows where there's no change:
# (since this will remove data, you may want to atributte to a new object)
new <- dat[group != next.group]

# Add end.time:
new[, end.time := shift(time, -1, max(dat$time)), by = people]

# Count the ocurrences (and order the result):
> new[, .N, by = .(time, end.time, group, next.group)][order(time, end.time, group)]
   time end.time group next.group N
1:    1        3     5          4 1
2:    2        3     4          3 1
3:    2        4     3          2 1
4:    2        5     5          4 1
5:    3        4     3          2 1
6:    3        4     4          3 1
7:    4        5     2          1 2
8:    4        5     3          2 1

上一个：R：将数据拆分为所有可能的“因子”组合

下一个：基于趋势值的数据操作

R 中的数据操作与数据随时间的变化

Data manipulation in R with data over time

评论