提问人:VR28 提问时间:1/29/2023 最后编辑:VR28 更新时间:1/29/2023 访问量:104
完成序列列名称并填充 R
complete sequence column names and fill R
问:
我有类似于以下内容的大数据:
week_0<-c(5,0,1,0,8,1)
week_4<-c(1,0,1,0,1,1)
week_8<-c(1,0,6,0,0,0)
week_9<-c(2,4,1,7,8,1)
week_10<-c(2,4,1,7,8,1)
Participant<-c("Lion","Cat","Dog","Snake","Tiger","Mouse")
test_data<-data.frame(Participant,week_0,week_4,week_8,week_9,week_10)
> test_data
Participant week_0 week_4 week_8 week_9 week_10
1 Lion 5 1 1 2 2
2 Cat 0 0 0 4 4
3 Dog 1 1 6 1 1
4 Snake 0 0 0 7 7
5 Tiger 8 1 0 8 8
6 Mouse 1 1 0 1 1
我想填补列名数字之间的空白。我正在寻找的最终结果是:
test_data
Participant week_0 week_1 week_2 week_3 week_4 week_5 week_6 week_7 week_8 week_9 week_10
1 Lion 5 5 5 5 1 1 1 1 1 2 2
2 Cat 0 0 0 0 0 0 0 0 0 4 4
3 Dog 1 1 1 1 1 1 1 1 6 1 1
4 Snake 0 0 0 0 0 0 0 0 0 7 7
5 Tiger 8 8 8 8 1 1 1 1 0 8 8
6 Mouse 1 1 1 1 1 1 1 1 0 1 1
我已经查看了 r 中的 Fill 函数,但我无法得到我想要的结果。 关于如何做到这一点的任何建议?
答:
1赞
jkatam
1/29/2023
#1
请检查以下代码
test_data<-data.frame(Participant,week_0,week_4,week_8,week_9,week_10) %>%
pivot_longer(starts_with('week'), names_to = 'name', values_to = 'value') %>%
mutate(seq=as.numeric(str_replace_all(name,'\\w*\\_',''))) %>% arrange(Participant)
seq <- data.frame(Participant=rep(unique(Participant),11)) %>% group_by(Participant) %>%
mutate(seq=row_number(), seq=seq-1) %>%
arrange(Participant)
test_data2 <- test_data %>% right_join(seq, by=c('Participant','seq')) %>%
arrange(Participant) %>%
mutate(name=ifelse(is.na(name),paste0('week_',seq),name)) %>% arrange(Participant,seq) %>%
group_by(Participant) %>%
fill(value) %>%
pivot_wider(Participant, names_from = name, values_from = value)
创建于 2023-01-28 使用 reprex v2.0.2
# A tibble: 6 × 11
# Groups: Participant [6]
Participant week_0 week_2 week_3 week_4 week_5 week_6 week_7 week_8 week_9 week_10
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Cat 0 0 0 0 0 0 0 0 4 4
2 Dog 1 1 1 1 1 1 1 6 1 1
3 Lion 5 5 5 1 1 1 1 1 2 2
4 Mouse 1 1 1 1 1 1 1 0 1 1
5 Snake 0 0 0 0 0 0 0 0 7 7
6 Tiger 8 8 8 1 1 1 1 0 8 8
1赞
akrun
1/29/2023
#2
使用 - 从“周”列名中提取数字后缀部分,然后获取值 ('i2') 之间的序列,根据索引复制列,并使用base R
min/max
match
i2
i1 <- as.integer(sub("week_", "", names(test_data)[-1]))
i2 <- Reduce(`:`, as.list(range(i1)))
test_data <- cbind(test_data[1], test_data[-1][cumsum(!is.na(match(i2, i1)))])
names(test_data)[-1] <- paste0("week_", i2)
-输出
> test_data
Participant week_0 week_1 week_2 week_3 week_4 week_5 week_6 week_7 week_8 week_9 week_10
1 Lion 5 5 5 5 1 1 1 1 1 2 2
2 Cat 0 0 0 0 0 0 0 0 0 4 4
3 Dog 1 1 1 1 1 1 1 1 6 1 1
4 Snake 0 0 0 0 0 0 0 0 0 7 7
5 Tiger 8 8 8 8 1 1 1 1 0 8 8
6 Mouse 1 1 1 1 1 1 1 1 0 1 1
使用 ,一个选项是用 重塑为 'long' ,用于扩展数据,使用以前的非 NA 扩展缺失值,并使用tidyverse
pivot_longer
complete
fill
pivot_wider
library(dplyr)
library(tidyr)
test_data %>%
pivot_longer(cols = starts_with('week_'),
names_prefix = "week_", names_transform = as.integer) %>%
complete(Participant, name = full_seq(name, period = 1)) %>%
fill(value, .direction = "downup") %>%
pivot_wider(names_from = name, values_from = value,
names_prefix = "week_") %>%
arrange(match(Participant, test_data$Participant))
-输出
# A tibble: 6 × 12
Participant week_0 week_1 week_2 week_3 week_4 week_5 week_6 week_7 week_8 week_9 week_10
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Lion 5 5 5 5 1 1 1 1 1 2 2
2 Cat 0 0 0 0 0 0 0 0 0 4 4
3 Dog 1 1 1 1 1 1 1 1 6 1 1
4 Snake 0 0 0 0 0 0 0 0 0 7 7
5 Tiger 8 8 8 8 1 1 1 1 0 8 8
6 Mouse 1 1 1 1 1 1 1 1 0 1 1
评论