提问人:Abby L 提问时间:9/20/2023 最后编辑:jpsmithAbby L 更新时间:9/20/2023 访问量:26
如何将列值和列名复制到唯一 ID 值中并在 r 中维护单元格信息?
How to copy column values and column names into unique ID value and maintain cell information in r?
问:
我有一个宽格式数据框,我希望通过缩短列来变长,但不知道如何让 pivot_longer() 和 paste() 正确协同工作。
下面是一个示例数据集:
df <- data.frame(Site_Number = c(1, 1, 1, 2),
Subsite_Number = c(1, 2, 3, 1),
C1 = c("Red", "Blue", "Green", "Red"),
C2 = c("Red", "Red", "Red", "Red"),
C3 = c("Blue", "Green", "NA", "Blue"),
C4 = c("Red", "NA", "NA", "NA"))
它看起来像:
Site_Number Subsite_Number C1 C2 C3 C4
1 1 Red Red Blue Red
1 2 Blue Red Green NA
1 3 Green Red NA NA
2 1 Red Red Blue NA
我想从 Site_Number、Subsite_Number 和“C”值(列名)创建一个唯一的 ID,将数据帧转换为长格式,但保留 C 列中的值。诀窍是某些“C”列中有 NA,不需要将其转换为新数据集。这是我希望的输出:
ID Color
S1T1C1 Red
S1T1C2 Red
S1T1C3 Blue
S1T1C4 Red
S1T2C1 Blue
S1T2C2 Red
S1T2C3 Green
S1T3C1 Green
S1T3C2 Red
S2T1C1 Red
S2T1C2 Red
S2T1C3 Blue
S = Site_Number,T = Subsite_Number,C 来自 C1:4 的列名来创建 ID 号。
关于如何正确操作它的任何想法?
答:
2赞
Gregor Thomas
9/20/2023
#1
df |>
mutate(ID = paste0("S", Site_Number, "T", Subsite_Number)) |>
select(-Site_Number, -Subsite_Number) |>
pivot_longer(-ID) |>
filter(value != "NA" & !is.na(value)) |>
mutate(ID = paste0(ID, name))
# # A tibble: 12 × 3
# ID name value
# <chr> <chr> <chr>
# 1 S1T1C1 C1 Red
# 2 S1T1C2 C2 Red
# 3 S1T1C3 C3 Blue
# 4 S1T1C4 C4 Red
# 5 S1T2C1 C1 Blue
# 6 S1T2C2 C2 Red
# 7 S1T2C3 C3 Green
# 8 S1T3C1 C1 Green
# 9 S1T3C2 C2 Red
# 10 S2T1C1 C1 Red
# 11 S2T1C2 C2 Red
# 12 S2T1C3 C3 Blue
1赞
Onyambu
9/20/2023
#2
另一种方式:
df %>%
mutate(ID = str_c('S', Site_Number, 'T', Subsite_Number), .keep='unused')%>%
mutate(across(-ID, ~na_if(.x, 'NA'))) %>% # type.convert(as.is = TRUE)
pivot_longer(C1:C4, values_to = 'Color', values_drop_na = TRUE) %>%
unite(ID, ID, name, sep = '')
ID Color
<chr> <chr>
1 S1T1C1 Red
2 S1T1C2 Red
3 S1T1C3 Blue
4 S1T1C4 Red
5 S1T2C1 Blue
6 S1T2C2 Red
7 S1T2C3 Green
8 S1T3C1 Green
9 S1T3C2 Red
10 S2T1C1 Red
11 S2T1C2 Red
12 S2T1C3 Blue
评论