如何将列值和列名复制到唯一 ID 值中并在 r 中维护单元格信息?

How to copy column values and column names into unique ID value and maintain cell information in r?

提问人:Abby L 提问时间:9/20/2023 最后编辑:jpsmithAbby L 更新时间:9/20/2023 访问量:26

问:

我有一个宽格式数据框,我希望通过缩短列来变长,但不知道如何让 pivot_longer() 和 paste() 正确协同工作。

下面是一个示例数据集:

df <- data.frame(Site_Number  = c(1, 1, 1, 2),
                 Subsite_Number = c(1, 2, 3, 1),
                  C1 = c("Red", "Blue", "Green", "Red"),
                  C2 = c("Red", "Red", "Red", "Red"),
                  C3 = c("Blue", "Green", "NA", "Blue"),
                  C4 = c("Red", "NA", "NA", "NA"))

它看起来像:

  Site_Number Subsite_Number    C1  C2    C3  C4
           1              1   Red Red  Blue Red
           1              2  Blue Red Green  NA
           1              3 Green Red    NA  NA
           2              1   Red Red  Blue  NA

我想从 Site_Number、Subsite_Number 和“C”值(列名)创建一个唯一的 ID,将数据帧转换为长格式,但保留 C 列中的值。诀窍是某些“C”列中有 NA,不需要将其转换为新数据集。这是我希望的输出:

     ID  Color
 S1T1C1   Red
 S1T1C2   Red
 S1T1C3  Blue
 S1T1C4   Red
 S1T2C1  Blue
 S1T2C2   Red
 S1T2C3 Green
 S1T3C1 Green
 S1T3C2   Red
 S2T1C1   Red
 S2T1C2   Red
 S2T1C3  Blue

S = Site_Number,T = Subsite_Number,C 来自 C1:4 的列名来创建 ID 号。

关于如何正确操作它的任何想法?

r 数据操作

评论


答:

2赞 Gregor Thomas 9/20/2023 #1
df |>
  mutate(ID = paste0("S", Site_Number, "T", Subsite_Number)) |>
  select(-Site_Number, -Subsite_Number) |>
  pivot_longer(-ID) |>
  filter(value != "NA" & !is.na(value)) |>
  mutate(ID = paste0(ID, name))
# # A tibble: 12 × 3
#    ID     name  value
#    <chr>  <chr> <chr>
#  1 S1T1C1 C1    Red  
#  2 S1T1C2 C2    Red  
#  3 S1T1C3 C3    Blue 
#  4 S1T1C4 C4    Red  
#  5 S1T2C1 C1    Blue 
#  6 S1T2C2 C2    Red  
#  7 S1T2C3 C3    Green
#  8 S1T3C1 C1    Green
#  9 S1T3C2 C2    Red  
# 10 S2T1C1 C1    Red  
# 11 S2T1C2 C2    Red  
# 12 S2T1C3 C3    Blue 
1赞 Onyambu 9/20/2023 #2

另一种方式:

df %>%
  mutate(ID = str_c('S', Site_Number, 'T', Subsite_Number), .keep='unused')%>%
  mutate(across(-ID, ~na_if(.x, 'NA'))) %>% # type.convert(as.is = TRUE)
  pivot_longer(C1:C4, values_to = 'Color', values_drop_na = TRUE) %>%
  unite(ID, ID, name, sep = '')

   ID     Color
   <chr>  <chr>
 1 S1T1C1 Red  
 2 S1T1C2 Red  
 3 S1T1C3 Blue 
 4 S1T1C4 Red  
 5 S1T2C1 Blue 
 6 S1T2C2 Red  
 7 S1T2C3 Green
 8 S1T3C1 Green
 9 S1T3C2 Red  
10 S2T1C1 Red  
11 S2T1C2 Red  
12 S2T1C3 Blue