提问人:Jack 提问时间:9/29/2023 更新时间:9/29/2023 访问量:35
如何使用 r 中的 mutate 函数将两列中的字符串组合减少为一列?
How do you reduce string combinations across two columns into one column using the mutate function in r?
问:
糟糕的数据输入导致两列具有相似数据的列可以合并为一列。我有三个组和 NA,我想减少到两个组和一个 NA
x <- data.frame(
a= c("Group 1", "Group 2", "Group 1", NA, "Group 1",NA, "Group 3", "Group 3"),
b= c("Group 1", NA, NA, "Group 2", "Group 1", NA, NA, "Group 3")
)
x
#The combinations of the group and the desired output:
#Group 1 + Group 1 = Group 1
#Group 2 + NA = Group 2
# Group 1 + NA = Group 1
#NA+NA = NA
#Group 3 + NA = Group 1
#Group 3 + Group 3= Unknown
x |>
dplyr::mutate(Group = case_when(a == c("Group 1")|b == c("Group 1") ~ "Group 1",
a == c("Group 2")|b == c("NA") ~ "Group 2",
a == c("Group 1")|b == c("NA") ~ "Group 2",
a == c("NA")|b == c("NA") ~ NA,
a == c("Group 3")|b == c("NA") ~ "Group 1",
a == c("Group 3")|b == c("Group 3") ~ "Unknown",
))
结果如下,第 3 组 + 第 3 组应该是未知数,但被归类为第 1 组。
a b Group
1 Group 1 Group 1 Group 1
2 Group 2 <NA> Group 2
3 Group 1 <NA> Group 1
4 <NA> Group 2 <NA>
5 Group 1 Group 1 Group 1
6 <NA> <NA> <NA>
7 Group 3 <NA> Unknown
8 Group 3 Group 3 Group 1
我将不胜感激。
答:
0赞
Chris Ruehlemann
9/29/2023
#1
这是你需要的吗?
x |>
dplyr::mutate(Group = case_when(a == "Group 1" & b == "Group 1" ~ "Group 1",
a == "Group 2" & is.na(b) | is.na(a) & b == "Group 2" ~ "Group 2",
a == "Group 1" & is.na(b) ~ "Group 1",
is.na(a) & is.na(b) ~ NA,
a == "Group 3" & is.na(b) | is.na(a) & b == "Group 3" ~ "Group 1",
a == "Group 3" & b == "Group 3" ~ "Unknown",
))
a b Group
1 Group 1 Group 1 Group 1
2 Group 2 <NA> Group 2
3 Group 1 <NA> Group 1
4 <NA> Group 2 Group 2
5 Group 1 Group 1 Group 1
6 <NA> <NA> <NA>
7 Group 3 <NA> Group 1
8 Group 3 Group 3 Unknown
评论
case_when(a == "Group 3" & b == "Group 3" ~ "Unknown", a == "Group 3" | b == "Group 3" ~ "Group 1", .default = coalesce(a, b))
a == c("NA")|b == c("NA")
"NA"
NA