提问人:Rara 提问时间:7/25/2023 最后编辑:Rara 更新时间:7/25/2023 访问量:39
如何将一个数据帧中的NA替换为另一个数据帧中的非唯一键的值?
How to replace NAs in a dataframe with values in another dataframe for non-unique keys?
问:
当 df1 中相同列中的 NA 等于 NA 时,我想替换 df1 中两列(board 和 date)中的值。关键是主语:所以在这种情况下,对于 vand 和 haap。问题是 df1 中的主题(例如 vand)不是唯一的,但 board 和 date 列中的值始终相同。我将不胜感激您的建议。
df1 <- structure(list(nature = c("sop", "dior", "coats", "sem", "wia",
"bodo"), subject = c("gank", "vand", "vand",
"jav", "vand", "haap"), board = c("REW", "EWW", "EWW", "SSD",
"EWW", "MMB"), date = c("2023-07-12",
"2023-06-09", "2023-06-09",
"2023-06-09", "2023-06-09",
"2023-03-05")), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
df2 <- structure(list(type = c("single", "couple", "couple", "couple", "couple",
"couple", "single", "couple", "couple", "couple"), name = c("ZIA",
"MIA", "lMIA", "LIA",
"LIA", "LIA", "DIA",
"LIA", "MIA", "SIA"
), subject = c("vand", "vank", "vank",
"jav", "tral", "twe",
"haap", "der", "leo",
"sdee"), board = c(NA,
"SSD", "REW", "EWW", "WWS, DDC", "SSD",
NA, "QQW", "XXD", "GGH"
), date = c(NA, "2023-07-03", "2023-07-03",
"2023-07-17", "2023-07-17",
"2023-01-16", NA,
"2023-07-17", "2023-06-08",
"2023-07-17")), class = "data.frame", row.names = c(NA,
-10L))
期望输出:
df3 <- structure(list(type = c("single", "couple", "couple", "couple", "couple",
"couple", "single", "couple", "couple", "couple"), name = c("ZIA",
"MIA", "lMIA", "LIA",
"LIA", "LIA", "DIA",
"LIA", "MIA", "SIA"
), subject = c("vand", "vank", "vank",
"jav", "tral", "twe",
"haap", "der", "leo",
"sdee"), board = c("EWW",
"SSD", "REW", "EWW", "WWS, DDC", "SSD",
"MMB", "QQW", "XXD", "GGH"
), date = c("2023-06-09", "2023-07-03", "2023-07-03",
"2023-07-17", "2023-07-17",
"2023-01-16", "2023-03-05",
"2023-07-17", "2023-06-08",
"2023-07-17")), class = "data.frame", row.names = c(NA,
-10L))
答:
1赞
Maël
7/25/2023
#1
您可以使用将第一个 data.frame 减少为唯一行,然后用于替换以下值:unique(df1)
dplyr::rows_update
dplyr::rows_update(df2, unique(df1), unmatched = "ignore")
输出
# type name subject board date
# 1 single ZIA vand EWW 2023-06-09
# 2 couple MIA vank SSD 2023-07-03
# 3 couple lMIA vank REW 2023-07-03
# 4 couple LIA jav SSD 2023-06-09
# 5 couple LIA tral WWS, DDC 2023-07-17
# 6 couple LIA twe SSD 2023-01-16
# 7 single DIA haap MMB 2023-03-05
# 8 couple LIA der QQW 2023-07-17
# 9 couple MIA leo XXD 2023-06-08
# 10 couple SIA sdee GGH 2023-07-17
评论
0赞
Rara
7/25/2023
感谢您的出色解决方案。但是,我应该注意,在我的真实数据中,df1 包含 df2 中不存在的其他列,不幸的是,代码无法抛出此错误:键值必须是唯一的。y
0赞
Maël
7/25/2023
然后,您可以创建另一个 df1,其列仅在df2
0赞
Rara
7/25/2023
我只需选择 df1 中的共享列即可获得所需的输出。谢谢你的建议。
评论