提问人:Usman YousafZai 提问时间:6/20/2023 更新时间:6/20/2023 访问量:25
如果数字或字符串与两个数据帧 R 匹配,则在新列中计数器
Counter in new column if a number or string matches from two data-frames R
问:
我将尝试给出一个最好的例子来理解。我有两个数据帧:df1 和 df2。在下面的数据帧中,如果来自 df1 的Drug_Code_1与来自 df2 的Drug_Code_2匹配,则应在 df1 中创建一个名为 Drug_count 的新列并将其递增 1。如果同一Pat_ID有两种不同的药物,那么它应该算作 2。每个Drug_Code都有独特的Drug_Names。这还应该用逗号“,”合并每个Pat_ID的列值。下面是给定 df1、df2 和合成 df 的示例。谢谢
DF1
Pat_ID Date Drug_Code Drug_Names
1 2010-12-09 1.1.1 Alpha-21
1 2010-12-15 1.1.2 Alpha
1 2010-12-15 1.1.3 Beta
1 2010-12-15 1.1.3 Beta-29
2 2010-12-09 1.1.3 Beta
2 2010-12-17 1.1.4 Beta-32
1 2010-12-15 1.1.3 Beta
3 2011-02-09 1.2.1 Gamma-21
3 2011-04-25 1.2.2 Gamma
3 2011-04-25 1.3.1 Tango
DF2型
Drug_Code Drug_Names
1.1.1 Alpha-21
1.1.2 Alpha
1.1.3 Beta
1.1.4 Beta-2
1.2.1 Gamma-21
1.2.2 Gamma
1.3.1 Tango
合成 df
Pat_ID Date Drug_Code Drug_Names Count
1 2010-12-09, 2010-12-15, 2010-12-15 1.1.1, 1.1.2, 1.1.3 Alpha-21,Alpha, Beta 3
2 2010-12-09, 2010-12-15 1.1.3, 1.2.1 Beta, Gamma-21 2
3 2011-02-09, 2011-04-25 1.2.1, 1.2.2 Gamma-21, Gamma 2
答:
1赞
Phoenix
6/20/2023
#1
这应该有效:
df1 <- data.frame(
Pat_ID = c(1, 1, 1, 1, 2, 2, 1, 3, 3, 3),
Date = c("2010-12-09", "2010-12-15", "2010-12-15", "2010-12-15", "2010-12-09",
"2010-12-17", "2010-12-15", "2011-02-09", "2011-04-25", "2011-04-25"),
Drug_Code = c("1.1.1", "1.1.2", "1.1.3", "1.1.3", "1.1.3", "1.1.4", "1.1.3", "1.2.1",
"1.2.2", "1.3.1"),
Drug_Names = c("Alpha-21", "Alpha", "Beta", "Beta-29", "Beta", "Beta-32", "Beta",
"Gamma-21", "Gamma", "Tango")
)
df2 <- data.frame(
Drug_Code = c("1.1.1", "1.1.2", "1.1.3", "1.1.4", "1.2.1", "1.2.2", "1.3.1"),
Drug_Names = c("Alpha-21", "Alpha", "Beta", "Beta-2", "Gamma-21", "Gamma", "Tango")
)
library(dplyr)
merged_df <- df1 %>%
left_join(df2, by = "Drug_Code")
result_df <- merged_df %>%
group_by(Pat_ID) %>%
summarise(
Date = paste(Date, collapse = ", "),
Drug_Code = paste(Drug_Code, collapse = ", "),
Drug_Names = paste(`Drug_Names.x`, collapse = ", "), # Updated column name
Count = n_distinct(Drug_Code)
)
print(result_df)
输出将为:
Pat_ID Date Drug_Code Drug_Names Count
<dbl> <chr> <chr> <chr> <int>
1 1 2010-12-09, 2010-12-15, 2010-12-15, 2010-12~ 1.1.1, 1~ Alpha-21,~ 1
2 2 2010-12-09, 2010-12-17 1.1.3, 1~ Beta, Bet~ 1
3 3 2011-02-09, 2011-04-25, 2011-04-25 1.2.1, 1~ Gamma-21,~ 1
评论
0赞
Melissa Key
6/20/2023
在解决方案中,所有 3 的计数都是 1。在折叠药物代码之前,您需要计数。您还可以使用paste
across
评论