根据匹配的字符串分配站点名称(不分隔地址)

Assign site name based on matching strings (without separating the address)

提问人:Banji 提问时间:1/5/2023 更新时间:1/5/2023 访问量:19

问:

我正在尝试根据一些匹配条件分配 SiteName。我已经使用以下步骤完成了任务,我想知道如果没有“分开”和“pivot_longer”功能,我是否可以获得相同的结果

法典:


df <- tibble::tribble(
  ~Author,                       ~"Author Address",
     "BA",        "METRO NORTH; DALLAS",
     "OA",       "RBWH; TPCH; TOPEND",
     "OB", "CABOOLTURE HOSPITAL; CALIFORNIA UNI",
    "AOS",           "CABOOLTURE HOSPITAL; RBWH",
       "KB",                     "WOMENS HOSPITAL"
  )


df


df %>% 
  separate (`Author Address`, into = c("Site1", "Site2", "Site3"), sep = ";", remove = F ) %>%
  rename (Address = `Author Address` ) %>% 
  pivot_longer (cols = - c(Author, Address), names_to = "Site", values_to = "Author Address" ) %>% 
  mutate (`Author Address`  = str_to_upper(`Author Address`),
          Site = case_when (grepl ("METRO NORTH", `Author Address` ) ~ "Metro North Health",
                            grepl ("CABOOLTURE HOSPITAL", `Author Address` ) ~ "Caboolture Hospital",
                            grepl ("TPCH", `Author Address` ) ~ "Prince charles hospital",
                            grepl ("RBWH", `Author Address` ) ~ "RBWH")) %>% 
  #filter (!is.na (Site)) %>% 
  select (Author, Address, Site ) %>% 
  distinct () %>% 
  group_by (Author, Address) %>% 
  summarise (Site = paste0(Site, collapse = ", "  )) %>% 
  ungroup () %>% 
  mutate (Site = str_trim (str_remove_all (Site, pattern = ", NA$"  ), "right"  ) )

输出:

enter image description here

我已经完成了任务,但希望没有“分离”和“pivot_longer”函数,可能还有一个短代码

r 字符串 操作 数据 整理

评论

0赞 PavoDive 1/5/2023
您可能需要代码的 和 部分,因为它们是必需的。您可以拥有其他函数(来自其他包,甚至来自基 R)本质上也会做同样的事情。你为什么要消除它们?separatepivot_longer
0赞 Banji 1/5/2023
感谢您的反馈。我正在尝试消除该功能,因为我必须提供一个分隔符。不幸的是,数据非常肮脏(比我的样本数据复杂得多),而且我无法预测分号总是落在正确的位置,这使得分离非常棘手。我认为在不分离字段的情况下匹配字符串将是最好的方法。separate

答: 暂无答案