在 R 中使用 str_replace_all 重命名两个以上字符串类型的列-解网

问：

我有一个数据集（dataraw），其中包含列标签，例如

condition1_men、condition1_women、condition2_men、condition3_women（等）

我想将字符串“condition1”、“condition2”替换为它们的名称。

条件 1_women = 相关_women;

条件 2_men = 无关_men;

条件3_men = 填充物_men;

当前代码：

data <- dataraw %>%
 rename_all(~ str_replace_all(str_replace(., 'condition1', "related"), 'condition2', "unrelated"))

这适用于最多 2 个字符串，每次我尝试添加第三个字符串时，我都会遇到意外的符号错误。

 data <- dataraw %>%
rename_all(~ str_replace_all(str_replace((., 'condition1', "related"), 'condition2', "unrelated"), 'condition3', "filler")))

我相信这一定很简单，但无论我尝试哪种组合，我都会遇到错误。有人能指出我所犯的简单错误吗？谢谢。

r dplyr 替换纵梁

library(dplyr)
dataraw <- data.frame(condition1_men=1, condition1_women=2, condition2_men=3, condition2_women=4, condition3_men=5)
dataraw
#   condition1_men condition1_women condition2_men condition2_women condition3_men
# 1              1                2              3                4              5
dataraw |>
  rename_with(.fn = ~ sub("^condition1_", "related_", sub("^condition2_", "unrelated_", .)))
#   related_men related_women unrelated_men unrelated_women condition3_men
# 1           1             2             3               4              5

如果你有一个 “from=to” 赋值的（命名）向量，我们也可以这样做，更通用一点：

conds <- c(condition1="related", condition2="unrelated")
dataraw |>
  rename_with(.fn = ~ Reduce(function(st, i) sub(names(conds)[i], conds[i], st), seq_along(conds), init = .x))
#   related_men related_women unrelated_men unrelated_women condition3_men
# 1           1             2             3               4              5

我们需要，因为我们需要保留先前条件映射的所有更改。Reduce

我经常发现这样的数据在长格式中做得更好（在以后的数据整理/分析中）（正如 Limey 所建议的那样）。为此，我们还可以做到：

dataraw |>
  tidyr::pivot_longer(cols = everything(), names_pattern = "(.*)_(.*)",
                      names_to = c("cond", ".value")) |>
  mutate(cond2 = conds[match(sub("_.*", "", cond), names(conds))])
# # A tibble: 3 × 4
#   cond         men women cond2    
#   <chr>      <dbl> <dbl> <chr>    
# 1 condition1     1     2 related  
# 2 condition2     3     4 unrelated
# 3 condition3     5    NA NA

尽管如果您的映射位于不同的帧中，它可能会更简单（数据管理、可视化、更新等），我们可以将其合并/联接到原始数据上：

cond_df <- tribble(
  ~ cond, ~ cond2
  , "condition1", "related"
  , "condition2", "unrelated"
)
dataraw |>
  tidyr::pivot_longer(cols = everything(), names_pattern = "(.*)_(.*)",
                      names_to = c("cond", ".value")) |>
  left_join(cond_df, by = "cond")
# # A tibble: 3 × 4
#   cond         men women cond2    
#   <chr>      <dbl> <dbl> <chr>    
# 1 condition1     1     2 related  
# 2 condition2     3     4 unrelated
# 3 condition3     5    NA NA

dataraw <- tibble(
  condition1_women = c("a", "b", "c"),
  condition2_men   = c("x", "y", "z"),
  condition3_men   = c("i", "j", "k"))

# A tibble: 3 × 3 -------------------
  condition1_women condition2_men condition3_men
  <chr>            <chr>          <chr>         
1 a                x              i             
2 b                y              j             
3 c                z              k

labels <- tribble(
  ~ old, ~ new,
  "condition1", "related",
  "condition2", "unrelated",
  "condition3", "filler")

# A tibble: 3 × 2 -------------------
  old        new      
  <chr>      <chr>    
1 condition1 related  
2 condition2 unrelated
3 condition3 filler 

old_names <- colnames(dataraw)

# -----------------------------------
[1] "condition1_women" "condition2_men"   "condition3_men"  

new_names <- map2_chr(
  labels$old, labels$new,
  \(y, z) modify(
    keep(old_names,\(x) str_detect(x, y)), 
    \(x) str_replace(x, y, z)))

# -----------------------------------   
[1] "related_women" "unrelated_men" "filler_men"

colnames(dataraw) <- new_names

# A tibble: 3 × 3 -------------------
  related_women unrelated_men filler_men
  <chr>         <chr>         <chr>     
1 a             x             i         
2 b             y             j         
3 c             z             k

上一个：删除字符串中某个“短语”后面的字符串的其余部分（R） [duplicate]

下一个：R stringr 解析案例字母

在 R 中使用 str_replace_all 重命名两个以上字符串类型的列

rename columns in R using str_replace_all for more than two string types

评论

评论