从列中提取第一个单词并插入到现有列中-解网

问：

我有一个 tibble，想根据 N/A 标准从列中提取第一个单词并将其插入到我拥有的代码不起作用的 existin 列中，任何人都可以提供帮助：

第二列有状态名称，有些行是 N/A 的。我想做的是，如果第二列中的行是 N/A，那么从第 7 列中取第一个词来替换 N/A。

这是我目前拥有的代码，但我目前收到错误。


for (i in which(is.na(PRC[,2]))){
PRC[int(i/11),2] <- substr(PRC[int(i/11),7],6) # needs to create a substring to pull the first word from source column to the state column and replace unknown
}

r 文本提取

使用多个软件包，您可以轻松获得所需的结果。我添加了一个示例 df 来展示解决方案。在下面的代码中，可以使用，并将第二列中带有 NA 的条目替换为第一列的第一个单词（using 和正则表达式）。tidyversemutateacrossifelsestr_extract

library(dplyr)
library(tibble)
library(stringr)

df <- tibble(fill = c("First state", "Second state", "Third state", "Fourth state", "Fifth state"),
             states = c("State B", "State A", NA, "State C", NA))

df |>
  mutate(across(states, ~ifelse(is.na(.x), 
                                str_extract(fill, "^[A-z]+(?= )"), 
                                .x)))

# A tibble: 5 × 2
#  fill         states 
#  <chr>        <chr>  
#1 First state  State B
#2 Second state State A
#3 Third state  Third  
#4 Fourth state State C
#5 Fifth state  Fifth

感谢您的回答，但是，当我尝试使用我的数据集时，我收到以下错误：错误：i 在参数中：。由：！无法对不存在的列进行子集化。✖ 列、、、等不存在。回溯： 1. dplyr：：mutate（...）2. dplyr：：：mutate.data.frame（...）3. dplyr：：：mutate_cols（.data， dplyr_quosures（...）， by） 5.dplyr：：：mutate_col（dots[[i]]， data， mask， new_columns） 6.dplyr：：：expand_across（dot）有什么建议吗？mutate()across(...)across()MinnesotaNorth CarolinaMassachusettsArizonaTennessee

0赞 Onyambu 10/23/2023 #2

用：coalesce

df %>%
   mutate(states=coalesce(states, str_extract(fill, "\\w+")))

# A tibble: 5 × 2
  fill         states 
  <chr>        <chr>  
1 First state  State B
2 Second state State A
3 Third state  Third  
4 Fourth state State C
5 Fifth state  Fifth

从列中提取第一个单词并插入到现有列中

Extract first word from a column and insert into existing column

评论

评论

评论