提问人:Catherine 提问时间:10/27/2023 最后编辑:zx8754Catherine 更新时间:10/27/2023 访问量:57
在 splitstackshape 包中使用 cSplit_e 函数时查找空格的负前瞻
Looking for a negative lookahead for whitespace when using the cSplit_e function in the splitstackshape package
问:
我希望将包含多个逗号分隔响应的列分成多个列。我在 splitstackshape 包中使用 cSplit_e 函数。不幸的是,包中的某些项目在单个项目中包含逗号,因此我试图指出它应该只在逗号处拆分,后面没有空格。
这是我现在拥有的语法:
cSplit_e(data=df,split.col="question",sep=",",type="character")
这需要这个:
Behavior; green, pink, blue,Sleep; indigo, violet, puce
并为以下各项创建单独的列:
question_Behavior; green
question_pink
question_blue
question_Sleep; indigo
question_violet
question_puce
但我希望它分裂成这样:
question_Behavior; green, pink, blue
question_Sleep; indigo, violet, puce
我不确定如何在cSplit_e语法中表明我只希望它在逗号处拆分,紧随其后的是非空格,并希望得到帮助!
数据帧示例:
id_num <- c("1","2","3","4","5")
question <- c("Behavior; green, pink, blue,Sleep; indigo, violet, puce","Behavior; green, pink, blue","","Sleep; indigo, violet, puce","Behavior; green, pink, blue,Sleep; indigo, violet, puce")
df <- data.frame(id_num,question)
答:
2赞
Lucca Nielsen
10/27/2023
#1
如果您不介意使用 ,这里有一个可能的解决方案的建议。也许它不像使用它那么优雅或简单,但我不知道。tidyr package
splitstackshape package
我不得不删除两个答案中值为空的id_num (id = 3)
我的代码:
df %>%
separate_rows(question, sep = "(?<=\\S),(?=\\S)", convert = FALSE) %>%
separate(question, into = c("question", "response"), sep = ";", extra = "merge") %>%
filter(!is.na(response)) %>%
pivot_wider(names_from = question, values_from = response) %>%
rename_all(~gsub("\\.", "_", .))
输出:
# A tibble: 4 × 3
id_num Behavior Sleep
<chr> <chr> <chr>
1 1 " green, pink, blue" " indigo, violet, puce"
2 2 " green, pink, blue" NA
3 4 NA " indigo, violet, puce"
4 5 " green, pink, blue" " indigo, violet, puce"
评论
dput(head(mydata))
,(?!\s)
fixed
cSplit_e
FALSE
sep
sep=",\\b"