提问人:littleworth 提问时间:11/17/2022 最后编辑:Wiktor Stribiżewlittleworth 更新时间:11/18/2022 访问量:640
如何使用 R 保留字符串中的字符顺序,将字符串中的字符替换为向量中的字符
How to replace a character in a string with characters in a vector by preserving its order using R
问:
我有这个字符串:
seed_pattern <- "K?ED??HRDDKDKD?HE?REKE??DE?KKK"
给定另一个字符串
bb_seq <- "rhhhhitv"
我想做的是通过保持导致的顺序来替换一个字符:?
bb_seq
bb_seq
保证 的总长度与 相同。?
bb_seq
KrEDhhHRDDKDKDhHEhREKEitDEvKKK
如何使用 R 实现这一点?
我试过这个但失败了:
seed_pattern <- "K?ED??HRDDKDKD?HE?REKE??DE?KKK"
bb_seq <- "rhhhhitv"
sp <- seed_pattern
gr <- gregexpr("\\?+", sp)
csml <- lapply(gr, function(sp) cumsum(attr(sp, "match.length")))
regmatches(sp, gr) <- lapply(csml, function(sp) substring(bb_seq, c(1, sp[1]), sp))
sp
# KrEDrhhHRDDKDKDrhhhHErhhhhREKErhhhhitDErhhhhitvKKK
我对非正则表达式解决方案持开放态度。
答:
6赞
Jilber Urbina
11/17/2022
#1
拆分、替换、合并:
> target <- strsplit(seed_pattern, "")[[1]]
> replacement <- strsplit(bb_seq, "")[[1]]
> target[target=="?"] <- replacement
> paste(target, collapse = "")
[1] "KrEDhhHRDDKDKDhHEhREKEitDEvKKK"
1赞
shadowtalker
11/17/2022
#2
您可以通过一次替换一个来做到这一点(可能不是很有效):?
seed_pattern <- "K?ED??HRDDKDKD?HE?REKE??DE?KKK"
bb_seq <- "rhhhhitv"
for (ch in unlist(strsplit(bb_seq, ""))) {
print(ch)
seed_pattern <- sub("?", ch, seed_pattern, fixed = TRUE)
}
print(seed_pattern)
# [1] "KrEDhhHRDDKDKDhHEhREKEitDEvKKK"
可悲的是,没有在论点上矢量化!sub
replacement
1赞
TarJae
11/17/2022
#3
这是很长的路要走。我仍然不能在不思考滴答声或数据帧的情况下做这些事情。希望有一天我能明白这一点:
library(dplyr)
library(tidyr)
tibble(seed_pattern, bb_seq) %>%
separate_rows(seed_pattern, sep='\\?') %>%
mutate(seed_pattern = paste(paste0(seed_pattern, substr(bb_seq, row_number(), row_number())), collapse = "")) %>%
slice(1) %>%
pull(seed_pattern)
[1] "KrEDhhHRDDKDKDhHEhREKEitDEvKKK"
10赞
lotus
11/17/2022
#4
您可以在一句话中执行此操作,但对您从之前的问题中获得的解决方案略有更改(感谢@thelatemail):
regmatches(seed_pattern, gregexpr("\\?", seed_pattern)) <- strsplit(bb_seq, "")
检查它是否提供了预期的结果:
seed_pattern == "KrEDhhHRDDKDKDhHEhREKEitDEvKKK"
[1] TRUE
2赞
Ottie
11/18/2022
#5
该方法是最好的,但如果您希望看到解决方案逐步增长,这里有一个顺序单行代码:regmatches
sapply(strsplit(bb_seq, "")[[1]], function(char) seed_pattern <<- sub("\\?", char, seed_pattern))
r
"KrED??HRDDKDKD?HE?REKE??DE?KKK"
h
"KrEDh?HRDDKDKD?HE?REKE??DE?KKK"
h
"KrEDhhHRDDKDKD?HE?REKE??DE?KKK"
h
"KrEDhhHRDDKDKDhHE?REKE??DE?KKK"
h
"KrEDhhHRDDKDKDhHEhREKE??DE?KKK"
i
"KrEDhhHRDDKDKDhHEhREKEi?DE?KKK"
t
"KrEDhhHRDDKDKDhHEhREKEitDE?KKK"
v
"KrEDhhHRDDKDKDhHEhREKEitDEvKKK"
评论
stringr::str_replace_all()
在它的参数上矢量化,但没有按照我的预期这样做。我试过了。str_replace_all(seed_pattern, "\\?+", unlist(str_split(bb_seq, "")))
regmatches(seed_pattern, gregexpr("\\?", seed_pattern)) <- strsplit(bb_seq, "")