给定字符集的所有可能序列

All possible sequences of given sets of characters

提问人:Ian 提问时间:12/7/2022 最后编辑:MaëlIan 更新时间:12/8/2022 访问量:101

问:

我有以下字符串:

'[ABC][abcd][XYZ]'

我想生成所有可能的字符串,其中第一个字符是 A、B 或 C,第二个字符是 a、b、c 或 d,第三个字符是 X、Y 或 Z。

示例:AcX、BaZ 等。

如何做到这一点,最好是在 Tidyverse 中?

r 字符串 序列 tidyr

评论


答:

7赞 jay.sf 12/7/2022 #1

首先适当地使用字符串来获取列表,然后使用 和 with 。splitstrexpand.gridpaste0do.call

el(strsplit('[ABC][abcd][XYZ]', '[\\[|\\]]', perl=TRUE)) |>
  {\(x) x[x != '']}() |>
  sapply(strsplit, '') |>
  do.call(what=expand.grid) |>
  do.call(what=paste0)
# [1] "AaX" "BaX" "CaX" "AbX" "BbX" "CbX" "AcX" "BcX" "CcX" "AdX" "BdX" "CdX" "AaY" "BaY" "CaY" "AbY" "BbY" "CbY" "AcY" "BcY"
# [21] "CcY" "AdY" "BdY" "CdY" "AaZ" "BaZ" "CaZ" "AbZ" "BbZ" "CbZ" "AcZ" "BcZ" "CcZ" "AdZ" "BdZ" "CdZ"
6赞 Maël 12/7/2022 #2

解决方案:stringr

library(stringr)
str_extract_all(x,"(?<=\\[).+?(?=\\])", simplify = TRUE) |>
  str_split("") |>
  expand.grid() |>
  do.call(what = paste0)

# [1] "AaX" "BaX" "CaX" "AbX" "BbX" "CbX" "AcX" "BcX" "CcX" "AdX" "BdX" "CdX" "AaY" "BaY" "CaY" "AbY" "BbY" "CbY" "AcY" "BcY"
#[21] "CcY" "AdY" "BdY" "CdY" "AaZ" "BaZ" "CaZ" "AbZ" "BbZ" "CbZ" "AcZ" "BcZ" "CcZ" "AdZ" "BdZ" "CdZ"

这也有效,使用:interaction

library(stringr)
str_extract_all(x,"(?<=\\[).+?(?=\\])", simplify = TRUE) |>
  str_split("") |>
  interaction(sep = "") |> levels()

# [1] "AaX" "BaX" "CaX" "AbX" "BbX" "CbX" "AcX" "BcX" "CcX" "AdX" "BdX" "CdX" "AaY" "BaY" "CaY" "AbY" "BbY" "CbY" "AcY" "BcY"
#[21] "CcY" "AdY" "BdY" "CdY" "AaZ" "BaZ" "CaZ" "AbZ" "BbZ" "CbZ" "AcZ" "BcZ" "CcZ" "AdZ" "BdZ" "CdZ"