从嵌套列表的每个元素中删除特定值-解网

问：

我有一个嵌套列表，如下所示：

nested.list <- list(c(46270L, 103154L, 159944L, 193405L, 199925L), c(24049L, 
  30454L, 55710L, 106407L, 122059L, 174131L), c(14520L, 46270L, 
  153636L, 188626L, 199925L), c(8150L, 24049L, 27321L, 30461L, 
  33513L, 55710L, 58933L, 71342L, 103154L, 122059L, 159920L, 169516L, 
  174131L), c(19195L, 71333L, 122059L, 137645L, 153636L, 183740L, 
  195065L, 199925L), c(14520L, 60368L, 80939L, 82381L, 95070L, 
  103172L, 106379L, 147215L, 166353L, 199925L), c(30461L, 68324L, 
  75981L, 77674L, 106407L, 120284L), c(24029L, 72751L, 103154L, 
  120284L, 142359L))

> nested.list
[[1]]
[1]  46270 103154 159944 193405 199925

[[2]]
[1]  24049  30454  55710 106407 122059 174131

[[3]]
[1]  14520  46270 153636 188626 199925

[[4]]
 [1]   8150  24049  27321  30461  33513  55710  58933  71342 103154 122059 159920 169516 174131

[[5]]
[1]  19195  71333 122059 137645 153636 183740 195065 199925

[[6]]
 [1]  14520  60368  80939  82381  95070 103172 106379 147215 166353 199925

[[7]]
[1]  30461  68324  75981  77674 106407 120284

[[8]]
[1]  24029  72751 103154 120284 142359

在每个列表中，我只想保留这些数字。（经过编辑以包括更具说明性的列表）。

target.values <- c(24029, 33513, 60368, 106407, 147215, 153636, 159920, 193405)

我试过了，但显然我错过了一些关于它如何工作的东西。purrr::keep(.x = nested.list, .p = function(x){all(x %in% target.values)})

r purrr 嵌套列表

set.seed(0)
nested.list <- replicate(1000, sample.int(100, sample.int(100, 1)))
target.values <- sample.int(100, 50)

f1 <- function() {
    lapply(nested.list, \(x) x[x %in% target.values])
}

f2 <- function() {
    lapply(nested.list, intersect, target.values)
}

f3 <- function() {
    Map(intersect, nested.list, list(target.values))
}

f4 <- function() {
    rapply(nested.list, \(z) z[z %in% target.values], how = "list")
}

f5 <- function() {
    map(nested.list, keep, \(x) x %in% target.values)
}

microbenchmark(
    f1 = f1(),
    f2 = f2(),
    f3 = f3(),
    f4 = f4(),
    f5 = f5(),
    unit = "relative",
    check = "equivalent",
    times = 50L
)

这给了

Unit: relative
 expr        min        lq      mean    median        uq       max neval
   f1  1.0000000  1.000000  1.000000  1.000000  1.000000  1.000000    50
   f2  4.1351785  4.099845  4.127982  4.088224  4.117141  3.649927    50
   f3  4.2296240  4.291979  4.842893  4.408095  4.817794 13.234459    50
   f4  0.9750334  0.979558  1.097199  1.010855  1.085660  1.760977    50
   f5 71.8225616 69.069376 64.921864 67.952181 64.960019 47.465078    50

我们可以看到，或者 with 应该是最有效的方式，而效率最低。lapplyrapply%in%keep

3赞 jblood94 10/24/2023 #5

您可以对比较进行矢量化，以便为具有大量向量的列表提供一些加速。

f6 <- function(x, target) {
  u <- unlist(x)
  i <- which(u %in% target)
  setNames(split(u[i], rep(as.factor(1:length(x)), lengths(x))[i]), names(x))
}

使用 @ThomasIsCoding 的基准测试：

microbenchmark::microbenchmark(
  f1 = f1(),
  f2 = f2(),
  f6 = f6(nested.list, target.values),
  unit = "relative",
  check = "equivalent"
)
#> Unit: relative
#>  expr      min       lq     mean   median       uq      max neval
#>    f1 2.106476 1.893394 2.113900 1.895352 1.883364 3.380201   100
#>    f2 6.219841 5.649225 6.325568 5.629023 5.760277 6.345773   100
#>    f6 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000   100

上一个：根据需要将 R 数据帧转换为嵌套列表

下一个：将撤消添加到具有嵌套列表的 SortableJS 列表

从嵌套列表的每个元素中删除特定值

remove specific values from each element of a nested list

评论

评论