我试图找到一种方法来循环浏览一系列数字，直到列表中的所有数字都与缺少或 NA 之前的最后一个数字相同-解网

问：

让我们以这些数字为例，因为我无法共享实际数据 end=c（1,3,6，NA，6,7,8，NA，12,23，NA）。我需要将其转换为 end=c（6,6,6，NA，8,8,8，NA，23,23，NA）。

`repeat{
  end[k]=end[k+1]
  k=k+1
  if(is.na(end[k+1]) | k==length(end)){
    
    break
  }
}`

我已经尝试过这是第一步，但我被困在如何循环中，同时最终跳过 NA。我尝试将这种循环方法添加到其中，但它没有给我所需的输出。

`j=1
while (j <= length(end)){
k <- j
repeat{
  end[k]=end[k+1]
  k=k+1
  if(is.na(end[k+1]) | k==length(end)){
    
    break
  }
}
 j=j+1
}`

有谁知道如何在临时停止 NA 的循环并将 NA 之前的最后一个数字转换为所有以前的数字时遍历这些数字。

R 循环嵌套 NA

溶液

将 NA 之间的值替换为每个 NA 之前的值。这是最快的解决方案（查看基准测试）。

na_pos <- which(is.na(end))
end[-na_pos] <- rep(end[na_pos-1], diff(c(0,na_pos))-1)
end
#> [1]  6  6  6 NA  8  8  8 NA 23 23 NA

如果不以 NA 结尾，则该解决方案将不起作用。因此，为了安全起见，您可以通过以下方式对其进行修改：end

na_pos <- union(which(is.na(end)), length(end)+1)
end[-na_pos] <- rep(end[na_pos-1], diff(c(0,na_pos))-1)
end
#> [1]  6  6  6 NA  8  8  8 NA 23 23 NA

虽然速度非常慢，但它仍然是最快的解决方案。

带有 FOR 环路的解决方案

为了帮助你，我将在这里留下一个有效的循环解决方案。你从头到尾，当你得到时，你用以前的非 NA 值替换任何 NA 值。

for(i in seq(length(end), 2)){
  
  if(!is.na(end[i]) & !is.na(end[i-1])) end[i-1] <- end[i]
  
}
end
#> [1]  6  6  6 NA  8  8  8 NA 23 23 NA

基准

在这个例子中，该方法是迄今为止最快的方法。我也冒昧地与其他解决方案进行了比较。napos

library(dplyr)

microbenchmark::microbenchmark(
  
  napos = {
    end <- c(1,3,6,NA,6,7,8,NA,12,23,NA)
    na_pos <- union(which(is.na(end)), length(end)+1)
    end[-na_pos] <- rep(end[na_pos-1], diff(c(0,na_pos))-1)},
  
  forloop = {
    end <- c(1,3,6,NA,6,7,8,NA,12,23,NA)
    for(i in seq(length(end), 2)){

      if(!is.na(end[i]) & !is.na(end[i-1])) end[i-1] <- end[i]

    }
  },
  cumsum = {
    end <- c(1,3,6,NA,6,7,8,NA,12,23,NA)
    grp <- cumsum(is.na(end))
    end <- ave(end, grp, FUN=function(x) ifelse(!is.na(x), tail(x, 1), NA))
  },
  
  rle = {
    
    end <- c(1,3,6,NA,6,7,8,NA,12,23,NA)
    end <- ave(end, 
        with(rle(is.na(end)), rep(seq_along(values), lengths)),
        FUN = function(x) tail(x, 1))
  },
  
  dplyr = {
  
    end <- tibble(end = c(1,3,6,NA,6,7,8,NA,12,23,NA)) |> 
      mutate(
        section = if_else(is.na(end), NA_integer_, cumsum(is.na(end)))
      ) |> 
      group_by(section) |> 
      mutate(
        end = tail(end, 1) # or alternatively last element
      ) |> 
      pull(end)
})
#> Unit: microseconds
#>     expr     min       lq      mean   median       uq     max neval
#>    napos    23.4    41.65    63.071    64.25    77.70   133.0   100
#>  forloop  5430.0  6154.35  7307.699  6729.85  7359.55 15330.1   100
#>   cumsum    98.5   143.40   227.069   195.55   254.00  2614.1   100
#>      rle   134.3   192.05   270.753   259.20   291.65  1964.7   100
#>    dplyr 13386.2 14998.55 16530.285 15647.60 16779.35 40551.9   100

^{创建于 2023-03-21 使用 reprex v2.0.2}

0赞 tmfmnk 3/21/2023 #3

@MrFlick答案的变体可能是：

ave(end, 
    with(rle(is.na(end)), rep(seq_along(values), lengths)),
    FUN = function(x) tail(x, 1))

 [1]  6  6  6 NA  8  8  8 NA 23 23 NA

0赞 Hauke L. 3/21/2023 #4

使用 dplyr 的一个想法。首先创建组，然后使用它们提取所需的数字。

library(tidyverse)

data <-
  tibble(end = c(1,3,6,NA,6,7,8,NA,12,23,NA)) |> 
  mutate(
    section = if_else(is.na(end), NA, cumsum(is.na(end)))
  )

data2 <- 
  data |> 
  group_by(section) |> 
  mutate(
    end2 = max(end), # use maximum
    end3 = tail(end, 1) # or alternatively last element
  )

上一个：嵌套在每个级别具有不同函数调用的循环的可变长度

下一个：在 Java 8 中，如何将“平面”对象列表转换为嵌套对象列表？

我试图找到一种方法来循环浏览一系列数字，直到列表中的所有数字都与缺少或 NA 之前的最后一个数字相同

I'm trying to find a way to loop through a series of numbers until all the numbers in the list are the same as the last number before a missing or NA

评论

溶液

带有 FOR 环路的解决方案

基准