Tidyverse 中的列索引,用于循环

column indexing in tidyverse, working for loop

我想将一个 for 循环转换为 tidyverse 工作流程,因为我通常遵循 tidyverse 方法并喜欢使用管道工作流程。我可以使用 for 循环轻松存档我想要的东西。

这是一个最小的可重现示例,其中我通过每行标识首次出现 1 的列(列的编号)。然后,我用它来索引我的矩阵,并在每一列中设置第一个 1 之前的 NA。

# set seed

# create matrix of 0 and 1
matrix<-matrix(rbinom(100, 1, 0.4),nrow=10,ncol=10)

# get first column not being a zero (index)
first <- apply(matrix, 1, function(x) min(which(x !=0)))

# set all column before the indexed column to NA
for (i in 1:nrow(matrix)){
  if(first[i] > 1) matrix[i, 1:(first[i]-1)] <- NA

for-loop 矩阵 dplyr tidyverse 数据操作



1赞 Mark 7/28/2023 #1


matrix |>
  as.data.frame() |>
  rowwise() |>
  mutate(first = which.max(c_across(everything())),
         across(-first, ~ if_else(as.numeric(str_extract(cur_column(), "\\d+")) < first, NA, .))) |> # this is quite janky. If anyone knows a way of selecting column numbers within an across call, please comment with it
  ungroup() |> 


1赞 benson23 7/28/2023 #2


在这里,核心概念是用于返回每个原始行的累积总和。如果为零,请将其替换为 ,否则保持原样。cumsumNA


as.data.frame(matrix) %>% 
  mutate(rn = row_number()) %>% 
  pivot_longer(-rn) %>% 
  mutate(value = ifelse(cumsum(value) == 0, NA, value), .by = rn) %>% 
  pivot_wider() %>% 

# A tibble: 10 × 10
      V1    V2    V3    V4    V5    V6    V7    V8    V9   V10
   <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
 1    NA     1     1     0     0     0     0     0     1     0
 2    NA    NA     1     0     0     1     0     0     0     1
 3    NA    NA     1     0     0     0     0     1     0     0
 4     1     0     0     1     1     0     0     0     1     0
 5    NA    NA    NA    NA    NA    NA     1     0     0     0
 6    NA    NA     1     1     0     0     1     0     0     1
 7    NA    NA     1     1     0     0     0     0     1     0
 8    NA    NA    NA     1     0     0     0     1     0     0
 9     1     0     1     1     0     0     0     0     0     1
10    NA     1     0     0     1     0     0     0     0     0