将排名矩阵（1 ~ 4）扩展为更大的二进制矩阵-解网

问：

我有一个矩阵，我想将其转换为具有二进制输出（0 vs 1）的矩阵。要转换的矩阵包含四行排名（1 到 4）：

mat1.data <- c(4,   3,  3,  3,  3,  2,  2,  1,  1,  1,
               3,   4,  2,  4,  2,  3,  1,  3,  3,  2,
               2,   2,  4,  1,  1,  1,  4,  4,  2,  4,
               1,   1,  1,  2,  4,  4,  3,  2,  4,  3)
mat1 <- matrix(mat1.data,nrow=4,ncol=10,byrow=TRUE)
mat1
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    4    3    3    3    3    2    2    1    1     1
[2,]    3    4    2    4    2    3    1    3    3     2
[3,]    2    2    4    1    1    1    4    4    2     4
[4,]    1    1    1    2    4    4    3    2    4     3

对于输入矩阵中的每一行，我想创建四个二进制行 - 每个排名值（1-4）对应一行。在二进制矩阵中，每个逐行条目在输入矩阵中焦点排名出现的位置上为 1，否则为 0。原始矩阵中的每一行应在输出矩阵中产生 10*4=40 个条目。

例如，对于输入矩阵中的第一行...

4   3   3   3   3   2   2   1   1   1

...输出应为：

0   0   0   0   0   0   0   1   1   1 # Rank 1 in input
0   0   0   0   0   1   1   0   0   0 # Rank 2 in input
0   1   1   1   1   0   0   0   0   0 # Rank 3 in input
1   0   0   0   0   0   0   0   0   0 # Rank 4 in input

继续此过程，所有四行排名的预期输出应如下所示：

0   0   0   0   0   0   0   1   1   1 #first row of rankings starts
0   0   0   0   0   1   1   0   0   0
0   1   1   1   1   0   0   0   0   0
1   0   0   0   0   0   0   0   0   0 #first row of rankings ends
0   0   0   0   0   0   1   0   0   0 #second row of rankings starts
0   0   1   0   1   0   0   0   0   1
1   0   0   0   0   1   0   1   1   0
0   1   0   1   0   0   0   0   0   0 #second row of rankings ends
0   0   0   1   1   1   0   0   0   0 #third row of rankings starts
1   1   0   0   0   0   0   0   1   0
0   0   0   0   0   0   0   0   0   0
0   0   1   0   0   0   1   1   0   1 #third row of rankings ends
1   1   1   0   0   0   0   0   0   0 #fourth row of rankings starts
0   0   0   1   0   0   0   1   0   0
0   0   0   0   0   0   1   0   0   1
0   0   0   0   1   1   0   0   1   0 #fourth row of rankings ends

我该如何实现？我有一个更大的数据集，所以首选更有效的方法，但任何帮助将不胜感激！

R 阵操作数据转换排名

matrix(sapply(mat1, \(i) replace(numeric(4), i, 1)), ncol = ncol(mat1))
#      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,]    0    0    0    0    0    0    0    1    1     1
# [2,]    0    0    0    0    0    1    1    0    0     0
# [3,]    0    1    1    1    1    0    0    0    0     0
# [4,]    1    0    0    0    0    0    0    0    0     0
# [5,]    0    0    0    0    0    0    1    0    0     0
# [6,]    0    0    1    0    1    0    0    0    0     1
# [7,]    1    0    0    0    0    1    0    1    1     0
# [8,]    0    1    0    1    0    0    0    0    0     0
# [9,]    0    0    0    1    1    1    0    0    0     0
#[10,]    1    1    0    0    0    0    0    0    1     0
#[11,]    0    0    0    0    0    0    0    0    0     0
#[12,]    0    0    1    0    0    0    1    1    0     1
#[13,]    1    1    1    0    0    0    0    0    0     0
#[14,]    0    0    0    1    0    0    0    1    0     0
#[15,]    0    0    0    0    0    0    1    0    0     1
#[16,]    0    0    0    0    1    1    0    0    1     0

它需要 2 个步骤，管道语法可能看起来更清晰：

sapply(mat1, \(i) replace(numeric(4), i, 1)) |>  ## each value to binary vector
  matrix(ncol = ncol(mat1))  ## reshape

实际上，我不需要那个匿名功能。我可以直接传递，以及它的参数。\(i)replacesapply

matrix(sapply(mat1, replace, x = numeric(4), values = 1), ncol = ncol(mat1))

sapply(mat1, replace, x = numeric(4), values = 1) |> matrix(ncol = ncol(mat1))

杂项

user20650 和我在评论中讨论了一点，这里有一个“矢量化”的方法，使用：outer

matrix(+outer(1:4, c(mat1), "=="), ncol = ncol(mat1))

Henrik 的答案是一种更节省内存的“矢量化”方法，但它使索引计算过于复杂。这里有一些更简单的东西：

out <- matrix(0, nrow(mat1) * 4, ncol(mat1))
pos1 <- seq(0, length(mat1) - 1) * 4 + c(mat1)
out[pos1] <- 1

到目前为止，所有方法都会创建一个密集的输出矩阵。这是可以的，因为非零元素的百分比为 25%，这通常不是稀疏的。但是，如果我们想要一个稀疏的，它也很简单：

## in fact, this is what Henrik aims to compute
ij <- arrayInd(pos1, c(4 * nrow(mat1), ncol(mat1)))
## sparse matrix
Matrix::sparseMatrix(i = ij[, 1], j = ij[, 2], x = rep(1, length(mat1)))
#16 x 10 sparse Matrix of class "dgCMatrix"
#                         
# [1,] . . . . . . . 1 1 1
# [2,] . . . . . 1 1 . . .
# [3,] . 1 1 1 1 . . . . .
# [4,] 1 . . . . . . . . .
# [5,] . . . . . . 1 . . .
# [6,] . . 1 . 1 . . . . 1
# [7,] 1 . . . . 1 . 1 1 .
# [8,] . 1 . 1 . . . . . .
# [9,] . . . 1 1 1 . . . .
#[10,] 1 1 . . . . . . 1 .
#[11,] . . . . . . . . . .
#[12,] . . 1 . . . 1 1 . 1
#[13,] 1 1 1 . . . . . . .
#[14,] . . . 1 . . . 1 . .
#[15,] . . . . . . 1 . . 1
#[16,] . . . . 1 1 . . 1 .

4赞 Henrik 7/16/2022 #2

使用、和矩阵索引：rowcol

m = matrix(0, nr = 4 * nrow(mat1), nc = ncol(mat1))
m[cbind(c(row(mat1) + seq(0, by = (4 - 1), len = nrow(mat1)) + (mat1 - 1)), 
        c(col(mat1)))] = 1

      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    0    0    0    0    0    0    1    1     1
 [2,]    0    0    0    0    0    1    1    0    0     0
 [3,]    0    1    1    1    1    0    0    0    0     0
 [4,]    1    0    0    0    0    0    0    0    0     0
 [5,]    0    0    0    0    0    0    1    0    0     0
 [6,]    0    0    1    0    1    0    0    0    0     1
 [7,]    1    0    0    0    0    1    0    1    1     0
 [8,]    0    1    0    1    0    0    0    0    0     0
 [9,]    0    0    0    1    1    1    0    0    0     0
[10,]    1    1    0    0    0    0    0    0    1     0
[11,]    0    0    0    0    0    0    0    0    0     0
[12,]    0    0    1    0    0    0    1    1    0     1
[13,]    1    1    1    0    0    0    0    0    0     0
[14,]    0    0    0    1    0    0    0    1    0     0
[15,]    0    0    0    0    0    0    1    0    0     1
[16,]    0    0    0    0    1    1    0    0    1     0

2赞 ThomasIsCoding 7/17/2022 #3

也许我们可以从使用+中受益，如下所示kroneckerrep

> +(kronecker(mat1, matrix(rep(1, 4))) == rep(1:4, nrow(mat1)))
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    0    0    0    0    0    0    1    1     1
 [2,]    0    0    0    0    0    1    1    0    0     0
 [3,]    0    1    1    1    1    0    0    0    0     0
 [4,]    1    0    0    0    0    0    0    0    0     0
 [5,]    0    0    0    0    0    0    1    0    0     0
 [6,]    0    0    1    0    1    0    0    0    0     1
 [7,]    1    0    0    0    0    1    0    1    1     0
 [8,]    0    1    0    1    0    0    0    0    0     0
 [9,]    0    0    0    1    1    1    0    0    0     0
[10,]    1    1    0    0    0    0    0    0    1     0
[11,]    0    0    0    0    0    0    0    0    0     0
[12,]    0    0    1    0    0    0    1    1    0     1
[13,]    1    1    1    0    0    0    0    0    0     0
[14,]    0    0    0    1    0    0    0    1    0     0
[15,]    0    0    0    0    0    0    1    0    0     1
[16,]    0    0    0    0    1    1    0    0    1     0

将排名矩阵 （1 ~ 4） 扩展为更大的二进制矩阵

Expand a matrix of rankings (1 ~ 4) to a bigger binary matrix

评论

评论

将排名矩阵（1 ~ 4）扩展为更大的二进制矩阵