在 R 中将向量转换为矩阵时指定变量名称

Specifying the variable names when converting a vector to a matrix in R

提问人:Simon Harmel 提问时间:9/13/2023 最后编辑:Simon Harmel 更新时间:9/18/2023 访问量:74

问:

我有一个名为 .该向量有 8 个元素,如下所示: .Rsc(L2DA.L2DF= .637, L2DA.L2G= 0.553,...)

我可以使用(见下文)将这个命名的数值向量转换为 8x8 相关矩阵。metafor::vec2mat(Rs)

问题:但是我想知道如何为该相关矩阵分配行名和列名,以便这些名称表示我原始命名的数字向量中的名称?

例如,在矩阵中,元素 [1,1] = 来自我的向量的第一个元素,因此它应该具有 colname 和 rowname ,依此类推。0.637L2DAL2DF

library(metafor)

dat <- read.csv("https://raw.githubusercontent.com/ilzl/i/master/j.csv")

dat$var1.var2 <- apply(dat[c("var1","var2")],1,paste0,collapse=".")

res <- rma(ri~var1.var2+0,1, data=dat)

(Rs = setNames(coef(res),sub("var1.var2","",names(coef(res)))))

(R_matrix = vec2mat(Rs))

          [,1]  [,2]      [,3]      [,4]      [,5]      [,6]      [,7]      [,8]
[1,] 1.0000000 0.637 0.5533333 0.4180000 0.5550000 0.5678947 0.4781481 0.3675000
[2,] 0.6370000 1.000 0.2440000 0.2900000 0.4840000 0.3500000 0.4750000 0.5700000
[3,] 0.5533333 0.244 1.0000000 0.2933333 0.5100000 0.3300000 0.4775000 0.5765714
[4,] 0.4180000 0.290 0.2933333 1.0000000 0.4627778 0.5212121 0.5569565 0.4928571
[5,] 0.5550000 0.484 0.5100000 0.4627778 1.0000000 0.4695652 0.5140625 0.5313793
[6,] 0.5678947 0.350 0.3300000 0.5212121 0.4695652 1.0000000 0.5194118 0.5258333
[7,] 0.4781481 0.475 0.4775000 0.5569565 0.5140625 0.5194118 1.0000000 0.4240000
[8,] 0.3675000 0.570 0.5765714 0.4928571 0.5313793 0.5258333 0.4240000 1.0000000
R 字符串 DataFrame 矩阵 向量

评论


答:

4赞 r2evans 9/13/2023 #1

首先,屏蔽是一个问题:vec2mat

Rs
# L2DA.L2DF  L2DA.L2G  L2DA.L2L  L2DA.L2M  L2DA.L2P  L2DA.L2V  L2DF.L2G  L2DF.L2L  L2DF.L2M  L2DF.L2P  L2DF.L2V   L2G.L2L   L2G.L2M   L2G.L2P   L2M.L2L 
# 0.6370000 0.5533333 0.4180000 0.5550000 0.5678947 0.4781481 0.3675000 0.2440000 0.2900000 0.4840000 0.3500000 0.4750000 0.5700000 0.2933333 0.5100000 
#   L2P.L2L   L2P.L2M  L2R.L2DA  L2R.L2DF   L2R.L2G   L2R.L2L   L2R.L2M   L2R.L2P   L2R.L2V   L2V.L2G   L2V.L2L   L2V.L2M   L2V.L2P 
# 0.3300000 0.4775000 0.5765714 0.4627778 0.5212121 0.5569565 0.4928571 0.4695652 0.5140625 0.5313793 0.5194118 0.5258333 0.4240000 

请注意,有六个值,表明矩阵的第一列(推断应该是 应该跨越 和 ...但是,它也包括以下值:L2DA.*"L2DA"0.6370.478L2DF.L2G

vec2mat(Rs)
#           [,1]  [,2]      [,3]      [,4]      [,5]      [,6]      [,7]      [,8]
# [1,] 1.0000000 0.637 0.5533333 0.4180000 0.5550000 0.5678947 0.4781481 0.3675000
# [2,] 0.6370000 1.000 0.2440000 0.2900000 0.4840000 0.3500000 0.4750000 0.5700000
# [3,] 0.5533333 0.244 1.0000000 0.2933333 0.5100000 0.3300000 0.4775000 0.5765714
# [4,] 0.4180000 0.290 0.2933333 1.0000000 0.4627778 0.5212121 0.5569565 0.4928571
# [5,] 0.5550000 0.484 0.5100000 0.4627778 1.0000000 0.4695652 0.5140625 0.5313793
# [6,] 0.5678947 0.350 0.3300000 0.5212121 0.4695652 1.0000000 0.5194118 0.5258333
# [7,] 0.4781481 0.475 0.4775000 0.5569565 0.5140625 0.5194118 1.0000000 0.4240000
# [8,] 0.3675000 0.570 0.5765714 0.4928571 0.5313793 0.5258333 0.4240000 1.0000000

无论出于何种原因,并非所有组合都存在于系数中;至少缺少:L2DA.L2R

nms <- unique(unlist(strsplit(names(Rs), "[.]")))
nms
# [1] "L2DA" "L2DF" "L2G"  "L2L"  "L2M"  "L2P"  "L2V"  "L2R" 
grep("L2DA", names(Rs), value=TRUE)
# [1] "L2DA.L2DF" "L2DA.L2G"  "L2DA.L2L"  "L2DA.L2M"  "L2DA.L2P"  "L2DA.L2V"  "L2R.L2DA" 
setdiff(nms, sub(".*\\.", "", grep("L2DA", names(Rs), value=TRUE)))
# [1] "L2R"

...虽然存在......似乎没有与数据共享的假设。L2R.L2DAvec2mat

虽然我们可能会蛮力匹配,但我认为这个推理步骤会使所有系数都已正确分配给行/列变得不那么清楚。

而不是,让我们自己做,保留名称。vec2mat

tmp <- data.frame(nm = names(Rs), R = Rs) |>
  transform(row = sub("\\..*", "", nm), col = sub(".*\\.", "", nm))
head(tmp)
#                  nm         R  row  col
# L2DA.L2DF L2DA.L2DF 0.6370000 L2DA L2DF
# L2DA.L2G   L2DA.L2G 0.5533333 L2DA  L2G
# L2DA.L2L   L2DA.L2L 0.4180000 L2DA  L2L
# L2DA.L2M   L2DA.L2M 0.5550000 L2DA  L2M
# L2DA.L2P   L2DA.L2P 0.5678947 L2DA  L2P
# L2DA.L2V   L2DA.L2V 0.4781481 L2DA  L2V
reshape2::dcast(tmp, row ~ col, value.var = "R")
#    row      L2DA      L2DF       L2G       L2L       L2M       L2P       L2V
# 1 L2DA        NA 0.6370000 0.5533333 0.4180000 0.5550000 0.5678947 0.4781481
# 2 L2DF        NA        NA 0.3675000 0.2440000 0.2900000 0.4840000 0.3500000
# 3  L2G        NA        NA        NA 0.4750000 0.5700000 0.2933333        NA
# 4  L2M        NA        NA        NA 0.5100000        NA        NA        NA
# 5  L2P        NA        NA        NA 0.3300000 0.4775000        NA        NA
# 6  L2R 0.5765714 0.4627778 0.5212121 0.5569565 0.4928571 0.4695652 0.5140625
# 7  L2V        NA        NA 0.5313793 0.5194118 0.5258333 0.4240000        NA

这并不奇怪,因为我们知道我们没有两个方向的对。让我们把两者交换,然后行绑定,然后再来一次。tmpdcast

tmp2 <- transform(tmp, row2 = row, row = col) |>
  transform(col = row2, row2 = NULL)
head(tmp2)
#                  nm         R  row  col
# L2DA.L2DF L2DA.L2DF 0.6370000 L2DF L2DA
# L2DA.L2G   L2DA.L2G 0.5533333  L2G L2DA
# L2DA.L2L   L2DA.L2L 0.4180000  L2L L2DA
# L2DA.L2M   L2DA.L2M 0.5550000  L2M L2DA
# L2DA.L2P   L2DA.L2P 0.5678947  L2P L2DA
# L2DA.L2V   L2DA.L2V 0.4781481  L2V L2DA
out <- rbind(tmp, tmp2) |>
  reshape2::dcast(row ~ col, value.var = "R")
out
#    row      L2DA      L2DF       L2G       L2L       L2M       L2P       L2R       L2V
# 1 L2DA        NA 0.6370000 0.5533333 0.4180000 0.5550000 0.5678947 0.5765714 0.4781481
# 2 L2DF 0.6370000        NA 0.3675000 0.2440000 0.2900000 0.4840000 0.4627778 0.3500000
# 3  L2G 0.5533333 0.3675000        NA 0.4750000 0.5700000 0.2933333 0.5212121 0.5313793
# 4  L2L 0.4180000 0.2440000 0.4750000        NA 0.5100000 0.3300000 0.5569565 0.5194118
# 5  L2M 0.5550000 0.2900000 0.5700000 0.5100000        NA 0.4775000 0.4928571 0.5258333
# 6  L2P 0.5678947 0.4840000 0.2933333 0.3300000 0.4775000        NA 0.4695652 0.4240000
# 7  L2R 0.5765714 0.4627778 0.5212121 0.5569565 0.4928571 0.4695652        NA 0.5140625
# 8  L2V 0.4781481 0.3500000 0.5313793 0.5194118 0.5258333 0.4240000 0.5140625        NA

如果你需要它作为一个简单的矩阵,那么我们可以这样做:

out2 <- as.matrix(out[,-1])
dimnames(out2) <- list(out$row, colnames(out)[-1])
out2
#           L2DA      L2DF       L2G       L2L       L2M       L2P       L2R       L2V
# L2DA        NA 0.6370000 0.5533333 0.4180000 0.5550000 0.5678947 0.5765714 0.4781481
# L2DF 0.6370000        NA 0.3675000 0.2440000 0.2900000 0.4840000 0.4627778 0.3500000
# L2G  0.5533333 0.3675000        NA 0.4750000 0.5700000 0.2933333 0.5212121 0.5313793
# L2L  0.4180000 0.2440000 0.4750000        NA 0.5100000 0.3300000 0.5569565 0.5194118
# L2M  0.5550000 0.2900000 0.5700000 0.5100000        NA 0.4775000 0.4928571 0.5258333
# L2P  0.5678947 0.4840000 0.2933333 0.3300000 0.4775000        NA 0.4695652 0.4240000
# L2R  0.5765714 0.4627778 0.5212121 0.5569565 0.4928571 0.4695652        NA 0.5140625
# L2V  0.4781481 0.3500000 0.5313793 0.5194118 0.5258333 0.4240000 0.5140625        NA

diag(out2) <- 1
out2
#           L2DA      L2DF       L2G       L2L       L2M       L2P       L2R       L2V
# L2DA 1.0000000 0.6370000 0.5533333 0.4180000 0.5550000 0.5678947 0.5765714 0.4781481
# L2DF 0.6370000 1.0000000 0.3675000 0.2440000 0.2900000 0.4840000 0.4627778 0.3500000
# L2G  0.5533333 0.3675000 1.0000000 0.4750000 0.5700000 0.2933333 0.5212121 0.5313793
# L2L  0.4180000 0.2440000 0.4750000 1.0000000 0.5100000 0.3300000 0.5569565 0.5194118
# L2M  0.5550000 0.2900000 0.5700000 0.5100000 1.0000000 0.4775000 0.4928571 0.5258333
# L2P  0.5678947 0.4840000 0.2933333 0.3300000 0.4775000 1.0000000 0.4695652 0.4240000
# L2R  0.5765714 0.4627778 0.5212121 0.5569565 0.4928571 0.4695652 1.0000000 0.5140625
# L2V  0.4781481 0.3500000 0.5313793 0.5194118 0.5258333 0.4240000 0.5140625 1.0000000

作为功能:

myfun <- function(Rs) {
  tmp <- data.frame(nm = names(Rs), R = Rs) |>
    transform(row = sub("\\..*", "", nm), col = sub(".*\\.", "", nm))
  tmp2 <- transform(tmp, row2 = row, row = col) |>
    transform(col = row2, row2 = NULL)
  out <- rbind(tmp, tmp2) |>
    tidyr::pivot_wider(id_cols = "row", names_from = "col", values_from = "R")
  out <- out[order(match(out$row, colnames(out))),]
  out2 <- as.matrix(out[,-1])
  dimnames(out2) <- list(out$row, colnames(out)[-1])
  diag(out2) <- 1
  out2
}
Rs
# L2DA.L2DF  L2DA.L2G  L2DA.L2L  L2DA.L2M  L2DA.L2P  L2DA.L2V  L2DF.L2G  L2DF.L2L  L2DF.L2M  L2DF.L2P  L2DF.L2V   L2G.L2L   L2G.L2M   L2G.L2P   L2M.L2L 
# 0.6370000 0.5533333 0.4180000 0.5550000 0.5678947 0.4781481 0.3675000 0.2440000 0.2900000 0.4840000 0.3500000 0.4750000 0.5700000 0.2933333 0.5100000 
#   L2P.L2L   L2P.L2M  L2R.L2DA  L2R.L2DF   L2R.L2G   L2R.L2L   L2R.L2M   L2R.L2P   L2R.L2V   L2V.L2G   L2V.L2L   L2V.L2M   L2V.L2P 
# 0.3300000 0.4775000 0.5765714 0.4627778 0.5212121 0.5569565 0.4928571 0.4695652 0.5140625 0.5313793 0.5194118 0.5258333 0.4240000 
myfun(Rs)
#           L2DF       L2G       L2L       L2M       L2P       L2V      L2DA       L2R
# L2DF 1.0000000 0.3675000 0.2440000 0.2900000 0.4840000 0.3500000 0.6370000 0.4627778
# L2G  0.3675000 1.0000000 0.4750000 0.5700000 0.2933333 0.5313793 0.5533333 0.5212121
# L2L  0.2440000 0.4750000 1.0000000 0.5100000 0.3300000 0.5194118 0.4180000 0.5569565
# L2M  0.2900000 0.5700000 0.5100000 1.0000000 0.4775000 0.5258333 0.5550000 0.4928571
# L2P  0.4840000 0.2933333 0.3300000 0.4775000 1.0000000 0.4240000 0.5678947 0.4695652
# L2V  0.3500000 0.5313793 0.5194118 0.5258333 0.4240000 1.0000000 0.4781481 0.5140625
# L2DA 0.6370000 0.5533333 0.4180000 0.5550000 0.5678947 0.4781481 1.0000000 0.5765714
# L2R  0.4627778 0.5212121 0.5569565 0.4928571 0.4695652 0.5140625 0.5765714 1.0000000

评论

0赞 r2evans 9/13/2023
(1) 当然,.(2) 当然,.myfun <- function(Rs) { tmp <- ...; tmp2 <- ..., out <- ...; out2 <- ...; out2; }diag(out2) <- 1
0赞 r2evans 9/13/2023
看看我的编辑@SimonHarmel,我为你做了这项工作(虽然它没有什么新颖或令人印象深刻的,只是一个包装)。
0赞 r2evans 9/14/2023
基地 R?我建议不要这样做。-to- 是微不足道的:dcastpivot_wider... |> tidyr::pivot_wider(id_cols = "row", names_from = "col", values_from = "R")
0赞 Simon Harmel 9/14/2023
哼哼,但是输出不匹配,例如,对角线元素在变成 1 之前不再都是 NA?
0赞 r2evans 9/14/2023
@SimonHarmel查看我的更新。
2赞 Robert Hacken 9/18/2023 #2

@r2evans解决方案的更简洁的变体:

# matrix with names of rows and columns in its two columns
row.col <- do.call(rbind, strsplit(names(Rs), '\\.'))
# vector of all names
nam <- sort(unique(c(row.col)))
# empty correlation matrix
corr <- matrix(NA, length(nam), length(nam), dimnames=list(nam, nam))
# fill it
diag(corr) <- 1
corr[rbind(row.col, row.col[, 2:1])] <- Rs
corr
#      L2DA L2DF  L2G  L2L  L2M  L2P  L2R  L2V
# L2DA 1.00 0.64 0.55 0.42 0.56 0.57 0.58 0.48
# L2DF 0.64 1.00 0.37 0.24 0.29 0.48 0.46 0.35
# L2G  0.55 0.37 1.00 0.48 0.57 0.29 0.52 0.53
# L2L  0.42 0.24 0.48 1.00 0.51 0.33 0.56 0.52
# L2M  0.56 0.29 0.57 0.51 1.00 0.48 0.49 0.53
# L2P  0.57 0.48 0.29 0.33 0.48 1.00 0.47 0.42
# L2R  0.58 0.46 0.52 0.56 0.49 0.47 1.00 0.51
# L2V  0.48 0.35 0.53 0.52 0.53 0.42 0.51 1.00

评论

0赞 Simon Harmel 9/20/2023
谢谢+1,罗伯特。我在这里问了一个关于你的伟大答案的后续问题。你能看一下吗?