如何将矩阵列插入到匹配的列表列中?

How to insert matrix columns into matching list columns?

提问人:Village.Idyot 提问时间:9/8/2023 最后编辑:Village.Idyot 更新时间:9/9/2023 访问量:61

问:

我有使用列表以有组织的方式存储值的代码,以及执行困难计算的矩阵函数。我想将矩阵列注入相应列表的正确部分。列表和矩阵之间的列名不同(由于能够在列表中嵌套并在不同级别分配名称,这对于管理复杂的数据集非常方便和重要),但它们具有共性。如何通过匹配这些通用标识符,系统地将矩阵列注入到列表的正确部分,如下图所示?在更完整的代码中,此示例派生自 列表和 的等效项是动态的:因此,我正在寻找一种系统的方式来执行此操作。mat1

在下面的示例中,常见的标识符是“Series”、“Class”和“P_advance”。

用于生成示例函数和矩阵的代码位于本文的底部。

enter image description here

以下是函数和生成矩阵的代码:classPrin()mat1

seriesList <- list(
  Series_One = data.frame(
    Class = c('Class_A','Class_B','Class_C'),
    ClassPct = c(0.5,0.2,0.06)
  ),
  Series_Two = data.frame(
    Class = c('Class_A','Class_B'),
    ClassPct = c(0.7,0.3)
  )
)

classPrin <- function(periods) {
  series_names <- c("Series_One","Series_Two")
  bucket <- list()
  
  for (series in series_names) {
    series_elements <- seriesList[[series]][["Class"]]
    bucket[[series]] <- list()
  
    for (element in series_elements) {
      mat <- matrix(0, nrow = periods, 2)
      colnames(mat) <- c(
        'P_advance',
        'FC_cover_CO'
      )
      mat[,"P_advance"] <- 0
      bucket[[series]][[element]] <- mat
    } 
  } 
  return(bucket)
}

mat1 <- matrix(0, nrow = 5, ncol = 3)
colnames(mat1) <- c(
  "Series_One_Class_A_P_advance",
  "Series_One_Class_B_P_advance",
  "Series_Two_Class_A_P_advance"
)
mat1[,1] <- 1:5
mat1[,2] <- 11:15
mat1[,3] <- 21:25
R 数据帧 列表 矩阵 嵌套

评论


答:

0赞 M-- 9/8/2023 #1

这是我的两分钱:

## Inputs
classPrin(5) -> classPrin5 # function taken from the question

matrix(c(1:5, 11:15, 21:25), 5 , 3, FALSE,
      list(NULL, c("Series_One_Class_A_P_advance",
                   "Series_One_Class_B_P_advance",
                   "Series_Two_Class_A_P_advance"))) -> mat1

## Packages (R version 4.2.2)
library(dplyr)  # dplyr_1.1.2
library(tidyr)  # tidyr_1.3.0
library(tibble) # tibble_3.2.1
library(purrr)  # purrr_1.0.1
## Pre-Processing
map(classPrin5, ~map(.x, data.frame)) %>%
  map(., ~bind_rows(.x, .id = "Class")) %>% 
  bind_rows(.id = "Series") %>% 
  group_by(Series, Class) %>% 
  mutate(rn = row_number()) %>% 
  ungroup()-> flat_classPrin5

mat1 %>% 
  as.data.frame() %>% 
  gather(var, val, everything()) %>% 
  transmute(scp = strsplit(var,  "(?:[^_]*(?:_[^_]*){1})\\K_", perl = T), val) %>% 
  unnest_wider("scp", names_sep = "_") %>% 
  setNames(c("Series", "Class", "Par", "Val")) %>%
  group_by(Series, Class) %>% 
  mutate(rn = row_number()) %>% 
  ungroup() %>% 
  pivot_wider(id_cols =  c(Series, Class, rn), 
              names_from = "Par", values_from = "Val") -> pr_mat1
## Final output
flat_classPrin5 %>% 
  full_join(., pr_mat1, by = c("Series", "Class", "rn"), suffix = c(".l", ".m")) %>% 
  transmute(Series, Class, 
            P_advance = coalesce(P_advance.m, P_advance.l), FC_cover_CO) %>% 
  split(., .$Series, drop = T) %>% 
  map(., ~split(.x, .x$Class, drop =T))
#> $Series_One
#> $Series_One$Class_A
#> # A tibble: 5 × 4
#>   Series     Class   P_advance FC_cover_CO
#>   <chr>      <chr>       <dbl>       <dbl>
#> 1 Series_One Class_A         1           0
#> 2 Series_One Class_A         2           0
#> 3 Series_One Class_A         3           0
#> 4 Series_One Class_A         4           0
#> 5 Series_One Class_A         5           0
#> 
#> $Series_One$Class_B
#> # A tibble: 5 × 4
#>   Series     Class   P_advance FC_cover_CO
#>   <chr>      <chr>       <dbl>       <dbl>
#> 1 Series_One Class_B        11           0
#> 2 Series_One Class_B        12           0
#> 3 Series_One Class_B        13           0
#> 4 Series_One Class_B        14           0
#> 5 Series_One Class_B        15           0
#> 
#> $Series_One$Class_C
#> # A tibble: 5 × 4
#>   Series     Class   P_advance FC_cover_CO
#>   <chr>      <chr>       <dbl>       <dbl>
#> 1 Series_One Class_C         0           0
#> 2 Series_One Class_C         0           0
#> 3 Series_One Class_C         0           0
#> 4 Series_One Class_C         0           0
#> 5 Series_One Class_C         0           0
#> 
#> 
#> $Series_Two
#> $Series_Two$Class_A
#> # A tibble: 5 × 4
#>   Series     Class   P_advance FC_cover_CO
#>   <chr>      <chr>       <dbl>       <dbl>
#> 1 Series_Two Class_A        21           0
#> 2 Series_Two Class_A        22           0
#> 3 Series_Two Class_A        23           0
#> 4 Series_Two Class_A        24           0
#> 5 Series_Two Class_A        25           0
#> 
#> $Series_Two$Class_B
#> # A tibble: 5 × 4
#>   Series     Class   P_advance FC_cover_CO
#>   <chr>      <chr>       <dbl>       <dbl>
#> 1 Series_Two Class_B         0           0
#> 2 Series_Two Class_B         0           0
#> 3 Series_Two Class_B         0           0
#> 4 Series_Two Class_B         0           0
#> 5 Series_Two Class_B         0           0

创建于 2023-09-08 使用 reprex v2.0.2

1赞 Village.Idyot 9/8/2023 #2

这是一种方法,尽管我想知道我的 for-loop 方法是否幼稚。我认为这是一个常见的问题,并且必须有一种巧妙的 R 方法来做到这一点。也许不是。作为一个相对刚接触 R 的新手(我不知道我能坚持这个借口多久),我的第一个冲动是用 for 循环进行攻击。但以下方法有效。

classPrin5 <- classPrin(5)

# Function to insert values from mat1 into object generated by classPrin()
insertValues <- function(classPrinList, mat, prefix) {
  for (series_name in names(classPrinList)) {
    for (class_name in names(classPrinList[[series_name]])) {
      col_name <- paste(series_name, class_name, "P_advance", sep = "_")
      if (col_name %in% colnames(mat)) {
        values_to_insert <- mat[, col_name]
        classPrinList[[series_name]][[class_name]][,"P_advance"] <- values_to_insert
      }
    }
  }
  return(classPrinList)
}

# Run the function
insertValues(classPrin5, mat1)

这将给出以下正确的输出:

> insertValues(classPrin5, mat1)
$Series_One
$Series_One$Class_A
     P_advance FC_cover_CO
[1,]         1           0
[2,]         2           0
[3,]         3           0
[4,]         4           0
[5,]         5           0

$Series_One$Class_B
     P_advance FC_cover_CO
[1,]        11           0
[2,]        12           0
[3,]        13           0
[4,]        14           0
[5,]        15           0

$Series_One$Class_C
     P_advance FC_cover_CO
[1,]         0           0
[2,]         0           0
[3,]         0           0
[4,]         0           0
[5,]         0           0


$Series_Two
$Series_Two$Class_A
     P_advance FC_cover_CO
[1,]        21           0
[2,]        22           0
[3,]        23           0
[4,]        24           0
[5,]        25           0

$Series_Two$Class_B
     P_advance FC_cover_CO
[1,]         0           0
[2,]         0           0
[3,]         0           0
[4,]         0           0
[5,]         0           0

评论

1赞 M-- 9/9/2023
for 循环没有错。事实上,它可能比我建议的要快得多。虽然,我会做一些小的调整,以减少对硬编码值的依赖。+1