在数据帧内使用变量名称进行子集疏浚 (MuMIn)

Subset dredge with variable names, inside a data frame (MuMIn)

提问人:M. Riera 提问时间:9/29/2023 更新时间:9/29/2023 访问量:46

问:

我已经安装了模型,并在它们上运行。我创建了一个存储模型中使用的变量名称和对象本身(在 ).通过使用 ,我希望使用存储在特定列中的变量名称对对象进行子集化,并将子集保留在新列中。dredge()data framedredgelistrowwise()dredge

我的问题是我无法使这种行行为起作用。

如果我不尝试调用存储在其他列中的字符串,我可以很好地子集对象,请参阅可重现示例。dredge

我找到了 2019 年、2018 年和 2014 年的相关问题(这里这里)。 一方面,这些问题在函数内部使用,而不是在存储在数据框中的已生成对象上使用。我尝试使用那里概述的解决方案,但没有成功,也可能是由于缺乏解析、表达式等方面的经验。subsetdredgesubsetdredge

library(MuMIn)
#> Warning: package 'MuMIn' was built under R version 4.2.3
library(tidyverse)
#> Warning: package 'ggplot2' was built under R version 4.2.2
#> Warning: package 'tibble' was built under R version 4.2.3
#> Warning: package 'tidyr' was built under R version 4.2.3
#> Warning: package 'purrr' was built under R version 4.2.3
#> Warning: package 'dplyr' was built under R version 4.2.3
#> Warning: package 'stringr' was built under R version 4.2.3


data(Cement)


fm <- lm(y ~ X1*X2 + X1*X3 + X4, Cement)


options(na.action = "na.fail")


d <- MuMIn::dredge(fm)
#> Fixed term is "(Intercept)"


d
#> Global model call: lm(formula = y ~ X1 * X2 + X1 * X3 + X4, data = Cement)
#> ---
#> Model selection table 
#>     (Int)     X1      X2       X3      X4    X1:X2     X1:X3 df  logLik  AICc
#> 4   52.58 1.4680  0.6623                                      4 -28.156  69.3
#> 12  71.65 1.4520  0.4161          -0.2365                     5 -26.933  72.4
#> 8   48.19 1.6960  0.6569  0.25000                             5 -26.952  72.5
#> 10 103.10 1.4400                  -0.6140                     4 -29.817  72.6
#> 14 111.70 1.0520         -0.41000 -0.6428                     5 -27.310  73.2
#> 20  53.71 1.2750  0.6358                  0.004197            5 -28.072  74.7
#> 15 203.60        -0.9234 -1.44800 -1.5570                     5 -29.734  78.0
#> 28  72.43 1.2870  0.3959          -0.2343 0.003577            6 -26.860  79.7
#> 24  49.05 1.5580  0.6387  0.24590         0.002914            6 -26.904  79.8
#> 16  62.41 1.5510  0.5102  0.10190 -0.1441                     6 -26.918  79.8
#> 40  48.17 1.6690  0.6522  0.25180                   0.007277  6 -26.933  79.9
#> 46 112.00 1.0680         -0.41490 -0.6462          -0.005210  6 -27.301  80.6
#> 13 131.30                -1.20000 -0.7246                     4 -35.372  83.7
#> 32  66.63 1.3570  0.4556  0.06354 -0.1767 0.003398            7 -26.854  90.1
#> 56  49.02 1.5320  0.6342  0.24770         0.002888  0.007116  7 -26.885  90.2
#> 48  60.19 1.5560  0.5298  0.12600 -0.1218           0.004666  7 -26.911  90.2
#> 7   72.07         0.7313 -1.00800                             4 -40.965  94.9
#> 9  117.60                         -0.7382                     3 -45.872 100.4
#> 3   57.42         0.7891                                      3 -46.035 100.7
#> 11  94.16         0.3109          -0.4569                     4 -45.761 104.5
#> 2   81.48 1.8690                                              3 -48.206 105.1
#> 64  64.80 1.3650  0.4722  0.08329 -0.1585 0.003335  0.003693  8 -26.850 105.7
#> 6   72.35 2.3120          0.49450                             4 -48.005 109.0
#> 38  64.05 0.9528          0.49550                   0.311600  5 -45.678 109.9
#> 5  110.20                -1.25600                             3 -50.980 110.6
#> 1   95.42                                                     2 -53.168 111.5
#>    delta weight
#> 4   0.00  0.539
#> 12  3.13  0.113
#> 8   3.16  0.111
#> 10  3.32  0.102
#> 14  3.88  0.078
#> 20  5.40  0.036
#> 15  8.73  0.007
#> 28 10.41  0.003
#> 24 10.49  0.003
#> 16 10.52  0.003
#> 40 10.55  0.003
#> 46 11.29  0.002
#> 13 14.43  0.000
#> 32 20.80  0.000
#> 56 20.86  0.000
#> 48 20.91  0.000
#> 7  25.62  0.000
#> 9  31.10  0.000
#> 3  31.42  0.000
#> 11 35.21  0.000
#> 2  35.77  0.000
#> 64 36.39  0.000
#> 6  39.70  0.000
#> 38 40.61  0.000
#> 5  41.31  0.000
#> 1  42.22  0.000
#> Models ranked by AICc(x)


data.frame(test = c("X1", "X2")) %>% 
  rowwise() %>% 
  mutate(dredge = list(d)) %>% 
  mutate(subset = list(subset(dredge, delta <= 6)))
#> # A tibble: 2 × 3
#> # Rowwise: 
#>   test  dredge               subset             
#>   <chr> <list>               <list>             
#> 1 X1    <mdl.slct [26 × 12]> <mdl.slct [6 × 12]>
#> 2 X2    <mdl.slct [26 × 12]> <mdl.slct [6 × 12]>


data.frame(test = c("X1", "X2")) %>% 
  rowwise() %>% 
  mutate(dredge = list(d)) %>% 
  mutate(subset = list(subset(dredge, has("X1"))))
#> # A tibble: 2 × 3
#> # Rowwise: 
#>   test  dredge               subset              
#>   <chr> <list>               <list>              
#> 1 X1    <mdl.slct [26 × 12]> <mdl.slct [18 × 12]>
#> 2 X2    <mdl.slct [26 × 12]> <mdl.slct [18 × 12]>


data.frame(test = c("X1", "X2")) %>% 
  rowwise() %>% 
  mutate(dredge = list(d)) %>% 
  mutate(subset = list(subset(dredge, has(test))))
#> # A tibble: 2 × 3
#> # Rowwise: 
#>   test  dredge               subset              
#>   <chr> <list>               <list>              
#> 1 X1    <mdl.slct [26 × 12]> <mdl.slct [26 × 12]>
#> 2 X2    <mdl.slct [26 × 12]> <mdl.slct [26 × 12]>


data.frame(test = c("X1", "X2")) %>% 
  rowwise() %>% 
  mutate(dredge = list(d)) %>% 
  mutate(subset = list(subset(dredge, parse(text = sprintf("has(%s)", test)))))
#> Error in `mutate()`:
#> ℹ In argument: `subset = list(subset(dredge, parse(text =
#>   sprintf("has(%s)", test))))`.
#> ℹ In row 1.
#> Caused by error in `xj[i]`:
#> ! invalid subscript type 'expression'
#> Backtrace:
#>      ▆
#>   1. ├─... %>% ...
#>   2. ├─dplyr::mutate(...)
#>   3. ├─dplyr:::mutate.data.frame(...)
#>   4. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), by)
#>   5. │   ├─base::withCallingHandlers(...)
#>   6. │   └─dplyr:::mutate_col(dots[[i]], data, mask, new_columns)
#>   7. │     └─mask$eval_all_mutate(quo)
#>   8. │       └─dplyr (local) eval()
#>   9. ├─base::subset(dredge, parse(text = sprintf("has(%s)", test)))
#>  10. ├─MuMIn:::subset.model.selection(...)
#>  11. │ └─MuMIn:::`[.model.selection`(...)
#>  12. │   ├─MuMIn:::subset_model_selection(item(x, j, i, ...), origattrib <- attributes(x))
#>  13. │   └─MuMIn:::item(x, j, i, ...)
#>  14. │     └─base::`[.data.frame`(x, i, name, ...)
#>  15. └─base::.handleSimpleError(...)
#>  16.   └─dplyr (local) h(simpleError(msg, call))
#>  17.     └─rlang::abort(message, class = error_class, parent = parent, call = error_call)


data.frame(test = c("X1", "X2")) %>% 
  rowwise() %>% 
  mutate(dredge = list(d)) %>% 
  mutate(subset = list(subset(dredge, eval(parse(text = sprintf("has(%s)", test))))))
#> Error in `mutate()`:
#> ℹ In argument: `subset = list(subset(dredge, eval(parse(text =
#>   sprintf("has(%s)", test)))))`.
#> ℹ In row 1.
#> Caused by error in `has()`:
#> ! no se pudo encontrar la función "has"
#> Backtrace:
#>      ▆
#>   1. ├─... %>% ...
#>   2. ├─dplyr::mutate(...)
#>   3. ├─dplyr:::mutate.data.frame(...)
#>   4. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), by)
#>   5. │   ├─base::withCallingHandlers(...)
#>   6. │   └─dplyr:::mutate_col(dots[[i]], data, mask, new_columns)
#>   7. │     └─mask$eval_all_mutate(quo)
#>   8. │       └─dplyr (local) eval()
#>   9. ├─base::subset(dredge, eval(parse(text = sprintf("has(%s)", test))))
#>  10. ├─MuMIn:::subset.model.selection(...)
#>  11. │ ├─MuMIn:::`[.model.selection`(...)
#>  12. │ │ ├─MuMIn:::subset_model_selection(item(x, j, i, ...), origattrib <- attributes(x))
#>  13. │ │ └─MuMIn:::item(x, j, i, ...)
#>  14. │ │   └─base::`[.data.frame`(x, i, name, ...)
#>  15. │ └─MuMIn:::subset_eval(substitute(subset), x, parent.frame())
#>  16. │   └─base::eval(...)
#>  17. │     └─base::eval(...)
#>  18. │       └─base::eval(parse(text = sprintf("has(%s)", test)))
#>  19. │         └─base::eval(parse(text = sprintf("has(%s)", test)))
#>  20. └─base::.handleSimpleError(...)
#>  21.   └─dplyr (local) h(simpleError(msg, call))
#>  22.     └─rlang::abort(message, class = error_class, parent = parent, call = error_call)


data.frame(test = c("X1", "X2")) %>% 
  rowwise() %>% 
  mutate(dredge = list(d)) %>% 
  mutate(subset = list(subset(dredge, has(substitute(v, list(v = as.name("test")))))))
#> Error in `mutate()`:
#> ℹ In argument: `subset = list(subset(dredge, has(substitute(v, list(v =
#>   as.name("test"))))))`.
#> ℹ In row 1.
#> Caused by error:
#> ! objeto 'substitute(v, list(v = as.name("test")))' no encontrado
#> Backtrace:
#>      ▆
#>   1. ├─... %>% ...
#>   2. ├─dplyr::mutate(...)
#>   3. ├─dplyr:::mutate.data.frame(., subset = list(subset(dredge, has(substitute(v, list(v = as.name("test")))))))
#>   4. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), by)
#>   5. │   ├─base::withCallingHandlers(...)
#>   6. │   └─dplyr:::mutate_col(dots[[i]], data, mask, new_columns)
#>   7. │     └─mask$eval_all_mutate(quo)
#>   8. │       └─dplyr (local) eval()
#>   9. ├─base::subset(dredge, has(substitute(v, list(v = as.name("test")))))
#>  10. ├─MuMIn:::subset.model.selection(dredge, has(substitute(v, list(v = as.name("test")))))
#>  11. │ ├─MuMIn:::`[.model.selection`(...)
#>  12. │ │ ├─MuMIn:::subset_model_selection(item(x, j, i, ...), origattrib <- attributes(x))
#>  13. │ │ └─MuMIn:::item(x, j, i, ...)
#>  14. │ │   └─base::`[.data.frame`(x, i, name, ...)
#>  15. │ └─MuMIn:::subset_eval(substitute(subset), x, parent.frame())
#>  16. │   └─base::eval(...)
#>  17. │     └─base::eval(...)
#>  18. └─base::.handleSimpleError(...)
#>  19.   └─dplyr (local) h(simpleError(msg, call))
#>  20.     └─rlang::abort(message, class = error_class, parent = parent, call = error_call)


sessionInfo()
#> R version 4.2.0 (2022-04-22 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 22000)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=Spanish_Spain.utf8  LC_CTYPE=Spanish_Spain.utf8   
#> [3] LC_MONETARY=Spanish_Spain.utf8 LC_NUMERIC=C                  
#> [5] LC_TIME=Spanish_Spain.utf8    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] forcats_0.5.1   stringr_1.5.0   dplyr_1.1.2     purrr_1.0.1    
#>  [5] readr_2.1.2     tidyr_1.3.0     tibble_3.2.1    ggplot2_3.4.0  
#>  [9] tidyverse_1.3.1 MuMIn_1.47.5   
#> 
#> loaded via a namespace (and not attached):
#>  [1] tidyselect_1.2.0 xfun_0.40        haven_2.5.0      lattice_0.20-45 
#>  [5] colorspace_2.0-3 vctrs_0.6.3      generics_0.1.2   htmltools_0.5.6 
#>  [9] stats4_4.2.0     yaml_2.3.5       utf8_1.2.2       rlang_1.1.1     
#> [13] pillar_1.9.0     glue_1.6.2       withr_2.5.0      DBI_1.1.2       
#> [17] dbplyr_2.1.1     readxl_1.4.0     modelr_0.1.8     lifecycle_1.0.3 
#> [21] cellranger_1.1.0 munsell_0.5.0    gtable_0.3.0     rvest_1.0.2     
#> [25] evaluate_0.15    knitr_1.39       tzdb_0.3.0       fastmap_1.1.0   
#> [29] fansi_1.0.3      highr_0.9        broom_0.8.0      backports_1.4.1 
#> [33] scales_1.2.1     jsonlite_1.8.0   fs_1.5.2         hms_1.1.1       
#> [37] digest_0.6.29    stringi_1.7.6    grid_4.2.0       cli_3.6.1       
#> [41] tools_4.2.0      magrittr_2.0.3   crayon_1.5.1     pkgconfig_2.0.3 
#> [45] ellipsis_0.3.2   Matrix_1.4-1     xml2_1.3.3       lubridate_1.8.0 
#> [49] reprex_2.0.2     assertthat_0.2.1 rmarkdown_2.14   httr_1.4.3      
#> [53] rstudioapi_0.13  R6_2.5.1         nlme_3.1-157     compiler_4.2.0

创建于 2023-09-29 使用 reprex v2.0.2

R DataFrame 子集 Rowwise Mumin

评论


答:

1赞 DaveArmstrong 9/29/2023 #1

我不太确定为什么该子集不能以通常的方式工作。我能够在函数内部获得所需的结果。疏浚对象的文档说这等同于 ,所以我改用后者(在这种情况下仍然不起作用)。mutate()map()mutate()subsethas("X1")!is.na(X1)has()

library(MuMIn)
library(tidyverse)
data(Cement)
fm <- lm(y ~ X1*X2 + X1*X3 + X4, Cement)
options(na.action = "na.fail")
d <- MuMIn::dredge(fm)
#> Fixed term is "(Intercept)"

data.frame(test = c("X1", "X2")) %>% 
  rowwise() %>% 
  mutate(dredge = list(d)) %>% 
  mutate(subset = map(test, \(x)subset(d, !is.na(d[[x]]))))
#> # A tibble: 2 × 3
#> # Rowwise: 
#>   test  dredge               subset              
#>   <chr> <list>               <list>              
#> 1 X1    <mdl.slct [26 × 12]> <mdl.slct [18 × 12]>
#> 2 X2    <mdl.slct [26 × 12]> <mdl.slct [16 × 12]>

创建于 2023-09-29 使用 reprex v2.0.2

评论

0赞 M. Riera 9/29/2023
您的解决方案有效,谢谢!我意识到你也可以用filter()做子集,因为疏浚也是类data.frame。我在 filter() 中使用了 get(),以便在数据框列中使用变量名称 (filter(get(column_storing_variable_name))。
1赞 DaveArmstrong 9/29/2023
@M.Riera也是一个很好的解决方案。