来自 rstatix 的t_test无法处理嵌套数据

t_test from rstatix not working with nested data

提问人:sbac 提问时间:9/15/2023 更新时间:10/2/2023 访问量:70

问:

我在嵌套数据上运行 () 没有问题。但是当我尝试从软件包中使用时,我收到错误。t.testdft_testrstatix

library(tidyverse)
library(rstatix)
#> 
#> Attaching package: 'rstatix'
#> The following object is masked from 'package:stats':
#> 
#>     filter

df <- ToothGrowth
df$dose <- as.factor(df$dose)


nested_data <- df %>% 
  group_by(dose) %>% 
  nest()

# This works
nested_models1 <- nested_data %>%
  mutate(t_test1 = map(data, ~t.test(.x$len ~ .x$supp, paired = TRUE,
                                     detailed = TRUE)))
nested_models1
#> # A tibble: 3 × 3
#> # Groups:   dose [3]
#>   dose  data              t_test1
#>   <fct> <list>            <list> 
#> 1 0.5   <tibble [20 × 2]> <htest>
#> 2 1     <tibble [20 × 2]> <htest>
#> 3 2     <tibble [20 × 2]> <htest>

# This does not work
nested_models2 <- nested_data %>%
  mutate(t_test2 = map(data, ~rstatix::t_test(.x$len ~ .x$supp, paired = TRUE,
                                     detailed = TRUE)))
#> Error in `mutate()`:
#> ℹ In argument: `t_test2 = map(data, ~rstatix::t_test(.x$len ~ .x$supp,
#>   paired = TRUE, detailed = TRUE))`.
#> ℹ In group 1: `dose = 0.5`.
#> Caused by error in `map()`:
#> ℹ In index: 1.
#> Caused by error in `rstatix::t_test()`:
#> ! argument "formula" is missing, with no default
#> Backtrace:
#>      ▆
#>   1. ├─nested_data %>% ...
#>   2. ├─dplyr::mutate(...)
#>   3. ├─dplyr:::mutate.data.frame(...)
#>   4. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), by)
#>   5. │   ├─base::withCallingHandlers(...)
#>   6. │   └─dplyr:::mutate_col(dots[[i]], data, mask, new_columns)
#>   7. │     └─mask$eval_all_mutate(quo)
#>   8. │       └─dplyr (local) eval()
#>   9. ├─purrr::map(data, ~rstatix::t_test(.x$len ~ .x$supp, paired = TRUE, detailed = TRUE))
#>  10. │ └─purrr:::map_("list", .x, .f, ..., .progress = .progress)
#>  11. │   ├─purrr:::with_indexed_errors(...)
#>  12. │   │ └─base::withCallingHandlers(...)
#>  13. │   ├─purrr:::call_with_cleanup(...)
#>  14. │   └─.f(.x[[i]], ...)
#>  15. │     └─rstatix::t_test(.x$len ~ .x$supp, paired = TRUE, detailed = TRUE)
#>  16. │       └─rstatix:::get_formula_left_hand_side(formula)
#>  17. │         └─base::deparse(formula[[2]])
#>  18. └─base::.handleSimpleError(...)
#>  19.   └─purrr (local) h(simpleError(msg, call))
#>  20.     └─cli::cli_abort(...)
#>  21.       └─rlang::abort(...)
Created on 2023-09-15 with reprex v2.0.2
r t 检验 rstatix

评论

2赞 user20650 9/15/2023
我认为 和 之间的区别在于函数参数的位置,其中公式是第一个,但第二个是。我不使用 tidy / [pipes 等,所以我无法向您展示执行此操作的规范方法,但添加数据持有者似乎可以工作,例如t.testt_testt.testt_test.map(data, ~rstatix::t_test(., len ~supp, paired = TRUE, detailed = TRUE))
0赞 sbac 9/16/2023
是的,这就是答案。
0赞 Rui Barradas 9/16/2023
@user20650 你为什么不发帖作为答案?答案比评论更有用。
0赞 user20650 9/16/2023
@RuiBarradas ;我没有将其作为答案,因为我不使用这套包,而且我不确定最佳代码。我的评论只是一个快速修复

答:

1赞 margusl 10/2/2023 #1

一个好处是它处理分组帧的方式,因此您可能只想尝试分组,而不是嵌套。然后,您可以将结果连接到嵌套的 tibble。或者,您可以切换到行操作,并通过 mutate 应用于嵌套的 tibbles。rstatixt_testt_test

library(tidyverse)

df <- ToothGrowth
df$dose <- as.factor(df$dose)

# rstatix on nested frame:
df_ttest <- df %>%
  group_by(dose) %>% 
  rstatix::t_test(len ~ supp, paired = TRUE, detailed = TRUE)

df_ttest
#> # A tibble: 3 × 14
#>   dose  estimate .y.   group1 group2    n1    n2 statistic       p    df
#> * <fct>    <dbl> <chr> <chr>  <chr>  <int> <int>     <dbl>   <dbl> <dbl>
#> 1 0.5     5.25   len   OJ     VC        10    10    2.98   0.0155      9
#> 2 1       5.93   len   OJ     VC        10    10    3.37   0.00823     9
#> 3 2      -0.0800 len   OJ     VC        10    10   -0.0426 0.967       9
#> # ℹ 4 more variables: conf.low <dbl>, conf.high <dbl>, method <chr>,
#> #   alternative <chr>

# you could then join this to nested tibble by `dose`:
df %>% 
  nest(.by = dose) %>% 
  left_join(df_ttest, by = "dose")
#> # A tibble: 3 × 15
#>   dose  data     estimate .y.   group1 group2    n1    n2 statistic       p
#>   <fct> <list>      <dbl> <chr> <chr>  <chr>  <int> <int>     <dbl>   <dbl>
#> 1 0.5   <tibble>   5.25   len   OJ     VC        10    10    2.98   0.0155 
#> 2 1     <tibble>   5.93   len   OJ     VC        10    10    3.37   0.00823
#> 3 2     <tibble>  -0.0800 len   OJ     VC        10    10   -0.0426 0.967  
#> # ℹ 5 more variables: df <dbl>, conf.low <dbl>, conf.high <dbl>, method <chr>,
#> #   alternative <chr>

# you can also use t_test on nested tibbles by switching to rowwise 
# (nest_by is different from nest(.by), returns rowwise tibble)
df %>% 
  nest_by(dose) %>% 
  mutate(t_test2 = rstatix::t_test(data, len ~ supp, paired = TRUE, detailed = TRUE)) %>% 
  ungroup()
#> # A tibble: 3 × 3
#>   dose                data t_test2$estimate $.y.  $group1 $group2   $n1   $n2
#>   <fct> <list<tibble[,2]>>            <dbl> <chr> <chr>   <chr>   <int> <int>
#> 1 0.5             [20 × 2]           5.25   len   OJ      VC         10    10
#> 2 1               [20 × 2]           5.93   len   OJ      VC         10    10
#> 3 2               [20 × 2]          -0.0800 len   OJ      VC         10    10
#> # ℹ 7 more variables: t_test2$statistic <dbl>, $p <dbl>, $df <dbl>,
#> #   $conf.low <dbl>, $conf.high <dbl>, $method <chr>, $alternative <chr>

创建于 2023-10-02 使用 reprex v2.0.2