一种方法来指示特定列的所有可能的李克特响应选项,以便那些未使用的选项在 R 中使用透视更长的选项具有 0?

A way to indicate all possible Likert response options for a particular column so that those not used have a 0 by them using pivot longer in R?

提问人:sdS 提问时间:6/22/2022 最后编辑:sdS 更新时间:6/23/2022 访问量:45

问:

我的数据中有许多李克特类型的问题,并且正在使用更长的透视来获取每个选项使用频率的百分比。然而,对于某些问题,受访者从未指出某些选项(例如,他们从未以 1 回答)。但是,如果未使用,我仍然希望看到每个项目的每个可能的响应为 0/0%。例如,假设我有一个数据帧 d1。

d1(names) 

"Course" "likert_1" "likert_2" "likert_3" "likert_4"
d1_long <- d1 %>% 
  pivot_longer(-Course, names_to = "items", values_to = "val") %>% 
  group_by(items) %>% 
  group_by(items, Course) %>% 
  mutate(N= sum (is.na(val) == F),
         val= as.character(val)) %>% 
  group_by(val, .add = TRUE) %>% 
  summarise(n = n(),
            percent = round((n/N), digits = 2)) %>% 
  distinct()
 

head(d1_long)
# A tibble: 6 × 5
# Groups:   items, Course, val [6]
  items             Course      val       n     percent
  <chr>             <chr>       <chr>    <int>   <dbl>
1 likert_1          A765           2         2    0.04
2 likert_1          A765           3         1    0.02
3 likert_1          A765           4         50   0.88
4 likert_1          B768           1         2    0.04
5 likert_1          B768           3         24   0.48
6 likert_1          B768           4         26   0.52

因此,我们可以看到响应选项 1 未在课程“A765”中使用,选项 2 未在课程 B768 中使用。我希望看到的是这样的:

head(d1_long)
# A tibble: 6 × 5
# Groups:   items, Course, val [6]
  items             Course      val       n     percent
  <chr>             <chr>       <chr>    <int>   <dbl>
1 likert_1          A765           1         0    0.00
2 likert_1          A765           2         2    0.04
3 likert_1          A765           3         1    0.02
4 likert_1          A765           4         50   0.88
4 likert_1          B768           1         2    0.04
5 likert_1          B768           2         0    0.00
6 likert_1          B768           3         24   0.48

非常感谢任何帮助 - 谢谢!

编辑:

dput(d1_long)
structure(list(items = c("likert_1", "likert_1", "likert_1", 
"likert_1", "likert_1", "likert_1"), Course = c("A765", "A765", 
"A765", "B768", "B768", "B768"), val = c(2L, 3L, 4L, 1L, 3L, 
4L), n = c(2L, 1L, 50L, 2L, 24L, 26L), percent = c(0.04, 0.02, 
0.88, 0.04, 0.48, 0.52)), class = c("grouped_df", "tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -6L), groups = structure(list(
    items = c("likert_1", "likert_1", "likert_1", "likert_1", 
    "likert_1", "likert_1"), Course = c("A765", "A765", "A765", 
    "B768", "B768", "B768"), val = c(2L, 3L, 4L, 1L, 3L, 4L), 
    .rows = structure(list(1L, 2L, 3L, 4L, 5L, 6L), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -6L), .drop = TRUE))

编辑2:我应该注意到 - 并非所有项目都有相同的响应方案。例如,有些是 1-5,有些是 1-7。谢谢

r

评论

1赞 Rui Barradas 6/22/2022
您可以使用输出 编辑问题吗?复制已发布表的问题在于 group 属性。dput(head(d1_long))
0赞 sdS 6/22/2022
实际上,我的数据集比我为这个问题创建的数据集大得多。我一会儿会做编辑。谢谢
0赞 Rui Barradas 6/22/2022
前 6 行就足够了,这就是会给的,所以没关系。headdput(head(d1_long))

答:

0赞 Rui Barradas 6/23/2022 #1

这是一种方法。按 和 分组,然后基于所有可能响应的向量。列,并用零填充(默认值为 )。itemsCoursecompletenpercentNA

suppressPackageStartupMessages(library(tidyverse))

all_possible_resp <- 1:4

d1_long %>%
  ungroup() %>%
  group_by(items, Course) %>% 
  complete(val = all_possible_resp, 
           fill = list(n = 0, percent = 0)) %>%
  ungroup()
#> # A tibble: 8 × 5
#>   items    Course   val     n percent
#>   <chr>    <chr>  <int> <int>   <dbl>
#> 1 likert_1 A765       1     0    0   
#> 2 likert_1 A765       2     2    0.04
#> 3 likert_1 A765       3     1    0.02
#> 4 likert_1 A765       4    50    0.88
#> 5 likert_1 B768       1     2    0.04
#> 6 likert_1 B768       2     0    0   
#> 7 likert_1 B768       3    24    0.48
#> 8 likert_1 B768       4    26    0.52

reprex 软件包 (v2.0.1) 于 2022-06-22 创建