提问人:sdS 提问时间:6/22/2022 最后编辑:sdS 更新时间:6/23/2022 访问量:45
一种方法来指示特定列的所有可能的李克特响应选项,以便那些未使用的选项在 R 中使用透视更长的选项具有 0?
A way to indicate all possible Likert response options for a particular column so that those not used have a 0 by them using pivot longer in R?
问:
我的数据中有许多李克特类型的问题,并且正在使用更长的透视来获取每个选项使用频率的百分比。然而,对于某些问题,受访者从未指出某些选项(例如,他们从未以 1 回答)。但是,如果未使用,我仍然希望看到每个项目的每个可能的响应为 0/0%。例如,假设我有一个数据帧 d1。
d1(names)
"Course" "likert_1" "likert_2" "likert_3" "likert_4"
d1_long <- d1 %>%
pivot_longer(-Course, names_to = "items", values_to = "val") %>%
group_by(items) %>%
group_by(items, Course) %>%
mutate(N= sum (is.na(val) == F),
val= as.character(val)) %>%
group_by(val, .add = TRUE) %>%
summarise(n = n(),
percent = round((n/N), digits = 2)) %>%
distinct()
head(d1_long)
# A tibble: 6 × 5
# Groups: items, Course, val [6]
items Course val n percent
<chr> <chr> <chr> <int> <dbl>
1 likert_1 A765 2 2 0.04
2 likert_1 A765 3 1 0.02
3 likert_1 A765 4 50 0.88
4 likert_1 B768 1 2 0.04
5 likert_1 B768 3 24 0.48
6 likert_1 B768 4 26 0.52
因此,我们可以看到响应选项 1 未在课程“A765”中使用,选项 2 未在课程 B768 中使用。我希望看到的是这样的:
head(d1_long)
# A tibble: 6 × 5
# Groups: items, Course, val [6]
items Course val n percent
<chr> <chr> <chr> <int> <dbl>
1 likert_1 A765 1 0 0.00
2 likert_1 A765 2 2 0.04
3 likert_1 A765 3 1 0.02
4 likert_1 A765 4 50 0.88
4 likert_1 B768 1 2 0.04
5 likert_1 B768 2 0 0.00
6 likert_1 B768 3 24 0.48
非常感谢任何帮助 - 谢谢!
编辑:
dput(d1_long)
structure(list(items = c("likert_1", "likert_1", "likert_1",
"likert_1", "likert_1", "likert_1"), Course = c("A765", "A765",
"A765", "B768", "B768", "B768"), val = c(2L, 3L, 4L, 1L, 3L,
4L), n = c(2L, 1L, 50L, 2L, 24L, 26L), percent = c(0.04, 0.02,
0.88, 0.04, 0.48, 0.52)), class = c("grouped_df", "tbl_df", "tbl",
"data.frame"), row.names = c(NA, -6L), groups = structure(list(
items = c("likert_1", "likert_1", "likert_1", "likert_1",
"likert_1", "likert_1"), Course = c("A765", "A765", "A765",
"B768", "B768", "B768"), val = c(2L, 3L, 4L, 1L, 3L, 4L),
.rows = structure(list(1L, 2L, 3L, 4L, 5L, 6L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -6L), .drop = TRUE))
编辑2:我应该注意到 - 并非所有项目都有相同的响应方案。例如,有些是 1-5,有些是 1-7。谢谢
答:
0赞
Rui Barradas
6/23/2022
#1
这是一种方法。按 和 分组,然后基于所有可能响应的向量。列,并用零填充(默认值为 )。items
Course
complete
n
percent
NA
suppressPackageStartupMessages(library(tidyverse))
all_possible_resp <- 1:4
d1_long %>%
ungroup() %>%
group_by(items, Course) %>%
complete(val = all_possible_resp,
fill = list(n = 0, percent = 0)) %>%
ungroup()
#> # A tibble: 8 × 5
#> items Course val n percent
#> <chr> <chr> <int> <int> <dbl>
#> 1 likert_1 A765 1 0 0
#> 2 likert_1 A765 2 2 0.04
#> 3 likert_1 A765 3 1 0.02
#> 4 likert_1 A765 4 50 0.88
#> 5 likert_1 B768 1 2 0.04
#> 6 likert_1 B768 2 0 0
#> 7 likert_1 B768 3 24 0.48
#> 8 likert_1 B768 4 26 0.52
由 reprex 软件包 (v2.0.1) 于 2022-06-22 创建
评论
dput(head(d1_long))
head
dput(head(d1_long))