在 R 中处理细微的“选择所有适用项”问题
Dealing with nuanced 'Select All That Apply' question in R
df <- data.frame(ID = 1:6, response_strength = c("Language (L) Attention (A)", "Movement Control (MC)", "Language (L) Getting Along with Others (G) Attention (A) Memory (M)", "Memory (M) Complex Thinking (C) Spatial Thinking (S)", "Memory (M) Spatial Thinking (S)", "Language (L) Attention (A)"), response_challenge = c("Movement Control (MC)", Language (L) Attention (A)", "Complex Thinking (C)", "Attention (A)", "Getting Along with Others (G) Keeping Track of Time/Order", "Keeping Track of Time/Order Movement Control (MC)"))
我的目标是转换为长格式并有一个输出表,显示选择给定响应选项的百分比,如下所示: (请注意:以下代码是为了说明目的而创建的,因此百分比不准确)
df2 <- data.frame(survey_question = c("response_strength", "response_strength", "response_strength", "response_strength", "response_strength", "response_strength", "response_strength", "response_challenge", "response_challenge", "response_challenge", "response_challenge", "response_challenge", "response_challenge", "response_challenge"), response = c("Movement Control (MC)", "Language (L)", "Attention (A)", "Getting Along with Others (G)", "Complex Thinking (C)", "Spatial Thinking (S)","Keeping Track of Time/Order", "Movement Control (MC)", "Language (L)", "Attention (A)", "Getting Along with Others (G)", "Complex Thinking (C)", "Spatial Thinking (S)", "Keeping Track of Time/Order"), n = c(1, 2, 4, 5, 3, 1, 2, 1, 2, 4, 5, 3, 1, 2), percent = c(.33, .67, 1.0, .33, .67, 1.0, .33, .67, 1.0, .33, .67, 1.0, .33, .67))
survey_question response n percent
1 response_strength Movement Control (MC) 1 0.33
2 response_strength Language (L) 2 0.67
3 response_strength Attention (A) 4 1.00
4 response_strength Getting Alone with Others (G) 5 0.33
5 response_strength Complex Thinking (C) 3 0.67
6 response_strength Spatial Thinking (S) 1 1.00
7 response_strength Keeping Track of Time/Order 2 0.33
8 response_challenge Movement Control (MC) 1 0.67
9 response_challenge Language (L) 2 1.00
10 response_challenge Attention (A) 4 0.33
11 response_challenge Getting Alone with Others (G) 5 0.67
12 response_challenge Complex Thinking (C) 3 1.00
13 response_challenge Spatial Thinking (S) 1 0.33
14 response_challenge Keeping Track of Time/Order 2 0.67
# load package
# first, we make the response_strength and response_challenge columns longer, making a "question" and "response" column
# str_remove removes the "response_" bit at the beginning which serves no purpose
df |> pivot_longer(-ID, names_to = "question", values_to = "response", names_transform = \(x) str_remove(x, "response_")) |>
# next we split the values, looking behind for a closing bracket, or the word Order
# Since this isn't your real data, you may have to edit this to make it work with the real code
mutate(response = str_split(response, "(?<=\\)|Order) ")) |>
# turn each response into it's own row
unnest_longer(response) |>
# create the n column and percent column
mutate(n = n(), percent = n / sum(n), .by = c(ID, question))
# A tibble: 23 × 5
ID question response n percent
<int> <chr> <chr> <int> <dbl>
1 1 strength Language (L) 2 0.5
2 1 strength Attention (A) 2 0.5
3 1 challenge Movement Control (MC) 1 1
4 2 strength Movement Control (MC) 1 1
5 2 challenge Language (L) 2 0.5
6 2 challenge Attention (A) 2 0.5
7 3 strength Language (L) 4 0.25
8 3 strength Getting Along with Others (G) 4 0.25
9 3 strength Attention (A) 4 0.25
10 3 strength Memory (M) 4 0.25
# ℹ 13 more rows
有关回望的更多信息,您可以在此处阅读有关它们的信息。 李