提问人:Catherine 提问时间:11/1/2023 更新时间:11/1/2023 访问量:20
在 pivot_longer 中引用函数参数作为列名
Referencing a function argument as a column name in pivot_longer
问:
我正在尝试编写一个使用 pivot_longer 的函数,并希望将我的函数对象用作pivot_longer中names_to参数的对象。
record <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
x214532 <- c("shirts, shoes",
"shoes, purses, hats",
"shirts, shoes, hats, heavy machinery",
"sponges, shoes",
"hats, heavy machinery",
"",
"heavy machinery, purses, shirts",
"heavy machinery, shoes, sponges",
"sponges",
"shoes")
screening_data_responses_char <- data.frame(record, x214532)
record x214532
1 1 shirts, shoes
2 2 shoes, purses, hats
3 3 shirts, shoes, hats, heavy machinery
4 4 sponges, shoes
5 5 hats, heavy machinery
6 6
7 7 heavy machinery, purses, shirts
8 8 heavy machinery, shoes, sponges
9 9 sponges
10 10 shoes
最终,我尝试取消连接列 x214532 并创建一个长数据集,将数据分隔到列中列出的项目中,然后创建一个长数据集,如下所示:
record x214532
1 1 shirts
2 1 shoes
3 2 shoes
4 2 purses
5 2 hats
6 3 shirts
7 3 shoes
8 3 hats
9 3 heavy machinery
10 4 sponges
11 4 shoes
12 5 hats
13 5 heavy machinery
14 6
15 7 heavy machinery
16 7 purses
17 7 shirts
18 8 heavy machinery
19 8 shoes
20 8 sponges
21 9 sponges
22 10 shoes
我希望包含数据的列仍称为 x214532,但我无法通过names_to pivot_longer传递它。这是我得到的:
remove_col_prefix <- function(x) {
pattern <- "^[^_]+_"
stringr::str_remove(x, pattern)
}
deconcatenate <- function(questionID) {
screening_data_responses_questionID <- cSplit_e(data=screening_data_responses_char,split.col=questionID,sep=",",type="character")
screening_data_responses_questionID <- screening_data_responses_questionID %>%
select(-questionID) %>%
pivot_longer(cols=c(starts_with(questionID)),
names_to="questionID",
values_to="questionID_resp") %>%
drop_na(questionID_resp) %>%
select(-questionID_resp) %>%
mutate(questionID=remove_col_prefix(questionID)) %>%
select(c(deid_pat_id, questionID))
screening_data_responses_char <- screening_data_responses_char %>%
select(-questionID)
screening_data_responses_char <-merge(screening_data_responses_char,screening_data_responses_questionID,by="deid_pat_id",all=TRUE)
}
screening_data_responses_char <- deconcatenate(questionID="x214532")
我尝试过的事情:
-{{}} 和 !!运算符(使用函数参数作为列名)
-enquo
-此水泥功能:(https://adv-r.hadley.nz/quasiquotation.html)
-deparse(subsitute(x))
我得到的东西:
- 该列从输出中完全消失
- 该列称为 questionID 而不是 x214532
- 该列名为 questionID,所有文本都变为 x214532
我确信我做错了一些事情,或者可能是我在pivot_longer中做对了,但也需要进一步更改语法,但我不太清楚。任何帮助将不胜感激!
答:
1赞
jpsmith
11/1/2023
#1
一个更简单的方法可能是使用和复杂的管道:tidyr::separate_longer_delim()
pivot_longer()
tidyr::separate_longer_delim(data = screening_data_responses_char,
cols = x214532,
delim = ",")
输出:
record x214532
1 1 shirts
2 1 shoes
3 2 shoes
4 2 purses
5 2 hats
6 3 shirts
7 3 shoes
8 3 hats
9 3 heavy machinery
10 4 sponges
11 4 shoes
12 5 hats
13 5 heavy machinery
14 6
15 7 heavy machinery
16 7 purses
17 7 shirts
18 8 heavy machinery
19 8 shoes
20 8 sponges
21 9 sponges
22 10 shoes
评论
1赞
Catherine
11/1/2023
嗯,这简直完美!很高兴有人比我聪明,已经把它作为一项功能:)如果这个功能不存在,我仍然对我问题的答案感到好奇,但这肯定可以完成这项工作!
0赞
Catherine
11/1/2023
当我尝试进行keep_empty论证时,我击中了“ '...'在不正确的上下文中使用“错误,有什么想法吗?' screening_data_responses_char <- screning_data_responses_char %>% separate_longer_delim(data = screening_data_responses_char, cols = x214532, delim = “,”, ..., keep_empty=TRUE) '
0赞
jpsmith
11/1/2023
我不确定,但我认为你在这里不需要它 - 如果你在真实数据中需要它,你可能想问一个新问题(并提供需要的数据)。此外,仅供参考,如果您将数据集管道传输到(即,screening_data_responses_cha %>% separate_longer_delim(...),则不需要单独的函数`.祝你好运!data = screening_data_responses_cha
评论