提问人:Marius 提问时间:11/24/2017 更新时间:7/5/2019 访问量:318
使用 dplyr 通过多个函数传递列名
Passing column names through multiple functions with dplyr
问:
我编写了一个简单的函数来创建百分比表:dplyr
library(dplyr)
df = tibble(
Gender = sample(c("Male", "Female"), 100, replace = TRUE),
FavColour = sample(c("Red", "Blue"), 100, replace = TRUE)
)
quick_pct_tab = function(df, col) {
col_quo = enquo(col)
df %>%
count(!! col_quo) %>%
mutate(Percent = (100 * n / sum(n)))
}
df %>% quick_pct_tab(FavColour)
# Output:
# A tibble: 2 x 3
FavColour n Percent
<chr> <int> <dbl>
1 Blue 58 58
2 Red 42 42
这很好用。但是,当我尝试在此基础上构建,编写一个新函数来计算使用分组计算相同百分比时,我无法弄清楚如何在新函数中使用 - 在尝试了 和 等的多个不同组合之后。quick_pct_tab
quo(col)
!! quo(col)
enquo(col)
bygender_tab = function(df, col) {
col_enquo = enquo(col)
# Want to replace this with
# df %>% quick_pct_tab(col)
gender_tab = df %>%
group_by(Gender) %>%
count(!! col_enquo) %>%
mutate(Percent = (100 * n / sum(n)))
gender_tab %>%
select(!! col_enquo, Gender, Percent) %>%
spread(Gender, Percent)
}
> df %>% bygender_tab(FavColour)
# A tibble: 2 x 3
FavColour Female Male
* <chr> <dbl> <dbl>
1 Blue 52.08333 63.46154
2 Red 47.91667 36.53846
据我了解,非标准评估已被弃用,因此学习如何使用 .我必须如何引用参数才能将其传递给进一步的函数?dplyr
dplyr > 0.7
col
dplyr
答:
2赞
akrun
11/24/2017
#1
我们需要做一些事情来触发对“col_enquo”的评估!!
bygender_tab = function(df, col) {
col_enquo = enquo(col)
df %>%
group_by(Gender) %>%
quick_pct_tab(!!col_enquo) %>% ## change
select(!! col_enquo, Gender, Percent) %>%
spread(Gender, Percent)
}
df %>%
bygender_tab(FavColour)
# A tibble: 2 x 3
# FavColour Female Male
#* <chr> <dbl> <dbl>
#1 Blue 54.54545 41.07143
#2 Red 45.45455 58.92857
使用 OP 的函数,输出为
# A tibble: 2 x 3
# FavColour Female Male
#* <chr> <dbl> <dbl>
#1 Blue 54.54545 41.07143
#2 Red 45.45455 58.92857
请注意,创建数据集时未设置种子
更新
使用 version (ran with - ),我们还可以使用 to do quote、unquote、substitutionrlang
0.4.0
dplyr
0.8.2
{{...}}
bygender_tabN = function(df, col) {
df %>%
group_by(Gender) %>%
quick_pct_tab({{col}}) %>% ## change
select({{col}}, Gender, Percent) %>%
spread(Gender, Percent)
}
df %>%
bygender_tabN(FavColour)
# A tibble: 2 x 3
# FavColour Female Male
# <chr> <dbl> <dbl>
#1 Blue 50 46.3
#2 Red 50 53.7
- 使用以前的函数检查输出(未提供set.seed)
df %>%
bygender_tab(FavColour)
# A tibble: 2 x 3
# FavColour Female Male
# <chr> <dbl> <dbl>
#1 Blue 50 46.3
#2 Red 50 53.7
评论
df %>% group_by(Gender) %>% quick_pct_tab(get(col))
函数内部似乎正在工作,但不确定这是否提供了所需的输出