提问人:Damien Dotta 提问时间:11/11/2023 最后编辑:Damien Dotta 更新时间:11/11/2023 访问量:51
如何在同一行上聚合数据
How can I aggregate the data on the same line
问:
在 pivot_wider() 之后,我得到以下 data.frame。
如何在同一行上聚合数据?
CODE_C CODE_P LIB_COMPOSANT LIB_PRODUIT `2020-01-01` `2020-02-01` `2020-03-01` `2020-04-01` `2020-05-01` `2020-06-01`
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 FABR**01 FABR** Abricot, 82 Abricot 1.32 NA NA NA NA NA
2 FABR**01 FABR** Abricot, 82 Abricot NA 1.10 NA NA NA NA
3 FABR**01 FABR** Abricot, 82 Abricot NA NA 3.33 NA NA NA
4 FABR**01 FABR** Abricot, 82 Abricot NA NA NA 4.71 NA NA
5 FABR**01 FABR** Abricot, 82 Abricot NA NA NA NA 4.38 NA
6 FABR**01 FABR** Abricot, 82 Abricot NA NA NA NA NA 3.25
要重现数据框:
structure(list(CODE_C = c("FABR**01", "FABR**01", "FABR**01",
"FABR**01", "FABR**01", "FABR**01"), CODE_P = c("FABR**", "FABR**",
"FABR**", "FABR**", "FABR**", "FABR**"), LIB_COMPOSANT = c("Abricot, 82",
"Abricot, 82", "Abricot, 82", "Abricot, 82", "Abricot, 82", "Abricot, 82"
), LIB_PRODUIT = c("Abricot", "Abricot", "Abricot", "Abricot",
"Abricot", "Abricot"), `2020-01-01` = c(1.32446153846154, NA,
NA, NA, NA, NA), `2020-02-01` = c(NA, 1.09984615384615, NA, NA,
NA, NA), `2020-03-01` = c(NA, NA, 3.33157894736842, NA, NA, NA
), `2020-04-01` = c(NA, NA, NA, 4.70916279069767, NA, NA), `2020-05-01` = c(NA,
NA, NA, NA, 4.37848648648649, NA), `2020-06-01` = c(NA, NA, NA,
NA, NA, 3.24713953488372)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
初始 pivot_wider() 如下所示:
pivot_wider(DONNEES_COMPOSANT,
names_from = date,
values_from = PRIX)
预期输出:
CODE_C CODE_P LIB_COMPOSANT LIB_PRODUIT `2020-01-01` `2020-02-01` `2020-03-01` `2020-04-01` `2020-05-01` `2020-06-01`
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 FABR**01 FABR** Abricot, 82 Abricot 1.32 1.10 3.33 4.71 4.38 3.25
答:
2赞
r2evans
11/11/2023
#1
我会推断你的输入数据是
DONNEES_COMPOSANT <- structure(list(CODE_C = c("FABR**01", "FABR**01", "FABR**01", "FABR**01", "FABR**01", "FABR**01"), CODE_P = c("FABR**", "FABR**", "FABR**", "FABR**", "FABR**", "FABR**"), LIB_COMPOSANT = c("Abricot, 82", "Abricot, 82", "Abricot, 82", "Abricot, 82", "Abricot, 82", "Abricot, 82"), LIB_PRODUIT = c("Abricot", "Abricot", "Abricot", "Abricot", "Abricot", "Abricot"), date = c("2020-01-01", "2020-02-01", "2020-03-01", "2020-04-01", "2020-05-01", "2020-06-01"), PRIX = c(1.32446153846154, 1.09984615384615, 3.33157894736842, 4.70916279069767, 4.37848648648649, 3.24713953488372)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -6L))
DONNEES_COMPOSANT
# # A tibble: 6 × 6
# CODE_C CODE_P LIB_COMPOSANT LIB_PRODUIT date PRIX
# <chr> <chr> <chr> <chr> <chr> <dbl>
# 1 FABR**01 FABR** Abricot, 82 Abricot 2020-01-01 1.32
# 2 FABR**01 FABR** Abricot, 82 Abricot 2020-02-01 1.10
# 3 FABR**01 FABR** Abricot, 82 Abricot 2020-03-01 3.33
# 4 FABR**01 FABR** Abricot, 82 Abricot 2020-04-01 4.71
# 5 FABR**01 FABR** Abricot, 82 Abricot 2020-05-01 4.38
# 6 FABR**01 FABR** Abricot, 82 Abricot 2020-06-01 3.25
为了得到你想要的东西,我们需要将你的前四列指定为:id_cols
pivot_wider(DONNEES_COMPOSANT, id_cols = c(CODE_C, CODE_P, LIB_COMPOSANT, LIB_PRODUIT), names_from = "date", values_from = "PRIX")
# # A tibble: 1 × 10
# CODE_C CODE_P LIB_COMPOSANT LIB_PRODUIT `2020-01-01` `2020-02-01` `2020-03-01` `2020-04-01` `2020-05-01` `2020-06-01`
# <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 FABR**01 FABR** Abricot, 82 Abricot 1.32 1.10 3.33 4.71 4.38 3.25
评论
1赞
RKeithL
11/11/2023
啊,好的@r2evans!我试图弄清楚这一点,但指定了这 4 列,因为我没有想到。好排骨先生!id_cols
下一个:如何透视数据
评论
pivot_wider()
NA
dput()
DONNEES_COMPOSANT