提问人:tchoup 提问时间:5/17/2022 最后编辑:Maëltchoup 更新时间:5/17/2022 访问量:434
尝试group_by然后汇总 max 和 min - 遇到明确格式的错误
trying to group_by and then summarize max and min - running into error for unambiguous format
问:
我的地址与 Kingwood 和 Humble 地址的信息重复。我正在尝试使用以下代码组合这些条目,保留最小首次报告日期和最大上次报告日期:
df <- df %>% group_by(id, street) %>%
summarise(firstReportedDate = min(as.Date(firstReportedDate))) %>%
summarise(lastReportedDate = max(as.Date(lastReportedDate)))
但是,由于某种原因,id == 1000 给了我错误:
Error: Problem with `summarise()` column `firstReportedDate`.
i `firstReportedDate = min(as.Date(firstReportedDate))`.
x character string is not in a standard unambiguous format
i The error occurred in group 3: id = "1000", street = "Po Box 203"
谁能帮我理解这个错误?数据示例如下:
dput(df)
structure(list(street = c("2200 Lake Village Dr", "1040 Marina Dr",
"2200 Lake Village Dr", "1040 Marina Dr", "22302 Rustic Bridge Ln",
"22302 Rustic Bridge Ln", "1060 Marina Dr", "3211 Laurel Point Ct",
"Po Box 203", "19703 Highway 59 N", "6714 Dorylee Ln", "3511 Forest Row Dr",
"3511 Forest Row Dr", "Acorn Ln"), city = c("Kingwood", "Humble",
"Kingwood", "Kingwood", "Kingwood", "Humble", "Humble", "Kingwood",
"Humble", "Humble", "Humble", "Kingwood", "Humble", "Humble"),
state = c("TX", "TX", "TX", "TX", "TX", "TX", "TX", "TX",
"TX", "TX", "TX", "TX", "TX", "TX"), zip = c("77339", "77339",
"77339", "77339", "77339", "77339", "77339", "77339", "77347",
"77338", "77396", "77345", "77345", "77345"), firstReportedDate = c("5/25/2019",
"1/1/2015", "9/30/2017", "11/30/2015", "10/18/2017", "6/15/2017",
"9/30/2009", "10/12/2002", "9/22/2017", "1/1/2009", "3/5/2004",
"4/8/2012", "9/30/2009", "1/1/2009"), lastReportedDate = c("4/1/2022",
"1/1/2021", "9/30/2017", "11/30/2015", "4/1/2022", "6/15/2018",
"9/30/2009", "3/3/2004", "4/1/2022", "1/1/2011", "3/5/2004",
"4/1/2022", "9/30/2009", "1/1/2013"), id = c("357", "357",
"357", "357", "359", "359", "359", "359", "1000", "1000",
"1000", "1431", "1431", "1431")), row.names = c(NA, -14L), class = c("tbl_df",
"tbl", "data.frame"))
答:
3赞
Maël
5/17/2022
#1
将所有内容嵌入到同一个摘要调用中。此外,您应该在数据不是国际日期格式的参数中指定日期的格式。format
as.Date
dat %>%
mutate(across(ends_with("Date"), as.Date, format = "%m/%d/%Y")) %>%
group_by(id, street) %>%
summarise(firstReportedDate = min(firstReportedDate),
lastReportedDate = max(lastReportedDate))
输出
# A tibble: 10 × 4
# Groups: id [4]
id street firstReportedDate lastReportedDate
<chr> <chr> <date> <date>
1 1000 19703 Highway 59 N 2009-01-01 2011-01-01
2 1000 6714 Dorylee Ln 2004-03-05 2004-03-05
3 1000 Po Box 203 2017-09-22 2022-04-01
4 1431 3511 Forest Row Dr 2009-09-30 2022-04-01
5 1431 Acorn Ln 2009-01-01 2013-01-01
6 357 1040 Marina Dr 2015-01-01 2021-01-01
7 357 2200 Lake Village Dr 2017-09-30 2022-04-01
8 359 1060 Marina Dr 2009-09-30 2009-09-30
9 359 22302 Rustic Bridge Ln 2017-06-15 2022-04-01
10 359 3211 Laurel Point Ct 2002-10-12 2004-03-03
评论
as.Date()
as.Date(firstReportedDate, '%m/%d/%Y')