提问人:RPV 提问时间:5/16/2023 更新时间:5/16/2023 访问量:55
R-package gtrendsR 中的数据下载问题 - 某些时间段的错误消息
Data download problems in the R-package gtrendsR - error message for certain time periods
问:
我正在尝试使用 gtrendsR 从 Google 趋势下载数据进行分析。我的关键词是德语单词“Nachrichten”,它等于英语单词 news。它已经运行得很好,但不幸的是,下载数据时存在一些问题。 我定义了各种三个月的时间段,在这些时间段内应该下载每日数据。我还想将每个时期单独保存为 .csv 文件。此外,所有数据都应合并并保存到一个大文件中(此处:trends_news)。 问题在于,始终不下载单个三个月的时间段,并出现错误消息:“错误消息:NA/NaN 参数”。错误也不是系统地总是在某个相同的时间段,但如果您使用其他搜索词(例如天气而不是新闻),则会有所不同。但为了我的分析,我需要 2004-01-01 和 2012-06-30 之间的所有数据。 这是我的代码:
#define time periods for the dowload
time = c("2004-01-01 2004-03-31", "2004-04-01 2004-06-30", "2004-07-01 2004-09-30", "2004-10-01 2004-12-31", "2005-01-01 2005-03-31", "2005-04-01 2005-06-30", "2005-07-01 2005-09-30", "2005-10-01 2005-12-31", "2006-01-01 2006-03-31", "2006-04-01 2006-06-30", "2006-07-01 2006-09-30", "2006-10-01 2006-12-31", "2007-01-01 2007-03-31", "2007-04-01 2007-06-30", "2007-07-01 2007-09-30", "2007-10-01 2007-12-31", "2008-01-01 2008-03-31", "2008-04-01 2008-06-30", "2008-07-01 2008-09-30", "2008-10-01 2008-12-31","2009-01-01 2009-03-31", "2009-04-01 2009-06-30", "2009-07-01 2009-09-30", "2009-10-01 2009-12-31","2010-01-01 2010-03-31", "2010-04-01 2010-06-30", "2010-07-01 2010-09-30", "2010-10-01 2010-12-31", "2011-01-01 2011-03-31", "2011-04-01 2011-06-30", "2011-07-01 2011-09-30", "2011-10-01 2011-12-31", "2012-01-01 2012-03-31", "2012-04-01 2012-06-30") `
Sys.setenv(TZ = "Europe/Berlin") # Set the timezone to 'Europe/Berlin'
#download data Nachrichten
trends_Nachrichten = data.table()
for (i in time) {
tryCatch({
trends <- gtrends(keyword = c("Nachrichten"),
time = i,
geo = "DE",
gprop = "web",
category = 0,
hl = "de-DE")
trends_data <- as.data.frame(trends$interest_over_time)
trends_data$date <- as.Date(trends_data$date)
file_name = paste0("Nachrichten", i, ".csv")
write.csv(trends_data,
file = paste0('/Users/...', file_name),
quote = TRUE,
row.names = FALSE)
trends_Nachrichten = rbind(trends_Nachrichten, trends_data)
}, error = function(e) {
cat("Error message:", conditionMessage(e), "\n")
})
}
可能是什么问题?有人有解决方案吗?
先谢谢你!
我已经在网上搜索了原因和解决方案,但找不到任何对我有帮助的东西。
答:
0赞
Arthur Welle
5/16/2023
#1
嗯,看起来很奇怪......它确实像可能是包问题或 API 解析问题一样接缝。对于某些特定日期范围,代码将失败。
#define time periods for the dowload
l_time <- c("2004-01-01 2004-03-31",
"2004-04-01 2004-06-30",
"2004-07-01 2004-09-30",
"2004-10-01 2004-12-31",
"2005-01-01 2005-03-31",
"2005-04-01 2005-06-30",
"2005-07-01 2005-09-30",
"2005-10-01 2005-12-31",
"2006-01-01 2006-03-31",
"2006-04-01 2006-06-30",
"2006-07-01 2006-09-30",
"2006-10-01 2006-12-31",
"2007-01-01 2007-03-31",
"2007-04-01 2007-06-30",
"2007-07-01 2007-09-30",
"2007-10-01 2007-12-31",
"2008-01-01 2008-03-31",
"2008-04-01 2008-06-30",
"2008-07-01 2008-09-30",
"2008-10-01 2008-12-31",
"2009-01-01 2009-03-31",
"2009-04-01 2009-06-30",
"2009-07-01 2009-09-30",
"2009-10-01 2009-12-31",
"2010-01-01 2010-03-31",
"2010-04-01 2010-06-30",
"2010-07-01 2010-09-30",
#"2010-10-01 2010-12-31",
"2011-01-01 2011-03-31",
"2011-04-01 2011-06-30",
"2011-07-01 2011-09-30",
"2011-10-01 2011-12-31",
"2012-01-01 2012-03-31",
"2012-04-01 2012-06-30")
我做了一个函数来包装你和测试。for
f_gtrendsR <- function(v_time = "2004-01-01 2004-03-31",
v_keyword = "Nachrichten"){
print(v_time)
trends <- gtrendsR::gtrends(keyword = v_keyword,
time = v_time,
geo = "DE",
gprop = "web",
category = 0,
hl = "de-DE")
k <- as.data.table(trends$interest_over_time)
k$date <- as.Date(k$date)
return(k)
}
并调用具有 和 的所有结果的函数。do.call
lapply
rbind
k <- do.call(rbind, lapply(l_time, f_gtrendsR, v_keyword = "Nachrichten"))
这对我来说很好。但请注意,只有没有“2010-10-01 2010-12-31”!
奇怪的是,如果我划分有问题的范围,它也可以正常工作:
k1 <- f_gtrendsR(v_time = "2010-10-01 2010-11-30", v_keyword = "Nachrichten")
k2 <- f_gtrendsR(v_time = "2010-11-01 2010-12-31", v_keyword = "Nachrichten")
所以,这不是一个正确的答案,但我想它会让你接近一个答案。你总是可以做手动工作并改变有问题的范围,但我对这种方法感到不满意。
评论
0赞
RPV
5/16/2023
嘿,非常感谢!正是这个时间范围对我不起作用,我也考虑过手动纠正它,但想先寻求帮助或自动解决方案,因为我也对该解决方案不满意!
评论