R-package gtrendsR 中的数据下载问题 - 某些时间段的错误消息

Data download problems in the R-package gtrendsR - error message for certain time periods

提问人:RPV 提问时间:5/16/2023 更新时间:5/16/2023 访问量:55

问:

我正在尝试使用 gtrendsR 从 Google 趋势下载数据进行分析。我的关键词是德语单词“Nachrichten”,它等于英语单词 news。它已经运行得很好,但不幸的是,下载数据时存在一些问题。 我定义了各种三个月的时间段,在这些时间段内应该下载每日数据。我还想将每个时期单独保存为 .csv 文件。此外,所有数据都应合并并保存到一个大文件中(此处:trends_news)。 问题在于,始终不下载单个三个月的时间段,并出现错误消息:“错误消息:NA/NaN 参数”。错误也不是系统地总是在某个相同的时间段,但如果您使用其他搜索词(例如天气而不是新闻),则会有所不同。但为了我的分析,我需要 2004-01-01 和 2012-06-30 之间的所有数据。 这是我的代码:

#define time periods for the dowload
time = c("2004-01-01 2004-03-31", "2004-04-01 2004-06-30", "2004-07-01 2004-09-30", "2004-10-01 2004-12-31", "2005-01-01 2005-03-31", "2005-04-01 2005-06-30", "2005-07-01 2005-09-30", "2005-10-01 2005-12-31", "2006-01-01 2006-03-31", "2006-04-01 2006-06-30", "2006-07-01 2006-09-30", "2006-10-01 2006-12-31", "2007-01-01 2007-03-31", "2007-04-01 2007-06-30", "2007-07-01 2007-09-30", "2007-10-01 2007-12-31", "2008-01-01 2008-03-31", "2008-04-01 2008-06-30", "2008-07-01 2008-09-30", "2008-10-01 2008-12-31","2009-01-01 2009-03-31", "2009-04-01 2009-06-30", "2009-07-01 2009-09-30", "2009-10-01 2009-12-31","2010-01-01 2010-03-31", "2010-04-01 2010-06-30", "2010-07-01 2010-09-30", "2010-10-01 2010-12-31", "2011-01-01 2011-03-31", "2011-04-01 2011-06-30", "2011-07-01 2011-09-30", "2011-10-01 2011-12-31", "2012-01-01 2012-03-31", "2012-04-01 2012-06-30") `
Sys.setenv(TZ = "Europe/Berlin")  # Set the timezone to 'Europe/Berlin'

#download data Nachrichten
trends_Nachrichten = data.table()

for (i in time) {
  
  tryCatch({
    trends <- gtrends(keyword = c("Nachrichten"), 
                      time = i,
                      geo = "DE",
                      gprop = "web",
                      category = 0,
                      hl = "de-DE")
    
    trends_data <- as.data.frame(trends$interest_over_time)
    trends_data$date <- as.Date(trends_data$date)
    
    file_name = paste0("Nachrichten", i, ".csv")
    write.csv(trends_data, 
              file = paste0('/Users/...', file_name), 
              quote = TRUE, 
              row.names = FALSE)
    
    trends_Nachrichten = rbind(trends_Nachrichten, trends_data)
  }, error = function(e) {
    cat("Error message:", conditionMessage(e), "\n")
  })
}

可能是什么问题?有人有解决方案吗?

先谢谢你!

我已经在网上搜索了原因和解决方案,但找不到任何对我有帮助的东西。

r 下载 错误消息 gtrendsr

评论


答:

0赞 Arthur Welle 5/16/2023 #1

嗯,看起来很奇怪......它确实像可能是包问题或 API 解析问题一样接缝。对于某些特定日期范围,代码将失败。

#define time periods for the dowload
l_time <- c("2004-01-01 2004-03-31", 
           "2004-04-01 2004-06-30", 
           "2004-07-01 2004-09-30", 
           "2004-10-01 2004-12-31", 
           "2005-01-01 2005-03-31", 
           "2005-04-01 2005-06-30", 
           "2005-07-01 2005-09-30", 
           "2005-10-01 2005-12-31", 
           "2006-01-01 2006-03-31",
           "2006-04-01 2006-06-30",
           "2006-07-01 2006-09-30", 
           "2006-10-01 2006-12-31", 
           "2007-01-01 2007-03-31",
           "2007-04-01 2007-06-30", 
           "2007-07-01 2007-09-30",
           "2007-10-01 2007-12-31",
           "2008-01-01 2008-03-31",
           "2008-04-01 2008-06-30",
           "2008-07-01 2008-09-30",
           "2008-10-01 2008-12-31",
           "2009-01-01 2009-03-31", 
           "2009-04-01 2009-06-30", 
           "2009-07-01 2009-09-30", 
           "2009-10-01 2009-12-31",
           "2010-01-01 2010-03-31",
           "2010-04-01 2010-06-30", 
           "2010-07-01 2010-09-30", 
       #"2010-10-01 2010-12-31", 
           "2011-01-01 2011-03-31", 
           "2011-04-01 2011-06-30", 
           "2011-07-01 2011-09-30",
           "2011-10-01 2011-12-31",
           "2012-01-01 2012-03-31", 
           "2012-04-01 2012-06-30")

我做了一个函数来包装你和测试。for

f_gtrendsR <- function(v_time = "2004-01-01 2004-03-31",
                       v_keyword = "Nachrichten"){
    print(v_time)
    trends <- gtrendsR::gtrends(keyword = v_keyword, 
                      time = v_time,
                      geo = "DE",
                      gprop = "web",
                      category = 0,
                      hl = "de-DE")
    
    k <- as.data.table(trends$interest_over_time)
    k$date <- as.Date(k$date)

    return(k)
}

并调用具有 和 的所有结果的函数。do.calllapplyrbind

k <- do.call(rbind, lapply(l_time, f_gtrendsR, v_keyword = "Nachrichten"))

这对我来说很好。但请注意,只有没有“2010-10-01 2010-12-31”

奇怪的是,如果我划分有问题的范围,它也可以正常工作:

k1 <- f_gtrendsR(v_time = "2010-10-01 2010-11-30", v_keyword = "Nachrichten")
k2 <- f_gtrendsR(v_time = "2010-11-01 2010-12-31", v_keyword = "Nachrichten")

所以,这不是一个正确的答案,但我想它会让你接近一个答案。你总是可以做手动工作并改变有问题的范围,但我对这种方法感到不满意。

评论

0赞 RPV 5/16/2023
嘿,非常感谢!正是这个时间范围对我不起作用,我也考虑过手动纠正它,但想先寻求帮助或自动解决方案,因为我也对该解决方案不满意!