一段时间内的 API 调用循环未将所有内容写入 txt 文件

API Call within a while loop not writing everything to txt file

提问人:unmute-me 提问时间:5/20/2023 最后编辑:unmute-me 更新时间:5/20/2023 访问量:47

问:

完整的程序将调用 NYT API 来查找 1851 年至今的文章。为了测试,我将其限制在 1851 年,今年只有四个月有问题,因此第 1-8 个月应该只返回空白 txt 文件。问题是只有 11 月和 12 月被写入他们的文件,而不是 9 月和 10 月。您需要注册一个 NYT API 密钥,并创建一个名为 txt_holder 的目录。

#create a NYT api call to scrap every year and all its months into individual txt files so that I can do a sentiment analysis month by month


#imports request function
import requests
#gets the datetime so I can get the current year
import datetime

import time

API_KEY = 

def files_and_count():
  #NYT data goes back to the year 1850
  year_increment = 1850
  today = datetime.date.today()
  year = today.year

  months = 0
  
  while year_increment != 1851:
    year_increment = year_increment + 1
    
    while months != 12:
      months = months+1
      query = str(year_increment) + "/"+ str(months)
      real_file_name = str(year_increment) + "_"+ str(months)
      file_name = str('txt_holder/'+ real_file_name +'.txt')
      file = open(file_name, 'w')
      
      search(query,API_KEY,file)

    
    months = 0

#Function that takes the API query and writes them into a file
def search(query,API_KEY,file):
  #here is the url with the query and API key as values so it is more flexible
  url = f'https://api.nytimes.com/svc/archive/v1/{query}.json?api-key={API_KEY}'
  
  print(url)

  #time between each API call
  time.sleep(12)
  
  try:
    #request the url
    response = requests.get(url)
  
    #find the JSON content
    content = response.json()
  
    #for loop to go through the text the request pulled from NYT
    for item in  content["response"]["docs"]:
      #searches in the JSON for the titles of the articles
      text = item["headline"]["main"]
      #for every article title there is a newline character added to the end of it so that each title is on a new line
      file.write(text+"\n")
      
  except:
    print('failed')
  
files_and_count()

这是对我有用的那个,谢谢你的帮助:

    #imports request function
import requests
#gets the datetime so I can get the current year
import datetime

import time

API_KEY = 


def files_and_count():
  #NYT data goes back to the year 1850

  today = datetime.date.today()
  year = today.year

  for i in range(1851, year):

    for months in range(1,13):
      
      query = str(i) + "/"+ str(months)
      real_file_name = str(i) + "_"+ str(months)
      file_name = str('txt_folder/'+ real_file_name +'.txt')
      file = open(file_name, 'w')
      
      search(query,API_KEY,file)




#Function that takes the API query and writes them into a file
def search(query,API_KEY,file):
  #here is the url with the query and API key as values so it is more flexible
  url = f'https://api.nytimes.com/svc/archive/v1/{query}.json?api-key={API_KEY}'
  
  print(url)

  #time between each API call
  time.sleep(12)
  
  try:
    #request the url
    response = requests.get(url)
  
    #find the JSON content
    content = response.json()
  
    #for loop to go through the text the request pulled from NYT
    for item in  content["response"]["docs"]:
      #searches in the JSON for the titles of the articles
      text = item["headline"]["main"]
      #for every article title there is a newline character added to the end of it so that each title is on a new line
      file.write(text+"\n")
      
  except ValueError:
    print('Could not pull content from/connect to API')
  




files_and_count()
Python 数据库 API 文件 文本

评论

0赞 Barmar 5/20/2023
Your code will be simpler if you use and for year_increment in range(year, 1852):for months in range(1, 13):
0赞 Barmar 5/20/2023
Does show the URL for those months? If it does, then the only reason you're not getting anything is that the API isn't returning any docs for those months. You say there should be 4 months in that year, but I think you're wrong.print(url)
0赞 John Gordon 5/20/2023
What, if anything, happens with the September and October files? Are they written as blank files? Are they not written at all?
1赞 Barmar 5/20/2023
Use to see what the API is actually returning.print(content)
0赞 John Gordon 5/20/2023
Does the "failed" error message ever appear?

答: 暂无答案