URL 错误...[WinError 3] (英语)系统找不到指定的路径

Error at URL ...[WinError 3] The system cannot find the path specified

提问人:Duc Nguyen 提问时间:11/16/2023 最后编辑:miken32Duc Nguyen 更新时间:11/17/2023 访问量:18

问:

我正在运行此代码来读取包含数千篇新闻文章 URL 的 csv 文件并提取正文文本。它一直做得很好,直到第 20,000 个 URL 开始显示此错误:[WinError 3] The system cannot find the path specified: C:\\Users\\Admin\\AppData\\Local\\Temp\\.newspaper_scraper\\article_resources

我不确定那里发生了什么。任何帮助将不胜感激,谢谢!

import nltk
import newspaper
import csv
import pandas as pd
from newspaper import Article
FILE = "output/thoisu_output" #importing the file
df = pd.read_csv(FILE + ".csv")
i = 0
def get_content_from_url_col(row):
   global i
   i = i +  1
   if i % 50 ==0:
      print('Scrape den trang', i) 
   if row['CatName'] != 'Infographics' and type(row['text']) != str:
      try:
         article = Article(row['Url'], language = 'vi')
         article.download()
         article.parse()
         row['text'] = article.text
      except Exception as error:
         print(f"Error at url: {row["Url"]}", error)
   return row
df = df.apply(get_content_from_url_col, axis = 1)
df.to_csv(f"{FILE}_2.csv")
Python 网页抓取 网址

评论


答: 暂无答案