我正在尝试从页面源代码中抓取一个html文件,但是代码在运行代码时给我错误

I'm trying to scraping a html file from page source but the code give me error when run code

提问人:Omar Kabil 提问时间:6/19/2023 更新时间:6/19/2023 访问量:19

问:

从 bs4 导入 BeautifulSoup 导入 CSV 从 itertools 导入zip_longest

将 open('dhsh.html', 'r') 替换为main_page: src = main_page.read()

rod_length = []
rod_power = []
line_rating = []
number_of_pc = []
lure_rating = []
rod_handle_type = []
number_of_guides_with_tip = []
water_type = []
model_number = []

soup = BeautifulSoup(src, 'lxml')
rod_lengths = soup.find_all('td', {'class', 'rod_lengths'})
rod_powers = soup.find_all('td', {'class', 'rod_powers'})
line_ratings = soup.find_all('td', {'class', 'line_ratings'})
number_of_pcs = soup.find_all('td', {'class', 'num_of_pcs'})
rod_handle_types = soup.find_all('td', {'class', 'rod_handle_type'})
lure_ratings = soup.find_all('td', {'class', 'lure_ratings'})
water_types = soup.find_all('td', {'class', 'water_type'})
model_numbers = soup.find_all('a', {'class', 'model_numbers'})
# ugly_tech = {'stik type':}
for i in range(len(rod_lengths)):
    rod_length.append(rod_lengths[i].text)
    rod_power.append(rod_powers[i].text)
    line_rating.append(line_ratings[i].text)
    number_of_pc.append(number_of_pcs[i].text)
    rod_handle_type.append(rod_handle_types[i].text)
    lure_rating.append(lure_ratings[i].text)
    water_type.append(water_types[i].text)
    model_number.append(model_numbers[i].text)
    
gx22s = [rod_length, rod_power, line_rating, number_of_pc, rod_handle_type, lure_rating, water_type, model_number]
unpaked = zip_longest(*gx22s)
with open('ugly_stik.csv', 'w') as ugly_stik:
    wr = csv.writer(ugly_stik)
    wr.writerow(['rod lengths', 'rod powers', 'line ratings', 'number of pcs', 'rod handle types', 'lure ratings', 'water types', 'model numbers'])
    wr.writerows(unpaked)

回溯(最近一次调用最后一次): 文件“c:\Users\Dkabil\web scrapping\gx2.py”,第 36 行,在 model_number.append(model_numbers[i].text) ~~~~~~~~~~~~~^^^ IndexError:列表索引超出范围

python-3.x 处理 编译器错误 语法错误

评论


答: 暂无答案