提问人:Samuel Oluwapelumi 提问时间:7/16/2023 最后编辑:Christoph LütjenSamuel Oluwapelumi 更新时间:7/16/2023 访问量:38
请问这段代码有什么问题 (Webscrapping - BeautifulSoup)
Please what's wrong in this code (Webscrapping - BeautifulSoup)
问:
``from bs4 import BeautifulSoup import requests`
html_page = requests. Get[jooble]('https://ng.jooble.org/SearchResult?ukw=ict').text
soup = BeautifulSoup(html_page, 'lxml')
job_card = soup.find_all('article', attrs= {"data-test-name":"_jobCard","class_":"FxQpvm yKsady"}, limit=10)
job _title = job _card .find('h2', attrs={"class":"_15V35X"})
job _ Requirment = job _card .find('div', attrs={"class":"_9jGwm1"})
company = job_card.find('p', attrs={"class":"Ya0gV9"})
post_date = job_card.find('div', class_ = 'caption e0VAhp')
job_link = job_card.header.h2.a['href']
print(' ')
print(f'JOB TITLE: {job_title}')
print(f'COMPANY NAME: {company}')
print(f'JOB REQUIREMENT: {job_Requirment} (Read More...)')
print(f'POSTED: {post_date}')
print(f'MORE INFO: {job_link}')
print(' ')
从 bs4 导入 BeautifulSoup 导入请求
此代码旨在为我提供打印行的列表,但是
这是我得到的回应:回溯(最近一次调用最后一次): 文件“C:\Users\Samuel Oluwapelumi\Desktop\cs50web\jobs.py”,第 8 行,在 job_title = job_card.find('h2', attrs={“class”:“_15V35X”}) ^^^^^^^^^^^^^ 文件“C:\Users\Samuel Oluwapelumi\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\bs4\element.py”,第 2428 行,在 getattr raise AttributeError( AttributeError:ResultSet 对象没有属性“find”。您可能将元素列表视为单个元素。你打电话给find_all()了吗? 当你打算调用 find() 时?
答:
0赞
yashaswi k
7/16/2023
#1
@samuel请按照以下代码进行操作,我已进行更改以获取职位,其他类别请遵循相同的内容
from bs4 import BeautifulSoup
import requests
html_page = requests.get('https://ng.jooble.org/SearchResult?ukw=ict')
soup = BeautifulSoup(html_page.content, 'lxml')
job_title = soup.find_all('h2',attrs={"class":"_15V35X"})
job_titlelist1=[]
for line in job_title:
job_titlelist1.append(line.text)
print(f'JOB TITLE: {job_titlelist1}')
输出:
JOB TITLE: ['ICT Officer', 'ICT Coordinator', 'ICT Teacher', 'ICT/Tech Support Staff', 'Technician (ICT)', 'Senior ICT Specialist at Palladium Group', 'ICT Teacher', 'ICT Teacher', 'Computer (ICT) Teacher', 'Senior ICT Specialist at Palladium', 'Ict Professional Urgently Needed', 'ICT Head', 'ICT Technical Assistant at the Norwegian Refugee Council (NRC)', 'ICT Head at Walex Biz Nigeria Limited', 'ICT Manager at Ascentech Services Limited', 'An Ict Professional Urgently Needed', 'Credit Officers, ICT Officers and Marketing Officers at Reputable Microfinance Bank', 'ICT Officer at Engine Lubricant Manufacturing Company', 'ICT Technical Assistant Nigeria Jos, Abuja', 'ICT and Coding Instructor at Proxynet Communication']
评论
You're probably treating a list of elements like a single element.
job_card
for job in job_card: