提问人:motylas 提问时间:11/13/2023 最后编辑:motylas 更新时间:11/13/2023 访问量:38
Python - Webscraping - 从网格和 flex 字段获取数据
Python - Webscraping - get data from grid and flex field
问:
我正在使用 selenium,但我无法从标记为 flex 的 DIV 中获取数据 https://www.jpg.store/collection/hungrycowsbymuesliswap?tab=items
我需要 asset_id 属性(标记为黄色)页面源的值
我的代码只能到达我正在寻找的值上方的 div
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--disable-gpu") headless
driver = webdriver.Chrome(options=chrome_options)
url = 'https://www.jpg.store/collection/hungrycowsbymuesliswap?tab=items'
driver.set_page_load_timeout(4)
driver.get(url)
element = driver.find_element(By.XPATH, "//body/div/div[1]/main/div[2]/div/section/div/div[2]/div/div/div[1]")
print(element.get_attribute('outerHTML'))
如果我将另一个div添加到xpath,代码将显示错误:selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//body/div/div[1]/main/div[2]/div/section/div/div[2]/div/div/div[1]/div"}
答:
0赞
Yaroslavm
11/13/2023
#1
你可以用定位器搜索元素数组,等待它的存在,使用并获取每个找到的元素的属性。div[asset_id]
WebDriverWait
asset_id
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--disable-gpu")
driver = webdriver.Chrome(options=chrome_options)
url = 'https://www.jpg.store/collection/hungrycowsbymuesliswap?tab=items'
driver.get(url)
wait = WebDriverWait(driver, 10)
asset_els = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "div[asset_id]")))
for element in asset_els:
print(element.get_attribute('asset_id'))
评论