Python - Webscraping - 从网格和 flex 字段获取数据-解网

问：

我正在使用 selenium，但我无法从标记为 flex 的 DIV 中获取数据 https://www.jpg.store/collection/hungrycowsbymuesliswap?tab=items

我需要 asset_id 属性（标记为黄色）页面源的值

我的代码只能到达我正在寻找的值上方的 div

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
chrome_options.add_argument("--headless")  
chrome_options.add_argument("--disable-gpu")   headless
driver = webdriver.Chrome(options=chrome_options)
url = 'https://www.jpg.store/collection/hungrycowsbymuesliswap?tab=items'
driver.set_page_load_timeout(4) 
driver.get(url)

element = driver.find_element(By.XPATH, "//body/div/div[1]/main/div[2]/div/section/div/div[2]/div/div/div[1]")

print(element.get_attribute('outerHTML'))

如果我将另一个div添加到xpath，代码将显示错误：selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//body/div/div[1]/main/div[2]/div/section/div/div[2]/div/div/div[1]/div"}

python selenium-webdriver web-scraping beautifulsoup flexbox

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--disable-gpu")
driver = webdriver.Chrome(options=chrome_options)
url = 'https://www.jpg.store/collection/hungrycowsbymuesliswap?tab=items'
driver.get(url)
wait = WebDriverWait(driver, 10)
asset_els = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "div[asset_id]")))

for element in asset_els:
    print(element.get_attribute('asset_id'))

上一个：使用 OpenCV 库处理文本图像

下一个：Keras 中的自定义损失函数与物理通知

Python - Webscraping - 从网格和 flex 字段获取数据

Python - Webscraping - get data from grid and flex field

评论