提问人:asma 提问时间:11/4/2023 更新时间:11/4/2023 访问量:21
Python Selenium:在网页抓取中到达可滚动 div 的末尾时如何停止 while 循环
Python Selenium: How to Stop While Loop When Reaching the End of a Scrollable div in Web Scraping
问:
我正在使用 Python 和 Selenium 编写网络抓取脚本。我有一个 while 循环,可以滚动网页并收集餐厅数据。我想在到达页面末尾时停止循环,但我不确定如何检测这种情况。这是我的代码:
try:
# Locate the scrollable div element for restaurant results
scrollable_main_div = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, '//div[@aria-label="نتائج عن restaurants in riyadh"]'))
)
i = 1
while True:
# Scroll down to the end of the div
driver.execute_script('arguments[0].scrollTop = arguments[0].scrollHeight', scrollable_main_div)
time.sleep(2)
# Locate and store a single restaurant element
restaurant = driver.find_element(By.XPATH, f'(//div[@aria-label="نتائج عن restaurants in riyadh"]//div[contains(@class,"Nv2PK THOPZb CpccDe ")])[{i}]')
# Get the restaurant's name
name = get_name_check(restaurant)
# Check for duplicate restaurant names
if name not in restaurant_names:
i += 1
restaurant_names.append(name)
print(name)
print("_____________________________")
time.sleep(1)
# This condition is intended to stop scrolling when the end of the page is reached
if driver.execute_script('arguments[0].scrollTop >= arguments[0].scrollHeight', scrollable_main_div):
break
except TimeoutException:
print("Timeout Exception: Check the page or adjust the waiting time")
except Exception as e:
print(f"An error occurred: {e}")
# Create a DataFrame and save it to a CSV file
df = pd.DataFrame(restaurant_names)
df.to_csv("names.csv", index=False)
我尝试滚动 div 并收集餐厅名称,并成功地将名称存储在restaurant_names列表中。但是,我需要帮助添加一个条件,以便在 div 中没有更多数据要收集时停止 while 循环,以便我可以创建 DataFrame 并将其保存到 CSV 文件。
答: 暂无答案
评论