报废 Selenium 和 Firefox 时出错

Error when scrapping Selenium and Firefox

提问人:Irina_Xena 提问时间:11/7/2023 最后编辑:Irina_Xena 更新时间:11/7/2023 访问量:38

问:

我的 Django 项目中有一个刮刀。它适用于Selenium + Firefox

我的 Dockerfile

# Install geckodriver
# Copied from: https://hub.docker.com/r/selenium/node-firefox/dockerfile
ARG GECKODRIVER_VERSION=0.26.0
RUN wget --no-verbose -O /tmp/geckodriver.tar.gz https://github.com/mozilla/geckodriver/releases/download/v$GECKODRIVER_VERSION/geckodriver-v$GECKODRIVER_VERSION-linux64.tar.gz \
  && rm -rf /opt/geckodriver \
  && tar -C /opt -zxf /tmp/geckodriver.tar.gz \
  && rm /tmp/geckodriver.tar.gz \
  && mv /opt/geckodriver /opt/geckodriver-$GECKODRIVER_VERSION \
  && chmod 755 /opt/geckodriver-$GECKODRIVER_VERSION \
  && ln -fs /opt/geckodriver-$GECKODRIVER_VERSION /usr/bin/geckodriver

docker-compose 中的 Selenium 容器

  selenium:
    image: selenium/standalone-firefox:latest
    hostname: firefox
    ports:
      - "4444:4444/tcp"
    shm_size: "2gb"
    restart: unless-stopped

我在初始化我的 scrapper 类时遇到错误

from selenium import webdriver
from selenium.webdriver import DesiredCapabilities


class Scrapper:
    WEBDRIVER_TIMEOUT = 2
    WEBDRIVER_ARGUMENTS = (
        "--disable-dev-shm-usage",
        "--ignore-certificate-errors",
        "--headless",
    )

    def __init__(self, useragent=None):
        self.options = webdriver.FirefoxOptions()
        for argument in self.WEBDRIVER_ARGUMENTS:
            self.options.add_argument(argument)
        self.driver = webdriver.Remote(
            command_executor="http://firefox:4444/wd/hub",
            desired_capabilities=DesiredCapabilities.FIREFOX,
            options=self.options,
        )
        self.driver.set_page_load_timeout(self.WEBDRIVER_TIMEOUT)

错误

raise MaxRetryError(_pool, url, error or ResponseError(cause))
| urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='127.0.0.1', port=4444): Max retries exceeded with url: /wd/hub/session (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f2c4df553c0>: Failed to establish a new connection: [Errno 111] Connection refused'))

我不知道如何解决它。

更新。 我是如何得到的

06:45:50.852 WARN [ExternalProcess$Builder.lambda$start$0] - failed to copy the output of process 168
java.io.IOException: Stream closed
django 火狐 docker-compose selenium-grid

评论


答: 暂无答案