在 AWS 上使用 flintrock 启动 Spark 集群时,如何解决此错误?

How do I resolve this error when using flintrock to start Spark clusters on AWS?

提问人:Eric Mariasis 提问时间:11/16/2023 更新时间:11/16/2023 访问量:19

问:

我已按照此处详述的说明尝试使用 flintrock 通过 AWS 上的 EC2 实例创建 Spark 集群。作为背景,我正在做的最终目标是跨 4 个 EC2 实例并行化 Spark 上的操作,并在主节点上收集结果。

以下是我为 config.yaml 提供的内容,当我尝试运行 flintrock 启动cluster_name时会使用该内容。运行此命令时,出现错误:An error occurred (InvalidParameterCombination) when calling the RunInstances operation: The parameter iops is not supported for gp2 volumes. Operation aborted.

services:
  spark:
    version: 3.1.2
    # git-commit: latest  # if not 'latest', provide a full commit SHA; e.g. d6dc12ef0146ae409834c78737c116050961f350
    # git-repository:  # optional; defaults to https://github.com/apache/spark
    # optional; defaults to download from a dynamically selected Apache mirror
    #   - can be http, https, or s3 URL
    #   - must contain a {v} template corresponding to the version
    #   - Spark must be pre-built
    #   - files must be named according to the release pattern shown here: https://dist.apache.org/repos/dist/release/spark/
    # download-source: "https://www.example.com/files/spark/{v}/"
    # download-source: "s3://some-bucket/spark/{v}/"
    # executor-instances: 1
  hdfs:
    version: 3.3.0
    # optional; defaults to download from a dynamically selected Apache mirror
    #   - can be http, https, or s3 URL
    #   - must contain a {v} template corresponding to the version
    #   - files must be named according to the release pattern shown here: https://dist.apache.org/repos/dist/release/hadoop/common/
    # download-source: "https://www.example.com/files/hadoop/{v}/"
    # download-source: "http://www-us.apache.org/dist/hadoop/common/hadoop-{v}/"
    # download-source: "s3://some-bucket/hadoop/{v}/"

provider: ec2

providers:
  ec2:
    key-name: spark_cluster
    identity-file: /media/sf_linuxvm/spark_cluster.pem
    instance-type: t2.micro
    region: us-east-1
    # availability-zone: <name>
    ami: ami-0230bd60aa48260c6
    user: ec2-user
    # ami: ami-61bbf104  # CentOS 7, us-east-1
    # user: centos
    # spot-price: <price>
    # spot-request-duration: 7d  # duration a spot request is valid, supports d/h/m/s (e.g. 4d 3h 2m 1s)
    # vpc-id: <id>
    # subnet-id: <id>
    # placement-group: <name>
    # security-groups:
    #   - group-name1
    #   - group-name2
    # instance-profile-name:
    # tags:
    #   - key1,value1
    #   - key2, value2  # leading/trailing spaces are trimmed
    #   - key3,  # value will be empty
    # min-root-ebs-size-gb: <size-gb>
    tenancy: default  # default | dedicated
    ebs-optimized: no  # yes | no
    instance-initiated-shutdown-behavior: terminate  # terminate | stop
    # user-data: /path/to/userdata/script
    # authorize-access-from:
    #   - 10.0.0.42/32
    #   - sg-xyz4654564xyz

launch:
  num-slaves: 3
  # install-hdfs: True
  install-spark: True
  java-version: 8

debug: false

我尝试在线和 Stack Overflow 上搜索错误的解决方案,但不确定如何将其应用于我的具体情况。

亚马逊网络服务 apache-spark EC2 亚马逊 -AMI

评论

0赞 Eric Mariasis 11/16/2023
flintrock github 存储库的链接在这里:github.com/nchammas/flintrock

答: 暂无答案