无法获取 GCP 监控指标的比率:错误 400:分子是增量指标,但分母不是增量指标

Cannot get ratio of metrics on GCP Monitoring: Error 400: the numerator is a delta metric but the denominator is not a delta metric

提问人:reith 提问时间:3/14/2023 更新时间:4/27/2023 访问量:214

问:

我正在尝试根据 PubSub 订阅中失败消息的比率创建警报策略。我喜欢用 pubsub.googleapis.com/subscription/dead_letter_message_count 作为分子,用 pubsub.googleapis.com/subscription/pull_ack_request_count 作为分母。对齐周期匹配,我使用交叉系列减少器通过消除所有标签来去除分母中的附加标签。我打算创建的警报策略如下所示:

monitoring/alertPolicy:AlertPolicy:
        combiner   : "AND"
        conditions : [
            [0]: {
                conditionThreshold: {
                    aggregations           : [
                        [0]: {
                            alignmentPeriod   : "600s"
                            crossSeriesReducer: "REDUCE_SUM"
                            perSeriesAligner  : "ALIGN_SUM"
                        }
                    ]
                    comparison             : "COMPARISON_GT"
                    denominatorAggregations: [
                        [0]: {
                            alignmentPeriod   : "600s"
                            crossSeriesReducer: "REDUCE_SUM"
                            perSeriesAligner  : "ALIGN_SUM"
                        }
                    ]
                    denominatorFilter      : "resource.type = \"pubsub_subscription\" AND resource.labels.subscription_id = \"subscription\" AND metric.type = \"pubsub.googleapis.com/subscription/pull_ack_request_count\""
                    duration               : "1800s"
                    filter                 : "resource.type = \"pubsub_subscription\" AND resource.labels.subscription_id = \"subscription\" AND metric.type = \"pubsub.googleapis.com/subscription/dead_letter_message_count\""
                    thresholdValue         : 0.5
                }
            }
        ]

但是我收到错误:

创建 AlertPolicy 时出错:googleapi:错误 400:分子是 增量指标,但分母不是增量指标。

这看起来令人困惑,因为这两个指标都是 Delta。我使用 API 资源管理器来检索时间序列。对于分子,我得到:

{
  "timeSeries": [
    {
      "metric": {
        "type": "pubsub.googleapis.com/subscription/dead_letter_message_count"
      },
      "resource": {
        "type": "pubsub_subscription",
        "labels": {
          "project_id": "redacted"
        }
      },
      "metricKind": "DELTA",
      "valueType": "INT64",
      "points": [
        {
          "interval": {
            "startTime": "2023-03-13T10:10:00Z",
            "endTime": "2023-03-13T10:20:00Z"
          },
          "value": {
            "int64Value": "0"
          }
        },
        ....,
        {
          "interval": {
            "startTime": "2023-03-13T09:10:00Z",
            "endTime": "2023-03-13T09:20:00Z"
          },
          "value": {
            "int64Value": "93"
          }
        },
        {
          "interval": {
            "startTime": "2023-03-13T09:00:00Z",
            "endTime": "2023-03-13T09:10:00Z"
          },
          "value": {
            "int64Value": "9"
          }
        },
        {
          "interval": {
            "startTime": "2023-03-13T08:50:00Z",
            "endTime": "2023-03-13T09:00:00Z"
          },
          "value": {
            "int64Value": "34"
          }
        }
      ]
    }
  ],
  "unit": "1"
}

对于分母:

{
  "timeSeries": [
    {
      "metric": {
        "type": "pubsub.googleapis.com/subscription/pull_ack_request_count"
      },
      "resource": {
        "type": "pubsub_subscription",
        "labels": {
          "project_id": "redacted"
        }
      },
      "metricKind": "DELTA",
      "valueType": "INT64",
      "points": [
        {
          "interval": {
            "startTime": "2023-03-13T09:50:00Z",
            "endTime": "2023-03-13T10:00:00Z"
          },
          "value": {
            "int64Value": "6"
          }
        },
        ....,
        {
          "interval": {
            "startTime": "2023-03-13T08:20:00Z",
            "endTime": "2023-03-13T08:30:00Z"
          },
          "value": {
            "int64Value": "104"
          }
        },
        {
          "interval": {
            "startTime": "2023-03-13T08:10:00Z",
            "endTime": "2023-03-13T08:20:00Z"
          },
          "value": {
            "int64Value": "93"
          }
        },
        {
          "interval": {
            "startTime": "2023-03-13T08:00:00Z",
            "endTime": "2023-03-13T08:10:00Z"
          },
          "value": {
            "int64Value": "111"
          }
        }
      ]
    }
  ],
  "unit": "1"
}
堆栈驱动程序 google-cloud-monitoring

评论

0赞 Nestor 4/12/2023
我建议与Google支持部门联系(创建案例),因为他们可以查看您的日志,通常不会公开共享的后端日志。cloud.google.com/contact

答:

1赞 reith 4/27/2023 #1

由于某些实现细节,无法在基于 JSON 的警报中定义此基于比率的警报。来自谷歌

我们从产品团队那里得到了更新,指出该问题已到期 到增量字段中的不一致。显然是这个原因 是pull_ack_request_count有一个增量窗口操作,带有 显式窗口。此显式窗口可防止 标记为增量的预计算。

比率是 Google 内部查询的一项功能。我们对 实现,不能保证不会出现错误。建议 通常使用 MQL 而不是分母过滤器。