提问人:x89 提问时间:11/17/2023 最后编辑:x89 更新时间:11/20/2023 访问量:99
弹性伸缩ECS服务
auto scaling for ECS service
问:
我已经按照本指南设置了 GTM 服务器端:https://aws-solutions-library-samples.github.io/advertising-marketing/using-google-tag-manager-for-server-side-website-analytics-on-aws.html
我正在使用 AWS ECS 任务定义和服务。之后,我使用 Snowbridge 使用 HTTP post 请求将数据从 AWS kinesis 发送到 GTM(扫雪机客户端)。
当数据量较大时,我偶尔会收到 GTM 的 502 错误。如果我过滤掉数据并减少转发到 GTM 的数据量,则不会再出现错误。我可以在 GTM 端进行哪些更改以确保可以处理大量数据?ECS可以使用自动扩缩容功能吗?
我已经使用了像这样的参数
deployment_maximum_percent = 200
deployment_minimum_healthy_percent = 50
但问题仍然存在。
这是我的 GTM 配置大致的样子:
resource "aws_ecs_cluster" "gtm" {
name = "gtm"
setting {
name = "containerInsights"
value = "enabled"
}
}
resource "aws_ecs_task_definition" "PrimaryServerSideContainer" {
family = "PrimaryServerSideContainer"
network_mode = "awsvpc"
requires_compatibilities = ["FARGATE"]
cpu = 2048
memory = 4096
execution_role_arn = aws_iam_role.gtm_container_exec_role.arn
task_role_arn = aws_iam_role.gtm_container_role.arn
runtime_platform {
operating_system_family = "LINUX"
cpu_architecture = "X86_64"
}
container_definitions = <<TASK_DEFINITION
[
{
"name": "primary",
"image": "gcr.io/cloud-tagging-10302018/gtm-cloud-image",
"environment": [
{
"name": "PORT",
"value": "80"
},
{
"name": "PREVIEW_SERVER_URL",
"value": "${var.PREVIEW_SERVER_URL}"
},
{
"name": "CONTAINER_CONFIG",
"value": "${var.CONTAINER_CONFIG}"
}
],
"cpu": 1024,
"memory": 2048,
"essential": true,
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "gtm-primary",
"awslogs-create-group": "true",
"awslogs-region": "eu-central-1",
"awslogs-stream-prefix": "ecs"
}
},
"portMappings" : [
{
"containerPort" : 80,
"hostPort" : 80
}
]
}
]
TASK_DEFINITION
}
resource "aws_ecs_service" "PrimaryServerSideService" {
name = var.primary_service_name
cluster = aws_ecs_cluster.gtm.id
task_definition = aws_ecs_task_definition.PrimaryServerSideContainer.id
desired_count = var.primary_service_desired_count
launch_type = "FARGATE"
platform_version = "LATEST"
scheduling_strategy = "REPLICA"
deployment_maximum_percent = 200
deployment_minimum_healthy_percent = 50
network_configuration {
assign_public_ip = true
security_groups = [aws_security_group.gtm-security-group.id]
subnets = data.aws_subnets.private.ids
}
load_balancer {
target_group_arn = aws_lb_target_group.PrimaryServerSideTarget.arn
container_name = "primary"
container_port = 80
}
lifecycle {
ignore_changes = [task_definition]
}
}
resource "aws_lb" "PrimaryServerSideLoadBalancer" {
name = "PrimaryServerSideLoadBalancer"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.gtm-security-group.id]
subnets = data.aws_subnets.public.ids
enable_deletion_protection = false
}
....
我还尝试添加这些:
resource "aws_appautoscaling_target" "ecs_target" {
max_capacity = 4
min_capacity = 1
resource_id = "service/${aws_ecs_cluster.gtm.name}/${aws_ecs_service.PrimaryServerSideService.name}"
scalable_dimension = "ecs:service:DesiredCount"
service_namespace = "ecs"
}
resource "aws_appautoscaling_policy" "ecs_policy" {
name = "scale-down"
policy_type = "StepScaling"
resource_id = aws_appautoscaling_target.ecs_target.resource_id
scalable_dimension = aws_appautoscaling_target.ecs_target.scalable_dimension
service_namespace = aws_appautoscaling_target.ecs_target.service_namespace
step_scaling_policy_configuration {
adjustment_type = "ChangeInCapacity"
cooldown = 60
metric_aggregation_type = "Maximum"
step_adjustment {
metric_interval_upper_bound = 0
scaling_adjustment = -1
}
}
}
但 502 错误仍然存在。
答:
2赞
Dmytro Sirant
11/20/2023
#1
你正在寻找正确的方向,只剩下两件事要做:
- 您需要确定指标以了解是否需要纵向扩展(更有可能是 CPU 使用率)
- 根据 p.1 中的指标更新您的规模
resource "aws_appautoscaling_policy" "ecs_policy"
目前,您的ecs_policy没有任何指标可供扩展。
下面是示例:
resource "aws_appautoscaling_policy" "ecs_target_cpu" {
name = "application-scaling-policy-cpu"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.ecs_service_target.resource_id
scalable_dimension = aws_appautoscaling_target.ecs_service_target.scalable_dimension
service_namespace = aws_appautoscaling_target.ecs_service_target.service_namespace
target_tracking_scaling_policy_configuration {
predefined_metric_specification {
predefined_metric_type = "ECSServiceAverageCPUUtilization"
}
target_value = 80
}
depends_on = [aws_appautoscaling_target.ecs_service_target]
}
评论
1赞
x89
11/20/2023
我已经尝试了这些设置。目标值 300。但我仍然偶尔会遇到 502 错误。可能不如以前那么频繁,但它们仍然存在。例如,MsgSent:2421,MsgFailed:3。除了 target_value 和 mix-max 容量(即在我的设置 atm 中为 15)之外,我还可以使用哪些参数
1赞
Dmytro Sirant
11/20/2023
目标值为 300 的原因是什么?这是 ECS 任务必须横向扩展的百分比。在这种情况下,它会在负载达到 300%(不可能)时等待,然后创建其他任务。docs.aws.amazon.com/autoscaling/application/APIReference/......刚刚注意到的另一件事,为什么容器的 CPU = 1024 和任务的 CPU = 2048?如果容器仅使用 1024 CPU,则任务将仅加载 50%。
评论