是否可以使用 AWS Athena 基于列数据创建复杂的行数据?

Is it possible to create complex row data based on column data using AWS Athena?

提问人:Siddarth Patil 提问时间:11/13/2023 更新时间:11/13/2023 访问量:35

问:

问题:

我有一个表格结构如下:

+----+------+-------------------+-------+
| id | code | size              | count |
+----+------+-------------------+-------+
| 1  | CA   | TwentyToFortyNine | 65    |
+----+------+-------------------+-------+
| 1  | CA   | FiveToNinteen     | 385   |
+----+------+-------------------+-------+
| 1  | CA   | OneToFour         | 492   |
+----+------+-------------------+-------+
| 1  | DK   | OneToFour         | 38    |
+----+------+-------------------+-------+
| 1  | DK   | TwentyToFortyNine | 1     |
+----+------+-------------------+-------+
| 2  | CA   | FiveToNinteen     | 389   |
+----+------+-------------------+-------+
| 2  | CA   | OneToFour         | 494   |
+----+------+-------------------+-------+
| 2  | DK   | FiveToNinteen     | 10    |
+----+------+-------------------+-------+
| 2  | DK   | OneToFour         | 38    |
+----+------+-------------------+-------+

列中的非重复值为:、 和 。但是,并非所有代码条目都包含所有这些值。例如,没有任何数据。sizeOneToFourFiveToNinteenTwentyToFortyNineid: 2size: TwentyToFortyNine

目的:

我的目标是为每个标有 的新行生成新行,显示列中的所有不同值及其相应的总数。idcode: TOTALsizecount

示例输出:

+----+-------+-------------------+-------+
| id | code  | size              | count |
+----+-------+-------------------+-------+
| 1  | TOTAL | OneToFour         | 530   |
+----+-------+-------------------+-------+
| 1  | TOTAL | FiveToNinteen     | 385   |
+----+-------+-------------------+-------+
| 1  | TOTAL | TwentyToFortyNine | 66    |
+----+-------+-------------------+-------+
| 1  | CA    | TwentyToFortyNine | 65    |
+----+-------+-------------------+-------+
| 1  | CA    | FiveToNinteen     | 385   |
+----+-------+-------------------+-------+
| 1  | CA    | OneToFour         | 492   |
+----+-------+-------------------+-------+
| 1  | DK    | OneToFour         | 38    |
+----+-------+-------------------+-------+
| 1  | DK    | TwentyToFortyNine | 1     |
+----+-------+-------------------+-------+
| 2  | TOTAL | OneToFour         | 532   |
+----+-------+-------------------+-------+
| 2  | TOTAL | FiveToNinteen     | 399   |
+----+-------+-------------------+-------+
| 2  | TOTAL | TwentyToFortyNine | 0     |
+----+-------+-------------------+-------+
| 2  | CA    | FiveToNinteen     | 389   |
+----+-------+-------------------+-------+
| 2  | CA    | OneToFour         | 494   |
+----+-------+-------------------+-------+
| 2  | DK    | FiveToNinteen     | 10    |
+----+-------+-------------------+-------+
| 2  | DK    | OneToFour         | 38    |
+----+-------+-------------------+-------+

是否可以使用 AWS Athena 实现?如果是,那么你能说明一下如何吗?或者我需要为此使用 Pandas 吗?

SQL Web 服务 亚马逊 雅典娜

评论


答:

1赞 Siddarth Patil 11/13/2023 #1

我终于能够弄清楚了。如果有人来寻找答案,那么这可能会有所帮助:

WITH all_sizes AS (
    SELECT id, code, size, SUM(count) as total_count
    FROM your_table_name
    GROUP BY id, code, size
),
all_combinations AS (
    SELECT id, 'TOTAL' as code, size, SUM(total_count) as count
    FROM all_sizes
    GROUP BY id, size
    UNION
    SELECT id, code, size, total_count as count
    FROM all_sizes
)
SELECT * FROM all_combinations
ORDER BY id, code, size;