提问人:Siddarth Patil 提问时间:11/13/2023 更新时间:11/13/2023 访问量:35
是否可以使用 AWS Athena 基于列数据创建复杂的行数据?
Is it possible to create complex row data based on column data using AWS Athena?
问:
问题:
我有一个表格结构如下:
+----+------+-------------------+-------+
| id | code | size | count |
+----+------+-------------------+-------+
| 1 | CA | TwentyToFortyNine | 65 |
+----+------+-------------------+-------+
| 1 | CA | FiveToNinteen | 385 |
+----+------+-------------------+-------+
| 1 | CA | OneToFour | 492 |
+----+------+-------------------+-------+
| 1 | DK | OneToFour | 38 |
+----+------+-------------------+-------+
| 1 | DK | TwentyToFortyNine | 1 |
+----+------+-------------------+-------+
| 2 | CA | FiveToNinteen | 389 |
+----+------+-------------------+-------+
| 2 | CA | OneToFour | 494 |
+----+------+-------------------+-------+
| 2 | DK | FiveToNinteen | 10 |
+----+------+-------------------+-------+
| 2 | DK | OneToFour | 38 |
+----+------+-------------------+-------+
列中的非重复值为:、 和 。但是,并非所有代码条目都包含所有这些值。例如,没有任何数据。size
OneToFour
FiveToNinteen
TwentyToFortyNine
id: 2
size: TwentyToFortyNine
目的:
我的目标是为每个标有 的新行生成新行,显示列中的所有不同值及其相应的总数。id
code: TOTAL
size
count
示例输出:
+----+-------+-------------------+-------+
| id | code | size | count |
+----+-------+-------------------+-------+
| 1 | TOTAL | OneToFour | 530 |
+----+-------+-------------------+-------+
| 1 | TOTAL | FiveToNinteen | 385 |
+----+-------+-------------------+-------+
| 1 | TOTAL | TwentyToFortyNine | 66 |
+----+-------+-------------------+-------+
| 1 | CA | TwentyToFortyNine | 65 |
+----+-------+-------------------+-------+
| 1 | CA | FiveToNinteen | 385 |
+----+-------+-------------------+-------+
| 1 | CA | OneToFour | 492 |
+----+-------+-------------------+-------+
| 1 | DK | OneToFour | 38 |
+----+-------+-------------------+-------+
| 1 | DK | TwentyToFortyNine | 1 |
+----+-------+-------------------+-------+
| 2 | TOTAL | OneToFour | 532 |
+----+-------+-------------------+-------+
| 2 | TOTAL | FiveToNinteen | 399 |
+----+-------+-------------------+-------+
| 2 | TOTAL | TwentyToFortyNine | 0 |
+----+-------+-------------------+-------+
| 2 | CA | FiveToNinteen | 389 |
+----+-------+-------------------+-------+
| 2 | CA | OneToFour | 494 |
+----+-------+-------------------+-------+
| 2 | DK | FiveToNinteen | 10 |
+----+-------+-------------------+-------+
| 2 | DK | OneToFour | 38 |
+----+-------+-------------------+-------+
是否可以使用 AWS Athena 实现?如果是,那么你能说明一下如何吗?或者我需要为此使用 Pandas 吗?
答:
1赞
Siddarth Patil
11/13/2023
#1
我终于能够弄清楚了。如果有人来寻找答案,那么这可能会有所帮助:
WITH all_sizes AS (
SELECT id, code, size, SUM(count) as total_count
FROM your_table_name
GROUP BY id, code, size
),
all_combinations AS (
SELECT id, 'TOTAL' as code, size, SUM(total_count) as count
FROM all_sizes
GROUP BY id, size
UNION
SELECT id, code, size, total_count as count
FROM all_sizes
)
SELECT * FROM all_combinations
ORDER BY id, code, size;
评论