提问人:Siddarth Patil 提问时间:10/31/2023 更新时间:11/1/2023 访问量:44
如何在 Athena 中从包含 JSON 列的 CSV 文件创建外部表
How do I create a external table in Athena from CSV file with JSON column in it
问:
我在 S3 中有一个 CSV 文件,如下所示:
id,name,secondary_id,created_at,last_modified_at,tags,report
2a-4c-4d-b0,foo1,103776194,2021-10-23 13:28:02.837511,2021-10-23 13:34:55.781556,"{""reports"": {""risk"": {""status"": ""ACTIVE""}, ""analysis"": {""status"": ""ACTIVE""}}}",health
2a-4c-4d-b0,bar1,103776194,2021-10-23 13:28:02.837511,2021-10-23 13:34:55.781556,"{""reports"": {""risk"": {""status"": ""ACTIVE""}, ""analysis"": {""status"": ""ACTIVE""}}}",risk
fc-ab-4a-8b,foo2,103101839,2021-10-23 12:54:25.662775,2021-10-23 12:56:54.53149,"{""reports"": {""risk"": {""status"": ""ACTIVE""}, ""analysis"": {""status"": ""ACTIVE""}}}",health
a9-2e-4e-b3,bar2,103776194,2021-10-23 13:23:35.286249,2021-10-23 13:35:22.340411,"{""reports"": {""risk"": {""status"": ""ACTIVE""}, ""analysis"": {""status"": ""ACTIVE""}}}",risk
我尝试使用查询:
CREATE EXTERNAL TABLE IF NOT EXISTS `test_table`
(
id STRING,
name STRING,
secondary_id STRING,
created_at TIMESTAMP,
last_modified_at TIMESTAMP,
tags STRING,
report STRING
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
LOCATION 's3://location/'
TBLPROPERTIES (
'skip.header.line.count' = '1'
);
但是,由于标记中有一个逗号 (,),因此它无法正确填充表并将其视为不同的列。
有谁知道我该如何解决它?谢谢。
答:
1赞
Siddarth Patil
11/1/2023
#1
我能够弄清楚。所以,我使用了库并使用了.因此,分隔列并定义列的开头和结尾。OpenCSVSerde
WITH SERDEPROPERTIES ('separatorChar' = ',', 'quoteChar' = '"')
separatorChar
quoteChar
因此,即使我的所有其他列都不是以它开头的,它仍然正确地解释它。希望这会有所帮助。"
评论
{""reports"": {""risk"": {""status"": ""ACTIVE""}, ""analysis"": {""status"": ""ACTIVE""}}}