大查询 SQL 中的数组拆分

Array Split in Big Query SQL

提问人:user2458552 提问时间:11/14/2023 更新时间:11/14/2023 访问量:42

问:

我想从嵌套的列字符串中获取列string_1和string_2。数组和拆分的混合不起作用。任何帮助都是值得赞赏的,TIA!当我尝试过这个但不起作用时:

ARRAY(SELECT DISTINCT(SPLIT(r, '/abc')[safe_ordinal(1)]), FROM UNNEST(table) r)  as string_1,
ARRAY(SELECT DISTINCT(SPLIT(r, '/abc')[safe_ordinal(2)]), FROM UNNEST(table) r)  as string_2,

enter image description here

sql google-bigquery

评论


答:

0赞 Mikhail Berlyant 11/14/2023 #1

使用以下方法

select num, any_value(strings) as strings, 
  array_agg(part[0]) as string_1, 
  array_agg(part[1]) as string_2
from your_data,
unnest(strings) string,
unnest([struct(split(string, '/abc/') as part)]) 
group by num      

如果应用于问题中的示例数据 - 输出为

enter image description here

0赞 Martin Weitzmann 11/14/2023 #2

我认为你走在正确的轨道上 - 这可以在没有分组/连接的情况下解决。我认为具有最少计算步骤的性能最高(可并行化)的解决方案是数组上的子查询,该子查询创建包含给定子行解决方案的新结构数组:

with test_data as (
  select * from unnest([
    struct(1 as num, ['/a/abc/1','/a/abc/2','/a/abc/3'] as strings),
    (2,['/b/abc/1','/b/abc/2','/b/abc/3']),
    (3,['/c/abc/1','/c/abc/2','/c/abc/3'])
  ])
)

-- select * from test_data
select 
  num,
  array( -- create new array of structs with solution
    select as struct
      s as str,
      split(s,'/abc/')[0] as string_1,
      split(s,'/abc/')[1] as string_2
    from unnest(strings) as s
  ) as strings
from 
  test_data

导致query result