提问人:amestrian 提问时间:11/16/2023 最后编辑:amestrian 更新时间:11/16/2023 访问量:80
如何将不匹配的行保留为 null 值 + 正确匹配的行 - bigquery sql
How to keep unmatched rows as nulls + rightly matched rows - bigquery sql
问:
我有下表(由于显而易见的原因,user_id与显示的输出不同):
Table A
user_id | acc_activation_date | generated_Date
1 | 13-04-2018 | 13-04-2018
1 | 13-04-2018 | 14-04-2018
1 | 13-04-2018 | 15-04-2018
1 | 13-04-2018 | 16-04-2018
1 | 13-04-2018 | 17-04-2018
1 | 13-04-2018 | 18-04-2018
1 | 13-04-2018 | 19-04-2018
1 | 13-04-2018 | 20-04-2018
1 | 13-04-2018 | 21-04-2018
1 | 13-04-2018 | 22-04-2018
1 | 13-04-2018 | 23-04-2018
1 | 13-04-2018 | 24-04-2018
1 | 13-04-2018 | 25-04-2018
Table B
user_id | acc_activation_date | joined_date | engagement_l1
1 | 13-04-2018 | 13-04-2018 | 1
1 | 13-04-2018 | 22-04-2018 | 1
1 | 13-04-2018 | 25-04-2018 | 0
预期输出:
user_id | generated_date | acc_activation_date | joined_date | engagement_l1
1 | 13-04-2018 | 13-04-2018 | 13-04-2018 | 1
1 | 14-04-2018 | 13-04-2018 | |
1 | 15-04-2018 | 13-04-2018 | |
1 | 16-04-2018 | 13-04-2018 | |
1 | 17-04-2018 | 13-04-2018 | |
1 | 18-04-2018 | 13-04-2018 | |
1 | 19-04-2018 | 13-04-2018 | |
1 | 20-04-2018 | 13-04-2018 | |
1 | 21-04-2018 | 13-04-2018 | |
1 | 22-04-2018 | 13-04-2018 | 22-04-2018 | 1
1 | 23-04-2018 | 13-04-2018 | |
1 | 24-04-2018 | 13-04-2018 | |
1 | 25-04-2018 | 13-04-2018 | 25-04-2018 | 0
我已经尝试了我能想到的左连接、右连接、完全连接的所有组合,但它在 bigquery SQL 中不起作用。这是我第一次使用 bigquery sql,因此它可能与其他 IDE 不同。
这是我最新的查询:
select distinct a.generated_Date,
a.user_id,
a.acc_activation_date,
b.engagement_l1,
b.joined_date
from TableB b
full join TableA a
on a.user_id=b.user_id
order by generated_date
如果我还在日期上添加加入:
select distinct a.generated_Date,
a.user_id,
a.acc_activation_date,
b.engagement_l1,
b.joined_Date
from TableB b
full join TableA a
on a.user_id=g.user_id and b.generated_date=a.joined_date
order by generated_date
答:
0赞
MatBailie
11/16/2023
#1
您只需要一个左连接即可获得最后一列。
激活日期可以从窗口函数派生。
SELECT
a.user_id,
a.generated_date,
MAX(b.activation_date)
OVER (
PARTITION BY a.user_id
)
AS activation_date,
b.transaction_date
FROM
TableA a
LEFT JOIN
TableB b
ON a.user_id = b.user_id
AND a.generated_date = b.transaction_date
ORDER BY
a.user_id,
a.generated_date
作为一般提示,如果您在查询中使用 DISTINCT 来“修复”结果,则可能存在错误。(这是一种代码气味。
另外,请确保您的示例数据和查询使用相同的表名和列名?
评论
0赞
amestrian
11/16/2023
嗨,感谢您的回答,但这并不能解决问题 - 我刚刚修复了原始帖子,所以希望现在它更清楚了,对此感到抱歉
0赞
MatBailie
11/16/2023
@amestrian 该查询会得到什么结果?是否正确更改了列名称?它在这里起作用;dbfiddle.uk/VmOS2xqR
0赞
GoonerForLife
11/16/2023
#2
一个简单的左连接就可以了。完全联接是导致重复的部分。
select a.user_id,
a.generated_date,
a.acc_activation_date,
b.joined_date,
b.engagement_l1
from table_a a
LEFT JOIN table_b b
ON a.user_id = b.user_id
AND a.generated_date = b.joined_date
评论