提问人:user22073463 提问时间:8/30/2023 最后编辑:user22073463 更新时间:8/31/2023 访问量:38
具有记录限制的循环
Round robin with record limitation
问:
我必须为以下要求构建一个 hive SQL 查询
我有一个客户表。我需要将总表记录除以 6(即假设表每月包含 600 条记录,最多 6 个月包含 100 条记录),每个月都有一个针对客户的括号。假设括号限制为 4,那么我需要从 5 个唯一帐户中选择 5 个唯一的电子邮件 ID。以防万一 10 然后来自 10 个唯一帐户的 10 个唯一电子邮件 ID
注意:我正在使用 mod 操作分发 6 个月的记录。
帐户 | 电子邮件 | 模组,6 |
---|---|---|
帐户 1 | email@acc1 | 1 |
ACC2 | email1@acc2 | 1 |
ACC2 | email2@acc2 | 2 |
ACC2 | email3@acc2 | 3 |
ACC2 | email4@acc2 | 4 |
ACC2 | email5@acc2 | 5 |
ACC2 | email6@acc2 | 6 |
ACC2 | email7@acc2 | 1 |
ACC3 | email1@acc3 | 1 |
ACC3 | email2@acc3 | 2 |
ACC3 | email3@acc3 | 3 |
ACC4 | email@acc4 | 1 |
ACC5 | email1@acc5 | 1 |
ACC5 | email2@acc5 | 2 |
预期输出 - 括号为 4(不需要低于输出 acc5,因为记录计数已达到括号范围 4)
帐户 | 电子邮件 | 模组,6 |
---|---|---|
帐户 1 | email@acc1 | 1 |
ACC2 | email1@acc2 | 1 |
ACC3 | email1@acc3 | 1 |
ACC4 | email@acc4 | 1 |
如果括号是 8(我必须先选择所有唯一帐户,然后再选择其他序列才能达到括号范围)
预期输出
帐户 | 电子邮件 | 模组,6 |
---|---|---|
帐户 1 | email@acc1 | 1 |
ACC2 | email1@acc2 | 1 |
ACC3 | email1@acc3 | 1 |
ACC4 | email@acc4 | 1 |
ACC5 | email1@acc5 | 1 |
ACC2 | email7@acc2 | 1 |
ACC2 | email2@acc2 | 2 |
ACC3 | email2@acc3 | 2 |
如果括号为 10
帐户 | 电子邮件 | 模组,6 |
---|---|---|
帐户 1 | email@acc1 | 1 |
ACC2 | email1@acc2 | 1 |
ACC3 | email1@acc3 | 1 |
ACC4 | email@acc4 | 1 |
ACC5 | email1@acc5 | 1 |
ACC2 | email7@acc2 | 1 |
ACC2 | email2@acc2 | 2 |
ACC3 | email2@acc3 | 2 |
ACC5 | email2@acc5 | 2 |
ACC2 | email3@acc2 | 3 |
我尝试了下面的查询。但它首先获取所有 1 条记录。我不确定如何先使用mod_seq_value 1获取唯一的帐户记录,然后从mod seq -1开始剩余的记录。
select * from (
select *, Row_number() over(order by mod_num_seq,acc_count) as rnk
select account,email,
count(*) over(partition by account) as acc_count
,case
when mod(row_number() over(partition by account),6)=0 then 6
else mod(row_number() over(partition by account),6)=0
end as mod_num_seq
from
customer
)a
)b where rnk<={:bracket}
答:
0赞
Dr Y Wit
8/31/2023
#1
不知道为什么输出中的第 6 行和第 7 行是 .email7@acc2, email2@acc2
email7@acc2, email2@acc3
with customer (account, email, x) as
(
select 'acc1','email@acc1', 1 from dual
union all select 'acc2','email1@acc2', 1 from dual
union all select 'acc2','email2@acc2', 2 from dual
union all select 'acc2','email3@acc2', 3 from dual
union all select 'acc2','email4@acc2', 4 from dual
union all select 'acc2','email5@acc2', 5 from dual
union all select 'acc2','email6@acc2', 6 from dual
union all select 'acc2','email7@acc2', 1 from dual
union all select 'acc3','email1@acc3', 1 from dual
union all select 'acc3','email2@acc3', 2 from dual
union all select 'acc3','email3@acc3', 3 from dual
union all select 'acc4','email@acc4', 1 from dual
union all select 'acc5','email1@acc5', 1 from dual
union all select 'acc5','email2@acc5', 2 from dual
)
, t as
(
select c.*, mod(row_number() over (partition by account order by email) - 1, 6) + 1 rn
from customer c
)
select t.*, row_number() over (partition by account order by rn, email) pick_up_order
from t
order by pick_up_order, account;
结果:
ACCOUNT EMAIL X RN PICK_UP_ORDER
------- ----------- ---------- ---------- -------------
acc1 email@acc1 1 1 1
acc2 email1@acc2 1 1 1
acc3 email1@acc3 1 1 1
acc4 email@acc4 1 1 1
acc5 email1@acc5 1 1 1
acc2 email7@acc2 1 1 2
acc3 email2@acc3 2 2 2
acc5 email2@acc5 2 2 2
acc2 email2@acc2 2 2 3
acc3 email3@acc3 3 3 3
acc2 email3@acc2 3 3 4
acc2 email4@acc2 4 4 5
acc2 email5@acc2 5 5 6
acc2 email6@acc2 6 6 7
14 rows selected.
评论