具有记录限制的循环

Round robin with record limitation

提问人:user22073463 提问时间:8/30/2023 最后编辑:user22073463 更新时间:8/31/2023 访问量:38

问:

我必须为以下要求构建一个 hive SQL 查询

我有一个客户表。我需要将总表记录除以 6(即假设表每月包含 600 条记录,最多 6 个月包含 100 条记录),每个月都有一个针对客户的括号。假设括号限制为 4,那么我需要从 5 个唯一帐户中选择 5 个唯一的电子邮件 ID。以防万一 10 然后来自 10 个唯一帐户的 10 个唯一电子邮件 ID

注意:我正在使用 mod 操作分发 6 个月的记录。

帐户 电子邮件 模组,6
帐户 1 email@acc1 1
ACC2 email1@acc2 1
ACC2 email2@acc2 2
ACC2 email3@acc2 3
ACC2 email4@acc2 4
ACC2 email5@acc2 5
ACC2 email6@acc2 6
ACC2 email7@acc2 1
ACC3 email1@acc3 1
ACC3 email2@acc3 2
ACC3 email3@acc3 3
ACC4 email@acc4 1
ACC5 email1@acc5 1
ACC5 email2@acc5 2

预期输出 - 括号为 4(不需要低于输出 acc5,因为记录计数已达到括号范围 4)

帐户 电子邮件 模组,6
帐户 1 email@acc1 1
ACC2 email1@acc2 1
ACC3 email1@acc3 1
ACC4 email@acc4 1

如果括号是 8(我必须先选择所有唯一帐户,然后再选择其他序列才能达到括号范围)

预期输出

帐户 电子邮件 模组,6
帐户 1 email@acc1 1
ACC2 email1@acc2 1
ACC3 email1@acc3 1
ACC4 email@acc4 1
ACC5 email1@acc5 1
ACC2 email7@acc2 1
ACC2 email2@acc2 2
ACC3 email2@acc3 2

如果括号为 10

帐户 电子邮件 模组,6
帐户 1 email@acc1 1
ACC2 email1@acc2 1
ACC3 email1@acc3 1
ACC4 email@acc4 1
ACC5 email1@acc5 1
ACC2 email7@acc2 1
ACC2 email2@acc2 2
ACC3 email2@acc3 2
ACC5 email2@acc5 2
ACC2 email3@acc2 3

我尝试了下面的查询。但它首先获取所有 1 条记录。我不确定如何先使用mod_seq_value 1获取唯一的帐户记录,然后从mod seq -1开始剩余的记录。

select * from (
select *, Row_number() over(order by mod_num_seq,acc_count) as rnk
select account,email,
count(*) over(partition by account) as acc_count
,case 
when mod(row_number() over(partition by account),6)=0 then 6
else mod(row_number() over(partition by account),6)=0 
end as mod_num_seq
from 
customer
)a
)b where rnk<={:bracket}
Oracle Sequence HQL 分析 循环

评论

0赞 Adrian Maxwell 8/30/2023
MOD,6 基于什么?它是基于 ID 还是类似的东西?(在示例数据中显示)您是否正在寻找一种方法来随机化所选电子邮件?我不完全清楚你真正想要什么。
0赞 p3consulting 8/30/2023
NTILE() 函数可以帮助您,缺点是它需要一个表达式作为参数(而不是作为语句),因此您必须在之前计算它并作为参数传递。
0赞 user22073463 8/31/2023
Mod - 6 基于行号 - mod(row_number() over(partition by account),6)。我的要求是每个帐户至少携带一个电子邮件 ID。有些帐户包含多个电子邮件 ID。所以我正在使用 mod 操作。首先,以循环方式,我需要拉取 mod 值 - 1 条记录,然后是 2 条,依此类推。

答:

0赞 Dr Y Wit 8/31/2023 #1

不知道为什么输出中的第 6 行和第 7 行是 .email7@acc2, email2@acc2email7@acc2, email2@acc3

with customer (account, email, x) as
(
select 'acc1','email@acc1', 1 from dual
union all select 'acc2','email1@acc2',  1 from dual
union all select 'acc2','email2@acc2',  2 from dual
union all select 'acc2','email3@acc2',  3 from dual
union all select 'acc2','email4@acc2',  4 from dual
union all select 'acc2','email5@acc2',  5 from dual
union all select 'acc2','email6@acc2',  6 from dual
union all select 'acc2','email7@acc2',  1 from dual
union all select 'acc3','email1@acc3',  1 from dual
union all select 'acc3','email2@acc3',  2 from dual
union all select 'acc3','email3@acc3',  3 from dual
union all select 'acc4','email@acc4',   1 from dual
union all select 'acc5','email1@acc5',  1 from dual
union all select 'acc5','email2@acc5',  2 from dual
)
, t as 
(
select c.*, mod(row_number() over (partition by account order by email) - 1, 6) + 1 rn
from customer c
)
select t.*, row_number() over (partition by account order by rn, email) pick_up_order
from t
order by pick_up_order, account;

结果:

ACCOUNT EMAIL                X         RN PICK_UP_ORDER
------- ----------- ---------- ---------- -------------
acc1    email@acc1           1          1             1
acc2    email1@acc2          1          1             1
acc3    email1@acc3          1          1             1
acc4    email@acc4           1          1             1
acc5    email1@acc5          1          1             1
acc2    email7@acc2          1          1             2
acc3    email2@acc3          2          2             2
acc5    email2@acc5          2          2             2
acc2    email2@acc2          2          2             3
acc3    email3@acc3          3          3             3
acc2    email3@acc2          3          3             4
acc2    email4@acc2          4          4             5
acc2    email5@acc2          5          5             6
acc2    email6@acc2          6          6             7

14 rows selected.