提问人:Spam 提问时间:10/18/2023 最后编辑:philipxySpam 更新时间:10/20/2023 访问量:53
有没有替代rank()over(分区...排序方式)
Is there an alternative to rank() over(partition by...order by)
问:
我正在尝试更改rank()(分区...订购方式...)在子查询中,由于 rank() 导致查询性能变慢,因此 THE GROUP BY 和 min/max 中。这是因为在子查询对每一行进行分区后,它会对其进行排名。我只想要每个申请人的第一个工作日期,而不必经过排名和分区。
我已经注释掉了查询的原始部分,以显示我尝试过的内容。
- 我已经包括了用子查询选择替换 where 子句。
join applicants_cte apps on apps.person_key = apps.person_key
- 按所有选定的字段分组。我想将 替换为 a,这样我就不必对每一行数据进行分区。
partition by ap.person_id
group by ap.person_id
- 我包括替换 .
min(ap.working_date)
order by ap.work_date asc
这些步骤给了我预期的计数输出(1,500,000),但是一旦我删除了,我得到了一个非常大的计数输出(34,000,000)。select * (from (select rank() over (partition by ap.person_id order by ap.working_date asc)
with applicants_cte as
(select person_key
from applicants.people
where date is between '01-01-2022' and '01-01-2023'
group by person_key
)
,
list_of_applicants as
-- Removed subquery and rank()
/*
(select *
from (select rank() over (partition by ap.person_id order by
ap.working_date asc) list,
*/
(select min(ap.working_date),
ap.person_id
date(applicants.start_date) as start_date,
a.first_name,
a.last_name,
b.email_address
from applicants.people ap
join information a on a.person_key = ap_id_key
join contact_information b on b.person_info_id = ap.id_key
-- replaced where statement with join
join applicants_cte apps on apps.person_key = apps.person_key
where date(applicants.start_date) IS NOT NULL
/*
where date(applicants.start_date) IS NOT NULL
and ap.person_id in (select applicant_id
from applicants_cte)) c
where c.list = 1
*/
group by ap.person_id,
start_date,
a.first_name,
a.last_name,
b.email_address
)
select count(*) from list_of_applicants
答:
0赞
Spam
10/20/2023
#1
我需要为我的第一个 CTE 添加一个 DISTINCT,我将第二个 CTE 拆分为两个子查询。applicants_cte
list_of_applicants
评论
select count(distinct person_id)