提问人:Lara19 提问时间:11/14/2023 最后编辑:Thom ALara19 更新时间:11/15/2023 访问量:70
按重复日期保留最新值
Keep newest value by duplicate dates
问:
我有一张名为“staff”的表:
Account Score UpdateTime UpdateTime_order
K1897 A 2023-09-08 14:57:58.113 1
K1897 B 2023-09-08 14:57:57.896 2
K1897 B 2023-08-01 10:07:57.487 3
K1897 B 2023-06-28 07:23:57.696 4
K1897 B 2023-06-05 14:20:13.789 5
K1898 C 2023-06-04 14:20:13.789 1
每位员工每天只能有一个分数,因此账户K1897在2023-09-08的分数应该是A。(当天分数从B变为A)
为了解决这个问题,我决定将 datetime 转换为日期格式,然后在它们重复时保留最新的 UpdateTime。
例如,我从账户 K1897 中取消了 UpdateTime_order=2 时的行,因为它的原始更新时间 2023-09-08 14:57:57.896 < 2023-09-08 14:57:58.113
Account Score UpdateTime UpdateTime_order
K1897 A 2023-09-08 1
K1897 B 2023-08-01 3
K1897 B 2023-06-28 4
K1897 B 2023-06-05 5
K1898 C 2023-06-04 1
然后根据新结果对UpdateTime_order重新排序。
我所期望的:
Account Score UpdateTime UpdateTime_order
K1897 A 2023-09-08 1
K1897 B 2023-08-01 2
K1897 B 2023-06-28 3
K1897 B 2023-06-05 4
K1898 C 2023-06-04 1
我的代码:
;WITH CTE_staff AS (
select
Account,
Score,
CAST([UpdateTime] AS Date) UpdateDate,
UpdateTime,
UpdateTime_order
FROM staff
)
select
Account,
Score,
UpdateDate,
ROW_NUMBER()OVER(PARTITION BY Account ORDER BY UpdateTime DESC) as UpdateTime_order3
from(
select *,
ROW_NUMBER()OVER(PARTITION BY Account, UpdateDate ORDER BY UpdateTime DESC) as UpdateTime_order2
from CTE_staff
) jj
where jj.UpdateTime_order2=1
它运行得很成功,但我认为我通过创建新列以一种复杂的方式编写了它。徘徊,如果有一个简单的方法可以做到这一点?
答:
小提琴:https://dbfiddle.uk/_NZfe5JW
WITH RankedStaff AS (
SELECT
Account,
Score,
UpdateTime,
ROW_NUMBER() OVER (PARTITION BY Account, CONVERT(DATE, UpdateTime) ORDER BY UpdateTime DESC) AS RowNum
FROM
staff
)
SELECT
Account,
Score,
UpdateTime
FROM
RankedStaff
WHERE
RowNum = 1;
此查询使用一个公用表表达式 (CTE),该表达式名为分区中的每一行,该分区由 和 的日期部分的组合定义。该函数用于根据降序对每个分区中的行进行排序,因此具有最新行的行将获得行号 1。RankedStaff
Account
UpdateTime
ROW_NUMBER()
UpdateTime
UpdateTime
然后,最后一条语句将结果筛选为仅包含等于 1 的行,这意味着它是每个帐户每天最新的行。SELECT
RowNum
UpdateTime
这应该会为您提供一个结果,其中仅包含每个帐户每天最新的行。UpdateTime
以下是如何使用一个来获取每个帐户和每天最近更新的行来完成它:ROW_NUMBER()
SELECT Account, Score, UpdateTime, UpdateTime_order
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Account, CAST(UpdateTime AS Date) ORDER BY UpdateTime DESC) AS RN
FROM staff
) AS S
WHERE rn = 1
ORDER BY UpdateTime DESC
也可以使用聚合函数:
我们首先需要确定每个帐户和每天的最长更新时间:GROUP BY
MAX()
SELECT Account, MAX(UpdateTime) AS MAX_UpdateTime
FROM staff
GROUP BY Account, CAST(UpdateTime AS DATE)
然后将此数据集与表连接起来,如下所示:
SELECT s.*
FROM staff s
INNER JOIN (
SELECT Account, MAX(UpdateTime) AS MAX_UpdateTime
FROM staff
GROUP BY Account, CAST(UpdateTime AS DATE)
) AS t ON S.Account = t.Account AND s.UpdateTime = t.MAX_UpdateTime
ORDER BY s.UpdateTime DESC
这应该返回您所说的预期结果:
with cte_staff as
(select Account, Score, CAST([UpdateTime] AS Date) UpdateDate, UpdateTime_order,
row_number() over (partition by Account, cast([UpdateTime] as Date) order by updateTime desc) rn
from staff
)
select Account, Score, UpdateDate,
row_number() over (partition by Account order by Score, rn) UpdateTime_order
from cte_staff
where rn = 1
order by Account, UpdateTime_order;
Account Score UpdateDate UpdateTime_order
K1897 A 2023-09-08 1
K1897 B 2023-06-05 2
K1897 B 2023-06-28 3
K1897 B 2023-08-01 4
K1898 C 2023-06-04 1
看小提琴。
下一个:SQL Server 重复
评论