在 SQL 中按连续日期周期对记录进行分组-解网

问：

我有一张这样的表格：

t104f005_employee_no	t104f040_position_no	t104f025_date_effective	t104f030_date_to
11354	89043	01/07/1999	01/04/2012
11354	89043	02/04/2012	31/08/2014
11354	89043	01/09/2014	21/09/2014
11354	89043	22/09/2014	27/09/2015
11354	23273	04/05/2015	27/09/2015
11354	94040	28/09/2015	10/01/2016
11354	94040	11/01/2016	08/05/2017
11354	94040	09/05/2017	24/03/2019
11354	99406	26/02/2018	01/07/2018
11354	99406	02/07/2018	30/09/2018
11354	99406	01/10/2018	01/11/2018
11354	97293	02/11/2018	25/11/2018
11354	99406	26/11/2018	10/03/2019
11354	97293	11/03/2019	24/03/2019
11354	99406	25/03/2019	30/09/2019
11354	99406	01/10/2019	15/03/2020
11354	97293	11/11/2019	24/11/2019
11354	99406	16/03/2020	10/05/2020
11354	101808	11/05/2020	14/06/2020
11354	99406	11/05/2020	14/06/2020
11354	101808	15/06/2020	04/10/2020
11354	99406	15/06/2020	04/10/2020
11354	101808	05/10/2020	25/10/2020
11354	99406	05/10/2020	21/11/2021
11354	101808	26/10/2020	31/01/2021
11354	101808	01/02/2021	28/02/2021
11354	101808	01/03/2021	30/06/2021
11354	101806	01/07/2021	18/07/2021
11354	101808	19/07/2021	21/11/2021
11354	101808	22/11/2021	28/11/2021
11354	99406	22/11/2021	28/11/2021
11354	101808	29/11/2021	20/03/2022
11354	101806	21/03/2022	17/04/2022
11354	101808	21/03/2022	22/05/2022
11354	101808	23/05/2022	09/04/2023
11354	101903	15/08/2022	12/02/2023
11354	105009	13/02/2023	28/02/2023
11354	105009	01/03/2023	09/04/2023
11354	101808	10/04/2023	零
11354	80717	15/05/2023	12/05/2024

而且我需要能够识别没有周期的连续位置，以及每个连续周期输出的最小开始和最大结束日期。预期结果如下所示：

t104f005_employee_no	t104f040_position_no	t104f025_date_effective	t104f030_date_to
11354	89043	01/07/1999	27/09/2015
11354	23273	04/05/2015	27/09/2015
11354	94040	28/09/2015	24/03/2019
11354	99406	26/02/2018	01/11/2018
11354	97293	02/11/2018	25/11/2018
11354	99406	26/11/2018	10/03/2019
11354	97293	11/03/2019	24/03/2019
11354	99406	25/03/2019	28/11/2021
11354	97293	11/11/2019	24/11/2019
11354	101808	11/05/2020	30/06/2021
11354	101806	01/07/2021	18/07/2021
11354	101808	19/07/2021	零
11354	101806	21/03/2022	17/04/2022
11354	101903	15/08/2022	12/02/2023
11354	105009	13/02/2023	28/02/2023
11354	105009	01/03/2023	09/04/2023
11354	80717	15/05/2023	12/05/2024

我尝试了以下方法，但没有成功，我现在有点难住了。

WITH RankedPositions AS (
  SELECT
    [t104f005_employee_no],
    [t104f040_position_no],
    [t104f025_date_effective],
    [t104f030_date_to],
    ROW_NUMBER() OVER (PARTITION BY [t104f005_employee_no] ORDER BY [t104f025_date_effective]) -
    ROW_NUMBER() OVER (PARTITION BY [t104f005_employee_no], [t104f040_position_no] ORDER BY [t104f025_date_effective]) AS grp
  FROM [AUR11PROD].[dbo].[t104_employment_history] with (nolock)
  WHERE [t104f005_employee_no] = '11354'
)    
SELECT
  [t104f005_employee_no],
  [t104f040_position_no],
  MIN([t104f025_date_effective]) AS min_startdate,
  MAX([t104f030_date_to]) AS max_enddate
FROM RankedPositions
GROUP BY [t104f005_employee_no], [t104f040_position_no], grp
ORDER BY [t104f005_employee_no], min_startdate;

还有这个

WITH RECURSIVE ContinuousPositions AS (
  SELECT
    employee_no,
    position_no,
    startdate,
    enddate
  FROM your_table_name
  WHERE NOT EXISTS (
    SELECT 1
    FROM your_table_name AS t2
    WHERE t2.employee_no = your_table_name.employee_no
      AND t2.position_no = your_table_name.position_no
      AND t2.startdate < your_table_name.startdate
  )
  
  UNION ALL
  
  SELECT
    cp.employee_no,
    cp.position_no,
    cp.startdate,
    t.enddate
  FROM ContinuousPositions AS cp
  JOIN your_table_name AS t ON (
    cp.employee_no = t.employee_no
    AND cp.position_no = t.position_no
    AND cp.enddate = DATEADD(day, -1, t.startdate)
  )
)    
SELECT
  employee_no,
  position_no,
  MIN(startdate) AS min_startdate,
  MAX(enddate) AS max_enddate
FROM ContinuousPositions
GROUP BY employee_no, position_no
ORDER BY employee_no, min_startdate;

sql-server t-sql 间隙和孤岛

with 
cte as
(
    select employee_no, position_no, date_effective, date_expiry,
           g = case when dateadd(day, -1, date_effective)
                    <>   lag(date_expiry) over (partition by employee_no, position_no
                                                    order by date_effective)
                    then 1
                    else 0
                    end
    from   employment_history
),
cte2 as
(
   select employee_no, position_no, date_effective, 
          date_expiry = isnull(date_expiry, '99991231'), 
          grp = sum(g) over (partition by employee_no, position_no
                                 order by date_effective)  
    from  cte
)
select employee_no, position_no, 
       date_effective = min(date_effective), 
       date_expiry    = nullif(max(date_expiry), '99991231')
from   cte2
group by employee_no, position_no, grp
order by employee_no, date_effective

db<>fiddle 演示

嘿@squirrel，谢谢！我试过了这个，但我收到这个错误。消息 8127，级别 16，状态 1，第 27 行列“cte2.t104f025_date_effective”在 ORDER BY 子句中无效，因为它不包含在聚合函数或 GROUP BY 子句中。我使用的是这个版本 - Microsoft SQL Server 2016 （SP3-GDR）（KB5021129） - 13.0.6430.49 （x64） 1月 22， 2023 17：38：22 版权所有（c） Microsoft Corporation Enterprise Edition： Windows Server 2016 Datacenter 10.0 <X64> 上的基于内核的许可（64 位）（内部版本 14393：）（虚拟机监控程序）

0赞 Squirrel 9/11/2023

我没有使用与您的列完全相同的列名，因为我的大脑无法处理这些额外的前缀。只需比较 DDL 并查找并替换为您的实际列名，它就可以工作了t104...

0赞 Carl Blunck 9/11/2023

干杯@squirrell。我一定是第一次错误地编辑了某些内容。

0赞 Carl Blunck 9/11/2023

@DaleK - 我做到了，还在学习。

上一个：基于日期重叠转换表

下一个：如何使用 ODBC 和 PHP 8.1 将 Laravel 10 连接到远程 SQL Server 2000？

在 SQL 中按连续日期周期对记录进行分组

Group records by continuous date periods in SQL

评论

评论