提问人:Ivan Lopatkin 提问时间:4/21/2023 最后编辑:Ivan Lopatkin 更新时间:4/24/2023 访问量:75
ROW_NUMBER 或其他顺序取决于日期 (SQL)
ROW_NUMBER OR OTHER SEQUENCE DEPENDING ON DATE (SQL)
问:
我在使用创建序列时遇到问题,但仍然无法处理。row_number
我有一张桌子
公元前 | io的 | 日期 |
---|---|---|
1一个 | 11 | 2022-01-01 |
1一个 | 11 | 2022-01-02 |
1一个 | 12 | 2022-01-03 |
1一个 | 11 | 2022-01-04 |
当我使用简单的分区依据和排序依据时,我得到这个结果row_number
bc
io
date
公元前 | io的 | 日期 | 注册护士 |
---|---|---|---|
1一个 | 11 | 2022-01-01 | 1 |
1一个 | 11 | 2022-01-02 | 2 |
1一个 | 12 | 2022-01-03 | 1 |
1一个 | 11 | 2022-01-04 | 3 |
但是我需要这个结果,当更改时,下一个,之前已经满足了,应该以 1 开头io
io
公元前 | io的 | 日期 | 注册护士 |
---|---|---|---|
1一个 | 11 | 2022-01-01 | 1 |
1一个 | 11 | 2022-01-02 | 2 |
1一个 | 12 | 2022-01-03 | 1 |
1一个 | 11 | 2022-01-04 | 1 |
我尝试使用这个sql,但它不正确
select tt.*,row_number() over(partition by tt.bc,tt.io order by tt.date ) as rn
from (
select '1a' as bc, 11 as io, '2021-01-01' as date
union all
select '1a' as bc, 11 as io, '2021-01-02' as date
union all
select '1a' as bc, 12 as io, '2021-01-03' as date
union all
select '1a' as bc, 11 as io, '2021-01-04' as date
) as tt
答:
0赞
markalex
4/21/2023
#1
您可以手动将行与“以前的”行进行比较,并基于它创建更改指示器。该指标的稍后求和将为您提供分区号,识别不间断的块。
这个分区号可以用在你的分区子句中。row_number
with tt as (
select '1a' as bc, 11 as io, '2021-01-01' as date
union all
select '1a' as bc, 11 as io, '2021-01-02' as date
union all
select '1a' as bc, 12 as io, '2021-01-03' as date
union all
select '1a' as bc, 11 as io, '2021-01-04' as date
), t2 as(
select
tt.*,
case when bc = lag(bc) over (order by date) and io = lag(io) over (order by date) then 0 else 1 end ind
from tt
), t3 as (
select
t2.*,
sum(ind) over ( order by date) pid
from t2
)
select
bc,
io,
date,
row_number() over (partition by pid order by date) rn
from t3
演示(在MySQL中)在这里。
编辑:忽略忽略案例条件提及的部分的更改:bc
bc
with tt as (
select '1a' as bc, 11 as io, '2021-01-01' as date
union all
select '1b' as bc, 11 as io, '2021-01-02' as date
union all
select '1a' as bc, 12 as io, '2021-01-03' as date
union all
select '1a' as bc, 11 as io, '2021-01-04' as date
), t2 as(
select
tt.*,
case when io = lag(io) over (order by date) then 0 else 1 end ind
from tt
), t3 as (
select
t2.*,
sum(ind) over ( order by date) pid
from t2
)
select
bc,
io,
date,
row_number() over (partition by pid order by date) rn
from t3
评论
0赞
markalex
4/24/2023
@andexte,对不起,错过了OP对此的评论。添加了针对该案例的查询。
0赞
astentx
4/21/2023
#2
这是一个常见的间隙和孤岛问题:对每个键的连续属性值进行分组(给定一些“类似时间”的维度)。方法是这样的:
- 按时间尺寸排序的每个键计算。
row_number
- 按时间维度排序的按每个键和感兴趣的属性进行计算。
row_number
- 找出它们的差异 - 这会对连续的相同属性值进行分组(在更改某些属性时,第二个值重置为 1,并且差异增加)。
row_number
下面是一个查询:
with src as (
select inline(array(
struct('1a', 11, date '2022-01-01'),
struct('1a', 11, date '2022-01-02'),
struct('1a', 12, date '2022-01-03'),
struct('1a', 11, date '2022-01-04')
)) as (bc, io, dt)
)
, prepared as (
select
src.*
/*Partition by keys*/
, row_number() over(partition by bc order by dt asc)
/*Partition by keys AND attributes to track changes and create groups*/
- row_number() over(partition by bc, io order by dt asc) as rn_diff
from src
)
select
bc, io, dt
/*Partition by keys AND attributes to track changes AND group number*/
, row_number() over(partition by bc, io, rn_diff order by dt asc) as rn
from prepared
order by dt asc
公元前 | io的 | DT | 注册护士 |
---|---|---|---|
1一个 | 11 | 2022-01-01 | 1 |
1一个 | 11 | 2022-01-02 | 2 |
1一个 | 12 | 2022-01-03 | 1 |
1一个 | 11 | 2022-01-04 | 1 |
基于 Postgres 的 dbfiddle(添加了更多属性)。
评论
1赞
markalex
4/21/2023
它可能会在较大的输入上产生不正确的输出: dbfiddle.uk/jPMBxpSJ 注意最后一行。
1赞
astentx
4/21/2023
@markalex 谢谢,我错过了最后的关键列和属性。固定partition by
评论
row_number
bc
io
bc
rn
bc
date
io
io
date