列定义的范围内的总和

Sum in range defined by column

提问人:IttayD 提问时间:11/27/2022 最后编辑:IttayD 更新时间:11/29/2022 访问量:118

问:

我有一个表格是这样的:

地位 时间戳
10 一个 0
10 B 1
15 B 2
10 C 3
12 D 4
20 一个 5
25 B 6
17 C 7
19 D 8

金额没有限制(除了是一个数字)。状态行可以有重复项(示例中的“B”)。

我想要的是将“A”状态之间的所有内容相加。所以结果应该是

时间戳
57 1
81 5

我需要这个用于ansi-sql(Spark)

ANSI-SQL

评论

1赞 Stu 11/27/2022
您的数据必须比您提供的要多,没有任何内容可以为您的数据提供任何排序。
1赞 nbk 11/27/2022
对于 SQL 来说,A 表本质上是未排序的,因此数据库无法区分第二行和第 7 行的 B,因此没有办法编写算法来汇总它们
0赞 IttayD 11/27/2022
@Stu,没有意识到这很重要。添加了时间戳
0赞 Jaytiger 11/27/2022
@IttayD,结果的时间戳为 1、5。不是 0、5 ?

答:

0赞 p3consulting 11/27/2022 #1

一旦您决定了“订单”列,一种可能的解决方案:

with data(ord, amount, status) as (
    select 1, 10, 'A' from dual union all
    select 2, 10, 'B' from dual union all
    select 3, 15, 'B' from dual union all
    select 4, 10, 'C' from dual union all
    select 5, 12, 'D' from dual union all
    select 6, 20, 'A' from dual union all
    select 7, 25, 'B' from dual union all
    select 8, 17, 'C' from dual union all
    select 9, 19, 'D' from dual 
),
pdata as (
    select d.*, case status when 'A' then lv else last_value(lv) ignore nulls over(order by ord) end as llv
    from (
        select d.*, 
            nvl(last_value(case status when 'A' then ord end) over(partition by status order by ord
                RANGE BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
            ), case status when 'A' then ord - 1 
            end) as lv
        from data d
    ) d
)
select sum(amount) from pdata
where llv is not null
group by llv
;

sum(amount)
57
81

请注意,将“when 'A' then lv else”替换为“when 'A' then null else”将为您提供 2 个 'A' 之间的行数总和(不包括第一个行)。

评论

0赞 IttayD 11/28/2022
谢谢。p_ord和n_ord有什么用?
0赞 p3consulting 11/28/2022
nothing:测试期间使用的剩余内容,我在发布时忘记删除。
0赞 p3consulting 11/28/2022
请注意,如果使用 TIMESTAMP 进行排序,您可能会在“RANGE BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING”上得到“ORA-00902”,然后不是直接从源表中进行选择,而是从子查询中进行选择,您将添加“row_number() over(order by the_timestamp_column) as ord”,然后它将再次工作。
0赞 Rodrigue 11/27/2022 #2

这是另一种可能性,假设表名是:Tmp


DROP PROCEDURE IF EXISTS summing;
DELIMITER |
CREATE PROCEDURE summing()
    BEGIN
        DECLARE _begin INT DEFAULT NULL;
        DECLARE _end INT DEFAULT NULL;
        
        SELECT `timestamp` INTO _begin FROM Tmp WHERE status='A' ORDER BY `timestamp` LIMIT 1;
        
        IF _begin IS NOT NULL THEN
            
            SELECT `timestamp` INTO _end FROM Tmp WHERE  status='A' AND `timestamp` > _begin ORDER BY `timestamp` LIMIT 1;
            WHILE _end IS NOT NULL DO
                SELECT SUM(amount) FROM Tmp WHERE `timestamp` > _begin AND `timestamp` < _end;
                SET _begin = _end;
                SET _end = NULL;
                SELECT `timestamp` INTO _end FROM Tmp WHERE  status='A' AND `timestamp` > _begin ORDER BY `timestamp` LIMIT 1;
            END WHILE;
        END IF;
    END|
DELIMITER ;

CALL summing ();

我已经在MySQL服务器上测试了它。

1赞 p3consulting 11/28/2022 #3

另一种解决方案:

with data(ts, amount, status) as (
    select to_timestamp('27-11-2022 12:00:00.00', 'DD-MM-YYYY HH24:MI:SS.FF'), 10, 'A' from dual union all
    select to_timestamp('27-11-2022 12:00:00.00', 'DD-MM-YYYY HH24:MI:SS.FF')+1/24, 10, 'B' from dual union all
    select to_timestamp('27-11-2022 12:00:00.00', 'DD-MM-YYYY HH24:MI:SS.FF')+2/24, 15, 'B' from dual union all
    select to_timestamp('27-11-2022 12:00:00.00', 'DD-MM-YYYY HH24:MI:SS.FF')+3/24, 10, 'C' from dual union all
    select to_timestamp('27-11-2022 12:00:00.00', 'DD-MM-YYYY HH24:MI:SS.FF')+4/24, 12, 'D' from dual union all
    select to_timestamp('27-11-2022 12:00:00.00', 'DD-MM-YYYY HH24:MI:SS.FF')+5/24, 20, 'A' from dual union all
    select to_timestamp('27-11-2022 12:00:00.00', 'DD-MM-YYYY HH24:MI:SS.FF')+6/24, 25, 'B' from dual union all
    select to_timestamp('27-11-2022 12:00:00.00', 'DD-MM-YYYY HH24:MI:SS.FF')+7/24, 17, 'C' from dual union all
    select to_timestamp('27-11-2022 12:00:00.00', 'DD-MM-YYYY HH24:MI:SS.FF')+8/24, 19, 'D' from dual 
)
select res from (
    select 
        status, ts, sum(amount) over(partition by s) as res
    from (
        select 
            d.*, sum(flag) over(order by ts) as s
        from (select d.*, decode(status,'A',1,0) as flag from data d) d
    ) d
)
where status = 'A'
order by ts
;