提问人:Drum 提问时间:11/16/2023 更新时间:11/16/2023 访问量:48
在MySQL中查询数据子组
Querying a sub-group of data in MySQL
问:
我正在尝试制定一个查询以从我的数据中获取答案。数据是销售数据,客户可能在一天内下多个订单,每张发票可能由多行组成。数据大致如下所示:
Customer | Invoice Date | Invoice Number | Stock | Total Value
---------------------------------------------------------------
Acme | 01/01/2023 | 1234 | Cod | £20
Acme | 01/01/2023 | 1234 | Hake | £15
Acme | 01/01/2023 | 2468 | Cod | £10
Acme | 01/01/2023 | 2468 | Hake | £12
Acme | 02/01/2023 | 3699 | Cod | £20
Acme | 02/01/2023 | 4567 | Hake | £15
Acme | 03/01/2023 | 9876 | Cod | £10
Acme | 03/01/2023 | 9876 | Hake | £1
Beta | 01/01/2023 | 8976 | Cod | £10
Beta | 01/01/2023 | 8976 | Hake | £15
Beta | 01/01/2023 | 5432 | Cod | £5
Beta | 01/01/2023 | 5432 | Hake | £12
Beta | 02/01/2023 | 2233 | Cod | £20
Beta | 02/01/2023 | 2233 | Hake | £15
Beta | 02/01/2023 | 1590 | Cod | £10
Beta | 02/01/2023 | 1590 | Hake | £15
我试图找到的是总销售额、每天和每张发票的平均支出以及低于给定阈值(比如 25 英镑)的发票计数,如下所示
Customer | Sales Total | Days Ordered Count | Invoice Count | Average per Invoice | Invoice < £25
-------------------------------------------------------------------------------------------------
Acme | £103 | 3 | 5 | £20.60 | 4
Beta | £102 | 2 | 4 | £25.50 | 1
我可以得到前五列,但我对如何处理最后一列感到困惑。这是我用于前五列的 SQL,它有效。
select Customer,sum(Sales_Total) AS Sales Total,
count(distinct Invoice_Date) AS Days ordered Count,
COUNT(distinct Invoice_Number) AS Invoice Count,
sum(Sales_Total)/count(distinct Invoice_Number) AS Average per Invoice,
from sales_2023
group by Customer
order by Customer;
我尝试过窗口函数,但我得到了相当意想不到的结果,当然不是我想要的。我也尝试过一个子查询,但我不知道如何在子查询中将发票上的不同行聚合在一起,所以这也不起作用。
任何关于如何处理这个问题的指示都将非常受欢迎。
答:
3赞
Amit Mohanty
11/16/2023
#1
您可以使用子查询来计算每张发票的销售总额和发票总额,方法是将其分组到 和 下。然后,在外部查询中,它汇总每个客户的发票计数,其中 小于。Customer
Invoice_Number
Invoice_Total
£25
SELECT
Customer,
SUM(Sales_Total) AS Sales_Total,
COUNT(DISTINCT Invoice_Date) AS Days_Ordered_Count,
COUNT(DISTINCT Invoice_Number) AS Invoice_Count,
SUM(Sales_Total) / COUNT(DISTINCT Invoice_Number) AS Average_per_Invoice,
SUM(CASE WHEN Invoice_Total < 25 THEN 1 ELSE 0 END) AS Invoice_Less_Than_25
FROM (
SELECT
Customer,
Invoice_Number,
Invoice_Date,
SUM(Total_Value) AS Sales_Total,
SUM(Total_Value) AS Invoice_Total
FROM sales_2023
GROUP BY Customer, Invoice_Number, Invoice_Date
) AS subquery
GROUP BY Customer
ORDER BY Customer;
评论
0赞
Drum
11/21/2023
我最终也得到了这个工作,非常感谢!
1赞
bi_noob
11/16/2023
#2
带有两个子查询的 Mysql 代码,以提高可读性。如果需要,您可以将它们合并为一个。
select subset_one.customer,subset_one.Sales_Total,
subset_one.Days_ordered_Count,
subset_one.Invoice_Count,
subset_one.Average_per_Invoice,
subset_two.count_less_25
from
(
select customer,sum(total_value) AS Sales_Total,
count(distinct Invoice_Date) AS Days_ordered_Count,
COUNT(distinct Invoice_Number) AS Invoice_Count,
sum(total_value)/count(distinct Invoice_Number) AS Average_per_Invoice
from sales_2023 group by Customer
) subset_one
inner join
(
select subset_lt.customer, sum(subset_lt.less_than_25) as count_less_25
from
(
select customer,Invoice_Number, (case when sum(total_value) < 25 then 1 else 0 end) as less_than_25
from sales_2023 group by Customer,Invoice_Number
) subset_lt group by subset_lt.Customer
) subset_two
on
subset_one.customer = subset_two.customer
添加数据库小提琴链接 小于 25
评论
0赞
Drum
11/16/2023
这奏效了,谢谢。我不确定我是否完全理解为什么,但现在我有一个工作的例子,我最终会到达那里!
0赞
mandy8055
11/16/2023
#3
为此,可以使用子查询首先计算每张发票的总价值,然后在主查询中使用条件聚合来计算低于给定阈值的值。像这样:invoices
SELECT
Customer,
SUM(Invoice_Total) AS Sales_Total,
COUNT(DISTINCT Invoice_Date) AS Days_Ordered_Count,
COUNT(DISTINCT Invoice_Number) AS Invoice_Count,
ROUND(SUM(Invoice_Total) / COUNT(DISTINCT Invoice_Number), 2) AS Average_per_Invoice,
COUNT(CASE WHEN Invoice_Total < 25 THEN Invoice_Number ELSE NULL END) AS Invoice_Less_Than_25
FROM (
SELECT
Customer,
Invoice_Date,
Invoice_Number,
SUM(Total_Value) AS Invoice_Total
FROM sales_2023
GROUP BY Customer, Invoice_Date, Invoice_Number
) AS invoice_totals
GROUP BY Customer
ORDER BY Customer;
评论