使用多个 WITH 子查询的替代方法

Alternative to using multiple WITH subqueries

提问人:David Eler 提问时间:10/5/2023 最后编辑:philipxyDavid Eler 更新时间:10/6/2023 访问量:48

问:

我正在通过 Google 数据分析专业证书的 Cyclistic 案例研究来练习数据分析。在我的分析过程中,我开发了这个查询来获取按季节共享的自行车数据。

什么是不那么口头/更聪明的方法,因为从子查询到子查询唯一变化的是季节名称和月份间隔?

WITH summer AS (
    SELECT
        'summer' AS season,
        avg(ride_duration) AS avg_ride_duration,
        max(ride_duration) AS max_ride_duration,
        count(ride_id) AS ride_count,
        mode() WITHIN GROUP (ORDER BY day_of_week) AS mode_day_of_week
    FROM trip_data_total
    WHERE
        extract(month FROM started_at) BETWEEN 06 AND 08
),
autumn AS (
    SELECT
        'autumn' AS season,
        avg(ride_duration) AS avg_ride_duration,
        max(ride_duration) AS max_ride_duration,
        count(ride_id) AS ride_count,
        mode() WITHIN GROUP (ORDER BY day_of_week) AS mode_day_of_week
    FROM trip_data_total
    WHERE
        extract(month FROM started_at) BETWEEN 09 AND 11
),
winter AS (
    SELECT
        'winter' AS season,
        avg(ride_duration) AS avg_ride_duration,
        max(ride_duration) AS max_ride_duration,
        count(ride_id) AS ride_count,
        mode() WITHIN GROUP (ORDER BY day_of_week) AS mode_day_of_week
    FROM trip_data_total
    WHERE
        extract(month FROM started_at) IN (12,01,02)
),
spring AS (
    SELECT
        'spring' AS season,
        avg(ride_duration) AS avg_ride_duration,
        max(ride_duration) AS max_ride_duration,
        count(ride_id) AS ride_count,
        mode() WITHIN GROUP (ORDER BY day_of_week) AS mode_day_of_week
    FROM trip_data_total
    WHERE
        extract(month FROM started_at) BETWEEN 03 AND 05
)
SELECT * FROM summer
UNION ALL
SELECT * FROM autumn
UNION ALL
SELECT * FROM winter
UNION ALL
SELECT * FROM spring
ORDER BY ride_count DESC
SQL PostgreSQL 子查询

评论

0赞 NickW 10/5/2023
是的,只需使用 CASE 语句来计算季节并按它进行分组 - 那么您只需要一个 SELECT 语句

答:

1赞 eshirvana 10/5/2023 #1

您可以改用并按此进行分组,以保持组的分隔:case statement

 SELECT
        case when extract(month FROM started_at) BETWEEN 06 AND 08 then 'summer' 
             when extract(month FROM started_at) BETWEEN 09 AND 11 then 'fall'
             when extract(month FROM started_at) IN (12,01,02) then 'winter'
             when extract(month FROM started_at) BETWEEN 03 AND 05 then 'spring' 
        end AS season,
        avg(ride_duration) AS avg_ride_duration,
        max(ride_duration) AS max_ride_duration,
        count(ride_id) AS ride_count,
        mode() WITHIN GROUP (ORDER BY day_of_week) AS mode_day_of_week
FROM trip_data_total
group by case when extract(month FROM started_at) BETWEEN 06 AND 08 then 'summer' 
             when extract(month FROM started_at) BETWEEN 09 AND 11 then 'fall'
             when extract(month FROM started_at) IN (12,01,02) then 'winter'
             when extract(month FROM started_at) BETWEEN 03 AND 05 then 'spring' 
        end
ORDER BY count(ride_id) DESC

评论

0赞 David Eler 10/5/2023
成功了!不过,我注意到您从 ORDER BY 子句中删除了别名。这不是一个好做法吗?它没有为我返回错误
0赞 eshirvana 10/5/2023
@DavidEler没有区别,只是有些数据库无法识别 order by 子句中的别名