提问人:Della 提问时间:11/17/2023 最后编辑:Della 更新时间:11/17/2023 访问量:53
什么是此 SQL 查询的纯文本翻译?
What is a Plain Text Translation of this SQL Query?
问:
我的 SQL 知识非常基础,需要一些帮助才能将这个相当长的查询(在查询一些 AWS athena 数据库的 python 脚本中运行,并嵌入一些 f 字符串)转换为简单的英语或流程图。我理解 SELECT * WHERE、JOINS 等概念,但这似乎真的很高级,重命名多个表并处理多种情况。
SELECT usc_customer_id, escape_customer_id, use_sms,
start_date, least(date('{segment_date_str}'), max(booking_date)) as
"max_booking_date",
date('{segment_date_str}') as segment_date,
count(t.job_no) as bookings, count(a.trip_no) as trips,
sum(total_trip_fare) as total_fare, sum(promo_amt) as "promo_amount",
sum(case when promo_amt > 0 then 1 else 0 end) as promo_times
FROM
(
SELECT m.usc_customer_id, m.escape_customer_id, use_sms,
case when date(user_create_time) > date('{start_date}') then date(
user_create_time)
else date('{start_date}') end as start_date,
date(b."req_pickup_dt") as booking_date,
j.job_no, j.status
FROM (SELECT m.*,
CASE WHEN news_sms = 1 OR third_party_sms = 1 THEN True
ELSE False END AS use_sms,
greatest(COALESCE(c.create_time,
m.usc_customer_create_time,m.escape_customer_create_time),
COALESCE(m.usc_customer_create_time,
m.escape_customer_create_time, c.create_time),
COALESCE(m.escape_customer_create_time, c.create_time,
m.usc_customer_create_time)) as user_create_time
FROM "publish_cab"."usc_customer" c join
customer_mapping m
on c.customer_id = m.usc_customer_id where country_code =
'65') m
join "publish_cab"."escape_booking" as b
on m.escape_customer_id = b.cust_id
join "publish_cab"."escape_job" as j
on b.booking_id = j.booking_id
where b.bookingdate between '{start_date_short}' and '
{segment_date_str_short}'
and j.createddate between '{start_date_short}' and '
{segment_date_str_short}'
) t
left join (
select * from "publish_cab"."escape_trip_details"
where createddate between '{start_date_short}' and '
{segment_date_str_short}'
) a
on t.job_no = a.job_no
group by usc_customer_id, escape_customer_id, start_date, use_sms
如果它可以转换为某种框图流程图,或者为对 SQL 相对较新但总体上对编程和数据结构非常了解的人进行相应的 pandas 数据帧操作,将不胜感激。
答:
你问“我怎么能理解SELECT的含义......?
只需查看行即可。 (你有数据,我们没有。 这是一个洋葱。 从内部开始,然后努力走出来。
这创造了一个新的关系,
基于 的新表。
因此,它可以与 和 进行 JOINed,这反过来又会产生另一个关系。SELECT m.*, ...
usc_customer
escape_booking
escape_job
如前所述,这些关系具有一堆复杂的语法
而且看起来有点大而可怕。
所以驯服这种复杂性。
用于将所有这些表达式封装到命名关系中
您可以方便地查询和 JOIN 针对。CREATE VIEW usc_customer_report_v AS SELECT ...
该 3 路 JOIN 可以进入视图
出于同样的原因。
最后,可能会进入另一种观点。escape_v
SELECT m.usc_customer_id, ...
在这一点上,你看到的是一个更简单的 整体查询。 你可以对每种成分进行推理 独立审查和调试它们 如果您觉得他们没有捕获正确的行 为您的业务问题。
生活充满了复杂的关系。 通常,描述性地命名这些关系的行为是第一步 抽象出它们的含义, 并从更高的角度看待它们。
想想图书馆。 它包含字母, 当近距离观察时,很多信件。 退后一步,我们看到单词、句子、 段落、章节、书籍、 每个都有自己的主题。 配备杜威十进制系统 我们可能会退后一步 并调查一个专门用于 化学,另一个是物理学。 利用抽象的力量 缩小到大局, 然后放大到您当前找到的内容 是感兴趣的详细信息。 给事物起名字 帮助我们进行抽象。
如果不了解您的数据模型和这些字段的含义,就很难用比 sql 更清晰的方式描述它。实际上,SQL 是专为口述转换而设计的。
...但是,重写以使用 CTE(通用表表达式,“with”子句部分,您在下面看到)可能会更容易看到名称与每个部分的对应关系。
with
/*
Define chunks of the query.
<name> as (<query body>)
You can think of that as assigning the result of the query body to a named variable.
Note the indentation, each named block starts at the beginning of the line. Don't get confused by the column aliases ex case when yada then yada end AS something. That just assigns the result of that case statement to column name.
See main body lower
*/
customer_mapping_for_country_65_and_dates as (
SELECT m.*,
CASE WHEN news_sms = 1
OR third_party_sms = 1 THEN True
ELSE False
END AS use_sms,
greatest(
COALESCE(
c.create_time,
m.usc_customer_create_time,
m.escape_customer_create_time
),
COALESCE(
m.usc_customer_create_time,
m.escape_customer_create_time, c.create_time),
COALESCE(m.escape_customer_create_time, c.create_time,
m.usc_customer_create_time)
) as user_create_time
FROM "publish_cab"."usc_customer" c
join customer_mapping m
on c.customer_id = m.usc_customer_id
where country_code = '65'),
customer_jobs_constrained_by_parameter_dates as (
SELECT cmcd.usc_customer_id,
cmcd.escape_customer_id, use_sms,
case when date(user_create_time) > date('{start_date}') then
date(user_create_time)
else date('{start_date}')
end as start_date,
date(b."req_pickup_dt") as booking_date,
j.job_no,
j.status
FROM customer_mapping_for_country_65_and_dates cmcd
join "publish_cab"."escape_booking" as b
on cmcd.escape_customer_id = b.cust_id
join "publish_cab"."escape_job" as j
on b.booking_id = j.booking_id
where b.bookingdate
between '{start_date_short}' and '{segment_date_str_short}'
and j.createddate
between '{start_date_short}' and '{segment_date_str_short}')
,
escape_trips_between_constraint_dates as (
select *
from "publish_cab"."escape_trip_details"
where createddate
between '{start_date_short}' and '{segment_date_str_short}')
/*
Main body.
Use all of your defined chunks together.
*/
SELECT usc_customer_id,
escape_customer_id,
use_sms,
start_date,
least(
date('{segment_date_str}'), max(booking_date)) as "max_booking_date",
date('{segment_date_str}') as segment_date,
count(t.job_no) as bookings, count(a.trip_no) as trips,
sum(total_trip_fare) as total_fare, sum(promo_amt) as "promo_amount",
sum(case when promo_amt > 0 then 1
else 0
end) as promo_times
from customer_jobs_constrained_by_parameter_dates t
left
join escape_trips_between_constraint_dates a
on t.job_no = a.job_no
group
by usc_customer_id,
escape_customer_id,
start_date,
use_sms
评论