PG 里两条一样的语句,时间跨度不同,结果优化成完全不同的执行方式....
语句语句用于计算:当天注册的人的登陆次数,一张注册表日志,一张登陆日志表 group by 用于合并渠道和游戏 id. 为了避免无法去重没用 distinct
select one_day, count() login_count, ROUND((count() / sum(count(*)) over(PARTITION BY one_day, app_id order by app_id))*100, 2) app_percent, sub_channel, app_id from (select reg.reg_day as one_day, login.app_id as app_id, login.sub_channel as sub_channel, reg.user_id from (select count_time, date_trunc('day', count_time) as reg_day, user_id from t_l_register_log where count_time >= '2019-03-03 00:00:00' and count_time < '2019-04-04 00:00:00' order by reg_day) as reg inner join (select date_trunc('day', count_time) as login_day ,sub_channel, app_id, user_id from t_l_platform_login_log where count_time >= '2019-03-03 00:00:00' and count_time < '2019-04-04 00:00:00' group by login_day, app_id, sub_channel, user_id order by login_day) as login on date_trunc('day', reg.count_time) = login_day and reg.user_id = login.user_id group by one_day, app_id, sub_channel, reg.user_id) as tb group by one_day, app_id, sub_channel order by one_day, app_id, login_count desc, app_percent desc;
这个是能这却执行的 sql,时间跨度 1 年
------------------------------------------------------------------------------------------
Sort (cost=97561067.23..97581067.23 rows=8000000 width=56) (actual time=34585.894..34586.519 rows=18517 loops=1)
Output: reg.reg_day, (count(*)), (round(((((count(*)))::numeric / sum((count(*))) OVER (?)) * '100'::numeric), 2)), login.sub_channel, login.app_id
Sort Key: reg.reg_day, login.app_id, (count(*)) DESC, (round(((((count(*)))::numeric / sum((count(*))) OVER (?)) * '100'::numeric), 2)) DESC
Sort Method: quicksort Memory: 2215kB
-> WindowAgg (cost=83444220.45..96096929.48 rows=8000000 width=56) (actual time=33708.198..34578.031 rows=18517 loops=1)
Output: reg.reg_day, (count(*)), round(((((count(*)))::numeric / sum((count(*))) OVER (?)) * '100'::numeric), 2), login.sub_channel, login.app_id
-> GroupAggregate (cost=83444220.45..95856929.48 rows=8000000 width=24) (actual time=33708.188..34562.551 rows=18517 loops=1)
Output: reg.reg_day, login.sub_channel, login.app_id, count(*)
Group Key: reg.reg_day, login.app_id, login.sub_channel
-> Group (cost=83444220.45..87555123.46 rows=328872241 width=28) (actual time=33708.179..34341.267 rows=3198877 loops=1)
Output: reg.reg_day, login.app_id, login.sub_channel, reg.user_id, login.app_id, login.sub_channel
Group Key: reg.reg_day, login.app_id, login.sub_channel, reg.user_id
-> Sort (cost=83444220.45..84266401.05 rows=328872241 width=20) (actual time=33708.175..33958.476 rows=3198877 loops=1)
Output: reg.reg_day, reg.user_id, login.app_id, login.sub_channel
Sort Key: reg.reg_day, login.app_id, login.sub_channel, reg.user_id
Sort Method: external merge Disk: 93944kB
-> Merge Join (cost=10070881.07..16687034.80 rows=328872241 width=20) (actual time=28498.899..31904.531 rows=3198877 loops=1)
Output: reg.reg_day, reg.user_id, login.app_id, login.sub_channel
Merge Cond: (((date_trunc('day'::text, reg.count_time)) = login.login_day) AND (reg.user_id = login.user_id))
-> Sort (cost=876809.36..883338.44 rows=2611632 width=20) (actual time=2665.030..2929.604 rows=3204477 loops=1)
Output: reg.reg_day, reg.user_id, reg.count_time, (date_trunc('day'::text, reg.count_time))
Sort Key: (date_trunc('day'::text, reg.count_time)), reg.user_id
Sort Method: external merge Disk: 131744kB
-> Subquery Scan on reg (cost=142257.54..491333.83 rows=2611632 width=20) (actual time=376.086..1283.740 rows=3204477 loops=1)
Output: reg.reg_day, reg.user_id, reg.count_time, date_trunc('day'::text, reg.count_time)
-> Gather Merge (cost=142257.54..465217.51 rows=2611632 width=20) (actual time=376.081..815.735 rows=3204477 loops=1)
Output: _hyper_8_974_chunk.count_time, (date_trunc('day'::text, _hyper_8_974_chunk.count_time)), _hyper_8_974_chunk.user_id
Workers Planned: 8
Workers Launched: 8
-> Sort (cost=141257.40..142073.54 rows=326454 width=20) (actual time=346.654..379.105 rows=356053 loops=9)
Output: _hyper_8_974_chunk.count_time, (date_trunc('day'::text, _hyper_8_974_chunk.count_time)), _hyper_8_974_chunk.user_id
Sort Key: (date_trunc('day'::text, _hyper_8_974_chunk.count_time))
Sort Method: external merge Disk: 10176kB
这个是没有正确执行的 sql,时间跨度一个月,不知道要跑多少时间
---------------------------------------------------------------------------
Sort (cost=1378360.26..1378362.88 rows=1046 width=56)
Output: reg.reg_day, (count(*)), (round(((((count(*)))::numeric / sum((count(*))) OVER (?)) * '100'::numeric), 2)), _hyper_4_1229_chunk.sub_channel, _hyper_4_1229_chunk.app_id
Sort Key: reg.reg_day, _hyper_4_1229_chunk.app_id, (count(*)) DESC, (round(((((count(*)))::numeric / sum((count(*))) OVER (?)) * '100'::numeric), 2)) DESC
-> WindowAgg (cost=1377873.71..1378307.80 rows=1046 width=56)
Output: reg.reg_day, (count(*)), round(((((count(*)))::numeric / sum((count(*))) OVER (?)) * '100'::numeric), 2), _hyper_4_1229_chunk.sub_channel, _hyper_4_1229_chunk.app_id
-> GroupAggregate (cost=1377873.71..1378276.42 rows=1046 width=24)
Output: reg.reg_day, _hyper_4_1229_chunk.sub_channel, _hyper_4_1229_chunk.app_id, count(*)
Group Key: reg.reg_day, _hyper_4_1229_chunk.app_id, _hyper_4_1229_chunk.sub_channel
-> Group (cost=1377873.71..1378004.46 rows=10460 width=28)
Output: reg.reg_day, _hyper_4_1229_chunk.app_id, _hyper_4_1229_chunk.sub_channel, reg.user_id, _hyper_4_1229_chunk.app_id, _hyper_4_1229_chunk.sub_channel
Group Key: reg.reg_day, _hyper_4_1229_chunk.app_id, _hyper_4_1229_chunk.sub_channel, reg.user_id
-> Sort (cost=1377873.71..1377899.86 rows=10460 width=20)
Output: reg.reg_day, reg.user_id, _hyper_4_1229_chunk.app_id, _hyper_4_1229_chunk.sub_channel
Sort Key: reg.reg_day, _hyper_4_1229_chunk.app_id, _hyper_4_1229_chunk.sub_channel, reg.user_id
-> Merge Join (cost=652242.46..1377175.37 rows=10460 width=20)
Output: reg.reg_day, reg.user_id, _hyper_4_1229_chunk.app_id, _hyper_4_1229_chunk.sub_channel
Merge Cond: ((date_trunc('day'::text, _hyper_4_1229_chunk.count_time)) = (date_trunc('day'::text, reg.count_time)))
Join Filter: (reg.user_id = _hyper_4_1229_chunk.user_id)
-> Group (cost=650558.24..1318952.53 rows=979872 width=20)
Output: (date_trunc('day'::text, _hyper_4_1229_chunk.count_time)), _hyper_4_1229_chunk.sub_channel, _hyper_4_1229_chunk.app_id, _hyper_4_1229_chunk.user_id
Group Key: (date_trunc('day'::text, _hyper_4_1229_chunk.count_time)), _hyper_4_1229_chunk.app_id, _hyper_4_1229_chunk.sub_channel, _hyper_4_1229_chunk.user_id
-> Gather Merge (cost=650558.24..1267509.25 rows=4899360 width=20)
Output: (date_trunc('day'::text, _hyper_4_1229_chunk.count_time)), _hyper_4_1229_chunk.sub_channel, _hyper_4_1229_chunk.app_id, _hyper_4_1229_chunk.user_id
Workers Planned: 5
-> Group (cost=649558.16..676504.67 rows=979872 width=20)
Output: (date_trunc('day'::text, _hyper_4_1229_chunk.count_time)), _hyper_4_1229_chunk.sub_channel, _hyper_4_1229_chunk.app_id, _hyper_4_1229_chunk.user_id
Group Key: (date_trunc('day'::text, _hyper_4_1229_chunk.count_time)), _hyper_4_1229_chunk.app_id, _hyper_4_1229_chunk.sub_channel, _hyper_4_1229_chunk.user_id
-> Sort (cost=649558.16..654457.53 rows=1959746 width=20)
Output: (date_trunc('day'::text, _hyper_4_1229_chunk.count_time)), _hyper_4_1229_chunk.sub_channel, _hyper_4_1229_chunk.app_id, _hyper_4_1229_chunk.user_id
Sort Key: (date_trunc('day'::text, _hyper_4_1229_chunk.count_time)), _hyper_4_1229_chunk.app_id, _hyper_4_1229_chunk.sub_channel, _hyper_4_1229_chunk.user_id
-> Result (cost=0.00..404552.30 rows=1959746 width=20)
Output: date_trunc('day'::text, _hyper_4_1229_chunk.count_time), _hyper_4_1229_chunk.sub_channel, _hyper_4_1229_chunk.app_id, _hyper_4_1229_chunk.user_id
-> Parallel Append (cost=0.00..380055.48 rows=1959746 width=20)
正确语句用了日期和 userid 排序, 然后正确的双排序 join 了 不正确的语句没用日期排序,最后导致了 M*N?
有人能帮我解释下是不是这个理啊 orz
DBA 脑子要要怎么长才能了了解数据库的各种坑啊......
这 DBA 要怎么干啊 orz,我光测一条语句就要疯了...
1
CallMeReznov 2019-07-11 10:36:37 +08:00
DBA 是另外一个物种
|