Re: effect on planner of turning a subquery to a table, sql function returning table

Поиск

Список

Период

Сортировка

От	Thierry Henrio
Тема	Re: effect on planner of turning a subquery to a table, sql function returning table
Дата	12 апреля 17:58:23
Msg-id	CAMPYKo0FouB-HZ1k-_Ur2v+kK71q0T5icQGrp+SPbQJGq0H2Rw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: effect on planner of turning a subquery to a table, sql function returning table (David Rowley <dgrowleyml@gmail.com>)
Список	pgsql-general

Дерево обсуждения

Thanks David!

Setting jit to 0:

GroupAggregate (cost=401037.82..503755.82 rows=1467400 width=124) (actual time=56.603..68.908 rows=4670 loops=1)
Group Key: t.device_id, t.date, t.start_time, t.end_time, (timerange(((t_1.value ->> 0))::time without time zone, ((t_1.value ->> 1))::time without time zone))
Buffers: shared hit=3067
-> Sort (cost=401037.82..404706.32 rows=1467400 width=96) (actual time=56.583..57.847 rows=22908 loops=1)
Sort Key: t.device_id, t.date, t.start_time, t.end_time, (timerange(((t_1.value ->> 0))::time without time zone, ((t_1.value ->> 1))::time without time zone)), t.rank
Sort Method: quicksort Memory: 3229kB
Buffers: shared hit=3067
-> Nested Loop (cost=2.99..100268.62 rows=1467400 width=96) (actual time=0.076..31.219 rows=22908 loops=1)
Buffers: shared hit=3067
-> Nested Loop (cost=2.98..55962.20 rows=14674 width=64) (actual time=0.055..6.589 rows=11454 loops=1)
Buffers: shared hit=3067
-> Nested Loop (cost=2.54..39.31 rows=500 width=40) (actual time=0.041..0.047 rows=1 loops=1)
Buffers: shared hit=3
-> Index Scan using campaigns_pkey on campaigns c (cost=0.28..8.30 rows=1 width=355) (actual time=0.011..0.012 rows=1 loops=1)
Index Cond: (id = 11870)
Buffers: shared hit=3
-> Hash Join (cost=2.26..26.01 rows=500 width=40) (actual time=0.028..0.031 rows=1 loops=1)
Hash Cond: (EXTRACT(dow FROM d.date) = ((j.dow)::integer)::numeric)
-> Function Scan on generate_series d (cost=0.01..10.01 rows=1000 width=8) (actual time=0.007..0.008 rows=1 loops=1)
-> Hash (cost=1.00..1.00 rows=100 width=64) (actual time=0.013..0.014 rows=7 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Function Scan on jsonb_each j (cost=0.00..1.00 rows=100 width=64) (actual time=0.007..0.008 rows=7 loops=1)
-> Index Scan using device_timeslots_date_index on device_timeslots t (cost=0.43..111.56 rows=29 width=32) (actual time=0.013..4.282 rows=11454 loops=1)
Index Cond: ((date = d.date) AND (date >= '2024-04-26'::date) AND (date <= '2024-04-26'::date))
Buffers: shared hit=3064
-> Memoize (cost=0.01..1.01 rows=100 width=32) (actual time=0.000..0.000 rows=2 loops=11454)
Cache Key: j.times
Cache Mode: binary
Hits: 11453 Misses: 1 Evictions: 0 Overflows: 0 Memory Usage: 1kB
-> Function Scan on jsonb_array_elements t_1 (cost=0.00..1.00 rows=100 width=32) (actual time=0.002..0.002 rows=2 loops=1)
Planning:
Buffers: shared hit=4
Planning Time: 0.512 ms
Execution Time: 69.476 ms

So as far as I understand, correct me when I'm wrong:

1) EXPLAIN (ANALYZE, BUFFERS) shows the jit timing, which mere EXPLAIN did not thank!
I found the https://www.postgresql.org/docs/current/jit-decision.html#JIT-DECISION

2) the subquery is overestimated, estimated 500, actual 1 in the Nested Loop.

And it amplified by its outer Nested Loop

-> Nested Loop (cost=2.99..100268.62 rows=1467400 width=96) (actual time=0.067..27.102 rows=22908 loops=1)
Buffers: shared hit=3067
-> Nested Loop (cost=2.98..55962.20 rows=14674 width=64) (actual time=0.055..5.750 rows=11454 loops=1)
Buffers: shared hit=3067
-> Nested Loop (cost=2.54..39.31 rows=500 width=40) (actual time=0.039..0.044 rows=1 loops=1)

The scan of the temp table was overestimated too, but it is mitigated by the Merge Join

-> Merge Join (cost=67127.99..67631.11 rows=33165 width=64) (actual time=7.642..13.455 rows=22908 loops=1)
Merge Cond: (z.date = t.date)
Buffers: shared hit=3064, local hit=1
-> Sort (cost=78.60..81.43 rows=1130 width=40) (actual time=0.011..0.012 rows=2 loops=1)
Sort Key: z.date
Sort Method: quicksort Memory: 25kB
Buffers: local hit=1
-> Seq Scan on z11870 z (cost=0.00..21.30 rows=1130 width=40) (actual time=0.006..0.007 rows=2 loops=1)
Buffers: local hit=1

The cost estimation is better in (B) than (A) (even when the estimated rows of z11870 are off).

So I have options : O1) set jit=0, O2) temp table, O3) rework schema design (a variation on O2).

Thanks!

On Fri, Apr 12, 2024 at 1:40 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Fri, 12 Apr 2024 at 23:27, Thierry Henrio <thierry.henrio@gmail.com> wrote:
> JIT:
> Functions: 36
> Options: Inlining true, Optimization true, Expressions true, Deforming true
> Timing: Generation 1.949 ms, Inlining 28.891 ms, Optimization 207.481 ms, Emission 134.907 ms, Total 373.228 ms
> Execution Time: 429.037 ms

It looks very much like the majority of the extra time is being spent
doing JIT compilation. This triggers for plan A but not plan B. You
can see from:

> GroupAggregate (cost=401037.82..503755.82 rows=1467400 width=124) (actual time=416.851..426.534 rows=4670 loops=1)

that the top-level row estimates are off. This makes the estimated
cost higher than it actually is. The planner opts to have tuple
deforming and expression evaluation JIT compiled to try to speed up
the plan thinking it's worthwhile. It's not in this case.

You can switch JIT off to try without with:

SET jit=0;

You might want to consider editing postgresql.conf and raising the
jit_above_cost, jit_inline_above_cost and jit_optimize_above_cost
values to some higher value or disable JIT completely.

SELECT pg_reload_conf(); -- to reload the config file afterwards.

David

В списке pgsql-general по дате отправления:

Предыдущее

От: David Rowley
Дата: 12 апреля, 14:39:52
Сообщение: Re: effect on planner of turning a subquery to a table, sql function returning table

Следующее

От: veem v
Дата: 13 апреля, 10:03:47
Сообщение: Re: Question on trigger

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: effect on planner of turning a subquery to a table, sql function returning table

Предыдущее

Следующее