Re: effect on planner of turning a subquery to a table, sql function returning table

Поиск
Список
Период
Сортировка
От Thierry Henrio
Тема Re: effect on planner of turning a subquery to a table, sql function returning table
Дата
Msg-id CAMPYKo0FouB-HZ1k-_Ur2v+kK71q0T5icQGrp+SPbQJGq0H2Rw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: effect on planner of turning a subquery to a table, sql function returning table  (David Rowley <dgrowleyml@gmail.com>)
Список pgsql-general
Thanks David!

Setting jit to 0:

 GroupAggregate  (cost=401037.82..503755.82 rows=1467400 width=124) (actual time=56.603..68.908 rows=4670 loops=1)
   Group Key: t.device_id, t.date, t.start_time, t.end_time, (timerange(((t_1.value ->> 0))::time without time zone, ((t_1.value ->> 1))::time without time zone))
   Buffers: shared hit=3067
   ->  Sort  (cost=401037.82..404706.32 rows=1467400 width=96) (actual time=56.583..57.847 rows=22908 loops=1)
         Sort Key: t.device_id, t.date, t.start_time, t.end_time, (timerange(((t_1.value ->> 0))::time without time zone, ((t_1.value ->> 1))::time without time zone)), t.rank
         Sort Method: quicksort  Memory: 3229kB
         Buffers: shared hit=3067
         ->  Nested Loop  (cost=2.99..100268.62 rows=1467400 width=96) (actual time=0.076..31.219 rows=22908 loops=1)
               Buffers: shared hit=3067
               ->  Nested Loop  (cost=2.98..55962.20 rows=14674 width=64) (actual time=0.055..6.589 rows=11454 loops=1)
                     Buffers: shared hit=3067
                     ->  Nested Loop  (cost=2.54..39.31 rows=500 width=40) (actual time=0.041..0.047 rows=1 loops=1)
                           Buffers: shared hit=3
                           ->  Index Scan using campaigns_pkey on campaigns c  (cost=0.28..8.30 rows=1 width=355) (actual time=0.011..0.012 rows=1 loops=1)
                                 Index Cond: (id = 11870)
                                 Buffers: shared hit=3
                           ->  Hash Join  (cost=2.26..26.01 rows=500 width=40) (actual time=0.028..0.031 rows=1 loops=1)
                                 Hash Cond: (EXTRACT(dow FROM d.date) = ((j.dow)::integer)::numeric)
                                 ->  Function Scan on generate_series d  (cost=0.01..10.01 rows=1000 width=8) (actual time=0.007..0.008 rows=1 loops=1)
                                 ->  Hash  (cost=1.00..1.00 rows=100 width=64) (actual time=0.013..0.014 rows=7 loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                       ->  Function Scan on jsonb_each j  (cost=0.00..1.00 rows=100 width=64) (actual time=0.007..0.008 rows=7 loops=1)
                     ->  Index Scan using device_timeslots_date_index on device_timeslots t  (cost=0.43..111.56 rows=29 width=32) (actual time=0.013..4.282 rows=11454 loops=1)
                           Index Cond: ((date = d.date) AND (date >= '2024-04-26'::date) AND (date <= '2024-04-26'::date))
                           Buffers: shared hit=3064
               ->  Memoize  (cost=0.01..1.01 rows=100 width=32) (actual time=0.000..0.000 rows=2 loops=11454)
                     Cache Key: j.times
                     Cache Mode: binary
                     Hits: 11453  Misses: 1  Evictions: 0  Overflows: 0  Memory Usage: 1kB
                     ->  Function Scan on jsonb_array_elements t_1  (cost=0.00..1.00 rows=100 width=32) (actual time=0.002..0.002 rows=2 loops=1)
 Planning:
   Buffers: shared hit=4
 Planning Time: 0.512 ms
 Execution Time: 69.476 ms


So as far as I understand, correct me when I'm wrong:

1) EXPLAIN (ANALYZE, BUFFERS) shows the jit timing, which mere EXPLAIN did not thank!
I found the https://www.postgresql.org/docs/current/jit-decision.html#JIT-DECISION

2) the subquery is overestimated, estimated 500, actual 1 in the Nested Loop.
And it amplified by its outer Nested Loop

         ->  Nested Loop  (cost=2.99..100268.62 rows=1467400 width=96) (actual time=0.067..27.102 rows=22908 loops=1)
               Buffers: shared hit=3067
               ->  Nested Loop  (cost=2.98..55962.20 rows=14674 width=64) (actual time=0.055..5.750 rows=11454 loops=1)
                     Buffers: shared hit=3067
                     ->  Nested Loop  (cost=2.54..39.31 rows=500 width=40) (actual time=0.039..0.044 rows=1 loops=1)

The scan of the temp table was overestimated too, but it is mitigated by the Merge Join

         ->  Merge Join  (cost=67127.99..67631.11 rows=33165 width=64) (actual time=7.642..13.455 rows=22908 loops=1)
               Merge Cond: (z.date = t.date)
               Buffers: shared hit=3064, local hit=1
               ->  Sort  (cost=78.60..81.43 rows=1130 width=40) (actual time=0.011..0.012 rows=2 loops=1)
                     Sort Key: z.date
                     Sort Method: quicksort  Memory: 25kB
                     Buffers: local hit=1
                     ->  Seq Scan on z11870 z  (cost=0.00..21.30 rows=1130 width=40) (actual time=0.006..0.007 rows=2 loops=1)
                           Buffers: local hit=1

The cost estimation is better in (B) than (A) (even when the estimated rows of z11870 are off).

So I have options : O1) set jit=0, O2) temp table, O3) rework schema design (a variation on O2).
Thanks!


On Fri, Apr 12, 2024 at 1:40 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Fri, 12 Apr 2024 at 23:27, Thierry Henrio <thierry.henrio@gmail.com> wrote:
>  JIT:
>    Functions: 36
>    Options: Inlining true, Optimization true, Expressions true, Deforming true
>    Timing: Generation 1.949 ms, Inlining 28.891 ms, Optimization 207.481 ms, Emission 134.907 ms, Total 373.228 ms
>  Execution Time: 429.037 ms

It looks very much like the majority of the extra time is being spent
doing JIT compilation.  This triggers for plan A but not plan B.  You
can see from:

> GroupAggregate  (cost=401037.82..503755.82 rows=1467400 width=124) (actual time=416.851..426.534 rows=4670 loops=1)

that the top-level row estimates are off.  This makes the estimated
cost higher than it actually is.  The planner opts to have tuple
deforming and expression evaluation JIT compiled to try to speed up
the plan thinking it's worthwhile. It's not in this case.

You can switch JIT off to try without with:

SET jit=0;

You might want to consider editing postgresql.conf and raising the
jit_above_cost, jit_inline_above_cost and jit_optimize_above_cost
values to some higher value or disable JIT completely.

SELECT pg_reload_conf(); -- to reload the config file afterwards.

David

В списке pgsql-general по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: effect on planner of turning a subquery to a table, sql function returning table
Следующее
От: veem v
Дата:
Сообщение: Re: Question on trigger