Обсуждение: [MASSMAIL] effect on planner of turning a subquery to a table, sql function returning table

Поиск
Список
Период
Сортировка

Hi there,

I work on a booking system.
Below is a query showing reservations of devices that overlaps a campaign.
A campaign has duration and time intervals by day of week.

Here is a query (A):

select
  device_id,
  t.date,
  timerange(t.start_time, t.end_time) * g.times as times
from device_timeslots t
join (
  select date, dow, timerange(start, stop) as times
  from campaigns c
  cross join generate_series(c.start_date, c.end_date, '1 day') d(date)
  join unnest_timeslots(c.timeslots) t(dow, start, stop) on t.dow=extract(dow from date)
  where id=11870
) g on g.date=t.date
where t.date between '2024-04-26' and '2024-04-26'
group by device_id, t.date, t.start_time, t.end_time, g.times;


The time intervals of the campaign are stored as a jsonb (c.timeslots) and expanded by an IMMUTABLE sql function unnest_timeslots returning a table.

The query (A) runs in 440.497 ms.

When I replace subquery with a temp table (B):

create temp table z11870 as (
  select date, dow, timerange(start, stop) as times
  from campaigns c
  cross join generate_series(c.start_date, c.end_date, '1 day') d(date)
  join unnest_timeslots(c.timeslots) t(dow, start, stop) on t.dow=extract(dow from date)
  where id=11870
);

select
  device_id,
  t.date,
  timerange(t.start_time, t.end_time) * z.times as times
from device_timeslots t
join z11870 z on z.date=t.date
where t.date between '2024-04-26' and '2024-04-26'
group by device_id, t.date, t.start_time, t.end_time, g.times;



The query (B) runs in 48.160 ms.

Here is (B) execution plan:

 GroupAggregate  (cost=70121.37..71282.14 rows=33165 width=124)
   Group Key: t.device_id, t.date, t.start_time, t.end_time, z.times
   ->  Sort  (cost=70121.37..70204.28 rows=33165 width=64)
         Sort Key: t.device_id, t.date, t.start_time, t.end_time, z.times, t.rank
         ->  Merge Join  (cost=67127.99..67631.11 rows=33165 width=64)
               Merge Cond: (z.date = t.date)
               ->  Sort  (cost=78.60..81.43 rows=1130 width=40)
                     Sort Key: z.date
                     ->  Seq Scan on z11870 z  (cost=0.00..21.30 rows=1130 width=40)
               ->  Sort  (cost=67049.39..67109.04 rows=23861 width=32)
                     Sort Key: t.date
                     ->  Bitmap Heap Scan on device_timeslots t  (cost=329.01..65314.41 rows=23861 width=32)
                           Recheck Cond: ((date >= '2024-04-26'::date) AND (date <= '2024-04-26'::date))
                           ->  Bitmap Index Scan on device_timeslots_date_index  (cost=0.00..323.05 rows=23861 width=0)
                                 Index Cond: ((date >= '2024-04-26'::date) AND (date <= '2024-04-26'::date))

, whereas the plan of (A) is:

 GroupAggregate  (cost=401037.82..503755.82 rows=1467400 width=124)
   Group Key: t.device_id, t.date, t.start_time, t.end_time, (timerange(((t_1.value ->> 0))::time without time zone, ((t_1.value ->> 1))::time without time zone))
   ->  Sort  (cost=401037.82..404706.32 rows=1467400 width=96)
         Sort Key: t.device_id, t.date, t.start_time, t.end_time, (timerange(((t_1.value ->> 0))::time without time zone, ((t_1.value ->> 1))::time without time zone)), t.rank
         ->  Nested Loop  (cost=2.99..100268.62 rows=1467400 width=96)
               ->  Nested Loop  (cost=2.98..55962.20 rows=14674 width=64)
                     ->  Nested Loop  (cost=2.54..39.31 rows=500 width=40)
                           ->  Index Scan using campaigns_pkey on campaigns c  (cost=0.28..8.30 rows=1 width=355)
                                 Index Cond: (id = 11870)
                           ->  Hash Join  (cost=2.26..26.01 rows=500 width=40)
                                 Hash Cond: (EXTRACT(dow FROM d.date) = ((j.dow)::integer)::numeric)
                                 ->  Function Scan on generate_series d  (cost=0.01..10.01 rows=1000 width=8)
                                 ->  Hash  (cost=1.00..1.00 rows=100 width=64)
                                       ->  Function Scan on jsonb_each j  (cost=0.00..1.00 rows=100 width=64)
                     ->  Index Scan using device_timeslots_date_index on device_timeslots t  (cost=0.43..111.56 rows=29 width=32)
                           Index Cond: ((date = d.date) AND (date >= '2024-04-26'::date) AND (date <= '2024-04-26'::date))
               ->  Memoize  (cost=0.01..1.01 rows=100 width=32)
                     Cache Key: j.times
                     Cache Mode: binary
                     ->  Function Scan on jsonb_array_elements t_1  (cost=0.00..1.00 rows=100 width=32)


The Merge Join of (B) provides better timing than the Nested Loop of (A)...

On the options I think:

O1) change the design, add a table much like the z11870

O2) Is there a way to hint planner to materialize a subquery?

O3) other?

Cheers!

, Thierry

Re: effect on planner of turning a subquery to a table, sql function returning table

От
David Rowley
Дата:
On Fri, 12 Apr 2024 at 22:33, Thierry Henrio <thierry.henrio@gmail.com> wrote:
> Here is (B) execution plan:

To be able to determine if a plan is good or bad, we'd really need to
see the EXPLAIN (ANALYZE) output rather than just the EXPLAIN ouput.

To save a possible roundtrip;

SET track_io_timing = ON;
EXPLAIN (ANALYZE, BUFFERS) <the query>

David



Re: effect on planner of turning a subquery to a table, sql function returning table

От
Thierry Henrio
Дата:
Thanks David!

Here are the plans with (ANALYZE, BUFFERS), and track_io_timing ON:

(A)

 GroupAggregate  (cost=401037.82..503755.82 rows=1467400 width=124) (actual time=416.851..426.534 rows=4670 loops=1)
   Group Key: t.device_id, t.date, t.start_time, t.end_time, (timerange(((t_1.value ->> 0))::time without time zone, ((t_1.value ->> 1))::time without time zone))
   Buffers: shared hit=3067
   ->  Sort  (cost=401037.82..404706.32 rows=1467400 width=96) (actual time=416.812..417.855 rows=22908 loops=1)
         Sort Key: t.device_id, t.date, t.start_time, t.end_time, (timerange(((t_1.value ->> 0))::time without time zone, ((t_1.value ->> 1))::time without time zone)), t.rank
         Sort Method: quicksort  Memory: 3229kB
         Buffers: shared hit=3067
         ->  Nested Loop  (cost=2.99..100268.62 rows=1467400 width=96) (actual time=371.380..395.324 rows=22908 loops=1)
               Buffers: shared hit=3067
               ->  Nested Loop  (cost=2.98..55962.20 rows=14674 width=64) (actual time=371.362..376.059 rows=11454 loops=1)
                     Buffers: shared hit=3067
                     ->  Nested Loop  (cost=2.54..39.31 rows=500 width=40) (actual time=371.332..371.337 rows=1 loops=1)
                           Buffers: shared hit=3
                           ->  Index Scan using campaigns_pkey on campaigns c  (cost=0.28..8.30 rows=1 width=355) (actual time=371.243..371.244 rows=1 loops=1)
                                 Index Cond: (id = 11870)
                                 Buffers: shared hit=3
                           ->  Hash Join  (cost=2.26..26.01 rows=500 width=40) (actual time=0.083..0.086 rows=1 loops=1)
                                 Hash Cond: (EXTRACT(dow FROM d.date) = ((j.dow)::integer)::numeric)
                                 ->  Function Scan on generate_series d  (cost=0.01..10.01 rows=1000 width=8) (actual time=0.025..0.026 rows=1 loops=1)
                                 ->  Hash  (cost=1.00..1.00 rows=100 width=64) (actual time=0.040..0.040 rows=7 loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                       ->  Function Scan on jsonb_each j  (cost=0.00..1.00 rows=100 width=64) (actual time=0.015..0.016 rows=7 loops=1)
                     ->  Index Scan using device_timeslots_date_index on device_timeslots t  (cost=0.43..111.56 rows=29 width=32) (actual time=0.023..3.477 rows=11454 loops=1)
                           Index Cond: ((date = d.date) AND (date >= '2024-04-26'::date) AND (date <= '2024-04-26'::date))
                           Buffers: shared hit=3064
               ->  Memoize  (cost=0.01..1.01 rows=100 width=32) (actual time=0.000..0.000 rows=2 loops=11454)
                     Cache Key: j.times
                     Cache Mode: binary
                     Hits: 11453  Misses: 1  Evictions: 0  Overflows: 0  Memory Usage: 1kB
                     ->  Function Scan on jsonb_array_elements t_1  (cost=0.00..1.00 rows=100 width=32) (actual time=0.004..0.005 rows=2 loops=1)
 Planning:
   Buffers: shared hit=4
 Planning Time: 0.455 ms
 JIT:
   Functions: 36
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 1.949 ms, Inlining 28.891 ms, Optimization 207.481 ms, Emission 134.907 ms, Total 373.228 ms
 Execution Time: 429.037 ms


And (B)

 GroupAggregate  (cost=70121.37..71282.14 rows=33165 width=124) (actual time=37.895..48.577 rows=4670 loops=1)
   Group Key: t.device_id, t.date, t.start_time, t.end_time, z.times
   Buffers: shared hit=3064, local hit=1
   ->  Sort  (cost=70121.37..70204.28 rows=33165 width=64) (actual time=37.874..38.979 rows=22908 loops=1)
         Sort Key: t.device_id, t.date, t.start_time, t.end_time, z.times, t.rank
         Sort Method: quicksort  Memory: 2737kB
         Buffers: shared hit=3064, local hit=1
         ->  Merge Join  (cost=67127.99..67631.11 rows=33165 width=64) (actual time=9.587..15.797 rows=22908 loops=1)
               Merge Cond: (z.date = t.date)
               Buffers: shared hit=3064, local hit=1
               ->  Sort  (cost=78.60..81.43 rows=1130 width=40) (actual time=0.030..0.031 rows=2 loops=1)
                     Sort Key: z.date
                     Sort Method: quicksort  Memory: 25kB
                     Buffers: local hit=1
                     ->  Seq Scan on z11870 z  (cost=0.00..21.30 rows=1130 width=40) (actual time=0.016..0.018 rows=2 loops=1)
                           Buffers: local hit=1
               ->  Sort  (cost=67049.39..67109.04 rows=23861 width=32) (actual time=9.549..10.720 rows=22907 loops=1)
                     Sort Key: t.date
                     Sort Method: quicksort  Memory: 1100kB
                     Buffers: shared hit=3064
                     ->  Bitmap Heap Scan on device_timeslots t  (cost=329.01..65314.41 rows=23861 width=32) (actual time=2.952..7.534 rows=11454 loops=1)
                           Recheck Cond: ((date >= '2024-04-26'::date) AND (date <= '2024-04-26'::date))
                           Heap Blocks: exact=3052
                           Buffers: shared hit=3064
                           ->  Bitmap Index Scan on device_timeslots_date_index  (cost=0.00..323.05 rows=23861 width=0) (actual time=1.648..1.648 rows=11454 loops=1)
                                 Index Cond: ((date >= '2024-04-26'::date) AND (date <= '2024-04-26'::date))
                                 Buffers: shared hit=12
 Planning:
   Buffers: shared hit=4
 Planning Time: 0.587 ms
 Execution Time: 49.163 ms



Both look correct on buffer hit? The first hits too much (11453) ?

Re: effect on planner of turning a subquery to a table, sql function returning table

От
David Rowley
Дата:
On Fri, 12 Apr 2024 at 23:27, Thierry Henrio <thierry.henrio@gmail.com> wrote:
>  JIT:
>    Functions: 36
>    Options: Inlining true, Optimization true, Expressions true, Deforming true
>    Timing: Generation 1.949 ms, Inlining 28.891 ms, Optimization 207.481 ms, Emission 134.907 ms, Total 373.228 ms
>  Execution Time: 429.037 ms

It looks very much like the majority of the extra time is being spent
doing JIT compilation.  This triggers for plan A but not plan B.  You
can see from:

> GroupAggregate  (cost=401037.82..503755.82 rows=1467400 width=124) (actual time=416.851..426.534 rows=4670 loops=1)

that the top-level row estimates are off.  This makes the estimated
cost higher than it actually is.  The planner opts to have tuple
deforming and expression evaluation JIT compiled to try to speed up
the plan thinking it's worthwhile. It's not in this case.

You can switch JIT off to try without with:

SET jit=0;

You might want to consider editing postgresql.conf and raising the
jit_above_cost, jit_inline_above_cost and jit_optimize_above_cost
values to some higher value or disable JIT completely.

SELECT pg_reload_conf(); -- to reload the config file afterwards.

David



Re: effect on planner of turning a subquery to a table, sql function returning table

От
Thierry Henrio
Дата:
Thanks David!

Setting jit to 0:

 GroupAggregate  (cost=401037.82..503755.82 rows=1467400 width=124) (actual time=56.603..68.908 rows=4670 loops=1)
   Group Key: t.device_id, t.date, t.start_time, t.end_time, (timerange(((t_1.value ->> 0))::time without time zone, ((t_1.value ->> 1))::time without time zone))
   Buffers: shared hit=3067
   ->  Sort  (cost=401037.82..404706.32 rows=1467400 width=96) (actual time=56.583..57.847 rows=22908 loops=1)
         Sort Key: t.device_id, t.date, t.start_time, t.end_time, (timerange(((t_1.value ->> 0))::time without time zone, ((t_1.value ->> 1))::time without time zone)), t.rank
         Sort Method: quicksort  Memory: 3229kB
         Buffers: shared hit=3067
         ->  Nested Loop  (cost=2.99..100268.62 rows=1467400 width=96) (actual time=0.076..31.219 rows=22908 loops=1)
               Buffers: shared hit=3067
               ->  Nested Loop  (cost=2.98..55962.20 rows=14674 width=64) (actual time=0.055..6.589 rows=11454 loops=1)
                     Buffers: shared hit=3067
                     ->  Nested Loop  (cost=2.54..39.31 rows=500 width=40) (actual time=0.041..0.047 rows=1 loops=1)
                           Buffers: shared hit=3
                           ->  Index Scan using campaigns_pkey on campaigns c  (cost=0.28..8.30 rows=1 width=355) (actual time=0.011..0.012 rows=1 loops=1)
                                 Index Cond: (id = 11870)
                                 Buffers: shared hit=3
                           ->  Hash Join  (cost=2.26..26.01 rows=500 width=40) (actual time=0.028..0.031 rows=1 loops=1)
                                 Hash Cond: (EXTRACT(dow FROM d.date) = ((j.dow)::integer)::numeric)
                                 ->  Function Scan on generate_series d  (cost=0.01..10.01 rows=1000 width=8) (actual time=0.007..0.008 rows=1 loops=1)
                                 ->  Hash  (cost=1.00..1.00 rows=100 width=64) (actual time=0.013..0.014 rows=7 loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                       ->  Function Scan on jsonb_each j  (cost=0.00..1.00 rows=100 width=64) (actual time=0.007..0.008 rows=7 loops=1)
                     ->  Index Scan using device_timeslots_date_index on device_timeslots t  (cost=0.43..111.56 rows=29 width=32) (actual time=0.013..4.282 rows=11454 loops=1)
                           Index Cond: ((date = d.date) AND (date >= '2024-04-26'::date) AND (date <= '2024-04-26'::date))
                           Buffers: shared hit=3064
               ->  Memoize  (cost=0.01..1.01 rows=100 width=32) (actual time=0.000..0.000 rows=2 loops=11454)
                     Cache Key: j.times
                     Cache Mode: binary
                     Hits: 11453  Misses: 1  Evictions: 0  Overflows: 0  Memory Usage: 1kB
                     ->  Function Scan on jsonb_array_elements t_1  (cost=0.00..1.00 rows=100 width=32) (actual time=0.002..0.002 rows=2 loops=1)
 Planning:
   Buffers: shared hit=4
 Planning Time: 0.512 ms
 Execution Time: 69.476 ms


So as far as I understand, correct me when I'm wrong:

1) EXPLAIN (ANALYZE, BUFFERS) shows the jit timing, which mere EXPLAIN did not thank!
I found the https://www.postgresql.org/docs/current/jit-decision.html#JIT-DECISION

2) the subquery is overestimated, estimated 500, actual 1 in the Nested Loop.
And it amplified by its outer Nested Loop

         ->  Nested Loop  (cost=2.99..100268.62 rows=1467400 width=96) (actual time=0.067..27.102 rows=22908 loops=1)
               Buffers: shared hit=3067
               ->  Nested Loop  (cost=2.98..55962.20 rows=14674 width=64) (actual time=0.055..5.750 rows=11454 loops=1)
                     Buffers: shared hit=3067
                     ->  Nested Loop  (cost=2.54..39.31 rows=500 width=40) (actual time=0.039..0.044 rows=1 loops=1)

The scan of the temp table was overestimated too, but it is mitigated by the Merge Join

         ->  Merge Join  (cost=67127.99..67631.11 rows=33165 width=64) (actual time=7.642..13.455 rows=22908 loops=1)
               Merge Cond: (z.date = t.date)
               Buffers: shared hit=3064, local hit=1
               ->  Sort  (cost=78.60..81.43 rows=1130 width=40) (actual time=0.011..0.012 rows=2 loops=1)
                     Sort Key: z.date
                     Sort Method: quicksort  Memory: 25kB
                     Buffers: local hit=1
                     ->  Seq Scan on z11870 z  (cost=0.00..21.30 rows=1130 width=40) (actual time=0.006..0.007 rows=2 loops=1)
                           Buffers: local hit=1

The cost estimation is better in (B) than (A) (even when the estimated rows of z11870 are off).

So I have options : O1) set jit=0, O2) temp table, O3) rework schema design (a variation on O2).
Thanks!


On Fri, Apr 12, 2024 at 1:40 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Fri, 12 Apr 2024 at 23:27, Thierry Henrio <thierry.henrio@gmail.com> wrote:
>  JIT:
>    Functions: 36
>    Options: Inlining true, Optimization true, Expressions true, Deforming true
>    Timing: Generation 1.949 ms, Inlining 28.891 ms, Optimization 207.481 ms, Emission 134.907 ms, Total 373.228 ms
>  Execution Time: 429.037 ms

It looks very much like the majority of the extra time is being spent
doing JIT compilation.  This triggers for plan A but not plan B.  You
can see from:

> GroupAggregate  (cost=401037.82..503755.82 rows=1467400 width=124) (actual time=416.851..426.534 rows=4670 loops=1)

that the top-level row estimates are off.  This makes the estimated
cost higher than it actually is.  The planner opts to have tuple
deforming and expression evaluation JIT compiled to try to speed up
the plan thinking it's worthwhile. It's not in this case.

You can switch JIT off to try without with:

SET jit=0;

You might want to consider editing postgresql.conf and raising the
jit_above_cost, jit_inline_above_cost and jit_optimize_above_cost
values to some higher value or disable JIT completely.

SELECT pg_reload_conf(); -- to reload the config file afterwards.

David