Обсуждение: BUG #17871: JIT during postgresql_fdw remote_estimates EXPLAIN have very negatively effect on planning time

Поиск
Список
Период
Сортировка

BUG #17871: JIT during postgresql_fdw remote_estimates EXPLAIN have very negatively effect on planning time

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17871
Logged by:          Maxim Boguk
Email address:      maxim.boguk@gmail.com
PostgreSQL version: 13.9
Operating system:   Linux
Description:

Issue:
postgresql_fdw remote estimated  explain calls could trigger JIT compilation
on the remote side (why explain without analyze trying use JIT at all???),
and with partitioned tables it will lead to very slow planning.

In my case simple query over FDW table have planning time over 150ms with
jit=on on remote side:
explain analyze select * from cold_replica_fdw.interview_review_info_archive
order by topic_id limit 1;
                                                          QUERY PLAN
                                                 

------------------------------------------------------------------------------------------------------------------------------
 Foreign Scan on interview_review_info_archive  (cost=14.10..14.14 rows=1
width=40) (actual time=3.942..4.526 rows=1 loops=1)
 Planning Time: 162.721 ms
 Execution Time: 5.226 ms

And only 15ms with jit=off on remote side:
explain analyze select * from cold_replica_fdw.interview_review_info_archive
order by topic_id limit 1;
                                                          QUERY PLAN
                                                 

------------------------------------------------------------------------------------------------------------------------------
 Foreign Scan on interview_review_info_archive  (cost=14.10..14.14 rows=1
width=40) (actual time=3.724..4.381 rows=1 loops=1)
 Planning Time: 14.655 ms
 Execution Time: 5.048 ms

Checking what's going on the remote side leads to the following results:
remote estimate explain calls from fdw with jit=on
[EXPLAIN] LOG:  duration: 97.050 ms  statement: EXPLAIN SELECT topic_id,
review_id, move_to_invitation_state_time, no_interview_reply_time,
review_suggestion_chat_message_creation_time FROM
public.interview_review_info_archive
vs
remote estimate explain calls from fdw with jit=off
[EXPLAIN] LOG:  duration: 3.343 ms  statement: EXPLAIN SELECT topic_id,
review_id, move_to_invitation_state_time, no_interview_reply_time,
review_suggestion_chat_message_creation_time FROM
public.interview_review_info_archive

Reason with jit=on the database spent a lot of time on the:
 JIT:
   Functions: 200
   Options: Inlining true, Optimization true, Expressions true, Deforming
true

Possible solutions:
band aid:                       postgresql fdw should invoke set jit to
'off';  when doing remote estimates via explain calls.
probably more correct:  explain (without analyze) should not invoke JIT code
path at all (because the database not going to execute query anyway, so
there are no profit from JIT could be gained).


PG Bug reporting form <noreply@postgresql.org> writes:
> Checking what's going on the remote side leads to the following results:
> remote estimate explain calls from fdw with jit=on
> [EXPLAIN] LOG:  duration: 97.050 ms  statement: EXPLAIN SELECT topic_id,
> review_id, move_to_invitation_state_time, no_interview_reply_time,
> review_suggestion_chat_message_creation_time FROM
> public.interview_review_info_archive
> vs
> remote estimate explain calls from fdw with jit=off
> [EXPLAIN] LOG:  duration: 3.343 ms  statement: EXPLAIN SELECT topic_id,
> review_id, move_to_invitation_state_time, no_interview_reply_time,
> review_suggestion_chat_message_creation_time FROM
> public.interview_review_info_archive

Do you see the same discrepancy when you execute EXPLAIN manually
on the remote side?  If so, I wouldn't blame postgres_fdw for it.

I suppose interview_review_info_archive is a view not a plain table?
In either case, could we see the DDL definition for it?

            regards, tom lane



Hi,

On 2023-03-27 08:04:59 +0000, PG Bug reporting form wrote:
> postgresql_fdw remote estimated  explain calls could trigger JIT compilation
> on the remote side (why explain without analyze trying use JIT at all???),
> and with partitioned tables it will lead to very slow planning.

It should not trigger all of JIT, just generating the bitcode, but not
optimizing / emitting it.

> Checking what's going on the remote side leads to the following results:
> remote estimate explain calls from fdw with jit=on
> [EXPLAIN] LOG:  duration: 97.050 ms  statement: EXPLAIN SELECT topic_id,
> review_id, move_to_invitation_state_time, no_interview_reply_time,
> review_suggestion_chat_message_creation_time FROM
> public.interview_review_info_archive
> vs
> remote estimate explain calls from fdw with jit=off
> [EXPLAIN] LOG:  duration: 3.343 ms  statement: EXPLAIN SELECT topic_id,
> review_id, move_to_invitation_state_time, no_interview_reply_time,
> review_suggestion_chat_message_creation_time FROM
> public.interview_review_info_archive

Can you show the explain plan for the remote queries?

Greetings,

Andres Freund



Andres Freund <andres@anarazel.de> writes:
> On 2023-03-27 08:04:59 +0000, PG Bug reporting form wrote:
>> postgresql_fdw remote estimated  explain calls could trigger JIT compilation
>> on the remote side (why explain without analyze trying use JIT at all???),
>> and with partitioned tables it will lead to very slow planning.

> It should not trigger all of JIT, just generating the bitcode, but not
> optimizing / emitting it.

I was wondering if we could skip doing even that much when
EXEC_FLAG_EXPLAIN_ONLY is set.  Without a test case, though,
I'm not very sure where the jitter is getting reached.

            regards, tom lane





On Tue, Mar 28, 2023 at 1:53 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
> Checking what's going on the remote side leads to the following results:
> remote estimate explain calls from fdw with jit=on
> [EXPLAIN] LOG:  duration: 97.050 ms  statement: EXPLAIN SELECT topic_id,
> review_id, move_to_invitation_state_time, no_interview_reply_time,
> review_suggestion_chat_message_creation_time FROM
> public.interview_review_info_archive
> vs
> remote estimate explain calls from fdw with jit=off
> [EXPLAIN] LOG:  duration: 3.343 ms  statement: EXPLAIN SELECT topic_id,
> review_id, move_to_invitation_state_time, no_interview_reply_time,
> review_suggestion_chat_message_creation_time FROM
> public.interview_review_info_archive

Do you see the same discrepancy when you execute EXPLAIN manually
on the remote side?  If so, I wouldn't blame postgres_fdw for it.

I suppose interview_review_info_archive is a view not a plain table?
In either case, could we see the DDL definition for it?

                        regards, tom lane

Hi,

Yes interview_review_info_archive is not a normal table by natively partitioned by range table with 100 partition:

negotiation_chat_archive=# \d+ interview_review_info_archive
                                              Partitioned table "public.interview_review_info_archive"
                    Column                    |            Type             | Collation | Nullable | Default | Storage | Stats target | Description
----------------------------------------------+-----------------------------+-----------+----------+---------+---------+--------------+-------------
 topic_id                                     | bigint                      |           | not null |         | plain   |              |
 review_id                                    | bigint                      |           |          |         | plain   |              |
 move_to_invitation_state_time                | timestamp without time zone |           |          |         | plain   |              |
 no_interview_reply_time                      | timestamp without time zone |           |          |         | plain   |              |
 review_suggestion_chat_message_creation_time | timestamp without time zone |           |          |         | plain   |              |
Partition key: RANGE (topic_id)
Indexes:
    "interview_review_info_archive_pkey" PRIMARY KEY, btree (topic_id)
Partitions: interview_review_info_archive_p001 FOR VALUES FROM ('0') TO ('100000000'),
...
            interview_review_info_archive_p100 FOR VALUES FROM ('9900000000') TO ('10000000000')

When I run EXPLAIN locally there are the same differences.
Under normal circumstances this behavior is not an issue because EXPLAIN is only executed by DBA, but with postgresql_fdw issues 2-3 EXPLAIN per each basic fdw query this behavior (triggering JIT on explain) has a very negative effect on performance.


Simplest case show huge difference in theEXPLAIN performance with and without JIT:
negotiation_chat_archive=# set jit to on;
SET
negotiation_chat_archive=# explain select * FROM public.interview_review_info_archive;
                                                                     QUERY PLAN                                                                      
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=100.00..4602434.79 rows=354847702 width=40)
   Workers Planned: 7
   ->  Parallel Append  (cost=0.00..1053857.77 rows=50692498 width=40)
         ->  Parallel Seq Scan on interview_review_info_archive_p030 interview_review_info_archive_30  (cost=0.00..200959.41 rows=15759281 width=40)
...
         ->  Parallel Seq Scan on interview_review_info_archive_p005 interview_review_info_archive_5  (cost=0.00..0.11 rows=1 width=40)
         ->  Parallel Seq Scan on interview_review_info_archive_p009 interview_review_info_archive_9  (cost=0.00..0.11 rows=1 width=40)
(103 rows)
 JIT:
   Functions: 200
   Options: Inlining true, Optimization true, Expressions true, Deforming true
(106 rows)
Time: 33.664 ms
PS: this sample contradicts Andreas' statement about "It should not trigger all of JIT, just generating the bitcode, but not optimizing / emitting it." (or I read EXPLAIN output wrong.)

vs
negotiation_chat_archive=# explain select * FROM public.interview_review_info_archive;
...
         ->  Parallel Seq Scan on interview_review_info_archive_p005 interview_review_info_archive_5  (cost=0.00..0.11 rows=1 width=40)
         ->  Parallel Seq Scan on interview_review_info_archive_p009 interview_review_info_archive_9  (cost=0.00..0.11 rows=1 width=40)
(103 rows)
Time: 3.392 ms


--
Maxim Boguk
Senior Postgresql DBA
https://dataegret.com/

Phone UA: +380 99 143 0000
Phone AU: +61  45 218 5678

On Tue, 28 Mar 2023 at 10:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I was wondering if we could skip doing even that much when
> EXEC_FLAG_EXPLAIN_ONLY is set.  Without a test case, though,
> I'm not very sure where the jitter is getting reached.

It seems to be coming from the qual initialisation in nodeSeqscan.c:
ie. ExecInitSeqScan -> ExecInitQual -> ExecReadyExpr ->
jit_compile_expr

I guess that means just add a check for EXEC_FLAG_EXPLAIN_ONLY in
jit_compile_expr, or only set the es_jit_flags bits in
standard_ExecutorStart() when not in EXEC_FLAG_EXPLAIN_ONLY. I'm not
really sure which of those is better, but the attached does the former
and seems to fix the issue.

David

Вложения
David Rowley <dgrowleyml@gmail.com> writes:
> I guess that means just add a check for EXEC_FLAG_EXPLAIN_ONLY in
> jit_compile_expr, or only set the es_jit_flags bits in
> standard_ExecutorStart() when not in EXEC_FLAG_EXPLAIN_ONLY. I'm not
> really sure which of those is better, but the attached does the former
> and seems to fix the issue.

Hmm, offhand I'd vote for the latter, because it adds cycles only
once per query not once per expression.  But Andres may have a
better idea about how to do this.

            regards, tom lane





On Tue, Mar 28, 2023 at 1:53 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
> Checking what's going on the remote side leads to the following results:
> remote estimate explain calls from fdw with jit=on
> [EXPLAIN] LOG:  duration: 97.050 ms  statement: EXPLAIN SELECT topic_id,
> review_id, move_to_invitation_state_time, no_interview_reply_time,
> review_suggestion_chat_message_creation_time FROM
> public.interview_review_info_archive
> vs
> remote estimate explain calls from fdw with jit=off
> [EXPLAIN] LOG:  duration: 3.343 ms  statement: EXPLAIN SELECT topic_id,
> review_id, move_to_invitation_state_time, no_interview_reply_time,
> review_suggestion_chat_message_creation_time FROM
> public.interview_review_info_archive

Do you see the same discrepancy when you execute EXPLAIN manually
on the remote side?  If so, I wouldn't blame postgres_fdw for it.

I suppose interview_review_info_archive is a view not a plain table?
In either case, could we see the DDL definition for it?

                        regards, tom lane

Hi,

During an attempt to create a minimal test case I discovered an effect for which I have no explanation:
(all tests with jit=on):

Run EXPLAIN on the single partition, so far expected results...
explain select * from interview_review_info_archive_p028;
                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
 Seq Scan on interview_review_info_archive_p028  (cost=0.00..361819.08 rows=34546848 width=40)
 JIT:
   Functions: 2
   Options: Inlining true, Optimization true, Expressions true, Deforming true

Prepare stand alone copy table with all data:
create table tables_to_drop.test as select * from interview_review_info_archive_p028;
vacuum ANALYZE tables_to_drop.test;
explain select * from tables_to_drop.test;
                           QUERY PLAN                            
-----------------------------------------------------------------
 Seq Scan on test  (cost=0.00..361908.72 rows=34562312 width=40)

The same cost/same data/same query/same structure but no JIT triggered?

Add the same indexes as exists on interview_review_info_archive_p028:
alter table tables_to_drop.test add constraint test_pk primary key (topic_id);
create unique index test_uniq on tables_to_drop.test(review_id, topic_id) WHERE review_id IS NOT NULL);
vacuum ANALYZE tables_to_drop.test;
And still no JIT triggered during EXPLAIN.

Making sure that the both tables contain exactly the same data in exactly the same order:
cluster tables_to_drop.test using test_pk;
analyze verbose tables_to_drop.test;
cluster interview_review_info_archive_p028 using interview_review_info_archive_p028_pkey;
analyze verbose interview_review_info_archive_p028;
No changes:

negotiation_chat_archive=# explain select * from interview_review_info_archive_p028;
                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
 Seq Scan on interview_review_info_archive_p028  (cost=0.00..361921.00 rows=34563080 width=40)
 JIT:
   Functions: 2
   Options: Inlining true, Optimization true, Expressions true, Deforming true

negotiation_chat_archive=# explain select * from tables_to_drop.test;
                           QUERY PLAN                            
-----------------------------------------------------------------
 Seq Scan on test  (cost=0.00..361922.08 rows=34563208 width=40)


Add check constraint on test table similar to exists on partition interview_review_info_archive_p028:
alter table tables_to_drop.test add constraint test_chk check ((topic_id IS NOT NULL) AND (topic_id >= '2700000000'::bigint) AND (topic_id < '2800000000'::bigint));
And still not JIT on standalone table.

Final structure of tables:
negotiation_chat_archive=# \d+ interview_review_info_archive_p028
                                                 Table "public.interview_review_info_archive_p028"
                    Column                    |            Type             | Collation | Nullable | Default | Storage | Stats target | Description
----------------------------------------------+-----------------------------+-----------+----------+---------+---------+--------------+-------------
 topic_id                                     | bigint                      |           | not null |         | plain   |              |
 review_id                                    | bigint                      |           |          |         | plain   |              |
 move_to_invitation_state_time                | timestamp without time zone |           |          |         | plain   |              |
 no_interview_reply_time                      | timestamp without time zone |           |          |         | plain   |              |
 review_suggestion_chat_message_creation_time | timestamp without time zone |           |          |         | plain   |              |
Partition of: interview_review_info_archive FOR VALUES FROM ('2700000000') TO ('2800000000')
Partition constraint: ((topic_id IS NOT NULL) AND (topic_id >= '2700000000'::bigint) AND (topic_id < '2800000000'::bigint))
Indexes:
    "interview_review_info_archive_p028_pkey" PRIMARY KEY, btree (topic_id) CLUSTER
    "interview_review_info_archive_p028_review_id_topic_id_idx" UNIQUE, btree (review_id, topic_id) WHERE review_id IS NOT NULL
Access method: heap

vs

negotiation_chat_archive=# \d+  tables_to_drop.test
                                                            Table "tables_to_drop.test"
                    Column                    |            Type             | Collation | Nullable | Default | Storage | Stats target | Description
----------------------------------------------+-----------------------------+-----------+----------+---------+---------+--------------+-------------
 topic_id                                     | bigint                      |           | not null |         | plain   |              |
 review_id                                    | bigint                      |           |          |         | plain   |              |
 move_to_invitation_state_time                | timestamp without time zone |           |          |         | plain   |              |
 no_interview_reply_time                      | timestamp without time zone |           |          |         | plain   |              |
 review_suggestion_chat_message_creation_time | timestamp without time zone |           |          |         | plain   |              |
Indexes:
    "test_pk" PRIMARY KEY, btree (topic_id) CLUSTER
    "test_uniq" UNIQUE, btree (review_id, topic_id) WHERE review_id IS NOT NULL
Check constraints:
    "test_chk" CHECK (topic_id IS NOT NULL AND topic_id >= '2700000000'::bigint AND topic_id < '2800000000'::bigint)
Access method: heap

Only difference is
Partition of: interview_review_info_archive FOR VALUES FROM ('2700000000') TO ('2800000000')

Somehow being part of partition structure triggering JIT during EXPLAIN even if table used as standalone (e.g. called directly not via partition root) in query.


--
Maxim Boguk
Senior Postgresql DBA
https://dataegret.com/

Phone UA: +380 99 143 0000
Phone AU: +61  45 218 5678

On Tue, 28 Mar 2023 at 17:54, Maxim Boguk <maxim.boguk@gmail.com> wrote:
> During an attempt to create a minimal test case I discovered an effect for which I have no explanation:
> (all tests with jit=on):
>
> Run EXPLAIN on the single partition, so far expected results...
> explain select * from interview_review_info_archive_p028;
>                                           QUERY PLAN
> -----------------------------------------------------------------------------------------------
>  Seq Scan on interview_review_info_archive_p028  (cost=0.00..361819.08 rows=34546848 width=40)
>  JIT:
>    Functions: 2
>    Options: Inlining true, Optimization true, Expressions true, Deforming true

It could be something like a dropped column forcing projection of the
SELECT *.  You might want to try without SELECT * using some subset of
columns instead.

Check if columns have been dropped with:

select attnum,attname from pg_Attribute where
attrelid='interview_review_info_archive_p028'::Regclass;

David





On Tue, Mar 28, 2023 at 5:13 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Tue, 28 Mar 2023 at 17:54, Maxim Boguk <maxim.boguk@gmail.com> wrote:
> During an attempt to create a minimal test case I discovered an effect for which I have no explanation:
> (all tests with jit=on):
>
> Run EXPLAIN on the single partition, so far expected results...
> explain select * from interview_review_info_archive_p028;
>                                           QUERY PLAN
> -----------------------------------------------------------------------------------------------
>  Seq Scan on interview_review_info_archive_p028  (cost=0.00..361819.08 rows=34546848 width=40)
>  JIT:
>    Functions: 2
>    Options: Inlining true, Optimization true, Expressions true, Deforming true

It could be something like a dropped column forcing projection of the
SELECT *.  You might want to try without SELECT * using some subset of
columns instead.

Check if columns have been dropped with:

select attnum,attname from pg_Attribute where
attrelid='interview_review_info_archive_p028'::Regclass;

David

Hi David,

You 100% right
negotiation_chat_archive=# explain select topic_id,review_id,move_to_invitation_state_time,no_interview_reply_time,review_suggestion_chat_message_creation_time from interview_review_info_archive_p028;
                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
 Seq Scan on interview_review_info_archive_p028  (cost=0.00..361921.00 rows=34563080 width=40)

Exceptionally curious case when select * not equivalent select list_all_columns from table...
Thank you.

--
Maxim Boguk
Senior Postgresql DBA
https://dataegret.com/

Phone UA: +380 99 143 0000
Phone AU: +61  45 218 5678



On Tue, Mar 28, 2023 at 5:21 PM Maxim Boguk <maxim.boguk@gmail.com> wrote:


On Tue, Mar 28, 2023 at 5:13 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Tue, 28 Mar 2023 at 17:54, Maxim Boguk <maxim.boguk@gmail.com> wrote:
> During an attempt to create a minimal test case I discovered an effect for which I have no explanation:
> (all tests with jit=on):
>
> Run EXPLAIN on the single partition, so far expected results...
> explain select * from interview_review_info_archive_p028;
>                                           QUERY PLAN
> -----------------------------------------------------------------------------------------------
>  Seq Scan on interview_review_info_archive_p028  (cost=0.00..361819.08 rows=34546848 width=40)
>  JIT:
>    Functions: 2
>    Options: Inlining true, Optimization true, Expressions true, Deforming true

It could be something like a dropped column forcing projection of the
SELECT *.  You might want to try without SELECT * using some subset of
columns instead.

Check if columns have been dropped with:

select attnum,attname from pg_Attribute where
attrelid='interview_review_info_archive_p028'::Regclass;

David

Hi David,

You 100% right
negotiation_chat_archive=# explain select topic_id,review_id,move_to_invitation_state_time,no_interview_reply_time,review_suggestion_chat_message_creation_time from interview_review_info_archive_p028;
                                          QUERY PLAN                                          
-----------------------------------------------------------------------------------------------
 Seq Scan on interview_review_info_archive_p028  (cost=0.00..361921.00 rows=34563080 width=40)

Exceptionally curious case when select * not equivalent select list_all_columns from table...
Thank you.

The small update - previous explain was wrong (I forgot enable jit)
but general explanation correct
dropping column from test table forcing jit during explain:

set JIT to on;
alter table tables_to_drop.test drop column move_to_invitation_state_time;
vacuum ANALYZE tables_to_drop.test;

negotiation_chat_archive=# explain select * from tables_to_drop.test;
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Seq Scan on test  (cost=0.00..361966.20 rows=34567620 width=32)
 JIT:
   Functions: 2
   Options: Inlining false, Optimization false, Expressions true, Deforming true
(4 rows)

--
Maxim Boguk
Senior Postgresql DBA
https://dataegret.com/

Phone UA: +380 99 143 0000
Phone AU: +61  45 218 5678

Hi,

On 2023-03-28 13:27:51 +1100, Maxim Boguk wrote:
> Simplest case show huge difference in theEXPLAIN performance with and
> without JIT:
> negotiation_chat_archive=# set jit to on;
> SET
> negotiation_chat_archive=# explain select * FROM
> public.interview_review_info_archive;
>                                                                      QUERY
> PLAN
>
-----------------------------------------------------------------------------------------------------------------------------------------------------
>  Gather  (cost=100.00..4602434.79 rows=354847702 width=40)
>    Workers Planned: 7
>    ->  Parallel Append  (cost=0.00..1053857.77 rows=50692498 width=40)
>          ->  Parallel Seq Scan on interview_review_info_archive_p030
> interview_review_info_archive_30  (cost=0.00..200959.41 rows=15759281
> width=40)
> ...
>          ->  Parallel Seq Scan on interview_review_info_archive_p005
> interview_review_info_archive_5  (cost=0.00..0.11 rows=1 width=40)
>          ->  Parallel Seq Scan on interview_review_info_archive_p009
> interview_review_info_archive_9  (cost=0.00..0.11 rows=1 width=40)
> (103 rows)
>  JIT:
>    Functions: 200
>    Options: Inlining true, Optimization true, Expressions true, Deforming
> true
> (106 rows)
> Time: 33.664 ms
> PS: this sample contradicts Andreas' statement about "It should not trigger
> all of JIT, just generating the bitcode, but not optimizing / emitting it."
> (or I read EXPLAIN output wrong.)

I think you might be misreading it. The output just tells you that we would
perform optimization etc, not that we did.

Upthread JIT caused the explain to take 100ms, here it's "just" 33ms. What's
the difference?

Greetings,

Andres Freund



Hi,

On 2023-03-28 17:28:04 +1100, Maxim Boguk wrote:
> The small update - previous explain was wrong (I forgot enable jit)
> but general explanation correct
> dropping column from test table forcing jit during explain:
>
> set JIT to on;
> alter table tables_to_drop.test drop column move_to_invitation_state_time;
> vacuum ANALYZE tables_to_drop.test;
>
> negotiation_chat_archive=# explain select * from tables_to_drop.test;
>                                    QUERY PLAN
>
> ---------------------------------------------------------------------------------
>  Seq Scan on test  (cost=0.00..361966.20 rows=34567620 width=32)
>  JIT:
>    Functions: 2
>    Options: Inlining false, Optimization false, Expressions true, Deforming
> true
> (4 rows)

That's not really related to JIT - when there are no dropped columns we return
"physical tuples" from e.g. a sequential scan. Physical tuples being tuples
just as they are in the table. But if there are dropped columns, we need to
project, to "filter out" the dropped column.


postgres[3877072][1]=# SHOW max_parallel_workers_per_gather ;
┌─────────────────────────────────┐
│ max_parallel_workers_per_gather │
├─────────────────────────────────┤
│ 0                               │
└─────────────────────────────────┘
(1 row)

postgres[3877072][1]=# ;DROP TABLE large;CREATE TABLE large(id int8, dropme int default 0, data text);INSERT INTO
large(id,data) SELECT g.i, g.i FROM generate_series(1, 5000000) g(i);
 


postgres[3877072][1]=# EXPLAIN (ANALYZE, TIMING OFF) SELECT count(*) FROM large ;
┌───────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                            QUERY PLAN                                             │
├───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Aggregate  (cost=76830.89..76830.90 rows=1 width=8) (actual rows=1 loops=1)                       │
│   ->  Seq Scan on large  (cost=0.00..67834.11 rows=3598711 width=0) (actual rows=5000000 loops=1) │
│ Planning Time: 0.076 ms                                                                           │
│ Execution Time: 325.636 ms                                                                        │
└───────────────────────────────────────────────────────────────────────────────────────────────────┘
(4 rows)
(repeat a couple times)

ALTER TABLE large DROP COLUMN dropme;


postgres[3877072][1]=# EXPLAIN (ANALYZE, TIMING OFF) SELECT count(*) FROM large ;
┌───────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                            QUERY PLAN                                             │
├───────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Aggregate  (cost=79617.50..79617.51 rows=1 width=8) (actual rows=1 loops=1)                       │
│   ->  Seq Scan on large  (cost=0.00..70063.40 rows=3821640 width=0) (actual rows=5000000 loops=1) │
│ Planning Time: 0.079 ms                                                                           │
│ Execution Time: 365.530 ms                                                                        │
└───────────────────────────────────────────────────────────────────────────────────────────────────┘
(4 rows)


After the DROP COLUMN the time is consistently higher.


JIT is used for projections. When we just use physical tuples, there's no
projection. Hence no JIT for a SELECT * FROM tbl; when there's no dropped
columns.

Greetings,

Andres Freund



Hi,

On 2023-03-28 16:05:00 +1300, David Rowley wrote:
> On Tue, 28 Mar 2023 at 10:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I was wondering if we could skip doing even that much when
> > EXEC_FLAG_EXPLAIN_ONLY is set.  Without a test case, though,
> > I'm not very sure where the jitter is getting reached.
> 
> It seems to be coming from the qual initialisation in nodeSeqscan.c:
> ie. ExecInitSeqScan -> ExecInitQual -> ExecReadyExpr ->
> jit_compile_expr
> 
> I guess that means just add a check for EXEC_FLAG_EXPLAIN_ONLY in
> jit_compile_expr, or only set the es_jit_flags bits in
> standard_ExecutorStart() when not in EXEC_FLAG_EXPLAIN_ONLY. I'm not
> really sure which of those is better, but the attached does the former
> and seems to fix the issue.

I'm a bit worried about either of these making EXPLAIN less informative. Right
now we can see whether and how much would be JITed without ANALYZE. After this
not anymore.

Greetings,

Andres Freund



Hi,

On 2023-03-28 11:33:41 -0700, Andres Freund wrote:
> On 2023-03-28 16:05:00 +1300, David Rowley wrote:
> > On Tue, 28 Mar 2023 at 10:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > I was wondering if we could skip doing even that much when
> > > EXEC_FLAG_EXPLAIN_ONLY is set.  Without a test case, though,
> > > I'm not very sure where the jitter is getting reached.
> > 
> > It seems to be coming from the qual initialisation in nodeSeqscan.c:
> > ie. ExecInitSeqScan -> ExecInitQual -> ExecReadyExpr ->
> > jit_compile_expr
> > 
> > I guess that means just add a check for EXEC_FLAG_EXPLAIN_ONLY in
> > jit_compile_expr, or only set the es_jit_flags bits in
> > standard_ExecutorStart() when not in EXEC_FLAG_EXPLAIN_ONLY. I'm not
> > really sure which of those is better, but the attached does the former
> > and seems to fix the issue.
> 
> I'm a bit worried about either of these making EXPLAIN less informative. Right
> now we can see whether and how much would be JITed without ANALYZE. After this
> not anymore.

I wonder if we should address the postgres_fdw case of potentially doing lots
of EXPLAINs by adding an option geared towards that?

Or perhaps even a dedicated function?  Generating an entire executor tree,
just to get to estimates that the planner already determined, seems a bit
questionable... Explain output can be quite large. And we are just looking at
the top-level cost + rows anyway?

Greetings,

Andres Freund



On Wed, 29 Mar 2023 at 07:33, Andres Freund <andres@anarazel.de> wrote:
>
> On 2023-03-28 16:05:00 +1300, David Rowley wrote:
> > I guess that means just add a check for EXEC_FLAG_EXPLAIN_ONLY in
> > jit_compile_expr, or only set the es_jit_flags bits in
> > standard_ExecutorStart() when not in EXEC_FLAG_EXPLAIN_ONLY. I'm not
> > really sure which of those is better, but the attached does the former
> > and seems to fix the issue.
>
> I'm a bit worried about either of these making EXPLAIN less informative. Right
> now we can see whether and how much would be JITed without ANALYZE. After this
> not anymore.

If you're worried about that, then wouldn't it just be better to do
something like have jit_compile_expr() increment some other counter
when bailing on EXEC_FLAG_EXPLAIN_ONLY then have EXPLAIN add that
counter to created_functions? With that, we don't need to invent any
new syntax and also remove the overhead from EXPLAIN and also keep the
JIT information.

David



Andres Freund <andres@anarazel.de> writes:
> I wonder if we should address the postgres_fdw case of potentially doing lots
> of EXPLAINs by adding an option geared towards that?

> Or perhaps even a dedicated function?  Generating an entire executor tree,
> just to get to estimates that the planner already determined, seems a bit
> questionable... Explain output can be quite large. And we are just looking at
> the top-level cost + rows anyway?

I experimented with the idea of adding an EXPLAIN option designed for
postgres_fdw's needs.  It seems quite simple to skip executor startup
(and hence jitting) as well as most of the cost of constructing the
EXPLAIN output.  You could imagine going further and having a way for
the planner to skip create_plan() and just return the Path tree, but
I'm skeptical that that would be worth the mess.  It would be a lot
harder to shoehorn into the existing EXPLAIN APIs, for one thing,
and I'm not sure what the planner side would look like either.

The attached just implements the actual printout without the
infrastructure of a new EXPLAIN option, but that's enough for rough
timing tests.  Using a case similar to Maxim's, with a 100-partition
table with a dropped column, this gives me something like

=# explain select * from tpart;
                   QUERY PLAN
------------------------------------------------
 Plan  (cost=0.00..4390.00 rows=226000 width=8)
(1 row)

Time: 0.795 ms

instead of:

=# explain select * from tpart;
                                  QUERY PLAN
-------------------------------------------------------------------------------
 Append  (cost=0.00..4390.00 rows=226000 width=8)
   ->  Seq Scan on tpart1 tpart_1  (cost=0.00..32.60 rows=2260 width=8)
   ->  Seq Scan on tpart2 tpart_2  (cost=0.00..32.60 rows=2260 width=8)
   ...
   ->  Seq Scan on tpart100 tpart_100  (cost=0.00..32.60 rows=2260 width=8)
 JIT:
   Functions: 200
   Options: Inlining true, Optimization true, Expressions true, Deforming true
(104 rows)

Time: 7.219 ms

Obviously plenty of bikeshedding to be done here about the option name
and the exact output, but that's the basic idea.  It would take only
a couple of lines in postgres_fdw.c to invoke this option if the
remote server is new enough, and its code for parsing the EXPLAIN
output wouldn't need to change at all.

BTW, I'm still wondering how Maxim got to 100-plus-msec overheads;
it's around 10ms at most for me, with what seems like a similar case.

            regards, tom lane

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 7e48e9c210..a979243669 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -538,6 +538,40 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,

     Assert(plannedstmt->commandType != CMD_UTILITY);

+    /*
+     * If option XXX is specified, skip executor setup and just print the cost
+     * estimates from the top-level plan node.
+     */
+    if (1)
+    {
+        Plan       *plan = plannedstmt->planTree;
+
+        if (es->format == EXPLAIN_FORMAT_TEXT)
+        {
+            appendStringInfo(es->str, "Plan  (cost=%.2f..%.2f rows=%.0f width=%d)",
+                             plan->startup_cost, plan->total_cost,
+                             plan->plan_rows, plan->plan_width);
+        }
+        else
+        {
+            ExplainOpenGroup("Query", NULL, true, es);
+            ExplainOpenGroup("Plan", "Plan", true, es);
+
+            ExplainPropertyFloat("Startup Cost", NULL, plan->startup_cost,
+                                 2, es);
+            ExplainPropertyFloat("Total Cost", NULL, plan->total_cost,
+                                 2, es);
+            ExplainPropertyFloat("Plan Rows", NULL, plan->plan_rows,
+                                 0, es);
+            ExplainPropertyInteger("Plan Width", NULL, plan->plan_width,
+                                   es);
+
+            ExplainCloseGroup("Plan", "Plan", true, es);
+            ExplainCloseGroup("Query", NULL, true, es);
+        }
+        return;
+    }
+
     if (es->analyze && es->timing)
         instrument_option |= INSTRUMENT_TIMER;
     else if (es->analyze)


BTW, I'm still wondering how Maxim got to 100-plus-msec overheads;
it's around 10ms at most for me, with what seems like a similar case.

                        regards, tom lane


Hi Tom,

100+ms on FDW result of two factors
1)FDW call explain 3 times in my case and two of them surely fire JIT
2)locally single EXPLAIN work 35ms on 100 partitions on almost idle server (after multiple runs)
negotiation_chat_archive=# explain select * from interview_review_info_archive;
...
 JIT:
   Functions: 200
   Options: Inlining false, Optimization false, Expressions true, Deforming true
(106 rows)
Time: 34.076 ms
negotiation_chat_archive=# explain select * from interview_review_info_archive;
Time: 34.437 ms
negotiation_chat_archive=# explain select * from interview_review_info_archive;
Time: 34.015 ms
pretty stable... not even close to the 7ms that you show in your test.
Table consists of 2 bigint + 3 timestamptz field + single dropped boolean.

I run
explain select * from interview_review_info_archive
\watch 1
for few minutes with corresponding perf record -p recording
and there is perf report result:  
   4.04%  postgres  libLLVM-10.so.1      [.] llvm::FoldingSetNodeID::AddString
   2.94%  postgres  libc-2.31.so         [.] malloc
   2.58%  postgres  libLLVM-10.so.1      [.] llvm::AttributeImpl::operator<
   2.44%  postgres  libLLVM-10.so.1      [.] llvm::AttributeSetNode::get
   2.18%  postgres  libLLVM-10.so.1      [.] llvm::BasicBlock::dropAllReferences
   1.87%  postgres  libLLVM-10.so.1      [.] llvm::Value::setValueName
   1.80%  postgres  libLLVM-10.so.1      [.] llvm::StringMapImpl::LookupBucketFor
   1.67%  postgres  libLLVM-10.so.1      [.] llvm::FoldingSetBase::FindNodeOrInsertPos
   1.50%  postgres  libLLVM-10.so.1      [.] llvm::FoldingSetNodeID::AddPointer
   1.49%  postgres  libLLVM-10.so.1      [.] llvm::StringMapImpl::RemoveKey
   1.48%  postgres  libLLVM-10.so.1      [.] llvm::BasicBlock::~BasicBlock
   1.46%  postgres  libLLVM-10.so.1      [.] llvm::AttrBuilder::addAttribute
   1.30%  postgres  libLLVM-10.so.1      [.] std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > c
   1.23%  postgres  libLLVM-10.so.1      [.] llvm::Value::setNameImpl
   1.17%  postgres  libstdc++.so.6.0.28  [.] std::_Rb_tree_insert_and_rebalance
   1.13%  postgres  libc-2.31.so         [.] cfree
   1.07%  postgres  libLLVM-10.so.1      [.] llvm::Value::destroyValueName
   1.06%  postgres  libLLVM-10.so.1      [.] llvm::User::operator new
   1.04%  postgres  postgres             [.] hash_search_with_hash_value
   0.97%  postgres  libLLVM-10.so.1      [.] std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > c
   0.95%  postgres  libLLVM-10.so.1      [.] llvm::hashing::detail::hash_combine_range_impl<unsigned int const>
   0.91%  postgres  libLLVM-10.so.1      [.] llvm::ConstantInt::get
   0.84%  postgres  libLLVM-10.so.1      [.] std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > c
   0.80%  postgres  libLLVM-10.so.1      [.] std::__sort<llvm::Attribute*, __gnu_cxx::__ops::_Iter_less_iter>
   0.73%  postgres  libLLVM-10.so.1      [.] llvm::Value::getValueName
   0.73%  postgres  libc-2.31.so         [.] memcpy
   0.69%  postgres  postgres             [.] dopr.constprop.0
   0.63%  postgres  libLLVM-10.so.1      [.] llvm::AttributeList::addAttributes
   0.63%  postgres  libLLVM-10.so.1      [.] llvm::AttrBuilder::addAttribute
   0.62%  postgres  libLLVM-10.so.1      [.] llvm::SymbolTableListTraits<llvm::Instruction>::setSymTabObject<llvm::Function*>
   0.60%  postgres  libstdc++.so.6.0.28  [.] std::_Rb_tree_increment

Maybe it provides some clues.

PS: I run all tests on the postgresql 13 version, maybe there are differences between JIT implementation/performance between 13 version and 15/head.

Regards,
Maxim


--
Maxim Boguk
Senior Postgresql DBA
https://dataegret.com/

Phone UA: +380 99 143 0000
Phone AU: +61  45 218 5678