Re: Custom explain options

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: Custom explain options
Дата
Msg-id CAFj8pRDXQX9F+wDVPuBdM2UGDvxfLY5Pf1=dtk2x2=7GUBPsqA@mail.gmail.com
обсуждение исходный текст
Ответ на Custom explain options  (Konstantin Knizhnik <knizhnik@garret.ru>)
Список pgsql-hackers
Hi

so 25. 11. 2023 v 8:23 odesílatel Konstantin Knizhnik <knizhnik@garret.ru> napsal:

Hi hackers,

EXPLAIN statement has a list of options (i.e. ANALYZE, BUFFERS, COST,...) which help to provide useful details of query execution.
In Neon we have added PREFETCH option which shows information about page prefetching during query execution (prefetching is more critical for Neon
architecture because of separation of compute and storage, so it is implemented not only for bitmap heap scan as in Vanilla Postgres, but also for seqscan, indexscan and indexonly scan). Another possible candidate  for explain options is local file cache (extra caching layer above shared buffers which is used to somehow replace file system cache in standalone Postgres).

I think that it will be nice to have a generic mechanism which allows extensions to add its own options to EXPLAIN.
I have attached the patch with implementation of such mechanism (also available as PR:  https://github.com/knizhnik/postgres/pull/1 )

I have demonstrated this mechanism using Bloom extension - just to report number of Bloom matches.

Not sure that it is really useful information but it is used mostly as example:

explain (analyze,bloom) select * from t where pk=2000;                                                       QUERY PLAN                                                        
------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on t  (cost=15348.00..15352.01 rows=1 width=4) (actual time=25.244..25.939 rows=1 loops=1)   Recheck Cond: (pk = 2000)   Rows Removed by Index Recheck: 292   Heap Blocks: exact=283   Bloom: matches=293   ->  Bitmap Index Scan on t_pk_idx  (cost=0.00..15348.00 rows=1 width=0) (actual time=25.147..25.147 rows=293 loops=1)         Index Cond: (pk = 2000)         Bloom: matches=293 Planning:   Bloom: matches=0 Planning Time: 0.387 ms Execution Time: 26.053 ms
(12 rows)

There are two known issues with this proposal:

1. I have to limit total size of all custom metrics - right now it is limited by 128 bytes. It is done to keep Instrumentation and some other data structures fixes size. Otherwise maintaining varying parts of this structure is ugly, especially in shared memory

2. Custom extension is added by means of RegisterCustomInsrumentation function which is called from _PG_init
But _PG_init is called when extension is loaded and it is loaded on demand when some of extension functions is called (except when extension is included
in shared_preload_libraries list), Bloom extension doesn't require it. So if your first statement executed in your session is:
 explain (analyze,bloom) select * from t where pk=2000;
...you will get error:
ERROR:  unrecognized EXPLAIN option "bloom"
LINE 1: explain (analyze,bloom) select * from t where pk=2000;
It happens because at the moment when explain statement parses options, Bloom index is not yet selected and so bloom extension is not loaded and RegisterCustomInsrumentation is not yet called. If we repeat the query, then proper result will be displayed (see above).


This patch has a lot of whitespaces and formatting issues. I fixed some

I don't understand how selecting some custom instrumentation can be safe.

List *pgCustInstr is a global variable. The attribute selected is set by NewExplainState routine

+   /* Reset custom instrumentations selection flag */
+   foreach (lc, pgCustInstr)
+   {
+       CustomInstrumentation *ci = (CustomInstrumentation*) lfirst(lc);
+
+       ci->selected = false;
+   }

and this attribute is used more times. But the queries can be nested. Theoretically EXPLAIN ANALYZE can run another EXPLAIN ANALYZE, and then this attribute of the global list can be rewritten. The list of selected custom instrumentations should be part of explain state, I think.

Regards

Pavel

 
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: meson: Stop using deprecated way getting path of files
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: remaining sql/json patches