Re: Find query characters in respect of optimizer for develop purpose

Поиск
Список
Период
Сортировка
От Melanie Plageman
Тема Re: Find query characters in respect of optimizer for develop purpose
Дата
Msg-id CAAKRu_Y64GmVLxjs0RTaUVLz7m5+D=6h=yM4Rw9GhiLhSNbkKg@mail.gmail.com
обсуждение исходный текст
Ответ на Find query characters in respect of optimizer for develop purpose  (Andy Fan <zhihui.fan1213@gmail.com>)
Список pgsql-hackers

On Mon, May 18, 2020 at 1:30 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:
Hello:

Before I want to pay attention to some optimizer features, I want to
estimate how much benefits it can create for customers, at least for our current
running customer. So I want to have some basic idea what kind of the query is
running now in respect of optimizer.  


You are imagining this to be collected during planning on a live
customer system as a form of telemetry?
I was inspired to search the hackers mailing list archive for the word
"telemetry" and didn't get many hits, which surprised me.
 

My basic is we can track it with the below struct(every backend has one global 
variable to record it).

+typedef struct
+{
+       int     subplan_count;
+       int     subquery_count;
+       int join_count;
+       bool hasagg;
+       bool hasgroup;
+} QueryCharacters;

it will be reset at the beginning of standard_planner, and the values are
increased at  make_subplan, set_subquery_pathlist, make_one_rel,
create_grouping_paths. later it can be tracked and viewed in
pg_stat_statements.


I think the natural reaction to this idea is: isn't there a 3rd party
tool that does this? Or can't you use one of the hooks and write an
extension, to, for example, examine the parse,query,and plan trees?

However, it does seem like keeping track of this information would be
much easier during planning since planner will be examining the query
tree and making the plan anyway.

On the other hand, I think that depends a lot on what specific
information you want to collect. Out of the fields you listed, it is
unclear what some of them would mean.
Does join_count count the number of explicit joins in the original query
or does it count the number of joins in the final plan? Does
subquery_count count all sub-selects in the original query or does it
only count subqueries that become SubqueryScans or SubPlans? What about
subqueries that become InitPlans?

One concern I have is that it seems like this struct would have to be
updated throughout planning and that it would be easy to break it with
the addition of new code. Couldn't every new optimization added to
planner potentially affect the accuracy of the information in the
struct?

--
Melanie Plageman

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: Recording test runtimes with the buildfarm
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Recording test runtimes with the buildfarm