Re: Partial aggregates pushdown

Поиск
Список
Период
Сортировка
От Alexander Pyhalov
Тема Re: Partial aggregates pushdown
Дата
Msg-id 37c9d93c2dca786f534639136d1da8f3@postgrespro.ru
обсуждение исходный текст
Ответ на RE: Partial aggregates pushdown  ("Fujii.Yuki@df.MitsubishiElectric.co.jp" <Fujii.Yuki@df.MitsubishiElectric.co.jp>)
Список pgsql-hackers
Fujii.Yuki@df.MitsubishiElectric.co.jp писал 2023-09-25 06:18:
> Hi Mr.Bruce, Mr.Pyhalov, Mr.Finnerty, hackers.
> 
> Thank you for your valuable comments. I sincerely apologize for the
> very late reply.
> Here is a response to your comments or a fix to the patch.
> 
> Tuesday, August 8, 2023 at 3:31 Bruce Momjian
>> > I have modified the program except for the point "if the version of the remote server is less than PG17".
>> > Instead, we have addressed the following.
>> > "If check_partial_aggregate_support is true and the remote server version is older than the local server
>> > version, postgres_fdw does not assume that the partial aggregate function is on the remote server unless
>> > the partial aggregate function and the aggregate function match."
>> > The reason for this is to maintain compatibility with any aggregate function that does not support partial
>> > aggregate in one version of V1 (V1 is PG17 or higher), even if the next version supports partial aggregate.
>> > For example, string_agg does not support partial aggregation in PG15, but it will support partial aggregation
>> > in PG16.
>> 
>> Just to clarify, I think you are saying:
>> 
>>         If check_partial_aggregate_support is true and the remote 
>> server
>>         version is older than the local server version, postgres_fdw
>>         checks if the partial aggregate function exists on the remote
>>         server during planning and only uses it if it does.
>> 
>> I tried to phrase it in a positive way, and mentioned the plan time
>> distinction.  Also, I am sorry I was away for most of July and am just
>> getting to this.
> Thanks for your comment. In the documentation, the description of
> check_partial_aggregate_support is as follows
> (please see postgres-fdw.sgml).
> --
> check_partial_aggregate_support (boolean)
> Only if this option is true, during query planning, postgres_fdw
> connects to the remote server and check if the remote server version
> is older than the local server version. If so, postgres_fdw assumes
> that for each built-in aggregate function, the partial aggregate
> function is not defined on the remote server unless the partial
> aggregate function and the aggregate function match. The default is
> false.
> --
> 
> Thursday, 20 July 2023 19:23 Alexander Pyhalov 
> <a.pyhalov@postgrespro.ru>:
>> Fujii.Yuki@df.MitsubishiElectric.co.jp писал 2023-07-19 03:43:
>> > Hi Mr.Pyhalov, hackers.
>> 
>> > 3)
>> > I modified the patch to safely do a partial aggregate pushdown for
>> > queries which contain having clauses.
>> >
>> 
>> Hi.
>> Sorry, but I don't see how it could work.
> We apologize for any inconvenience caused.
> Thanks to Pyhalov's and Jim's comments, I have realized that I have
> made a fundamental mistake regarding the pushdown of the HAVING clause
> and the difficulty of achieving it performing Partial aggregate
> pushdown.
> So, I removed the codes about pushdown of the HAVING clause performing
> Partial aggregate pushdown.
> 
> Thursday, 20 July 2023 19:23 Alexander Pyhalov 
> <a.pyhalov@postgrespro.ru>:
>> As for changes in planner.c (setGroupClausePartial()) I have several
>> questions.
>> 
>> 1) Why don't we add non_group_exprs to pathtarget->exprs when
>> partial_target->exprs is not set?
>> 
>> 2) We replace extra->partial_target->exprs with partial_target->exprs
>> after processing. Why are we sure that after this tleSortGroupRef is
>> correct?
> Response to 1)
> The code you pointed out was unnecessary. I have removed this code.
> Also, the process of adding PlaceHolderVar's expr to partial_target was 
> missing.
> So I fixed this.
> 
> Response to 2)
> The making procedures extra->groupClausePartial and 
> extra->partial_target
> in make_partial_grouping_target for this patch is as follows.
> STEP1. From grouping_target->exprs, extract Aggref, Var and
> Placeholdervar that are not included in Aggref.
> STEP2. setGroupClausePartial sets the copy of original groupClause to
> extra->groupClausePartial
> and sets the copy of original partial_target to extra->partial_target.
> STEP3. setGroupClausePartial adds Var and Placeholdervar in STEP1 to
> partial_target.
> The sortgroupref of partial_target->sortgrouprefs to be added to value 
> is set to
> (the maximum value of the existing sortgroupref) + 1.
> setGroupClausePartial adds data sgc of sortgroupclause type where
> sgc->tlesortgroupref
> matches the sortgroupref to GroupClause.
> STEP4. add_new_columns_to_pathtarget adds STEP1's Aggref to 
> partial_target.
> 
> Due to STEP2, the list of tlesortgrouprefs set in
> extra->groupClausePartial is not duplicated.

Do you mean that extra->partial_target->sortgrouprefs is not replaced, 
and so we preserve tlesortgroupref numbers?
I'm suspicious about rewriting extra->partial_target->exprs with 
partial_target->exprs - I'm still not sure why we
  don't we loose information, added by add_column_to_pathtarget() to 
extra->partial_target->exprs?

Also look at the following example.

EXPLAIN VERBOSE SELECT  count(*) , (b/2)::numeric FROM pagg_tab GROUP BY 
b/2 ORDER BY 1;
                                             QUERY PLAN
---------------------------------------------------------------------------------------------------
  Sort  (cost=511.35..511.47 rows=50 width=44)
    Output: (count(*)), ((((pagg_tab.b / 2)))::numeric), ((pagg_tab.b / 
2))
    Sort Key: (count(*))
    ->  Finalize HashAggregate  (cost=509.06..509.94 rows=50 width=44)
          Output: count(*), (((pagg_tab.b / 2)))::numeric, ((pagg_tab.b / 
2))
          Group Key: ((pagg_tab.b / 2))
          ->  Append  (cost=114.62..506.06 rows=600 width=16)
                ->  Foreign Scan  (cost=114.62..167.69 rows=200 width=16)
                      Output: ((pagg_tab.b / 2)), (PARTIAL count(*)), 
pagg_tab.b
                      Relations: Aggregate on (public.fpagg_tab_p1 
pagg_tab)
                      Remote SQL: SELECT (b / 2), count(*), b FROM 
public.pagg_tab_p1 GROUP BY 1, 2
                ->  Foreign Scan  (cost=114.62..167.69 rows=200 width=16)
                      Output: ((pagg_tab_1.b / 2)), (PARTIAL count(*)), 
pagg_tab_1.b
                      Relations: Aggregate on (public.fpagg_tab_p2 
pagg_tab_1)
                      Remote SQL: SELECT (b / 2), count(*), b FROM 
public.pagg_tab_p2 GROUP BY 1, 2
                ->  Foreign Scan  (cost=114.62..167.69 rows=200 width=16)
                      Output: ((pagg_tab_2.b / 2)), (PARTIAL count(*)), 
pagg_tab_2.b
                      Relations: Aggregate on (public.fpagg_tab_p3 
pagg_tab_2)
                      Remote SQL: SELECT (b / 2), count(*), b FROM 
public.pagg_tab_p3 GROUP BY 1, 2

Note that group by is still deparsed incorrectly.
-- 
Best regards,
Alexander Pyhalov,
Postgres Professional



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: generic plans and "initial" pruning
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: [PATCH] Add inline comments to the pg_hba_file_rules view