RE: Partial aggregates pushdown

Поиск
Список
Период
Сортировка
От Fujii.Yuki@df.MitsubishiElectric.co.jp"
Тема RE: Partial aggregates pushdown
Дата
Msg-id TYAPR01MB30884F8E4D94038CD661FFEF95F02@TYAPR01MB3088.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на Re: Partial aggregates pushdown  (Alexander Pyhalov <a.pyhalov@postgrespro.ru>)
Ответы Re: Partial aggregates pushdown
Re: Partial aggregates pushdown
Re: Partial aggregates pushdown
Список pgsql-hackers
Hi Mr. Pyhalov.

Sorry for the late reply.
Thank you for your modification and detailed review.
I attach a fixed patch, have been not yet rebased.

Monday, 25 March 2024 16:01 Alexander Pyhalov <a.pyhalov@postgrespro.ru>:.
> Comment in nodeAgg.c seems to be strange:
>
> 1079 /*
> 1080 * If the agg's finalfn is provided and PARTIAL_AGGREGATE
> keyword is
> 1081 * not specified, apply the agg's finalfn.
> 1082 * If PARTIAL_AGGREGATE keyword is specified and the
> transValue type
> 1083 * is internal, apply the agg's serialfn. In this case, if
> the agg's
> 1084 * serialfn must not be invalid. Otherwise return
> transValue.
> 1085 */
>
> Likely, you mean:
>
> ... In this case the agg'ss serialfn must not be invalid...
Fixed.

> Lower, in the same file, please, correct error message:
>
> 1136 if(!OidIsValid(peragg->serialfn_oid))
> 1137  elog(ERROR, "serialfunc is note provided
> for partial aggregate");
>
> it should be "serialfunc is not provided for partial aggregate"
Fixed.

> Also something is wrong with the following test :
>
> SELECT /* aggregate <> partial aggregate */
>  array_agg(c_int4array), array_agg(b),
>  avg(b::int2), avg(b::int4), avg(b::int8), avg(c_interval),
>  avg(b::float4), avg(b::float8),
>  corr(b::float8, (b * b)::float8),
>  covar_pop(b::float8, (b * b)::float8),
>  covar_samp(b::float8, (b * b)::float8),
>  regr_avgx((2 * b)::float8, b::float8),
> .....
>
> Its results have changed since last patch. Do they depend on daylight
> saving time?
You are right. In my environment, TimeZone is set to 'PST8PDT'
with which timetz values depends on daylight saving time.
Changed TimeZone to 'UTC' in this test.

> You can see that filter is applied before append. The result is correct
> only by chance, as sum in every partition is actually < 700. If you
> lower this bound, let's say, to 200, you'll start getting wrong results
> as data is filtered prior to aggregation.
>
> It seems, however, that in partial case you should just avoid pulling
> conditions from having qual at all, all filters will be applied on upper
> level. Something like
Thank you for your modification.

> Found one more problem. You can fire partial aggregate over partitioned
> table, but convert_combining_aggrefs() will make non-partial copy, which
> leads to
> 'variable not found in subplan target list' error.
Thanks for the correction as well.
As you pointed out,
the original patch certainly had the potential to cause problems.
However, I could not actually reproduce the problem in cases such as the following.

  Settings:
    t(c1, c2) is a patitioned table whose partition key is c1.
    t1, t2 are patitions of t and are partitioned table.
    t11, t12: partitions of t1 and foreign table of postgres_fdw.
    t21, t22: partitions of t2 and foreign table of postgres_fdw.
  Query:
    select c2 / 2, sum(c1) from t group by c2 / 2 order by 1

If you have a reproducible example, I would like to add it to
the regression test.
Do you have a reproducible example?

> Also denied partial agregates pushdown on server version mismatch.
> Should check_partial_aggregate_support be 'true' by default?
Could we discuss this point after we determine how to transfer state values?
If we determine this point, we can easly determine whether check_partial_aggregate_support shold be 'true' by default.

> I'm not sure what to do with current grammar - it precludes partial
> distinct aggregates. I understand that it's currently impossible to have
> partial aggregation for distinct agregates -but does it worth to have
> such restriction at grammar level?
If partial aggregation for distinct agregates becomes possible in the future,
I see no problem with the policy of accepting new SQL keywords,
such as "PARTIL_AGGREGATE DISTINCT".

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ashutosh Bapat
Дата:
Сообщение: Re: apply_scanjoin_target_to_paths and partitionwise join
Следующее
От: "Tristan Partin"
Дата:
Сообщение: Re: Comments on Custom RMGRs