Re: Partial aggregates pushdown

Поиск
Список
Период
Сортировка
От Alexander Pyhalov
Тема Re: Partial aggregates pushdown
Дата
Msg-id fa5f3f5548330e512ae151a5ac79091f@postgrespro.ru
обсуждение исходный текст
Ответ на RE: Partial aggregates pushdown  ("Fujii.Yuki@df.MitsubishiElectric.co.jp" <Fujii.Yuki@df.MitsubishiElectric.co.jp>)
Список pgsql-hackers
Fujii.Yuki@df.MitsubishiElectric.co.jp писал(а) 2024-05-28 00:30:
> Hi Mr. Pyhalov.
> 
> Sorry for the late reply.
> Thank you for your modification and detailed review.
> I attach a fixed patch, have been not yet rebased.
> 
> Monday, 25 March 2024 16:01 Alexander Pyhalov 
> <a.pyhalov@postgrespro.ru>:.
>> Comment in nodeAgg.c seems to be strange:
>> 
>> 1079 /*
>> 1080 * If the agg's finalfn is provided and PARTIAL_AGGREGATE
>> keyword is
>> 1081 * not specified, apply the agg's finalfn.
>> 1082 * If PARTIAL_AGGREGATE keyword is specified and the
>> transValue type
>> 1083 * is internal, apply the agg's serialfn. In this case, if
>> the agg's
>> 1084 * serialfn must not be invalid. Otherwise return
>> transValue.
>> 1085 */
>> 
>> Likely, you mean:
>> 
>> ... In this case the agg'ss serialfn must not be invalid...
> Fixed.
> 
>> Lower, in the same file, please, correct error message:
>> 
>> 1136 if(!OidIsValid(peragg->serialfn_oid))
>> 1137  elog(ERROR, "serialfunc is note provided
>> for partial aggregate");
>> 
>> it should be "serialfunc is not provided for partial aggregate"
> Fixed.
> 
>> Also something is wrong with the following test :
>> 
>> SELECT /* aggregate <> partial aggregate */
>>  array_agg(c_int4array), array_agg(b),
>>  avg(b::int2), avg(b::int4), avg(b::int8), avg(c_interval),
>>  avg(b::float4), avg(b::float8),
>>  corr(b::float8, (b * b)::float8),
>>  covar_pop(b::float8, (b * b)::float8),
>>  covar_samp(b::float8, (b * b)::float8),
>>  regr_avgx((2 * b)::float8, b::float8),
>> .....
>> 
>> Its results have changed since last patch. Do they depend on daylight
>> saving time?
> You are right. In my environment, TimeZone is set to 'PST8PDT'
> with which timetz values depends on daylight saving time.
> Changed TimeZone to 'UTC' in this test.
> 
>> You can see that filter is applied before append. The result is 
>> correct
>> only by chance, as sum in every partition is actually < 700. If you
>> lower this bound, let's say, to 200, you'll start getting wrong 
>> results
>> as data is filtered prior to aggregation.
>> 
>> It seems, however, that in partial case you should just avoid pulling
>> conditions from having qual at all, all filters will be applied on 
>> upper
>> level. Something like
> Thank you for your modification.
> 
>> Found one more problem. You can fire partial aggregate over 
>> partitioned
>> table, but convert_combining_aggrefs() will make non-partial copy, 
>> which
>> leads to
>> 'variable not found in subplan target list' error.
> Thanks for the correction as well.
> As you pointed out,
> the original patch certainly had the potential to cause problems.
> However, I could not actually reproduce the problem in cases such as 
> the following.
> 
>   Settings:
>     t(c1, c2) is a patitioned table whose partition key is c1.
>     t1, t2 are patitions of t and are partitioned table.
>     t11, t12: partitions of t1 and foreign table of postgres_fdw.
>     t21, t22: partitions of t2 and foreign table of postgres_fdw.
>   Query:
>     select c2 / 2, sum(c1) from t group by c2 / 2 order by 1
> 
> If you have a reproducible example, I would like to add it to
> the regression test.
> Do you have a reproducible example?
> 
>> Also denied partial agregates pushdown on server version mismatch.
>> Should check_partial_aggregate_support be 'true' by default?
> Could we discuss this point after we determine how to transfer state 
> values?
> If we determine this point, we can easly determine whether 
> check_partial_aggregate_support shold be 'true' by default.
> 
>> I'm not sure what to do with current grammar - it precludes partial
>> distinct aggregates. I understand that it's currently impossible to 
>> have
>> partial aggregation for distinct agregates -but does it worth to have
>> such restriction at grammar level?
> If partial aggregation for distinct agregates becomes possible in the 
> future,
> I see no problem with the policy of accepting new SQL keywords,
> such as "PARTIL_AGGREGATE DISTINCT".

BTW, there's I have an issue with test results in the last version of 
the patch. Attaching regression diffs.
I have partial sum over c_interval instead of sum(c_interval).


-- 
Best regards,
Alexander Pyhalov,
Postgres Professional
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Pyhalov
Дата:
Сообщение: Re: Partial aggregates pushdown
Следующее
От: Alexander Pyhalov
Дата:
Сообщение: Re: CREATE INDEX CONCURRENTLY on partitioned index