Обсуждение: Partial aggregates pushdown

Поиск

Список

Период

Сортировка

Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

15 октября 2021 г., 16:15:33

Hi.

One of the issues when we try to use sharding in PostgreSQL is absence 
of partial aggregates pushdown.

I see several opportunities to alleviate this issue.
If we look at Citus, it implements aggregate, calculating internal state 
of an arbitrary agregate function and exporting it as text. So we could 
calculate internal states independently on all data sources and then 
finalize it, which allows to compute arbitrary aggregate.

But, as mentioned in [1] thread, for some functions (like 
count/max/min/sum) we can just push down them. It seems easy and covers 
a lot of cases.
For now there are still issues - for example you can't handle functions 
as avg() as we should somehow get its internal state or sum() variants, 
which need aggserialfn/aggdeserialfn. Preliminary version is attached.

Is someone else working on the issue? Does suggested approach make 
sense?

[1] 
https://www.postgresql.org/message-id/flat/9998c3af9fdb5f7d62a6c7ad0fcd9142%40postgrespro.ru

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Вложения

0001-Partial-aggregates-push-down.patch

Re: Partial aggregates pushdown

От

Tomas Vondra

Дата:

15 октября 2021 г., 17:56:27

Hi Alexander,

On 10/15/21 15:15, Alexander Pyhalov wrote:
> Hi.
> 
> One of the issues when we try to use sharding in PostgreSQL is absence 
> of partial aggregates pushdown.
> 
> I see several opportunities to alleviate this issue.
> If we look at Citus, it implements aggregate, calculating internal state 
> of an arbitrary agregate function and exporting it as text. So we could 
> calculate internal states independently on all data sources and then 
> finalize it, which allows to compute arbitrary aggregate.
> 
> But, as mentioned in [1] thread, for some functions (like 
> count/max/min/sum) we can just push down them. It seems easy and covers 
> a lot of cases.
> For now there are still issues - for example you can't handle functions 
> as avg() as we should somehow get its internal state or sum() variants, 
> which need aggserialfn/aggdeserialfn. Preliminary version is attached.
> 
> Is someone else working on the issue? Does suggested approach make sense?
> 

I think a couple people worked on this (or something similar/related) in 
the past, but I don't recall any recent patches.

IMHO being able to push-down parts of an aggregation to other nodes is a 
very desirable feature, that might result in huge improvements for some 
analytical workloads.

As for the proposed approach, it's probably good enough for the first 
version to restrict this to aggregates where the aggregate result is 
sufficient, i.e. we don't need any new export/import procedures.

But it's very unlikely we'd want to restrict it the way the patch does 
it, i.e. based on aggregate name. That's both fragile (people can create 
new aggregates with such name) and against the PostgreSQL extensibility 
(people may implement custom aggregates, but won't be able to benefit 
from this just because of name).

So for v0 maybe, but I think there neeeds to be a way to relax this in 
some way, for example we could add a new flag to pg_aggregate to mark 
aggregates supporting this.

And then we should extend this for aggregates with more complex internal 
states (e.g. avg), by supporting a function that "exports" the aggregate 
state - similar to serial/deserial functions, but needs to be portable.

I think the trickiest thing here is rewriting the remote query to call 
this export function, but maybe we could simply instruct the remote node 
to use a different final function for the top-level node?

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

15 октября 2021 г., 18:05:27

Tomas Vondra писал 2021-10-15 17:56:
> Hi Alexander,
> 

Hi.

> And then we should extend this for aggregates with more complex
> internal states (e.g. avg), by supporting a function that "exports"
> the aggregate state - similar to serial/deserial functions, but needs
> to be portable.
> 
> I think the trickiest thing here is rewriting the remote query to call
> this export function, but maybe we could simply instruct the remote
> node to use a different final function for the top-level node?
> 
> 

If we have some special export function, how should we find out that 
remote server supports this? Should it be server property or should it 
somehow find out it while connecting to the server?

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Re: Partial aggregates pushdown

От

Tomas Vondra

Дата:

15 октября 2021 г., 18:26:37

On 10/15/21 17:05, Alexander Pyhalov wrote:
> Tomas Vondra писал 2021-10-15 17:56:
>> Hi Alexander,
>>
> 
> Hi.
> 
>> And then we should extend this for aggregates with more complex
>> internal states (e.g. avg), by supporting a function that "exports"
>> the aggregate state - similar to serial/deserial functions, but needs
>> to be portable.
>>
>> I think the trickiest thing here is rewriting the remote query to call
>> this export function, but maybe we could simply instruct the remote
>> node to use a different final function for the top-level node?
>>
>>
> 
> If we have some special export function, how should we find out that 
> remote server supports this? Should it be server property or should it 
> somehow find out it while connecting to the server?
> 

Good question. I guess there could be some initial negotiation based on 
remote node version etc. And we could also disable this pushdown for 
older server versions, etc.

But after that, I think we can treat this just like other definitions 
between local/remote node - we'd assume they match (i.e. the remote 
server has the export function), and then we'd get an error if it does 
not. If you need to use remote nodes without an export function, you'd 
have to disable the pushdown.

AFAICS this works both for case with explicit query rewrite (i.e. we 
send SQL with calls to the export function) and implicit query rewrite 
(where the remote node uses a different finalize function based on mode, 
specified by GUC).

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Partial aggregates pushdown

От

Tomas Vondra

Дата:

16 октября 2021 г., 00:59:47

On 10/15/21 21:31, Stephen Frost wrote:
> Greetings,
> 
> * Tomas Vondra (tomas.vondra@enterprisedb.com) wrote:
>> On 10/15/21 17:05, Alexander Pyhalov wrote:
>>> Tomas Vondra писал 2021-10-15 17:56:
>>>> And then we should extend this for aggregates with more complex
>>>> internal states (e.g. avg), by supporting a function that "exports"
>>>> the aggregate state - similar to serial/deserial functions, but needs
>>>> to be portable.
>>>>
>>>> I think the trickiest thing here is rewriting the remote query to call
>>>> this export function, but maybe we could simply instruct the remote
>>>> node to use a different final function for the top-level node?
>>>
>>> If we have some special export function, how should we find out that
>>> remote server supports this? Should it be server property or should it
>>> somehow find out it while connecting to the server?
>>
>> Good question. I guess there could be some initial negotiation based on
>> remote node version etc. And we could also disable this pushdown for older
>> server versions, etc.
> 
> Yeah, I'd think we would just only support it on versions where we know
> it's available.  That doesn't seem terribly difficult.
> 

Yeah.

But maybe Alexander was concerned about cases where the nodes disagree 
on the aggregate definition, so one node might have the export function 
and the other would not. E.g. the remote node may have older version of 
an extension implementing the aggregate, without the export function 
(although the server version supports it). I don't think we can do much 
about that, it's just one of many issues that may be caused by 
mismatching schemas.

I wonder if this might get more complex, though. Imagine for example a 
partitioned table on node A with a FDW partition, pointing to a node B. 
But on B, the object is partitioned again, with one partition placed on 
C. So it's like

   A -> partition on B -> partition on C

When planning on A, we can consider server version on B. But what if C 
is an older version, not supporting the export function?

Bot sure if this makes any difference, though ... in the worst case it 
will error out, and we should have a way to disable the feature on A.

>> But after that, I think we can treat this just like other definitions
>> between local/remote node - we'd assume they match (i.e. the remote server
>> has the export function), and then we'd get an error if it does not. If you
>> need to use remote nodes without an export function, you'd have to disable
>> the pushdown.
>>
>> AFAICS this works both for case with explicit query rewrite (i.e. we send
>> SQL with calls to the export function) and implicit query rewrite (where the
>> remote node uses a different finalize function based on mode, specified by
>> GUC).
> 
> Not quite sure where to drop this, but I've always figured we'd find a
> way to use the existing PartialAgg / FinalizeAggregate bits which are
> used for parallel query when it comes to pushing down to foreign servers
> to perform aggregates.  That also gives us how to serialize the results,
> though we'd have to make sure that works across different
> architectures.. I've not looked to see if that's the case today.
> 

It sure is similar to what serial/deserial functions do for partial 
aggs, but IIRC the functions were not designed to be portable. I think 
we don't even require compatibility across minor releases, because we 
only use this to copy data between workers running at the same time. Not 
saying it can't be made to work, of course.

> Then again, being able to transform an aggregate into a partial
> aggregate that runs as an actual SQL query would mean we do partial
> aggregate push-down against non-PG FDWs and that'd be pretty darn neat,
> so maybe that's a better way to go, if we can figure out how.
> 
> (I mean, for avg it's pretty easy to just turn that into a SELECT that
> grabs the sum and the count and use that..  other aggregates are more
> complicated though and that doesn't work, maybe we need both?)
> 

Maybe, but that seems like a very different concept - transforming the 
SQL so that it calculates different set of aggregates that we know can 
be pushed down easily. But I don't recall any other practical example 
beyond the AVG() -> SUM()/COUNT(). Well, VAR() can be translated into 
SUM(X), SUM(X^2).

Another thing is how many users would actually benefit from this. I 
mean, for this to matter you need partitioned table with partitions 
placed on a non-PG FDW, right? Seems like a pretty niche use case.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

19 октября 2021 г., 09:56:45

Hi.

Tomas Vondra писал 2021-10-15 17:56:
> As for the proposed approach, it's probably good enough for the first
> version to restrict this to aggregates where the aggregate result is
> sufficient, i.e. we don't need any new export/import procedures.
> 
> But it's very unlikely we'd want to restrict it the way the patch does
> it, i.e. based on aggregate name. That's both fragile (people can
> create new aggregates with such name) and against the PostgreSQL
> extensibility (people may implement custom aggregates, but won't be
> able to benefit from this just because of name).
> 
> So for v0 maybe, but I think there neeeds to be a way to relax this in
> some way, for example we could add a new flag to pg_aggregate to mark
> aggregates supporting this.
> 

Updated patch to mark aggregates as pushdown-safe in pg_aggregates.

So far have no solution for aggregates with internal aggtranstype.
-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Вложения

0001-Partial-aggregates-push-down-v02.patch

Re: Partial aggregates pushdown

От

Tomas Vondra

Дата:

19 октября 2021 г., 16:25:52

On 10/19/21 08:56, Alexander Pyhalov wrote:
> Hi.
> 
> Tomas Vondra писал 2021-10-15 17:56:
>> As for the proposed approach, it's probably good enough for the first
>> version to restrict this to aggregates where the aggregate result is
>> sufficient, i.e. we don't need any new export/import procedures.
>>
>> But it's very unlikely we'd want to restrict it the way the patch does
>> it, i.e. based on aggregate name. That's both fragile (people can
>> create new aggregates with such name) and against the PostgreSQL
>> extensibility (people may implement custom aggregates, but won't be
>> able to benefit from this just because of name).
>>
>> So for v0 maybe, but I think there neeeds to be a way to relax this in
>> some way, for example we could add a new flag to pg_aggregate to mark
>> aggregates supporting this.
>>
> 
> Updated patch to mark aggregates as pushdown-safe in pg_aggregates.
> 
> So far have no solution for aggregates with internal aggtranstype.

Thanks. Please add it to the next CF, so that we don't lose track of it.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

21 октября 2021 г., 13:55:04

Tomas Vondra писал 2021-10-19 16:25:
> On 10/19/21 08:56, Alexander Pyhalov wrote:
>> Hi.
>> 
>> Tomas Vondra писал 2021-10-15 17:56:
>>> As for the proposed approach, it's probably good enough for the first
>>> version to restrict this to aggregates where the aggregate result is
>>> sufficient, i.e. we don't need any new export/import procedures.
>>> 
>>> But it's very unlikely we'd want to restrict it the way the patch 
>>> does
>>> it, i.e. based on aggregate name. That's both fragile (people can
>>> create new aggregates with such name) and against the PostgreSQL
>>> extensibility (people may implement custom aggregates, but won't be
>>> able to benefit from this just because of name).
>>> 
>>> So for v0 maybe, but I think there neeeds to be a way to relax this 
>>> in
>>> some way, for example we could add a new flag to pg_aggregate to mark
>>> aggregates supporting this.
>>> 
>> 
>> Updated patch to mark aggregates as pushdown-safe in pg_aggregates.
>> 
>> So far have no solution for aggregates with internal aggtranstype.

Hi. Updated patch.
Now aggregates with internal states can be pushed down, if they are 
marked as pushdown safe (this flag is set to true for min/max/sum),
have internal states and associated converters. Converters are called 
locally, they transform aggregate result to serialized internal 
representation.
As converters don't have access to internal aggregate state, partial 
aggregates like avg() are still not pushable.

For now the overall logic is quite simple. We now also call 
add_foreign_grouping_paths() for partial aggregation.  In 
foreign_expr_walker() we check if aggregate is pushable (which means 
that it is simple, marked as pushable and if has 'internal' as 
aggtranstype, has associated converter).
If it is pushable, we proceed as with usual aggregates (but forbid 
having pushdown). During postgresGetForeignPlan() we produce list of 
converters for aggregates. As converters has different input argument 
type from their result (bytea), we have to generate alternative 
metadata, which is used by make_tuple_from_result_row().
If make_tuple_from_result_row() encounters field with converter, it 
calls converter and returns result. For now we expect converter to have 
only one input and output argument. Existing converters just transform 
input value to internal representation and return its serialized form.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Вложения

0001-Partial-aggregates-push-down-v03.patch

Re: Partial aggregates pushdown

От

Zhihong Yu

Дата:

22 октября 2021 г., 00:43:38

Hi,

w.r.t. 0001-Partial-aggregates-push-down-v03.patch

For partial_agg_ok(),

+ if (agg->aggdistinct || agg->aggvariadic || agg->aggkind != AGGKIND_NORMAL || agg->aggorder != NIL)
+ ok = false;

Since SearchSysCache1() is not called yet, you can return false directly.

+ if (aggform->aggpartialpushdownsafe != true)

The above can be written as:

if (!aggform->aggpartialpushdownsafe)

For build_conv_list():

+ Oid converter_oid = InvalidOid;
+
+ if (IsA(tlentry->expr, Aggref))

...

+ }
+ convlist = lappend_oid(convlist, converter_oid);

Do you intend to append InvalidOid to convlist (when tlentry->expr is not Aggref) ?

Cheers

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

22 октября 2021 г., 09:26:50

Zhihong Yu писал 2021-10-22 00:43:
> Hi,
> w.r.t. 0001-Partial-aggregates-push-down-v03.patch
> 

Hi.

> For partial_agg_ok(),
> 
> +   if (agg->aggdistinct || agg->aggvariadic || agg->aggkind !=
> AGGKIND_NORMAL || agg->aggorder != NIL)
> +       ok = false;
> 
> Since SearchSysCache1() is not called yet, you can return false
> directly.

Fixed.

> 
> +       if (aggform->aggpartialpushdownsafe != true)
> 
> The above can be written as:
> 
>        if (!aggform->aggpartialpushdownsafe)

Fixed.

> 
> For build_conv_list():
> 
> +           Oid         converter_oid = InvalidOid;
> +
> +           if (IsA(tlentry->expr, Aggref))
> 
> ...
> +           }
> +           convlist = lappend_oid(convlist, converter_oid);
> 
> Do you intend to append InvalidOid to convlist (when tlentry->expr is
> not Aggref) ?

Yes, for each tlist member (which matches fpinfo->grouped_tlist in case 
when foreignrel is UPPER_REL) we need to find corresponding converter.
If we don't append InvalidOid, we can't find convlist member, 
corresponding to tlist member. Added comments to build_conv_list.

Also fixed error in pg_dump.c (we selected '0' when 
aggpartialconverterfn was not defined in schema, but checked for '-').

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Вложения

0001-Partial-aggregates-push-down-v04.patch

Re: Partial aggregates pushdown

От

Peter Eisentraut

Дата:

01 ноября 2021 г., 12:47:42

On 21.10.21 12:55, Alexander Pyhalov wrote:
> Now aggregates with internal states can be pushed down, if they are 
> marked as pushdown safe (this flag is set to true for min/max/sum),
> have internal states and associated converters. Converters are called 
> locally, they transform aggregate result to serialized internal 
> representation.
> As converters don't have access to internal aggregate state, partial 
> aggregates like avg() are still not pushable.

It seems to me that the system should be able to determine from the 
existing aggregate catalog entry whether an aggregate can be pushed 
down.  For example, it could check aggtranstype != internal and similar. 
  A separate boolean flag should not be necessary.  Or if it is, the 
patch should provide some guidance about how an aggregate function 
author should set it.

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

01 ноября 2021 г., 13:30:27

Peter Eisentraut писал 2021-11-01 12:47:
> On 21.10.21 12:55, Alexander Pyhalov wrote:
>> Now aggregates with internal states can be pushed down, if they are 
>> marked as pushdown safe (this flag is set to true for min/max/sum),
>> have internal states and associated converters. Converters are called 
>> locally, they transform aggregate result to serialized internal 
>> representation.
>> As converters don't have access to internal aggregate state, partial 
>> aggregates like avg() are still not pushable.
> 
> It seems to me that the system should be able to determine from the
> existing aggregate catalog entry whether an aggregate can be pushed
> down.  For example, it could check aggtranstype != internal and
> similar.  A separate boolean flag should not be necessary.

Hi.
I think we can't infer this property from existing flags. For example, 
if I have avg() with bigint[] argtranstype, it doesn't mean we can push 
down it. We couldn't also decide if partial aggregete is safe to push 
down based on aggfinalfn presence (for example, it is defined for 
sum(numeric), but we can push it down.

> Or if it
> is, the patch should provide some guidance about how an aggregate
> function author should set it.

Where should it be provided?

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Re: Partial aggregates pushdown

От

Ilya Gladyshev

Дата:

02 ноября 2021 г., 00:31:35

Hi,

On 21.10.2021 13:55, Alexander Pyhalov wrote:

Hi. Updated patch.
Now aggregates with internal states can be pushed down, if they are marked as pushdown safe (this flag is set to true for min/max/sum),
have internal states and associated converters.

I don't quite understand why this is restricted only to aggregates that have 'internal' state, I feel like that should be possible for any aggregate that has a function to convert its final result back to aggregate state to be pushed down. While I couldn't come up with a useful example for this, except maybe for an aggregate whose aggfinalfn is used purely for cosmetic purposes (e.g. format the result into a string), I still feel that it is an unnecessary restriction.

A few minor review notes to the patch:

+static List *build_conv_list(RelOptInfo *foreignrel);

this should probably be up top among other declarations.

@@ -1433,6 +1453,48 @@ postgresGetForeignPlan(PlannerInfo *root,
outer_plan);
}

+/*
+ * Generate attinmeta if there are some converters:
+ * they are expecxted to return BYTEA, but real input type is likely different.
+ */

typo in word "expecxted".

@@ -139,10 +147,13 @@ typedef struct PgFdwScanState
                                 * for a foreign join scan. */
     TupleDesc    tupdesc;        /* tuple descriptor of scan */
     AttInMetadata *attinmeta;    /* attribute datatype conversion metadata */
+    AttInMetadata *rcvd_attinmeta;    /* metadata for received tuples, NULL if
+                                    * there's no converters */

Looks like rcvd_attinmeta is redundant and you could use attinmeta for conversion metadata.

The last thing - the patch needs to be rebased, it doesn't apply cleanly on top of current master.

Thanks,

Ilya Gladyshev

Re: Partial aggregates pushdown

От

Ilya Gladyshev

Дата:

02 ноября 2021 г., 00:53:54

On 01.11.2021 13:30, Alexander Pyhalov wrote:
> Peter Eisentraut писал 2021-11-01 12:47:
>> On 21.10.21 12:55, Alexander Pyhalov wrote:
>>> Now aggregates with internal states can be pushed down, if they are 
>>> marked as pushdown safe (this flag is set to true for min/max/sum),
>>> have internal states and associated converters. Converters are 
>>> called locally, they transform aggregate result to serialized 
>>> internal representation.
>>> As converters don't have access to internal aggregate state, partial 
>>> aggregates like avg() are still not pushable.
>>
>> It seems to me that the system should be able to determine from the
>> existing aggregate catalog entry whether an aggregate can be pushed
>> down.  For example, it could check aggtranstype != internal and
>> similar.  A separate boolean flag should not be necessary.
>
> Hi.
> I think we can't infer this property from existing flags. For example, 
> if I have avg() with bigint[] argtranstype, it doesn't mean we can 
> push down it. We couldn't also decide if partial aggregete is safe to 
> push down based on aggfinalfn presence (for example, it is defined for 
> sum(numeric), but we can push it down.

I think one potential way to do it would be to allow pushing down 
aggregates that EITHER have state of the same type as their return type, 
OR have a conversion function that converts their return value to the 
type of their state.

Re: Partial aggregates pushdown

От

Tomas Vondra

Дата:

02 ноября 2021 г., 00:57:48

On 11/1/21 22:31, Ilya Gladyshev wrote:
> Hi,
> 
> On 21.10.2021 13:55, Alexander Pyhalov wrote:
>> Hi. Updated patch.
>> Now aggregates with internal states can be pushed down, if they are 
>> marked as pushdown safe (this flag is set to true for min/max/sum),
>> have internal states and associated converters.
> 
> I don't quite understand why this is restricted only to aggregates that 
> have 'internal' state, I feel like that should be possible for any 
> aggregate that has a function to convert its final result back to 
> aggregate state to be pushed down. While I couldn't come up with a 
> useful example for this, except maybe for an aggregate whose aggfinalfn 
> is used purely for cosmetic purposes (e.g. format the result into a 
> string), I still feel that it is an unnecessary restriction.
> 

But it's *not* restricted to aggregates with internal state. The patch 
merely requires aggregates with "internal" state to have an extra 
"converter" function.

That being said, I don't think the approach used to deal with internal 
state is the right one. AFAICS it simply runs the aggregate on the 
remote node, finalizes is there, and then uses the converter function to 
"expand" the partial result back into the internal state.

Unfortunately that only works for aggregates like "sum" where the result 
is enough to rebuild the internal state, but it fails for anything more 
complex (like "avg" or "var").

Earlier in this thread I mentioned this to serial/deserial functions, 
and I think we need to do something like that for internal state. I.e. 
we need to call the "serial" function on the remote node, and which 
dumps the whole internal state, and then "deserial" on the local node.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Partial aggregates pushdown

От

Tomas Vondra

Дата:

02 ноября 2021 г., 02:49:08

On 11/1/21 22:53, Ilya Gladyshev wrote:
> 
> On 01.11.2021 13:30, Alexander Pyhalov wrote:
>> Peter Eisentraut писал 2021-11-01 12:47:
>>> On 21.10.21 12:55, Alexander Pyhalov wrote:
>>>> Now aggregates with internal states can be pushed down, if they are 
>>>> marked as pushdown safe (this flag is set to true for min/max/sum),
>>>> have internal states and associated converters. Converters are 
>>>> called locally, they transform aggregate result to serialized 
>>>> internal representation.
>>>> As converters don't have access to internal aggregate state, partial 
>>>> aggregates like avg() are still not pushable.
>>>
>>> It seems to me that the system should be able to determine from the
>>> existing aggregate catalog entry whether an aggregate can be pushed
>>> down.  For example, it could check aggtranstype != internal and
>>> similar.  A separate boolean flag should not be necessary.
>>
>> Hi.
>> I think we can't infer this property from existing flags. For example, 
>> if I have avg() with bigint[] argtranstype, it doesn't mean we can 
>> push down it. We couldn't also decide if partial aggregete is safe to 
>> push down based on aggfinalfn presence (for example, it is defined for 
>> sum(numeric), but we can push it down.
> 
> I think one potential way to do it would be to allow pushing down 
> aggregates that EITHER have state of the same type as their return type, 
> OR have a conversion function that converts their return value to the 
> type of their state.
> 

IMO just checking (aggtranstype == result type) entirely ignores the 
issue of portability - we've never required the aggregate state to be 
portable in any meaningful way (between architectures, minor/major 
versions, ...) and it seems foolish to just start relying on it here.

Imagine for example an aggregate using bytea state, storing some complex 
C struct in it. You can't just copy that between architectures.

It's a bit like why we don't simply copy data types to network, but pass 
them through input/output or send/receive functions. The new flag is a 
way to mark aggregates where this is safe, and I don't think we can do 
away without it.

The more I think about this, the more I'm convinced the proper way to do 
this would be adding export/import functions, similar to serial/deserial 
functions, with the extra portability guarantees. And we'd need to do 
that for all aggregates, not just those with (aggtranstype == internal).

I get it - the idea of the patch is that keeping the data types the same 
makes it much simpler to pass the aggregate state (compared to having to 
export/import it). But I'm not sure it's the right approach.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

02 ноября 2021 г., 12:12:07

Hi.

Updated and rebased patch.

Ilya Gladyshev писал 2021-11-02 00:31:
> Hi,
> On 21.10.2021 13:55, Alexander Pyhalov wrote:
> 
>> Hi. Updated patch.
>> Now aggregates with internal states can be pushed down, if they are
>> marked as pushdown safe (this flag is set to true for min/max/sum),
>> have internal states and associated converters.
> 
> I don't quite understand why this is restricted only to aggregates
> that have 'internal' state, I feel like that should be possible for
> any aggregate that has a function to convert its final result back to
> aggregate state to be pushed down. While I couldn't come up with a
> useful example for this, except maybe for an aggregate whose
> aggfinalfn is used purely for cosmetic purposes (e.g. format the
> result into a string), I still feel that it is an unnecessary
> restriction.
> 

I don't feel comfortable with it for the following reasons.
- Now partial converters translate aggregate result to serialized 
internal representation.
In case when aggregate type is different from internal state,
we'd have to translate it to non-serialized internal representation,
so converters should skip serialization step. This seems like 
introducing two
kind of converters.
- I don't see any system aggregates which would benefit from this.

However, it doesn't seem to be complex, and if it seems to be desirable,
it can be done.
For now introduced check that transtype matches aggregate type (or is 
internal)
in partial_agg_ok().


> A few minor review notes to the patch:
> 
> +static List *build_conv_list(RelOptInfo *foreignrel);
> 
> this should probably be up top among other declarations.
> 

Moved it upper.


> @@ -1433,6 +1453,48 @@ postgresGetForeignPlan(PlannerInfo *root,
>                              outer_plan);
>  }
> 
> +/*
> + * Generate attinmeta if there are some converters:
> + * they are expecxted to return BYTEA, but real input type is likely
> different.
> + */
> 
> typo in word "expecxted".

Fixed.

> 
> @@ -139,10 +147,13 @@ typedef struct PgFdwScanState
>                                   * for a foreign join scan. */
>      TupleDesc    tupdesc;        /* tuple descriptor of scan */
>      AttInMetadata *attinmeta;    /* attribute datatype conversion
> metadata */
> +    AttInMetadata *rcvd_attinmeta;    /* metadata for received
> tuples, NULL if
> +                                     * there's no converters */
> 
> Looks like rcvd_attinmeta is redundant and you could use attinmeta for
> conversion metadata.

Seems so, removed it.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Вложения

0001-Partial-aggregates-push-down-v05.patch

Re: Partial aggregates pushdown

От

Daniel Gustafsson

Дата:

03 ноября 2021 г., 16:45:38

> On 2 Nov 2021, at 10:12, Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote:

> Updated and rebased patch.

+    state = (Int128AggState *) palloc0(sizeof(Int128AggState));
+    state->calcSumX2 = false;
+
+    if (!PG_ARGISNULL(0))
+    {
+#ifdef HAVE_INT128
+        do_int128_accum(state, (int128) PG_GETARG_INT64(0));
+#else
+        do_numeric_accum(state, int64_to_numeric(PG_GETARG_INT64(0)));
+#endif

This fails on non-INT128 platforms as state cannot be cast to Int128AggState
outside of HAVE_INT128; it's not defined there.  This needs to be a
PolyNumAggState no?

--
Daniel Gustafsson        https://vmware.com/

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

03 ноября 2021 г., 17:50:19

Daniel Gustafsson писал 2021-11-03 16:45:
>> On 2 Nov 2021, at 10:12, Alexander Pyhalov <a.pyhalov@postgrespro.ru> 
>> wrote:
> 
>> Updated and rebased patch.
> 
> +    state = (Int128AggState *) palloc0(sizeof(Int128AggState));
> +    state->calcSumX2 = false;
> +
> +    if (!PG_ARGISNULL(0))
> +    {
> +#ifdef HAVE_INT128
> +        do_int128_accum(state, (int128) PG_GETARG_INT64(0));
> +#else
> +        do_numeric_accum(state, int64_to_numeric(PG_GETARG_INT64(0)));
> +#endif
> 
> This fails on non-INT128 platforms as state cannot be cast to 
> Int128AggState
> outside of HAVE_INT128; it's not defined there.  This needs to be a
> PolyNumAggState no?

Hi.
Thank you for noticing this. It's indeed fails with 
pgac_cv__128bit_int=no.
Updated patch.
-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Вложения

0001-Partial-aggregates-push-down-v06.patch

Re: Partial aggregates pushdown

От

Daniel Gustafsson

Дата:

15 ноября 2021 г., 13:16:24

> On 3 Nov 2021, at 15:50, Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote:
>
> Daniel Gustafsson писал 2021-11-03 16:45:
>>> On 2 Nov 2021, at 10:12, Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote:
>>> Updated and rebased patch.
>> +    state = (Int128AggState *) palloc0(sizeof(Int128AggState));
>> +    state->calcSumX2 = false;
>> +
>> +    if (!PG_ARGISNULL(0))
>> +    {
>> +#ifdef HAVE_INT128
>> +        do_int128_accum(state, (int128) PG_GETARG_INT64(0));
>> +#else
>> +        do_numeric_accum(state, int64_to_numeric(PG_GETARG_INT64(0)));
>> +#endif
>> This fails on non-INT128 platforms as state cannot be cast to Int128AggState
>> outside of HAVE_INT128; it's not defined there.  This needs to be a
>> PolyNumAggState no?
>
> Hi.
> Thank you for noticing this. It's indeed fails with pgac_cv__128bit_int=no.
> Updated patch.

The updated patch also fails to apply now, but on the catversion.h bump.  To
avoid having to rebase for that I recommend to skip that part in the patch and
just mention the need in the thread, any committer picking this up for commit
will know to bump the catversion so there is no use in risking unneccesary
conflicts.

--
Daniel Gustafsson        https://vmware.com/

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

15 ноября 2021 г., 16:01:51

Daniel Gustafsson писал 2021-11-15 13:16:
>> On 3 Nov 2021, at 15:50, Alexander Pyhalov <a.pyhalov@postgrespro.ru> 
>> wrote:
>> 
>> Daniel Gustafsson писал 2021-11-03 16:45:
>>>> On 2 Nov 2021, at 10:12, Alexander Pyhalov 
>>>> <a.pyhalov@postgrespro.ru> wrote:
>>>> Updated and rebased patch.
>>> +    state = (Int128AggState *) palloc0(sizeof(Int128AggState));
>>> +    state->calcSumX2 = false;
>>> +
>>> +    if (!PG_ARGISNULL(0))
>>> +    {
>>> +#ifdef HAVE_INT128
>>> +        do_int128_accum(state, (int128) PG_GETARG_INT64(0));
>>> +#else
>>> +        do_numeric_accum(state, int64_to_numeric(PG_GETARG_INT64(0)));
>>> +#endif
>>> This fails on non-INT128 platforms as state cannot be cast to 
>>> Int128AggState
>>> outside of HAVE_INT128; it's not defined there.  This needs to be a
>>> PolyNumAggState no?
>> 
>> Hi.
>> Thank you for noticing this. It's indeed fails with 
>> pgac_cv__128bit_int=no.
>> Updated patch.
> 
> The updated patch also fails to apply now, but on the catversion.h 
> bump.  To
> avoid having to rebase for that I recommend to skip that part in the 
> patch and
> just mention the need in the thread, any committer picking this up for 
> commit
> will know to bump the catversion so there is no use in risking 
> unneccesary
> conflicts.

I've updated patch - removed catversion dump.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Вложения

0001-Partial-aggregates-push-down-v07.patch

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

17 января 2022 г., 10:46:55

Julien Rouhaud писал 2022-01-14 15:16:
> Hi,
> 
> On Mon, Nov 15, 2021 at 04:01:51PM +0300, Alexander Pyhalov wrote:
>> 
>> I've updated patch - removed catversion dump.
> 
> This version of the patchset doesn't apply anymore:
> 
> http://cfbot.cputube.org/patch_36_3369.log
> === Applying patches on top of PostgreSQL commit ID
> 025b920a3d45fed441a0a58fdcdf05b321b1eead ===
> === applying patch ./0001-Partial-aggregates-push-down-v07.patch
> patching file src/bin/pg_dump/pg_dump.c
> Hunk #1 succeeded at 13111 (offset -965 lines).
> Hunk #2 FAILED at 14167.
> Hunk #3 succeeded at 13228 (offset -961 lines).
> Hunk #4 succeeded at 13319 (offset -966 lines).
> 1 out of 4 hunks FAILED -- saving rejects to file 
> src/bin/pg_dump/pg_dump.c.rej
> 
> Could you send a rebased version?  In the meantime I will switch the cf 
> entry
> to Waiting on Author.

Hi. Attaching rebased patch.
-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Вложения

0001-Partial-aggregates-push-down-v08.patch

Re: Partial aggregates pushdown

От

Zhihong Yu

Дата:

17 января 2022 г., 11:43:27

On Sun, Jan 16, 2022 at 11:47 PM Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote:

Julien Rouhaud писал 2022-01-14 15:16:
> Hi,
>
> On Mon, Nov 15, 2021 at 04:01:51PM +0300, Alexander Pyhalov wrote:
>>
>> I've updated patch - removed catversion dump.
>
> This version of the patchset doesn't apply anymore:
>
> http://cfbot.cputube.org/patch_36_3369.log
> === Applying patches on top of PostgreSQL commit ID
> 025b920a3d45fed441a0a58fdcdf05b321b1eead ===
> === applying patch ./0001-Partial-aggregates-push-down-v07.patch
> patching file src/bin/pg_dump/pg_dump.c
> Hunk #1 succeeded at 13111 (offset -965 lines).
> Hunk #2 FAILED at 14167.
> Hunk #3 succeeded at 13228 (offset -961 lines).
> Hunk #4 succeeded at 13319 (offset -966 lines).
> 1 out of 4 hunks FAILED -- saving rejects to file
> src/bin/pg_dump/pg_dump.c.rej
>
> Could you send a rebased version? In the meantime I will switch the cf
> entry
> to Waiting on Author.

Hi. Attaching rebased patch.
--
Best regards,
Alexander Pyhalov,
Postgres Professional

Hi,

+ FdwScanPrivateConvertors

+ * Generate attinmeta if there are some converters:

I think it would be better if converter is spelled the same way across the patch.

For build_conv_list():

+ if (IS_UPPER_REL(foreignrel))

You can return NIL for !IS_UPPER_REL(foreignrel) - this would save indentation for the body of the func.

Cheers

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

17 января 2022 г., 15:26:55

Zhihong Yu писал 2022-01-17 11:43:
> Hi,
> +   FdwScanPrivateConvertors
> 
> + * Generate attinmeta if there are some converters:
> 
> I think it would be better if converter is spelled the same way across
> the patch.
> 
> For build_conv_list():
> 
> +   if (IS_UPPER_REL(foreignrel))
> 
> You can return NIL for !IS_UPPER_REL(foreignrel) - this would save
> indentation for the body of the func.

Hi.
Updated patch.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

17 января 2022 г., 15:27:53

Alexander Pyhalov писал 2022-01-17 15:26:
> Zhihong Yu писал 2022-01-17 11:43:
>> Hi,
>> +   FdwScanPrivateConvertors
>> 
>> + * Generate attinmeta if there are some converters:
>> 
>> I think it would be better if converter is spelled the same way across
>> the patch.
>> 
>> For build_conv_list():
>> 
>> +   if (IS_UPPER_REL(foreignrel))
>> 
>> You can return NIL for !IS_UPPER_REL(foreignrel) - this would save
>> indentation for the body of the func.
> 
> Hi.
> Updated patch.

Sorry, missed attachment.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Вложения

0001-Partial-aggregates-push-down-v09.patch

Re: Partial aggregates pushdown

От

Andres Freund

Дата:

22 марта 2022 г., 03:49:14

On 2022-01-17 15:27:53 +0300, Alexander Pyhalov wrote:
> Alexander Pyhalov писал 2022-01-17 15:26:
> > Updated patch.
> 
> Sorry, missed attachment.

Needs another update: http://cfbot.cputube.org/patch_37_3369.log

Marked as waiting on author.

- Andres

Re: Partial aggregates pushdown

От

Tomas Vondra

Дата:

22 марта 2022 г., 15:28:46

On 3/22/22 01:49, Andres Freund wrote:
> On 2022-01-17 15:27:53 +0300, Alexander Pyhalov wrote:
>> Alexander Pyhalov писал 2022-01-17 15:26:
>>> Updated patch.
>>
>> Sorry, missed attachment.
> 
> Needs another update: http://cfbot.cputube.org/patch_37_3369.log
> 
> Marked as waiting on author.
> 

TBH I'm still not convinced this is the right approach. I've voiced this
opinion before, but to reiterate the main arguments:

1) It's not clear to me how could this get extended to aggregates with
more complex aggregate states, to support e.g. avg() and similar fairly
common aggregates.

2) I'm not sure relying on aggpartialpushdownsafe without any version
checks etc. is sufficient. I mean, how would we know the remote node has
the same idea of representing the aggregate state. I wonder how this
aligns with assumptions we do e.g. for functions etc.

Aside from that, there's a couple review comments:

1) should not remove the comment in foreign_expr_walker

2) comment in deparseAggref is obsolete/inaccurate

3) comment for partial_agg_ok should probably explain when we consider
aggregate OK to be pushed down

4) I'm not sure why get_rcvd_attinmeta comment talks about "return type
bytea" and "real input type".

5) Talking about "partial" aggregates is a bit confusing, because that
suggests this is related to actual "partial aggregates". But it's not.

6) Can add_foreign_grouping_paths do without the new 'partial'
parameter? Clearly, it can be deduced from extra->patype, no?

7) There's no docs for PARTIALCONVERTERFUNC / PARTIAL_PUSHDOWN_SAFE in
CREATE AGGREGATE sgml docs.

8) I don't think "serialize" in the converter functions is the right
term, considering those functions are not "serializing" anything. If
anything, it's the remote node that is serializing the agg state and the
local not is deserializing it. Or maybe I just misunderstand where are
the converter functions executed?

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

22 марта 2022 г., 19:15:10

Tomas Vondra писал 2022-03-22 15:28:
> On 3/22/22 01:49, Andres Freund wrote:
>> On 2022-01-17 15:27:53 +0300, Alexander Pyhalov wrote:
>>> Alexander Pyhalov писал 2022-01-17 15:26:
>>>> Updated patch.
>>> 
>>> Sorry, missed attachment.
>> 
>> Needs another update: http://cfbot.cputube.org/patch_37_3369.log
>> 
>> Marked as waiting on author.
>> 
> 
> TBH I'm still not convinced this is the right approach. I've voiced 
> this
> opinion before, but to reiterate the main arguments:
> 
> 1) It's not clear to me how could this get extended to aggregates with
> more complex aggregate states, to support e.g. avg() and similar fairly
> common aggregates.

Hi.
Yes, I'm also not sure how to proceed with aggregates with complex 
state.
Likely it needs separate function to export their state, but then we 
should
somehow ensure that this function exists and our 'importer' can handle 
its result.
Note that for now we have no mechanics in postgres_fdw to find out 
remote server version
on planning stage.

> 2) I'm not sure relying on aggpartialpushdownsafe without any version
> checks etc. is sufficient. I mean, how would we know the remote node 
> has
> the same idea of representing the aggregate state. I wonder how this
> aligns with assumptions we do e.g. for functions etc.

It seems to be not a problem for me, as for now we don't care about 
remote node internal aggregate state representation.
We currently get just aggregate result from remote node. For aggregates
with 'internal' stype we call converter locally, and it converts 
external result from
aggregate return type to local node internal representation.

> 
> Aside from that, there's a couple review comments:
> 
> 1) should not remove the comment in foreign_expr_walker

Fixed.

> 
> 2) comment in deparseAggref is obsolete/inaccurate

Fixed.

> 
> 3) comment for partial_agg_ok should probably explain when we consider
> aggregate OK to be pushed down

Expanded comment.
> 
> 4) I'm not sure why get_rcvd_attinmeta comment talks about "return type
> bytea" and "real input type".

Expanded comment.  Tupdesc can be retrieved from 
node->ss.ss_ScanTupleSlot,
and so we expect to see bytea (as should be produced by partial 
aggregation).
But when we scan data, we get aggregate
output type (which matches converter input type), so attinmeta should
be fixed.
If we deal with aggregate which doesn't have converter, partial_agg_ok()
ensures that agg->aggfnoid return type matches agg->aggtranstype.

> 5) Talking about "partial" aggregates is a bit confusing, because that
> suggests this is related to actual "partial aggregates". But it's not.

How should we call them? It's about pushing "Partial count()" or  
"Partial sum()" to the remote server,
why it's not related to partial aggregates? Do you mean that it's not 
about parallel aggregate processing?

> 6) Can add_foreign_grouping_paths do without the new 'partial'
> parameter? Clearly, it can be deduced from extra->patype, no?

Fixed this.

> 
> 7) There's no docs for PARTIALCONVERTERFUNC / PARTIAL_PUSHDOWN_SAFE in
> CREATE AGGREGATE sgml docs.

Added documentation. I'd appreciate advice on how it should be extended.

> 
> 8) I don't think "serialize" in the converter functions is the right
> term, considering those functions are not "serializing" anything. If
> anything, it's the remote node that is serializing the agg state and 
> the
> local not is deserializing it. Or maybe I just misunderstand where are
> the converter functions executed?

Converter function transforms aggregate result to serialized internal 
representation,
which is expected from partial aggregate. I mean, it converts aggregate
result type to internal representation and then efficiently executes
serialization code (i.e. converter(x) == serialize(to_internal(x))).

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Вложения

0001-Partial-aggregates-push-down-v10.patch

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

01 августа 2022 г., 08:55:31

Hi Mr.Vondra, Mr.Pyhalov.

I'm interesied in Mr.Pyhalov's patch due to the following background.
--Background
I develop postgresql's extension such as fdw in my work. 
I'm interested in using postgresql for OLAP. 
I think the function of a previous patch "Push aggregation down to base relations and joins"[1] is desiable. I rebased
theprevious patch and register the rebased patch on the next commitfest[2].
 
And I think it would be more useful if the previous patch works on a foreign table of postgres_fdw.
I realized the function of partial aggregation pushdown is necessary  to make the previous patch work on a foreign
tableof postgres_fdw.
 
--

So I reviewed Mr.Pyhalov's patch and discussions on this thread.
I made a draft of approach to respond to Mr.Vondra's comments.
Would you check whether my draft is right or not?

--My draft
> 1) It's not clear to me how could this get extended to aggregates with 
> more complex aggregate states, to support e.g. avg() and similar 
> fairly common aggregates.
We add a special aggregate function every aggregate function (hereafter we call this src)  which supports partial
aggregation.
The followings are differences between the src and the special aggregate function.
difference1) result type
The result type is same with the src's transtype if the src's transtype is not internal.
Otherwise the result type is bytea.

difference2) final func
The final func does not exist if the src's transtype is not internal.
Otherwize the final func returns serialized value.

For example, let me call the special aggregate function of avg(float8) avg_p(float8).
The result value of avg_p is a float8 array which consists of count and summation.
avg_p does not have finalfunc.

We pushdown the special aggregate function instead of a src.
For example, we issue "select avg_p(c) from t" instead of "select avg(c) from t"
in the above example.

We add a new column partialaggfn to pg_aggregate to get the oid of  the special aggregate function from the the src's
oid.
This column is the oid of the special aggregate function which corresponds to the src.

If an aggregate function does not have any special aggregate function,  then we does not pushdown any partial
aggregationof the aggregate function.
 

> 2) I'm not sure relying on aggpartialpushdownsafe without any version 
> checks etc. is sufficient. I mean, how would we know the remote node 
> has the same idea of representing the aggregate state. I wonder how 
> this aligns with assumptions we do e.g. for functions etc.
We add compatible server versions infomation to pg_aggregate and  the set of options of postgres_fdw's foreign server.
We check compatibility of an aggregate function using this infomation.

An additional column of pg_aggregate is compatibleversonrange.
This column is a range of postgresql server versions which  has compatible aggregate function.
An additional options of postgres_fdw's foreign server are serverversion and bwcompatibleverson.
serverversion is remote postgresql server version.
bwcompatibleverson is the maximum version in which any aggregate function is compatible with local noed's one.
Our version check passes if and only if at least one of the following conditions is true.
condition1) the option value of serverversion is in compatibleversonrange.
condition2) the local postgresql server version is between bwcompatibleverson and the option value of serverversion.

We can get the local postgresql server version from PG_VERSION_NUM macro.
We use condition1 if the local postgresql server version is not more than the remote one.
and use condition2 if the local postgresql server version is greater than the remote one.
--

Sincerely yours,
Yuuki Fujii

[1] https://commitfest.postgresql.org/32/1247/
[2] https://commitfest.postgresql.org/39/3764/

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

22 ноября 2022 г., 04:01:55

Hi Mr.Vondra, Mr.Pyhalov, Everyone.

I discussed with Mr.Pyhalov about the above draft by directly sending mail to
 him(outside of pgsql-hackers). Mr.Pyhalov allowed me to update his patch
along with the above draft. So I update Mr.Pyhalov's patch v10.

I wrote my patch for discussion.
My patch passes regression tests which contains additional basic postgres_fdw tests
for my patch's feature. But my patch doesn't contain sufficient documents and tests.
If reviewers accept my approach, I will add documents and tests to my patch.

The following is a my patch's readme.
# I simplified the above draft.

--readme of my patch
1. interface
1) pg_aggregate
There are the following additional columns.
a) partialaggfn
  data type    : regproc.
  default value: zero(means invalid).
  description  : This field refers to the special aggregate function(then we call
     this partialaggfunc)
    corresponding to aggregation function(then we call src) which has aggfnoid.
    partialaggfunc is used for partial aggregation pushdown by postgres_fdw.
    The followings are differences between the src and the special aggregate function.
      difference1) result type
        The result type is same as the src's transtype if the src's transtype
        is not internal.
        Otherwise the result type is bytea.
      difference2) final func
        The final func does not exist if the src's transtype is not internal.
        Otherwize the final func returns serialized value.
    For example, there is a partialaggfunc avg_p_int4 which corresponds to avg(int4)
    whose aggtranstype is _int4.
    The result value of avg_p_int4 is a float8 array which consists of count and
    summation. avg_p_int4 does not have finalfunc.
    For another example, there is a partialaggfunc avg_p_int8 which corresponds to
    avg(int8) whose aggtranstype is internal.
    The result value of avg_p_int8 is a bytea serialized array which consists of count
    and summation. avg_p_int8 has finalfunc int8_avg_serialize which is serialize function
    of avg(int8). This field is zero if there is no partialaggfunc.

b) partialagg_minversion
  data type    : int4.
  default value: zero(means current version).
  description  : This field is the minimum PostgreSQL server version which has
    partialaggfunc. This field is used for checking compatibility of partialaggfunc.

The above fields are valid in tuples for builtin avg, sum, min, max, count.
There are additional records which correspond to partialaggfunc for avg, sum, min, max,
count.

2) pg_proc
There are additional records which correspond to partialaggfunc for avg, sum, min, max,
count.

3) postgres_fdw
postgres_fdw has an additional foreign server option server_version. server_version is
integer value which means remote server version number. Default value of server_version
is zero. server_version is used for checking compatibility of partialaggfunc.

2. feature
postgres_fdw can pushdown partial aggregation of avg, sum, min, max, count.
Partial aggregation pushdown is fine when the following two conditions are both true.
  condition1) partialaggfn is valid.
  condition2) server_version is not less than partialagg_minversion
postgres_fdw executes pushdown the patialaggfunc instead of a src.
For example, we issue "select avg_p_int4(c) from t" instead of "select avg(c) from t"
in the above example.

postgres_fdw can pushdown every aggregate function which supports partial aggregation
if you add a partialaggfunc corresponding to the aggregate function by create aggregate
command.

3. difference between my patch and Mr.Pyhalov's v10 patch.
1) In my patch postgres_fdw can pushdown partial aggregation of avg
2) In my patch postgres_fdw can pushdown every aggregate function which supports partial
  aggregation if you add a partialaggfunc corresponding to the aggregate function.

4. sample commands in psql
\c postgres
drop database tmp;
create database tmp;
\c tmp
create extension postgres_fdw;
create server server_01 foreign data wrapper postgres_fdw options(host 'localhost', dbname 'tmp', server_version
'160000',async_capable 'true'); 
create user mapping for postgres server server_01 options(user 'postgres', password 'postgres');
create server server_02 foreign data wrapper postgres_fdw options(host 'localhost', dbname 'tmp', server_version
'160000',async_capable 'true'); 
create user mapping for postgres server server_02 options(user 'postgres', password 'postgres');

create table t(dt timestamp, id int4, name text, total int4, val float4, type int4, span interval) partition by list
(type);

create table t1(dt timestamp, id int4, name text, total int4, val float4, type int4, span interval);
create table t2(dt timestamp, id int4, name text, total int4, val float4, type int4, span interval);

truncate table t1;
truncate table t2;
insert into t1 select timestamp'2020-01-01' + cast(t || ' seconds' as interval), t % 100, 'hoge' || t, 1, 1.1, 1,
cast('1seconds' as interval) from generate_series(1, 100000, 1) t; 
insert into t1 select timestamp'2020-01-01' + cast(t || ' seconds' as interval), t % 100, 'hoge' || t, 2, 2.1, 1,
cast('2seconds' as interval) from generate_series(1, 100000, 1) t; 
insert into t2 select timestamp'2020-01-01' + cast(t || ' seconds' as interval), t % 100, 'hoge' || t, 1, 1.1, 2,
cast('1seconds' as interval) from generate_series(1, 100000, 1) t; 
insert into t2 select timestamp'2020-01-01' + cast(t || ' seconds' as interval), t % 100, 'hoge' || t, 2, 2.1, 2,
cast('2seconds' as interval) from generate_series(1, 100000, 1) t; 

create foreign table f_t1 partition of t for values in (1) server server_01 options(table_name 't1');
create foreign table f_t2 partition of t for values in (2) server server_02 options(table_name 't2');

set enable_partitionwise_aggregate = on;
explain (verbose, costs off) select avg(total::int4), avg(total::int8) from t;
select avg(total::int4), avg(total::int8) from t;

Sincerely yours,
Yuuki Fujii
--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v11.patch

Re: Partial aggregates pushdown

От

Ted Yu

Дата:

22 ноября 2022 г., 08:00:01

On Mon, Nov 21, 2022 at 5:02 PM Fujii.Yuki@df.MitsubishiElectric.co.jp <Fujii.Yuki@df.mitsubishielectric.co.jp> wrote:

Hi Mr.Vondra, Mr.Pyhalov, Everyone.

I discussed with Mr.Pyhalov about the above draft by directly sending mail to
him(outside of pgsql-hackers). Mr.Pyhalov allowed me to update his patch
along with the above draft. So I update Mr.Pyhalov's patch v10.

I wrote my patch for discussion.
My patch passes regression tests which contains additional basic postgres_fdw tests
for my patch's feature. But my patch doesn't contain sufficient documents and tests.
If reviewers accept my approach, I will add documents and tests to my patch.

The following is a my patch's readme.
# I simplified the above draft.

--readme of my patch
1. interface
1) pg_aggregate
There are the following additional columns.
a) partialaggfn
data type : regproc.
default value: zero(means invalid).
description : This field refers to the special aggregate function(then we call
this partialaggfunc)
corresponding to aggregation function(then we call src) which has aggfnoid.
partialaggfunc is used for partial aggregation pushdown by postgres_fdw.
The followings are differences between the src and the special aggregate function.
difference1) result type
The result type is same as the src's transtype if the src's transtype
is not internal.
Otherwise the result type is bytea.
difference2) final func
The final func does not exist if the src's transtype is not internal.
Otherwize the final func returns serialized value.
For example, there is a partialaggfunc avg_p_int4 which corresponds to avg(int4)
whose aggtranstype is _int4.
The result value of avg_p_int4 is a float8 array which consists of count and
summation. avg_p_int4 does not have finalfunc.
For another example, there is a partialaggfunc avg_p_int8 which corresponds to
avg(int8) whose aggtranstype is internal.
The result value of avg_p_int8 is a bytea serialized array which consists of count
and summation. avg_p_int8 has finalfunc int8_avg_serialize which is serialize function
of avg(int8). This field is zero if there is no partialaggfunc.

b) partialagg_minversion
data type : int4.
default value: zero(means current version).
description : This field is the minimum PostgreSQL server version which has
partialaggfunc. This field is used for checking compatibility of partialaggfunc.

The above fields are valid in tuples for builtin avg, sum, min, max, count.
There are additional records which correspond to partialaggfunc for avg, sum, min, max,
count.

2) pg_proc
There are additional records which correspond to partialaggfunc for avg, sum, min, max,
count.

3) postgres_fdw
postgres_fdw has an additional foreign server option server_version. server_version is
integer value which means remote server version number. Default value of server_version
is zero. server_version is used for checking compatibility of partialaggfunc.

2. feature
postgres_fdw can pushdown partial aggregation of avg, sum, min, max, count.
Partial aggregation pushdown is fine when the following two conditions are both true.
condition1) partialaggfn is valid.
condition2) server_version is not less than partialagg_minversion
postgres_fdw executes pushdown the patialaggfunc instead of a src.
For example, we issue "select avg_p_int4(c) from t" instead of "select avg(c) from t"
in the above example.

postgres_fdw can pushdown every aggregate function which supports partial aggregation
if you add a partialaggfunc corresponding to the aggregate function by create aggregate
command.

3. difference between my patch and Mr.Pyhalov's v10 patch.
1) In my patch postgres_fdw can pushdown partial aggregation of avg
2) In my patch postgres_fdw can pushdown every aggregate function which supports partial
aggregation if you add a partialaggfunc corresponding to the aggregate function.

4. sample commands in psql
\c postgres
drop database tmp;
create database tmp;
\c tmp
create extension postgres_fdw;
create server server_01 foreign data wrapper postgres_fdw options(host 'localhost', dbname 'tmp', server_version '160000', async_capable 'true');
create user mapping for postgres server server_01 options(user 'postgres', password 'postgres');
create server server_02 foreign data wrapper postgres_fdw options(host 'localhost', dbname 'tmp', server_version '160000', async_capable 'true');
create user mapping for postgres server server_02 options(user 'postgres', password 'postgres');

create table t(dt timestamp, id int4, name text, total int4, val float4, type int4, span interval) partition by list (type);

create table t1(dt timestamp, id int4, name text, total int4, val float4, type int4, span interval);
create table t2(dt timestamp, id int4, name text, total int4, val float4, type int4, span interval);

truncate table t1;
truncate table t2;
insert into t1 select timestamp'2020-01-01' + cast(t || ' seconds' as interval), t % 100, 'hoge' || t, 1, 1.1, 1, cast('1 seconds' as interval) from generate_series(1, 100000, 1) t;
insert into t1 select timestamp'2020-01-01' + cast(t || ' seconds' as interval), t % 100, 'hoge' || t, 2, 2.1, 1, cast('2 seconds' as interval) from generate_series(1, 100000, 1) t;
insert into t2 select timestamp'2020-01-01' + cast(t || ' seconds' as interval), t % 100, 'hoge' || t, 1, 1.1, 2, cast('1 seconds' as interval) from generate_series(1, 100000, 1) t;
insert into t2 select timestamp'2020-01-01' + cast(t || ' seconds' as interval), t % 100, 'hoge' || t, 2, 2.1, 2, cast('2 seconds' as interval) from generate_series(1, 100000, 1) t;

create foreign table f_t1 partition of t for values in (1) server server_01 options(table_name 't1');
create foreign table f_t2 partition of t for values in (2) server server_02 options(table_name 't2');

set enable_partitionwise_aggregate = on;
explain (verbose, costs off) select avg(total::int4), avg(total::int8) from t;
select avg(total::int4), avg(total::int8) from t;

Sincerely yours,
Yuuki Fujii
--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Hi,

For partial_agg_compatible :

+ * Check that partial aggregate agg has compatibility

If the `agg` refers to func parameter, the parameter name is aggform

+ int32 partialagg_minversion = PG_VERSION_NUM;
+ if (aggform->partialagg_minversion == PARTIALAGG_MINVERSION_DEFAULT) {
+ partialagg_minversion = PG_VERSION_NUM;

I am curious why the same variable is assigned the same value twice. It seems the if block is redundant.

+ if ((fpinfo->server_version >= partialagg_minversion)) {
+ compatible = true;

The above can be simplified as: return fpinfo->server_version >= partialagg_minversion;

Cheers

RE: [CAUTION!! freemail] Re: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

22 ноября 2022 г., 12:11:13

Hi Mr.Yu.

Thank you for comments.

> + * Check that partial aggregate agg has compatibility
> 
> If the `agg` refers to func parameter, the parameter name is aggform
I fixed the above typo and made the above comment easy to understand
New comment is "Check that partial aggregate function of aggform exsits in remote"

> +       int32  partialagg_minversion = PG_VERSION_NUM;
> +       if (aggform->partialagg_minversion ==
> PARTIALAGG_MINVERSION_DEFAULT) {
> +               partialagg_minversion = PG_VERSION_NUM;
> 
> 
> I am curious why the same variable is assigned the same value twice. It seems
> the if block is redundant.
> 
> +       if ((fpinfo->server_version >= partialagg_minversion)) {
> +               compatible = true;
> 
> 
> The above can be simplified as: return fpinfo->server_version >=
> partialagg_minversion;
I fixed according to your comment.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

> -----Original Message-----
> From: Ted Yu <yuzhihong@gmail.com>
> Sent: Tuesday, November 22, 2022 2:00 PM
> To: Fujii Yuki/藤井 雄規(MELCO/情報総研 ＤＭ最適Ｇ)
> <Fujii.Yuki@df.MitsubishiElectric.co.jp>
> Cc: Alexander Pyhalov <a.pyhalov@postgrespro.ru>; Tomas Vondra
> <tomas.vondra@enterprisedb.com>; PostgreSQL-development
> <pgsql-hackers@postgresql.org>; Andres Freund <andres@anarazel.de>;
> Zhihong Yu <zyu@yugabyte.com>; Julien Rouhaud <rjuju123@gmail.com>;
> Daniel Gustafsson <daniel@yesql.se>; Ilya Gladyshev
> <i.gladyshev@postgrespro.ru>
> Subject: [CAUTION!! freemail] Re: Partial aggregates pushdown
> 
> 
> 
> On Mon, Nov 21, 2022 at 5:02 PM Fujii.Yuki@df.MitsubishiElectric.co.jp
> <mailto:Fujii.Yuki@df.MitsubishiElectric.co.jp>
> <Fujii.Yuki@df.mitsubishielectric.co.jp
> <mailto:Fujii.Yuki@df.mitsubishielectric.co.jp> > wrote:
> 
> 
>     Hi Mr.Vondra, Mr.Pyhalov, Everyone.
> 
>     I discussed with Mr.Pyhalov about the above draft by directly sending
> mail to
>      him(outside of pgsql-hackers). Mr.Pyhalov allowed me to update his
> patch
>     along with the above draft. So I update Mr.Pyhalov's patch v10.
> 
>     I wrote my patch for discussion.
>     My patch passes regression tests which contains additional basic
> postgres_fdw tests
>     for my patch's feature. But my patch doesn't contain sufficient
> documents and tests.
>     If reviewers accept my approach, I will add documents and tests to my
> patch.
> 
>     The following is a my patch's readme.
>     # I simplified the above draft.
> 
>     --readme of my patch
>     1. interface
>     1) pg_aggregate
>     There are the following additional columns.
>     a) partialaggfn
>       data type    : regproc.
>       default value: zero(means invalid).
>       description  : This field refers to the special aggregate
> function(then we call
>          this partialaggfunc)
>         corresponding to aggregation function(then we call src) which has
> aggfnoid.
>         partialaggfunc is used for partial aggregation pushdown by
> postgres_fdw.
>         The followings are differences between the src and the special
> aggregate function.
>           difference1) result type
>             The result type is same as the src's transtype if the src's
> transtype
>             is not internal.
>             Otherwise the result type is bytea.
>           difference2) final func
>             The final func does not exist if the src's transtype is not
> internal.
>             Otherwize the final func returns serialized value.
>         For example, there is a partialaggfunc avg_p_int4 which
> corresponds to avg(int4)
>         whose aggtranstype is _int4.
>         The result value of avg_p_int4 is a float8 array which consists of
> count and
>         summation. avg_p_int4 does not have finalfunc.
>         For another example, there is a partialaggfunc avg_p_int8 which
> corresponds to
>         avg(int8) whose aggtranstype is internal.
>         The result value of avg_p_int8 is a bytea serialized array which
> consists of count
>         and summation. avg_p_int8 has finalfunc int8_avg_serialize
> which is serialize function
>         of avg(int8). This field is zero if there is no partialaggfunc.
> 
>     b) partialagg_minversion
>       data type    : int4.
>       default value: zero(means current version).
>       description  : This field is the minimum PostgreSQL server version
> which has
>         partialaggfunc. This field is used for checking compatibility of
> partialaggfunc.
> 
>     The above fields are valid in tuples for builtin avg, sum, min, max,
> count.
>     There are additional records which correspond to partialaggfunc for
> avg, sum, min, max,
>     count.
> 
>     2) pg_proc
>     There are additional records which correspond to partialaggfunc for
> avg, sum, min, max,
>     count.
> 
>     3) postgres_fdw
>     postgres_fdw has an additional foreign server option server_version.
> server_version is
>     integer value which means remote server version number. Default
> value of server_version
>     is zero. server_version is used for checking compatibility of
> partialaggfunc.
> 
>     2. feature
>     postgres_fdw can pushdown partial aggregation of avg, sum, min, max,
> count.
>     Partial aggregation pushdown is fine when the following two
> conditions are both true.
>       condition1) partialaggfn is valid.
>       condition2) server_version is not less than partialagg_minversion
>     postgres_fdw executes pushdown the patialaggfunc instead of a src.
>     For example, we issue "select avg_p_int4(c) from t" instead of "select
> avg(c) from t"
>     in the above example.
> 
>     postgres_fdw can pushdown every aggregate function which supports
> partial aggregation
>     if you add a partialaggfunc corresponding to the aggregate function by
> create aggregate
>     command.
> 
>     3. difference between my patch and Mr.Pyhalov's v10 patch.
>     1) In my patch postgres_fdw can pushdown partial aggregation of avg
>     2) In my patch postgres_fdw can pushdown every aggregate function
> which supports partial
>       aggregation if you add a partialaggfunc corresponding to the
> aggregate function.
> 
>     4. sample commands in psql
>     \c postgres
>     drop database tmp;
>     create database tmp;
>     \c tmp
>     create extension postgres_fdw;
>     create server server_01 foreign data wrapper postgres_fdw
> options(host 'localhost', dbname 'tmp', server_version '160000', async_capable
> 'true');
>     create user mapping for postgres server server_01 options(user
> 'postgres', password 'postgres');
>     create server server_02 foreign data wrapper postgres_fdw
> options(host 'localhost', dbname 'tmp', server_version '160000', async_capable
> 'true');
>     create user mapping for postgres server server_02 options(user
> 'postgres', password 'postgres');
> 
>     create table t(dt timestamp, id int4, name text, total int4, val float4, type
> int4, span interval) partition by list (type);
> 
>     create table t1(dt timestamp, id int4, name text, total int4, val float4,
> type int4, span interval);
>     create table t2(dt timestamp, id int4, name text, total int4, val float4,
> type int4, span interval);
> 
>     truncate table t1;
>     truncate table t2;
>     insert into t1 select timestamp'2020-01-01' + cast(t || ' seconds' as
> interval), t % 100, 'hoge' || t, 1, 1.1, 1, cast('1 seconds' as interval) from
> generate_series(1, 100000, 1) t;
>     insert into t1 select timestamp'2020-01-01' + cast(t || ' seconds' as
> interval), t % 100, 'hoge' || t, 2, 2.1, 1, cast('2 seconds' as interval) from
> generate_series(1, 100000, 1) t;
>     insert into t2 select timestamp'2020-01-01' + cast(t || ' seconds' as
> interval), t % 100, 'hoge' || t, 1, 1.1, 2, cast('1 seconds' as interval) from
> generate_series(1, 100000, 1) t;
>     insert into t2 select timestamp'2020-01-01' + cast(t || ' seconds' as
> interval), t % 100, 'hoge' || t, 2, 2.1, 2, cast('2 seconds' as interval) from
> generate_series(1, 100000, 1) t;
> 
>     create foreign table f_t1 partition of t for values in (1) server server_01
> options(table_name 't1');
>     create foreign table f_t2 partition of t for values in (2) server server_02
> options(table_name 't2');
> 
>     set enable_partitionwise_aggregate = on;
>     explain (verbose, costs off) select avg(total::int4), avg(total::int8) from
> t;
>     select avg(total::int4), avg(total::int8) from t;
> 
>     Sincerely yours,
>     Yuuki Fujii
>     --
>     Yuuki Fujii
>     Information Technology R&D Center Mitsubishi Electric Corporation
> 
> 
> 
> Hi,
> For partial_agg_compatible :
> 
> + * Check that partial aggregate agg has compatibility
> 
> If the `agg` refers to func parameter, the parameter name is aggform
> 
> +       int32  partialagg_minversion = PG_VERSION_NUM;
> +       if (aggform->partialagg_minversion ==
> PARTIALAGG_MINVERSION_DEFAULT) {
> +               partialagg_minversion = PG_VERSION_NUM;
> 
> 
> I am curious why the same variable is assigned the same value twice. It seems
> the if block is redundant.
> 
> +       if ((fpinfo->server_version >= partialagg_minversion)) {
> +               compatible = true;
> 
> 
> The above can be simplified as: return fpinfo->server_version >=
> partialagg_minversion;
> 
> Cheers

Вложения

0001-Partial-aggregates-push-down-v12.patch

Re: [CAUTION!! freemail] Re: Partial aggregates pushdown

От

Ted Yu

Дата:

22 ноября 2022 г., 13:51:44

On Tue, Nov 22, 2022 at 1:11 AM Fujii.Yuki@df.MitsubishiElectric.co.jp <Fujii.Yuki@df.mitsubishielectric.co.jp> wrote:

Hi Mr.Yu.

Thank you for comments.

> + * Check that partial aggregate agg has compatibility
>
> If the `agg` refers to func parameter, the parameter name is aggform
I fixed the above typo and made the above comment easy to understand
New comment is "Check that partial aggregate function of aggform exsits in remote"

> + int32 partialagg_minversion = PG_VERSION_NUM;
> + if (aggform->partialagg_minversion ==
> PARTIALAGG_MINVERSION_DEFAULT) {
> + partialagg_minversion = PG_VERSION_NUM;
>
>
> I am curious why the same variable is assigned the same value twice. It seems
> the if block is redundant.
>
> + if ((fpinfo->server_version >= partialagg_minversion)) {
> + compatible = true;
>
>
> The above can be simplified as: return fpinfo->server_version >=
> partialagg_minversion;
I fixed according to your comment.

Sincerely yours,
Yuuki Fujii

Hi,

Thanks for the quick response.

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

22 ноября 2022 г., 19:05:29

Fujii.Yuki@df.MitsubishiElectric.co.jp писал 2022-11-22 04:01:
> Hi Mr.Vondra, Mr.Pyhalov, Everyone.
> 
> I discussed with Mr.Pyhalov about the above draft by directly sending 
> mail to
>  him(outside of pgsql-hackers). Mr.Pyhalov allowed me to update his 
> patch
> along with the above draft. So I update Mr.Pyhalov's patch v10.
> 

Hi, Yuki. Thank you for your work on this.

I've looked through the patch. Overall I like this approach, but have 
the following comments.

1) Why should we require partialaggfn for min()/max()/count()? We could 
just use original functions for a lot of aggregates, and so it would be 
possible to push down some partial aggregates to older servers. I'm not 
sure that it's a strict requirement, but a nice thing to think about. 
Can we use the function itself as partialaggfn, for example, for 
sum(int4)? For functions with internal aggtranstype (like sum(int8) it 
would be more difficult).

2) fpinfo->server_version is not aggregated, for example, when we form 
fpinfo in foreign_join_ok(), it seems we should spread it in more places 
in postgres_fdw.c.

3) In add_foreign_grouping_paths() it seems there's no need for 
additional argument, we can look at extra->patype. Also Assert() in 
add_foreign_grouping_paths() will fire in --enable-cassert build.

4) Why do you modify lookup_agg_function() signature? I don't see tests, 
showing that it's neccessary. Perhaps, more precise function naming 
should be used instead?

5) In tests:
          - Why version_num does have "name" type in 
f_alter_server_version() function?
          - You modify server_version option of 'loopback' server, but 
don't reset it after test. This could affect further tests.
          - "It's unsafe to push down partial aggregates with distinct" 
in postgres_fdw.sql:3002 seems to be misleading.
3001
3002 -- It's unsafe to push down partial aggregates with distinct
3003 SELECT f_alter_server_version('loopback', 'set', -1);
3004 EXPLAIN (VERBOSE, COSTS OFF)
3005 SELECT avg(d) FROM pagg_tab;
3006 SELECT avg(d) FROM pagg_tab;
3007 select * from pg_foreign_server;

6) While looking at it, could cause a crash with something like

CREATE TYPE COMPLEX AS (re FLOAT, im FLOAT);

CREATE OR REPLACE FUNCTION
sum_complex (sum complex, el complex)
RETURNS complex AS
$$
DECLARE
s complex;
BEGIN
if el is not null and sum is not null then
sum.re:=coalesce(sum.re,0)+el.re;
sum.im:=coalesce(sum.im,0)+el.im;
end if;
RETURN sum;
END;
$$ LANGUAGE plpgSQL;

CREATE AGGREGATE SUM(COMPLEX) (
SFUNC=sum_complex,
STYPE=complex,
partialaggfunc=aaaa,
partialagg_minversion=1400
);

where aaaa - something nonexisting


enforce_generic_type_consistency (actual_arg_types=0x56269873d200, 
declared_arg_types=0x0, nargs=1, rettype=0, allow_poly=true) at 
parse_coerce.c:2132
2132                    Oid                     decl_type = 
declared_arg_types[j];
(gdb) bt
#0  enforce_generic_type_consistency (actual_arg_types=0x56269873d200, 
declared_arg_types=0x0, nargs=1, rettype=0, allow_poly=true) at 
parse_coerce.c:2132
#1  0x00005626960072de in lookup_agg_function (fnName=0x5626986715a0, 
nargs=1, input_types=0x56269873d200, variadicArgType=0, 
rettype=0x7ffd1a4045d8, only_normal=false) at pg_aggregate.c:916
#2  0x00005626960064ba in AggregateCreate (aggName=0x562698671000 "sum", 
aggNamespace=2200, replace=false, aggKind=110 'n', numArgs=1, 
numDirectArgs=0, parameterTypes=0x56269873d1e8, allParameterTypes=0, 
parameterModes=0,
     parameterNames=0, parameterDefaults=0x0, variadicArgType=0, 
aggtransfnName=0x5626986712c0, aggfinalfnName=0x0, aggcombinefnName=0x0, 
aggserialfnName=0x0, aggdeserialfnName=0x0, aggmtransfnName=0x0, 
aggminvtransfnName=0x0,
     aggmfinalfnName=0x0, partialaggfnName=0x5626986715a0, 
finalfnExtraArgs=false, mfinalfnExtraArgs=false, finalfnModify=114 'r', 
mfinalfnModify=114 'r', aggsortopName=0x0, aggTransType=16390, 
aggTransSpace=0, aggmTransType=0,
     aggmTransSpace=0, partialaggMinversion=1400, agginitval=0x0, 
aggminitval=0x0, proparallel=117 'u') at pg_aggregate.c:582
#3  0x00005626960a1e1c in DefineAggregate (pstate=0x56269869ab48, 
name=0x562698671038, args=0x5626986711b0, oldstyle=false, 
parameters=0x5626986713b0, replace=false) at aggregatecmds.c:450
#4  0x000056269643061f in ProcessUtilitySlow (pstate=0x56269869ab48, 
pstmt=0x562698671a68,
     queryString=0x5626986705d8 "CREATE AGGREGATE SUM(COMPLEX) 
(\nSFUNC=sum_complex,\nSTYPE=COMPLEX,\npartialaggfunc=scomplex,\npartialagg_minversion=1400\n);", 
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
     dest=0x562698671b48, qc=0x7ffd1a4053c0) at utility.c:1407
#5  0x000056269642fbb4 in standard_ProcessUtility (pstmt=0x562698671a68, 
queryString=0x5626986705d8 "CREATE AGGREGATE SUM(COMPLEX) 
(\nSFUNC=sum_complex,\nSTYPE=COMPLEX,\npartialaggfunc=scomplex,\npartialagg_minversion=1400\n);",
     readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, 
queryEnv=0x0, dest=0x562698671b48, qc=0x7ffd1a4053c0) at utility.c:1074


Later will look at it again.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

30 ноября 2022 г., 06:10:22

Hi Mr.Pyhalov.

Thank you for comments.

> I've looked through the patch. Overall I like this approach, but have
> the following comments.
> 
> 1) Why should we require partialaggfn for min()/max()/count()? We could
> just use original functions for a lot of aggregates, and so it would be
> possible to push down some partial aggregates to older servers. I'm not
> sure that it's a strict requirement, but a nice thing to think about.
> Can we use the function itself as partialaggfn, for example, for
> sum(int4)?
> For functions with internal aggtranstype (like sum(int8) it
> would be more difficult).
Thank you. I realized that partial aggregate pushdown is fine 
without partialaggfn if original function has no aggfinalfn and 
aggtranstype of it is not internal. So I have improved v12 by
this realization.
However, v13 requires partialaggfn for aggregate if it has aggfinalfn or 
aggtranstype of it is internal such as sum(int8). 

> 2) fpinfo->server_version is not aggregated, for example, when we form
> fpinfo in foreign_join_ok(), it seems we should spread it in more places
> in postgres_fdw.c.
I have responded to your comment by adding copy of server_version in 
merge_fdw_options.
 
> 3) In add_foreign_grouping_paths() it seems there's no need for
> additional argument, we can look at extra->patype. Also Assert() in
> add_foreign_grouping_paths() will fire in --enable-cassert build.
I have fixed according to your comment.

> 4) Why do you modify lookup_agg_function() signature? I don't see tests,
> showing that it's neccessary. Perhaps, more precise function naming
> should be used instead?
I realized that there is no need of modification lookup_agg_function().
Instead, I use LookupFuncName().

> 5) In tests:
>           - Why version_num does have "name" type in
> f_alter_server_version() function?
>           - You modify server_version option of 'loopback' server, but
> don't reset it after test. This could affect further tests.
>           - "It's unsafe to push down partial aggregates with distinct"
> in postgres_fdw.sql:3002 seems to be misleading.
> 3001
> 3002 -- It's unsafe to push down partial aggregates with distinct
> 3003 SELECT f_alter_server_version('loopback', 'set', -1);
I have fixed according to your comment.

> 6) While looking at it, could cause a crash with something like
I have fixed this problem by using LookupFuncName() instead of lookup_agg_function.

The following is readme of v13.
--readme of Partial aggregates push down v13
1. interface
1) pg_aggregate
There are the following additional columns.
a) partialaggfn
  data type    : regproc.
  default value: zero(means invalid).
  description  : This field refers to the special aggregate function(then we call
     this partialaggfunc)
    corresponding to aggregation function(then we call src) which has aggfnoid.
    partialaggfunc is used for partial aggregation pushdown by postgres_fdw.
    The followings are differences between the src and the special aggregate function.
      difference1) result type
        The result type is same as the src's transtype if the src's transtype
        is not internal.
        Otherwise the result type is bytea.
      difference2) final func
        The final func does not exist if the src's transtype is not internal.
        Otherwize the final func returns serialized value.
    For example, there is a partialaggfunc avg_p_int4 which corresponds to avg(int4)
    whose aggtranstype is _int4.
    The result value of avg_p_int4 is a float8 array which consists of count and 
    summation. avg_p_int4 does not have finalfunc.
    For another example, there is a partialaggfunc avg_p_int8 which corresponds to 
    avg(int8) whose aggtranstype is internal.
    The result value of avg_p_int8 is a bytea serialized array which consists of count 
    and summation. avg_p_int8 has finalfunc int8_avg_serialize which is serialize function
    of avg(int8). This field is zero if there is no partialaggfunc.

b) partialagg_minversion
  data type    : int4.
  default value: zero(means current version).
  description  : This field is the minimum PostgreSQL server version which has 
    partialaggfunc. This field is used for checking compatibility of partialaggfunc.

The above fields are valid in tuples for builtin avg, sum, min, max, count.
There are additional records which correspond to partialaggfunc for avg, sum, min, max, count.

2) pg_proc
There are additional records which correspond to partialaggfunc for avg, sum, min, max, count.

3) postgres_fdw
postgres_fdw has an additional foreign server option server_version. server_version is 
integer value which means remote server version number. Default value of server_version 
is zero. server_version is used for checking compatibility of partialaggfunc.

2. feature
Partial aggregation pushdown is fine when either of the following conditions is true.
  condition1) aggregate function has not internal aggtranstype and has no aggfinalfn.
  condition2) the following two conditions are both true.
    condition2-1) partialaggfn is valid.
    condition2-2) server_version is not less than partialagg_minversion postgres_fdw executes 
      pushdown the patialaggfunc instead of a src.
postgres_fdw can pushdown partial aggregation of aggregate function which has internal 
aggtranstype or has aggfinalfn if the function is one of avg, sum(int8), sum(numeric).

For example, we issue "select avg_p_int4(c) from t" instead of "select avg(c) from t"
in the above example.

postgres_fdw can pushdown every aggregate function which supports partial aggregation 
if you add a partialaggfunc corresponding to the aggregate function by create aggregate command.

3. sample commands in psql
\c postgres
drop database tmp;
create database tmp;
\c tmp
create extension postgres_fdw;
create server server_01 foreign data wrapper postgres_fdw options(host 'localhost', dbname 'tmp', server_version
'160000',async_capable 'true'); 
 
create user mapping for postgres server server_01 options(user 'postgres', password 'postgres'); 
create server server_02 foreign data wrapper postgres_fdw options(host 'localhost', dbname 'tmp', server_version
'160000',async_capable 'true'); 
 
create user mapping for postgres server server_02 options(user 'postgres', password 'postgres');

create table t(dt timestamp, id int4, name text, total int4, val float4, type int4, span interval) partition by list
(type);

create table t1(dt timestamp, id int4, name text, total int4, val float4, type int4, span interval); 
create table t2(dt timestamp, id int4, name text, total int4, val float4, type int4, span interval);

truncate table t1;
truncate table t2;
insert into t1 select timestamp'2020-01-01' + cast(t || ' seconds' as interval), t % 100, 'hoge' || t, 1, 1.1, 1,
cast('1seconds' as interval) from generate_series(1, 100000, 1) t; 
 
insert into t1 select timestamp'2020-01-01' + cast(t || ' seconds' as interval), t % 100, 'hoge' || t, 2, 2.1, 1,
cast('2seconds' as interval) from generate_series(1, 100000, 1) t; 
 
insert into t2 select timestamp'2020-01-01' + cast(t || ' seconds' as interval), t % 100, 'hoge' || t, 1, 1.1, 2,
cast('1seconds' as interval) from generate_series(1, 100000, 1) t; 
 
insert into t2 select timestamp'2020-01-01' + cast(t || ' seconds' as interval), t % 100, 'hoge' || t, 2, 2.1, 2,
cast('2seconds' as interval) from generate_series(1, 100000, 1) t;
 

create foreign table f_t1 partition of t for values in (1) server server_01 options(table_name 't1'); 
create foreign table f_t2 partition of t for values in (2) server server_02 options(table_name 't2');

set enable_partitionwise_aggregate = on; 
explain (verbose, costs off) select avg(total::int4), avg(total::int8) from t; 
select avg(total::int4), avg(total::int8) from t;
--

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v13.patch

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

30 ноября 2022 г., 11:12:24

Hi, Yuki.

1) In previous version of the patch aggregates, which had partialaggfn, 
were ok to push down. And it was a definite sign that aggregate can be 
pushed down. Now we allow pushing down an aggregate, which prorettype is 
not internal and aggfinalfn is not defined. Is it safe for all 
user-defined (or builtin) aggregates, even if they are generally 
shippable? Aggcombinefn is executed locally and we check that aggregate 
function itself is shippable. Is it enough? Perhaps, we could use 
partialagg_minversion (like aggregates with partialagg_minversion == -1 
should not be pushed down) or introduce separate explicit flag?

2) Do we really have to look at pg_proc in partial_agg_ok() and 
deparseAggref()? Perhaps, looking at aggtranstype is enough?

3) I'm not sure if CREATE AGGREGATE tests with invalid 
PARTIALAGGFUNC/PARTIALAGG_MINVERSION should be in postgres_fdw tests or 
better should be moved to src/test/regress/sql/create_aggregate.sql, as 
they are not specific to postgres_fdw

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

30 ноября 2022 г., 13:01:41

Hi Mr.Pyhalov.

> 1) In previous version of the patch aggregates, which had partialaggfn, were ok
> to push down. And it was a definite sign that aggregate can be pushed down.
> Now we allow pushing down an aggregate, which prorettype is not internal and
> aggfinalfn is not defined. Is it safe for all user-defined (or builtin) aggregates,
> even if they are generally shippable? Aggcombinefn is executed locally and we
> check that aggregate function itself is shippable. Is it enough? Perhaps, we
> could use partialagg_minversion (like aggregates with partialagg_minversion
> == -1 should not be pushed down) or introduce separate explicit flag?
In what case partial aggregate pushdown is unsafe for aggregate which has not internal aggtranstype
 and has no aggfinalfn?
By reading [1], I believe that if aggcombinefn of such aggregate recieves return values of original
 aggregate functions in each remote then it must produce same value that would have resulted
 from scanning all the input in a single operation.

> 2) Do we really have to look at pg_proc in partial_agg_ok() and
> deparseAggref()? Perhaps, looking at aggtranstype is enough?
You are right. I fixed according to your comment.

> 3) I'm not sure if CREATE AGGREGATE tests with invalid
> PARTIALAGGFUNC/PARTIALAGG_MINVERSION should be in postgres_fdw
> tests or better should be moved to src/test/regress/sql/create_aggregate.sql,
> as they are not specific to postgres_fdw
Thank you. I moved these tests to src/test/regress/sql/create_aggregate.sql.

[1] https://www.postgresql.org/docs/15/xaggr.html#XAGGR-PARTIAL-AGGREGATES

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v14.patch

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

30 ноября 2022 г., 16:30:24

Fujii.Yuki@df.MitsubishiElectric.co.jp писал 2022-11-30 13:01:
> Hi Mr.Pyhalov.
> 
>> 1) In previous version of the patch aggregates, which had 
>> partialaggfn, were ok
>> to push down. And it was a definite sign that aggregate can be pushed 
>> down.
>> Now we allow pushing down an aggregate, which prorettype is not 
>> internal and
>> aggfinalfn is not defined. Is it safe for all user-defined (or 
>> builtin) aggregates,
>> even if they are generally shippable? Aggcombinefn is executed locally 
>> and we
>> check that aggregate function itself is shippable. Is it enough? 
>> Perhaps, we
>> could use partialagg_minversion (like aggregates with 
>> partialagg_minversion
>> == -1 should not be pushed down) or introduce separate explicit flag?
> In what case partial aggregate pushdown is unsafe for aggregate which
> has not internal aggtranstype
>  and has no aggfinalfn?
> By reading [1], I believe that if aggcombinefn of such aggregate
> recieves return values of original
>  aggregate functions in each remote then it must produce same value
> that would have resulted
>  from scanning all the input in a single operation.
> 

One more issue I started to think about - now we don't check 
partialagg_minversion for "simple" aggregates at all. Is it correct? It 
seems that , for example, we could try to pushdown bit_or(int8) to old 
servers, but it didn't exist, for example, in 8.4.  I think it's a 
broader issue (it would be also the case already if we push down 
aggregates) and shouldn't be fixed here. But there is an issue - 
is_shippable() is too optimistic.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

30 ноября 2022 г., 17:30:45

Fujii.Yuki@df.MitsubishiElectric.co.jp писал 2022-11-30 13:01:

>> 2) Do we really have to look at pg_proc in partial_agg_ok() and
>> deparseAggref()? Perhaps, looking at aggtranstype is enough?
> You are right. I fixed according to your comment.
> 

partial_agg_ok() still looks at pg_proc.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

01 декабря 2022 г., 05:23:28

Hi Mr.Pyhalov.

> One more issue I started to think about - now we don't check
> partialagg_minversion for "simple" aggregates at all. Is it correct? It seems that ,
> for example, we could try to pushdown bit_or(int8) to old servers, but it didn't
> exist, for example, in 8.4.  I think it's a broader issue (it would be also the case
> already if we push down
> aggregates) and shouldn't be fixed here. But there is an issue -
> is_shippable() is too optimistic.
I think it is correct for now.
F.38.7 of [1] says "A limitation however is that postgres_fdw generally assumes that
immutable built-in functions and operators are safe to send to the remote server for
execution, if they appear in a WHERE clause for a foreign table." and says that we can
avoid this limitation by rewriting query.
It looks that postgres_fdw follows this policy in case of UPPERREL_GROUP_AGG aggregate pushdown.
If a aggreagate has not internal aggtranstype and has not aggfinalfn ,
partialaggfn of it is equal to itself.
So I think that it is adequate to follow this policy in case of partial aggregate pushdown
 for such aggregates.

> >> 2) Do we really have to look at pg_proc in partial_agg_ok() and
> >> deparseAggref()? Perhaps, looking at aggtranstype is enough?
> > You are right. I fixed according to your comment.
> >
>
> partial_agg_ok() still looks at pg_proc.
Sorry for taking up your time. I fixed it.

[1] https://www.postgresql.org/docs/current/postgres-fdw.html

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v15.patch

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

01 декабря 2022 г., 19:36:12

Fujii.Yuki@df.MitsubishiElectric.co.jp писал 2022-12-01 05:23:
> Hi Mr.Pyhalov.
> 
Hi.

Attaching minor fixes. I haven't proof-read all comments (but perhaps, 
they need attention from some native speaker).

Tested it with queries from 
https://github.com/swarm64/s64da-benchmark-toolkit, works as expected.
-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Вложения

1.patch

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

05 декабря 2022 г., 05:03:49

Hi Mr.Pyhalov.

> Attaching minor fixes. I haven't proof-read all comments (but perhaps, they
> need attention from some native speaker).
Thank you. I fixed according to your patch.
And I fixed have proof-read all comments and messages.

> Tested it with queries from
> https://github.com/swarm64/s64da-benchmark-toolkit, works as expected.
Thank you for additional tests.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v16.patch

Re: Partial aggregates pushdown

От

Andres Freund

Дата:

07 декабря 2022 г., 21:59:43

Hi,

On 2022-12-05 02:03:49 +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> > Attaching minor fixes. I haven't proof-read all comments (but perhaps, they
> > need attention from some native speaker).
> Thank you. I fixed according to your patch.
> And I fixed have proof-read all comments and messages.

cfbot complains about some compiler warnings when building with clang:
https://cirrus-ci.com/task/6606268580757504

deparse.c:3459:22: error: equality comparison with extraneous parentheses [-Werror,-Wparentheses-equality]
        if ((node->aggsplit == AGGSPLIT_SIMPLE)) {
             ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~
deparse.c:3459:22: note: remove extraneous parentheses around the comparison to silence this warning
        if ((node->aggsplit == AGGSPLIT_SIMPLE)) {
            ~               ^                 ~
deparse.c:3459:22: note: use '=' to turn this equality comparison into an assignment
        if ((node->aggsplit == AGGSPLIT_SIMPLE)) {
                            ^~
                            =

Greetings,

Andres Freund

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

16 декабря 2022 г., 01:23:05

Hi Mr.Freund.

> cfbot complains about some compiler warnings when building with clang:
> https://cirrus-ci.com/task/6606268580757504
>
> deparse.c:3459:22: error: equality comparison with extraneous parentheses
> [-Werror,-Wparentheses-equality]
>         if ((node->aggsplit == AGGSPLIT_SIMPLE)) {
>              ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~
> deparse.c:3459:22: note: remove extraneous parentheses around the
> comparison to silence this warning
>         if ((node->aggsplit == AGGSPLIT_SIMPLE)) {
>             ~               ^                 ~
> deparse.c:3459:22: note: use '=' to turn this equality comparison into an
> assignment
>         if ((node->aggsplit == AGGSPLIT_SIMPLE)) {
>                             ^~
>                             =
I fixed this error.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v17.patch

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

31 марта 2023 г., 02:41:21

On Thu, Dec 15, 2022 at 10:23:05PM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> Hi Mr.Freund.
> 
> > cfbot complains about some compiler warnings when building with clang:
> > https://cirrus-ci.com/task/6606268580757504
> > 
> > deparse.c:3459:22: error: equality comparison with extraneous parentheses
> > [-Werror,-Wparentheses-equality]
> >         if ((node->aggsplit == AGGSPLIT_SIMPLE)) {
> >              ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~
> > deparse.c:3459:22: note: remove extraneous parentheses around the
> > comparison to silence this warning
> >         if ((node->aggsplit == AGGSPLIT_SIMPLE)) {
> >             ~               ^                 ~
> > deparse.c:3459:22: note: use '=' to turn this equality comparison into an
> > assignment
> >         if ((node->aggsplit == AGGSPLIT_SIMPLE)) {
> >                             ^~
> >                             =
> I fixed this error.

Considering we only have a week left before feature freeze, I wanted to
review the patch from this commitfest item:

    https://commitfest.postgresql.org/42/4019/

The most recent patch attached.

This feature has been in development since 2021, and it is something
that will allow new workloads for Postgres, specifically data warehouse
sharding workloads.

We currently allow parallel aggregates when the table is on the same
machine, and we allow partitonwise aggregates on FDWs only with GROUP BY
keys matching partition keys.  The first is possible since we can share
data structures between background workers, and the second is possible
because if the GROUP BY includes the partition key, we are really just
appending aggregate rows, not combining aggregate computations.

What we can't do without this patch is to push aggregates that require
partial aggregate computations (no partition key GROUP BY) to FDW
partitions because we don't have a clean way to pass such information
from the remote FDW server to the finalize backend.  I think that is
what this patch does.

First, am I correct?  Second, how far away is this from being committable
and/or what work needs to be done to get it committable, either for PG 16
or 17?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.

Вложения

0001-Partial-aggregates-push-down-v17.patch

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

31 марта 2023 г., 08:49:21

Hi Mr.Momjian

> First, am I correct?
Yes, you are correct. This patch uses new special aggregate functions for partial aggregate
(then we call this partialaggfunc).

> Second, how far away is this from being committable
> and/or what work needs to be done to get it committable, either for PG 16 or 17?
I believe there are three: 1. and 2. are not clear if they are necessary or not; 3. are clearly necessary.
I would like to hear the opinions of the development community on whether or not 1. and 2. need to be addressed.

1. Making partialaggfunc user-defined function
In v17, I make partialaggfuncs as built-in functions.
Because of this approach, v17 changes specification of BKI file and pg_aggregate.
For now, partialaggfuncs are needed by only postgres_fdw which is just an extension of PostgreSQL.
In the future, when revising the specifications for BKI files and pg_aggregate when modifying existing PostgreSQL
functions,
It is necessary to align them with this patch's changes.
I am concerned that this may be undesirable.
So I am thinking that v17 should be modified to making partialaggfunc as user defined function.

2. Automation of creating definition of partialaggfuncs
In development of v17, I manually create definition of partialaggfuncs for avg, min, max, sum, count.
I am concerned that this may be undesirable.
So I am thinking that v17 should be modified to automate creating definition of partialaggfuncs
for all built-in aggregate functions.

3. Documentation
I need add explanation of partialaggfunc to documents on postgres_fdw and other places.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

07 апреля 2023 г., 04:59:39

On Fri, Mar 31, 2023 at 05:49:21AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> Hi Mr.Momjian
> 
> > First, am I correct?
> Yes, you are correct. This patch uses new special aggregate functions for partial aggregate
> (then we call this partialaggfunc).

First, my apologies for not addressing this sooner.  I was so focused on
my own tasks that I didn't realize this very important patch was not
getting attention.  I will try my best to get it into PG 17.

What amazes me is that you didn't need to create _any_ actual aggregate
functions.  Rather, you just needed to hook existing functions into the
aggregate tables for partial FDW execution.

> > Second, how far away is this from being committable
> > and/or what work needs to be done to get it committable, either for PG 16 or 17?
> I believe there are three: 1. and 2. are not clear if they are necessary or not; 3. are clearly necessary.
> I would like to hear the opinions of the development community on whether or not 1. and 2. need to be addressed.
> 
> 1. Making partialaggfunc user-defined function
> In v17, I make partialaggfuncs as built-in functions.
> Because of this approach, v17 changes specification of BKI file and pg_aggregate.
> For now, partialaggfuncs are needed by only postgres_fdw which is just an extension of PostgreSQL.
> In the future, when revising the specifications for BKI files and pg_aggregate when modifying existing PostgreSQL
functions,
> It is necessary to align them with this patch's changes.
> I am concerned that this may be undesirable.
> So I am thinking that v17 should be modified to making partialaggfunc as user defined function.

I think we have three possible cases for aggregates pushdown to FDWs:

1)  Postgres built-in aggregate functions
2)  Postgres user-defined & extension aggregate functions
3)  aggregate functions calls to non-PG FDWs

Your patch handles #1 by checking that the FDW Postgres version is the
same as the calling Postgres version.  However, it doesn't check for
extension versions, and frankly, I don't see how we could implement that
cleanly without significant complexity.

I suggest we remove the version check requirement --- instead just
document that the FDW Postgres version should be the same or newer than
the calling Postgres server --- that way, we can assume that whatever is
in the system catalogs of the caller is in the receiving side.  We
should add a GUC to turn off this optimization for cases where the FDW
Postgres version is older than the caller.  This handles case 1-2.

For case 3, I don't even know how much pushdown those do of _any_
aggregates to non-PG servers, let along parallel FDW ones.  Does anyone
know the details?

> 2. Automation of creating definition of partialaggfuncs
> In development of v17, I manually create definition of partialaggfuncs for avg, min, max, sum, count.
> I am concerned that this may be undesirable.
> So I am thinking that v17 should be modified to automate creating definition of partialaggfuncs
> for all built-in aggregate functions.

Are there any other builtin functions that need this?  I think we can
just provide documention for extensions on how to do this.

> 3. Documentation
> I need add explanation of partialaggfunc to documents on postgres_fdw and other places.

I can help with that once we decide on the above.

I think 'partialaggfn' should be named 'aggpartialfn' to match other
columns in pg_aggregate.

I am confused by these changes to pg_aggegate:

+{ aggfnoid => 'sum_p_int8', aggtransfn => 'int8_avg_accum',
+  aggfinalfn => 'int8_avg_serialize', aggcombinefn => 'int8_avg_combine',
+  aggserialfn => 'int8_avg_serialize', aggdeserialfn => 'int8_avg_deserialize',
+  aggtranstype => 'internal', aggtransspace => '48' },

...

+{ aggfnoid => 'sum_p_numeric', aggtransfn => 'numeric_avg_accum',
+  aggfinalfn => 'numeric_avg_serialize', aggcombinefn => 'numeric_avg_combine',
+  aggserialfn => 'numeric_avg_serialize',
+  aggdeserialfn => 'numeric_avg_deserialize',
+  aggtranstype => 'internal', aggtransspace => '128' },

Why are these marked as 'sum' but use 'avg' functions?

It would be good to explain exactly how this is diffent from background
worker parallelism.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.

Вложения

aggpush.diff

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

07 апреля 2023 г., 12:25:52

Hi Mr.Momjian

> First, my apologies for not addressing this sooner.  I was so focused on my
> own tasks that I didn't realize this very important patch was not getting
> attention.  I will try my best to get it into PG 17.
Thank you very much for your comments.
I will improve this patch for PG17.
I believe that this patch will help us use PostgreSQL's built-in sharding for OLAP.

> What amazes me is that you didn't need to create _any_ actual aggregate
> functions.  Rather, you just needed to hook existing functions into the
> aggregate tables for partial FDW execution.
Yes. This patch enables partial aggregate pushdown using
only existing functions which belong to existing aggregate function
and are needed by parallel query(such as state transition function and serialization function).
This patch does not need new types of function belonging to aggregate functions
and does not need new functions belonging to existing aggregate functions.

> I suggest we remove the version check requirement --- instead just document
> that the FDW Postgres version should be the same or newer than the calling
> Postgres server --- that way, we can assume that whatever is in the system
> catalogs of the caller is in the receiving side.
Thanks for the comment. I will modify this patch according to your comment.

> We should add a GUC to turn off
> this optimization for cases where the FDW Postgres version is older than the
> caller.  This handles case 1-2.
Thanks for the advice here too.
I thought it would be more appropriate to add a foregin server option of
postgres_fdw rather than adding GUC.
Would you mind if I ask you what you think about it?

> > 2. Automation of creating definition of partialaggfuncs In development
> > of v17, I manually create definition of partialaggfuncs for avg, min, max, sum,
> count.
> > I am concerned that this may be undesirable.
> > So I am thinking that v17 should be modified to automate creating
> > definition of partialaggfuncs for all built-in aggregate functions.
>
> Are there any other builtin functions that need this?  I think we can just
> provide documention for extensions on how to do this.
For practical purposes, it is sufficient
if partial aggregate for the above functions can be pushed down.
I think you are right, it would be sufficient to document how to achieve
 partial aggregate pushdown for other built-in functions.

> > 3. Documentation
> > I need add explanation of partialaggfunc to documents on postgres_fdw and
> other places.
>
> I can help with that once we decide on the above.
Thank you. In the next verion of this patch, I will add documents on postgres_fdw
and other places.

> I think 'partialaggfn' should be named 'aggpartialfn' to match other columns in
> pg_aggregate.
Thanks for the comment. I will modify this patch according to your comment.

> For case 3, I don't even know how much pushdown those do of _any_
> aggregates to non-PG servers, let along parallel FDW ones.  Does anyone
> know the details?
To allow partial aggregate pushdown for non-PG FDWs,
I think we need to add pushdown logic to their FDWs for each function.
For example, we need to add logic avg() -> sum()/count() to their FDWs for avg.
To allow parallel partial aggregate by non-PG FDWs,
I think we need to add FDW Routines for Asynchronous Execution to their FDWs[1].

> I am confused by these changes to pg_aggegate:
>
> +{ aggfnoid => 'sum_p_int8', aggtransfn => 'int8_avg_accum',
> +  aggfinalfn => 'int8_avg_serialize', aggcombinefn =>
> +'int8_avg_combine',
> +  aggserialfn => 'int8_avg_serialize', aggdeserialfn =>
> +'int8_avg_deserialize',
> +  aggtranstype => 'internal', aggtransspace => '48' },
>
> ...
>
> +{ aggfnoid => 'sum_p_numeric', aggtransfn => 'numeric_avg_accum',
> +  aggfinalfn => 'numeric_avg_serialize', aggcombinefn =>
> +'numeric_avg_combine',
> +  aggserialfn => 'numeric_avg_serialize',
> +  aggdeserialfn => 'numeric_avg_deserialize',
> +  aggtranstype => 'internal', aggtransspace => '128' },
>
> Why are these marked as 'sum' but use 'avg' functions?
This reason is that sum(int8)/sum(numeric) shares some functions with avg(int8)/avg(numeric),
and sum_p_int8 is aggpartialfn of sum(int8) and sum_p_numeric is aggpartialfn of sum(numeric).

--Part of avg(int8) in BKI file in PostgreSQL15.0[2].
{ aggfnoid => 'avg(int8)', aggtransfn => 'int8_avg_accum',
  aggfinalfn => 'numeric_poly_avg', aggcombinefn => 'int8_avg_combine',
  aggserialfn => 'int8_avg_serialize', aggdeserialfn => 'int8_avg_deserialize',
  aggmtransfn => 'int8_avg_accum', aggminvtransfn => 'int8_avg_accum_inv',
  aggmfinalfn => 'numeric_poly_avg', aggtranstype => 'internal',
  aggtransspace => '48', aggmtranstype => 'internal', aggmtransspace => '48' },
--

--Part of sum(int8) in BKI file in PostgreSQL15.0[2].
{ aggfnoid => 'sum(int8)', aggtransfn => 'int8_avg_accum',
  aggfinalfn => 'numeric_poly_sum', aggcombinefn => 'int8_avg_combine',
  aggserialfn => 'int8_avg_serialize', aggdeserialfn => 'int8_avg_deserialize',
  aggmtransfn => 'int8_avg_accum', aggminvtransfn => 'int8_avg_accum_inv',
  aggmfinalfn => 'numeric_poly_sum', aggtranstype => 'internal',
  aggtransspace => '48', aggmtranstype => 'internal', aggmtransspace => '48' },
--

[1] https://www.postgresql.org/docs/15/fdw-callbacks.html#FDW-CALLBACKS-ASYNC
[2] https://github.com/postgres/postgres/blob/REL_15_0/src/include/catalog/pg_aggregate.dat

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

08 апреля 2023 г., 04:50:11

On Fri, Apr  7, 2023 at 09:25:52AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> Hi Mr.Momjian
> 
> > First, my apologies for not addressing this sooner.  I was so focused on my
> > own tasks that I didn't realize this very important patch was not getting
> > attention.  I will try my best to get it into PG 17.
> Thank you very much for your comments. 
> I will improve this patch for PG17.
> I believe that this patch will help us use PostgreSQL's built-in sharding for OLAP.

Agreed!  Again, my apologies for not helping with this _much_ sooner. 
You have done amazing work here.

> > What amazes me is that you didn't need to create _any_ actual aggregate
> > functions.  Rather, you just needed to hook existing functions into the
> > aggregate tables for partial FDW execution.
> Yes. This patch enables partial aggregate pushdown using 
> only existing functions which belong to existing aggregate function
> and are needed by parallel query(such as state transition function and serialization function).
> This patch does not need new types of function belonging to aggregate functions
> and does not need new functions belonging to existing aggregate functions.

Very nice.

> > I suggest we remove the version check requirement --- instead just document
> > that the FDW Postgres version should be the same or newer than the calling
> > Postgres server --- that way, we can assume that whatever is in the system
> > catalogs of the caller is in the receiving side.  
> Thanks for the comment. I will modify this patch according to your comment.
> 
> > We should add a GUC to turn off
> > this optimization for cases where the FDW Postgres version is older than the
> > caller.  This handles case 1-2.
> Thanks for the advice here too.
> I thought it would be more appropriate to add a foregin server option of 
> postgres_fdw rather than adding GUC. 
> Would you mind if I ask you what you think about it?

I like the GUC idea because it gives administrators a single place to
check if the feature is enabled.  However, I can imagine cases where you
might have multiple remote FDW servers and some might be older than the
sending server.

What I don't want is an error-prone setup where administrators have to
remember what the per-server settings are.  Based on your suggestion,
let's allow CREATE SERVER to have an option 'enable_async_aggregate' (is
that the right name?), which defaults to 'true' for _all_ servers, even
those that don't support async aggregates.  With that, all FDW servers
are enabled by default, and if the FDW extension supports async
aggregates, they will automatically be pushed down and will report an
error only if the remote FDW is too old to support it.

> > > 2. Automation of creating definition of partialaggfuncs In development
> > > of v17, I manually create definition of partialaggfuncs for avg, min, max, sum,
> > count.
> > > I am concerned that this may be undesirable.
> > > So I am thinking that v17 should be modified to automate creating
> > > definition of partialaggfuncs for all built-in aggregate functions.
> > 
> > Are there any other builtin functions that need this?  I think we can just
> > provide documention for extensions on how to do this.
> For practical purposes, it is sufficient 
> if partial aggregate for the above functions can be pushed down.
> I think you are right, it would be sufficient to document how to achieve
>  partial aggregate pushdown for other built-in functions.

Uh, we actually want the patch to implement partial aggregate pushdown
for all builtin data types that can support it.  Is that done?  I think
it is only extension aggregates, which we do not control, that need this
documentation.

> > > 3. Documentation
> > > I need add explanation of partialaggfunc to documents on postgres_fdw and
> > other places.
> > 
> > I can help with that once we decide on the above.
> Thank you. In the next verion of this patch, I will add documents on postgres_fdw
> and other places. 

Good.

> > I think 'partialaggfn' should be named 'aggpartialfn' to match other columns in
> > pg_aggregate.
> Thanks for the comment. I will modify this patch according to your comment.
> 
> > For case 3, I don't even know how much pushdown those do of _any_
> > aggregates to non-PG servers, let along parallel FDW ones.  Does anyone
> > know the details?
> To allow partial aggregate pushdown for non-PG FDWs,
> I think we need to add pushdown logic to their FDWs for each function.
> For example, we need to add logic avg() -> sum()/count() to their FDWs for avg.
> To allow parallel partial aggregate by non-PG FDWs,
> I think we need to add FDW Routines for Asynchronous Execution to their FDWs[1].

Okay, I think we can just implement this for 1-2 and let extensions
worry about 3.

> > I am confused by these changes to pg_aggegate:
> > 
> > +{ aggfnoid => 'sum_p_int8', aggtransfn => 'int8_avg_accum',
> > +  aggfinalfn => 'int8_avg_serialize', aggcombinefn =>
> > +'int8_avg_combine',
> > +  aggserialfn => 'int8_avg_serialize', aggdeserialfn =>
> > +'int8_avg_deserialize',
> > +  aggtranstype => 'internal', aggtransspace => '48' },
> > 
> > ...
> > 
> > +{ aggfnoid => 'sum_p_numeric', aggtransfn => 'numeric_avg_accum',
> > +  aggfinalfn => 'numeric_avg_serialize', aggcombinefn =>
> > +'numeric_avg_combine',
> > +  aggserialfn => 'numeric_avg_serialize',
> > +  aggdeserialfn => 'numeric_avg_deserialize',
> > +  aggtranstype => 'internal', aggtransspace => '128' },
> > 
> > Why are these marked as 'sum' but use 'avg' functions?
> This reason is that sum(int8)/sum(numeric) shares some functions with avg(int8)/avg(numeric),
> and sum_p_int8 is aggpartialfn of sum(int8) and sum_p_numeric is aggpartialfn of sum(numeric).

Ah, I see this now, thanks.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.

Re: Partial aggregates pushdown

От

Tom Lane

Дата:

08 апреля 2023 г., 04:55:00

Bruce Momjian <bruce@momjian.us> writes:
> What I don't want is an error-prone setup where administrators have to
> remember what the per-server settings are.  Based on your suggestion,
> let's allow CREATE SERVER to have an option 'enable_async_aggregate' (is
> that the right name?), which defaults to 'true' for _all_ servers, even
> those that don't support async aggregates.

Uh, what?  Why would we not be able to tell from the remote server's
version number whether it has this ability?

            regards, tom lane

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

08 апреля 2023 г., 05:34:08

On Fri, Apr  7, 2023 at 09:55:00PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > What I don't want is an error-prone setup where administrators have to
> > remember what the per-server settings are.  Based on your suggestion,
> > let's allow CREATE SERVER to have an option 'enable_async_aggregate' (is
> > that the right name?), which defaults to 'true' for _all_ servers, even
> > those that don't support async aggregates.
> 
> Uh, what?  Why would we not be able to tell from the remote server's
> version number whether it has this ability?

That was covered here:

    https://www.postgresql.org/message-id/ZC95C0%2BPVhVP3iax%40momjian.us

    I think we have three possible cases for aggregate pushdown to FDWs:

    1)  Postgres built-in aggregate functions
    2)  Postgres user-defined & extension aggregate functions
    3)  aggregate functions calls to non-PG FDWs

    Your patch handles #1 by checking that the FDW Postgres version is the
-->    same as the calling Postgres version.  However, it doesn't check for
-->    extension versions, and frankly, I don't see how we could implement that
-->    cleanly without significant complexity.

The issue is not a mismatch of postgres_fdw versions but the extension
versions and whether the partial aggregate functions exist on the remote
side, e.g., something like a PostGIS upgrade.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.

Re: Partial aggregates pushdown

От

Tom Lane

Дата:

08 апреля 2023 г., 05:44:09

Bruce Momjian <bruce@momjian.us> writes:
> On Fri, Apr  7, 2023 at 09:55:00PM -0400, Tom Lane wrote:
>> Uh, what?  Why would we not be able to tell from the remote server's
>> version number whether it has this ability?

> The issue is not a mismatch of postgres_fdw versions but the extension
> versions and whether the partial aggregate functions exist on the remote
> side, e.g., something like a PostGIS upgrade.

postgres_fdw has no business pushing down calls to non-builtin functions
unless the user has explicitly authorized that via the existing
whitelisting mechanism.  I think you're reinventing the wheel,
and not very well.

            regards, tom lane

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

08 апреля 2023 г., 05:53:53

On Fri, Apr  7, 2023 at 10:44:09PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Fri, Apr  7, 2023 at 09:55:00PM -0400, Tom Lane wrote:
> >> Uh, what?  Why would we not be able to tell from the remote server's
> >> version number whether it has this ability?
> 
> > The issue is not a mismatch of postgres_fdw versions but the extension
> > versions and whether the partial aggregate functions exist on the remote
> > side, e.g., something like a PostGIS upgrade.
> 
> postgres_fdw has no business pushing down calls to non-builtin functions
> unless the user has explicitly authorized that via the existing
> whitelisting mechanism.  I think you're reinventing the wheel,
> and not very well.

The patch has you assign an option at CREATE AGGREGATE time if it can do
push down, and postgres_fdw checks that.  What whitelisting mechanism
are you talking about?  async_capable?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

08 апреля 2023 г., 06:04:23

On Fri, Apr  7, 2023 at 10:53:53PM -0400, Bruce Momjian wrote:
> On Fri, Apr  7, 2023 at 10:44:09PM -0400, Tom Lane wrote:
> > Bruce Momjian <bruce@momjian.us> writes:
> > > On Fri, Apr  7, 2023 at 09:55:00PM -0400, Tom Lane wrote:
> > >> Uh, what?  Why would we not be able to tell from the remote server's
> > >> version number whether it has this ability?
> > 
> > > The issue is not a mismatch of postgres_fdw versions but the extension
> > > versions and whether the partial aggregate functions exist on the remote
> > > side, e.g., something like a PostGIS upgrade.
> > 
> > postgres_fdw has no business pushing down calls to non-builtin functions
> > unless the user has explicitly authorized that via the existing
> > whitelisting mechanism.  I think you're reinventing the wheel,
> > and not very well.
> 
> The patch has you assign an option at CREATE AGGREGATE time if it can do
> push down, and postgres_fdw checks that.  What whitelisting mechanism
> are you talking about?  async_capable?

FYI, in the patch the CREATE AGGREGATE option is called PARTIALAGGFUNC.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.

Re: Partial aggregates pushdown

От

Andres Freund

Дата:

08 апреля 2023 г., 07:16:14

On 2023-04-07 22:53:53 -0400, Bruce Momjian wrote:
> On Fri, Apr  7, 2023 at 10:44:09PM -0400, Tom Lane wrote:
> > Bruce Momjian <bruce@momjian.us> writes:
> > > On Fri, Apr  7, 2023 at 09:55:00PM -0400, Tom Lane wrote:
> > >> Uh, what?  Why would we not be able to tell from the remote server's
> > >> version number whether it has this ability?
> > 
> > > The issue is not a mismatch of postgres_fdw versions but the extension
> > > versions and whether the partial aggregate functions exist on the remote
> > > side, e.g., something like a PostGIS upgrade.
> > 
> > postgres_fdw has no business pushing down calls to non-builtin functions
> > unless the user has explicitly authorized that via the existing
> > whitelisting mechanism.  I think you're reinventing the wheel,
> > and not very well.
> 
> The patch has you assign an option at CREATE AGGREGATE time if it can do
> push down, and postgres_fdw checks that.  What whitelisting mechanism
> are you talking about?  async_capable?

extensions (string)

    This option is a comma-separated list of names of PostgreSQL extensions that are installed, in compatible versions,
onboth the local and remote servers. Functions and operators that are immutable and belong to a listed extension will
beconsidered shippable to the remote server. This option can only be specified for foreign servers, not per-table.
 

    When using the extensions option, it is the user's responsibility that the listed extensions exist and behave
identicallyon both the local and remote servers. Otherwise, remote queries may fail or behave unexpectedly.

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

08 апреля 2023 г., 17:15:40

On Fri, Apr  7, 2023 at 09:16:14PM -0700, Andres Freund wrote:
> On 2023-04-07 22:53:53 -0400, Bruce Momjian wrote:
> > > postgres_fdw has no business pushing down calls to non-builtin functions
> > > unless the user has explicitly authorized that via the existing
> > > whitelisting mechanism.  I think you're reinventing the wheel,
> > > and not very well.
> > 
> > The patch has you assign an option at CREATE AGGREGATE time if it can do
> > push down, and postgres_fdw checks that.  What whitelisting mechanism
> > are you talking about?  async_capable?
> 
> extensions (string)
> 
>     This option is a comma-separated list of names of PostgreSQL extensions that are installed, in compatible
versions,on both the local and remote servers. Functions and operators that are immutable and belong to a listed
extensionwill be considered shippable to the remote server. This option can only be specified for foreign servers, not
per-table.
> 
>     When using the extensions option, it is the user's responsibility that the listed extensions exist and behave
identicallyon both the local and remote servers. Otherwise, remote queries may fail or behave unexpectedly.

Okay, this is very helpful --- it is exactly the issue we are dealing
with --- how can we know if partial aggregate functions exists on the
remote server.  (I knew I was going to need API help on this.)

So, let's remove the PARTIALAGG_MINVERSION option from the patch and
just make it automatic --- we push down builtin partial aggregates if
the remote server is the same or newer _major_ version than the sending
server.  For extensions, if people have older extensions on the same or
newer foreign servers, they can adjust 'extensions' above.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

08 апреля 2023 г., 19:18:21

On Sat, Apr  8, 2023 at 10:15:40AM -0400, Bruce Momjian wrote:
> On Fri, Apr  7, 2023 at 09:16:14PM -0700, Andres Freund wrote:
> > extensions (string)
> > 
> >     This option is a comma-separated list of names of PostgreSQL extensions that are installed, in compatible
versions,on both the local and remote servers. Functions and operators that are immutable and belong to a listed
extensionwill be considered shippable to the remote server. This option can only be specified for foreign servers, not
per-table.
> > 
> >     When using the extensions option, it is the user's responsibility that the listed extensions exist and behave
identicallyon both the local and remote servers. Otherwise, remote queries may fail or behave unexpectedly.
 
> 
> Okay, this is very helpful --- it is exactly the issue we are dealing
> with --- how can we know if partial aggregate functions exists on the
> remote server.  (I knew I was going to need API help on this.)
> 
> So, let's remove the PARTIALAGG_MINVERSION option from the patch and
> just make it automatic --- we push down builtin partial aggregates if
> the remote server is the same or newer _major_ version than the sending
> server.  For extensions, if people have older extensions on the same or
> newer foreign servers, they can adjust 'extensions' above.

Looking further, I don't see any cases where we check if a builtin
function added in a major release also exists on the foreign server, so
maybe we don't need any checks but just need a mention in the release
notes.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

10 апреля 2023 г., 04:18:37

Hi Mr.Momjian, Mr.Lane, Mr.Freund.

Thank you for advices.

> From: Bruce Momjian <bruce@momjian.us>
> > > > 2. Automation of creating definition of partialaggfuncs In
> > > > development of v17, I manually create definition of
> > > > partialaggfuncs for avg, min, max, sum,
> > > count.
> > > > I am concerned that this may be undesirable.
> > > > So I am thinking that v17 should be modified to automate creating
> > > > definition of partialaggfuncs for all built-in aggregate functions.
> > >
> > > Are there any other builtin functions that need this?  I think we
> > > can just provide documention for extensions on how to do this.
> > For practical purposes, it is sufficient if partial aggregate for the
> > above functions can be pushed down.
> > I think you are right, it would be sufficient to document how to
> > achieve  partial aggregate pushdown for other built-in functions.
>
> Uh, we actually want the patch to implement partial aggregate pushdown for all
> builtin data types that can support it.  Is that done?  I think it is only extension
> aggregates, which we do not control, that need this documentation.
The last version of this patch can't pushdown partial aggregate for all builtin aggregate functions that can support
it.
I will improve this patch to pushdown partial aggregate for all builtin aggregate functions
that can support it.

There is one more thing I would like your opinion on.
As the major version of PostgreSQL increase, it is possible that
the new builtin aggregate functions are added to the newer PostgreSQL.
This patch assume that aggpartialfns definitions exist in BKI files.
Due to this assumption, someone should add aggpartialfns definitions of new builtin aggregate functions to BKI files.
There are two possible ways to address this issue. Would the way1 be sufficient?
Or would way2 be preferable?
  way1) Adding documentaion for how to add these definitions to BKI files
  way2) Improving this patch to automatically add these definitions to BKI files by some tool such as initdb.

> From: Bruce Momjian <bruce@momjian.us>
> On Fri, Apr  7, 2023 at 09:16:14PM -0700, Andres Freund wrote:
> > On 2023-04-07 22:53:53 -0400, Bruce Momjian wrote:
> > > > postgres_fdw has no business pushing down calls to non-builtin
> > > > functions unless the user has explicitly authorized that via the
> > > > existing whitelisting mechanism.  I think you're reinventing the
> > > > wheel, and not very well.
> > >
> > > The patch has you assign an option at CREATE AGGREGATE time if it
> > > can do push down, and postgres_fdw checks that.  What whitelisting
> > > mechanism are you talking about?  async_capable?
> >
> > extensions (string)
> >
> >     This option is a comma-separated list of names of PostgreSQL
> extensions that are installed, in compatible versions, on both the local and
> remote servers. Functions and operators that are immutable and belong to a
> listed extension will be considered shippable to the remote server. This option
> can only be specified for foreign servers, not per-table.
> >
> >     When using the extensions option, it is the user's responsibility that the
> listed extensions exist and behave identically on both the local and remote
> servers. Otherwise, remote queries may fail or behave unexpectedly.
>
> Okay, this is very helpful --- it is exactly the issue we are dealing with --- how
> can we know if partial aggregate functions exists on the remote server.  (I
> knew I was going to need API help on this.)
>
> So, let's remove the PARTIALAGG_MINVERSION option from the patch and just
> make it automatic --- we push down builtin partial aggregates if the remote
> server is the same or newer _major_ version than the sending server.  For
> extensions, if people have older extensions on the same or newer foreign
> servers, they can adjust 'extensions' above.
Okay, I understand. I will remove PARTIALAGG_MINVERSION option from the patch
and I will add check whether aggpartialfn depends on some extension which
is containd in extensions list of the postgres_fdw's foreign server.
In the next version of this patch,
we can pushdown partial aggregate for an user-defined aggregate function only
when the function pass through this check.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

13 апреля 2023 г., 09:12:44

On Mon, Apr 10, 2023 at 01:18:37AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> > Uh, we actually want the patch to implement partial aggregate pushdown for all
> > builtin data types that can support it.  Is that done?  I think it is only extension
> > aggregates, which we do not control, that need this documentation.
> The last version of this patch can't pushdown partial aggregate for all builtin aggregate functions that can support
it.
> I will improve this patch to pushdown partial aggregate for all builtin aggregate functions
> that can support it.
> 
> There is one more thing I would like your opinion on.
> As the major version of PostgreSQL increase, it is possible that
> the new builtin aggregate functions are added to the newer PostgreSQL.
> This patch assume that aggpartialfns definitions exist in BKI files.
> Due to this assumption, someone should add aggpartialfns definitions of new builtin aggregate functions to BKI
files.
> There are two possible ways to address this issue. Would the way1 be sufficient?
> Or would way2 be preferable?
>   way1) Adding documentaion for how to add these definitions to BKI files
>   way2) Improving this patch to automatically add these definitions to BKI files by some tool such as initdb.

I think documentation is sufficient.  You already showed that someone
can do this with CREATE AGGREGATE for non-builtin aggregates.

> > So, let's remove the PARTIALAGG_MINVERSION option from the patch and just
> > make it automatic --- we push down builtin partial aggregates if the remote
> > server is the same or newer _major_ version than the sending server.  For
> > extensions, if people have older extensions on the same or newer foreign
> > servers, they can adjust 'extensions' above.
> Okay, I understand. I will remove PARTIALAGG_MINVERSION option from the patch
> and I will add check whether aggpartialfn depends on some extension which
> is containd in extensions list of the postgres_fdw's foreign server.

Yes, good.  Did we never push down aggregates before?  I thought we
pushed down partitionwise aggregates already, and such a check should
already be done.  If the check isn't there, it should be.

> In the next version of this patch,
> we can pushdown partial aggregate for an user-defined aggregate function only 
> when the function pass through this check.

Understood.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

13 апреля 2023 г., 09:50:36

On Thu, Apr 13, 2023 at 02:12:44AM -0400, Bruce Momjian wrote:
> > In the next version of this patch,
> > we can pushdown partial aggregate for an user-defined aggregate function only 
> > when the function pass through this check.
> 
> Understood.

In summary, we don't do any version check for built-in function
pushdown, so we don't need it for aggregates either.  We check extension
functions against the extension pushdown list, so we should be checking
this for partial aggregate pushdown, and for partition-wise aggregate
pushdown.  If we don't do that last check already, it is a bug.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

13 апреля 2023 г., 13:56:26

Hi Mr.Momjian.

> > There is one more thing I would like your opinion on.
> > As the major version of PostgreSQL increase, it is possible that the
> > new builtin aggregate functions are added to the newer PostgreSQL.
> > This patch assume that aggpartialfns definitions exist in BKI files.
> > Due to this assumption, someone should add aggpartialfns definitions of
> new builtin aggregate functions to BKI files.
> > There are two possible ways to address this issue. Would the way1 be
> sufficient?
> > Or would way2 be preferable?
> >   way1) Adding documentaion for how to add these definitions to BKI files
> >   way2) Improving this patch to automatically add these definitions to BKI
> files by some tool such as initdb.
>
> I think documentation is sufficient.  You already showed that someone can do
> this with CREATE AGGREGATE for non-builtin aggregates.
Thank you for your opinion. I will modify this patch according to the way1.

> > > So, let's remove the PARTIALAGG_MINVERSION option from the patch and
> > > just make it automatic --- we push down builtin partial aggregates
> > > if the remote server is the same or newer _major_ version than the
> > > sending server.  For extensions, if people have older extensions on
> > > the same or newer foreign servers, they can adjust 'extensions' above.
> > Okay, I understand. I will remove PARTIALAGG_MINVERSION option from
> > the patch and I will add check whether aggpartialfn depends on some
> > extension which is containd in extensions list of the postgres_fdw's foreign
> server.
>
> Yes, good.  Did we never push down aggregates before?  I thought we
> pushed down partitionwise aggregates already, and such a check should
> already be done.  If the check isn't there, it should be.
Yes. The last version of this patch(and original postgres_fdw) checks if
user-defined aggregate depends some extension which is contained in 'extensions'.
But, in the last version of this patch, there is no such check for
aggpartialfn of user-defined aggregate. So, I will add such check to this patch.
I think that this modification is easy to do .

> In summary, we don't do any version check for built-in function pushdown, so
> we don't need it for aggregates either.  We check extension functions against
> the extension pushdown list, so we should be checking this for partial
> aggregate pushdown, and for partition-wise aggregate pushdown.  If we don't
> do that last check already, it is a bug.
I understood.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Re: Partial aggregates pushdown

От

Robert Haas

Дата:

13 апреля 2023 г., 19:01:59

On Wed, Nov 30, 2022 at 3:12 AM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:
> 1) In previous version of the patch aggregates, which had partialaggfn,
> were ok to push down. And it was a definite sign that aggregate can be
> pushed down. Now we allow pushing down an aggregate, which prorettype is
> not internal and aggfinalfn is not defined. Is it safe for all
> user-defined (or builtin) aggregates, even if they are generally
> shippable?

I think that this is exactly the correct test. Here's how to think
about it: to perform an aggregate, you merge all the values into the
transition state, and then you apply the final function once at the
end. So the process looks like this:

TRANSITION_STATE_0 + VALUE_1 = TRANSITION_STATE_1
TRANSITION_STATE_1 + VALUE_2 = TRANSITION_STATE_2
...
TRANSITION_STATE_N => RESULT

Here, + represents applying the transition function and => represents
applying the final function.

In the case of parallel query, we want every worker to be able to
incorporate values into its own transition states and then merge all
the transition states at the end. That's a problem, because the
transition function expects a transition state and a value, not two
transition states. So we invented the idea of a "combine" function to
solve this problem.  A combine function takes two transition states
and produces a new transition state. That allows each worker to create
an initially empty transition state, merge a bunch of values into it,
and then pass the result back to the leader, which can combine all the
transition states using the combine function, and then apply the final
function at the end.

The same kind of idea works here. If we want to push down an entire
aggregate, there's no problem, provided the remote side supports it:
just push down the whole operation and get the result. But if we want
to push down part of the aggregate, then what we want to get back is a
transition value that we can then combine with other values (using the
transition function) or other transition states (using the combine
function) locally. That's tricky, because there's no SQL syntax to ask
the remote side to give us the transition value rather than the final
value. I think we would need to add that to solve this problem in its
full generality. However, in the special case where there's no final
function, the problem goes away, because then a transition value and a
result are identical. If we ask for a result, we can treat it as a
transition value, and there's no problem.

Internal values are a problem. Generally, you don't see internal as
the return type for an aggregate, because then the aggregate couldn't
be called by the user. An internal value can't be returned. However,
it's pretty common to see an aggregate that has an internal value as a
transition type, and something else as the result type. In such cases,
even if we had some syntax telling the remote side to send the
transition value rather than the final value, it would not be
sufficient, because the internal value still couldn't be transmitted.
This problem also arises for parallel query, where we want to move
transition values between processes within a single database cluster.
We solved that problem using aggserialfn and aggdeserialfn.
aggserialfn converts an internal transition value (which can't be
moved between processes) into a bytea, and aggdeserialfn does the
reverse. Maybe we would adopt the same solution here: our syntax that
tells the remote side to give us the transition value rather than the
final value could also tell the remote side to serialize it to bytea
if it's an internal type. However, if we did this, we'd have to be
sure that our deserialization functions were pretty well hardened
against unexpected or even malicious input, because who knows whether
that remote server is really going to send us a bytea in the format
that we're expecting to get?

Anyway, for the present patch, I think that testing whether there's a
final function is the right thing, and testing whether the return type
is internal doesn't hurt. If we want to extend this to other cases in
the future, then I think we need syntax to ask the remote side for the
unfinalized aggregate, like SELECT UNFINALIZED MAX(a) FROM t1, or
whatever. I'm not sure what the best concrete SQL syntax is - probably
not that.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

14 апреля 2023 г., 09:54:23

On Thu, Apr 13, 2023 at 10:56:26AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> > Yes, good.  Did we never push down aggregates before?  I thought we
> > pushed down partitionwise aggregates already, and such a check should
> > already be done.  If the check isn't there, it should be.
> Yes. The last version of this patch(and original postgres_fdw) checks if
> user-defined aggregate depends some extension which is contained in 'extensions'.
> But, in the last version of this patch, there is no such check for 
> aggpartialfn of user-defined aggregate. So, I will add such check to this patch. 
> I think that this modification is easy to do . 

Good, so our existing code is correct and the patch just needs
adjustment.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

02 июня 2023 г., 06:54:06

Hi Mr.Bruce, hackers.

I updated the patch.
The following is a list of comments received on the previous version of the patch
and my update to them in this version of the patch.

[comment1]
> From: Bruce Momjian <bruce@momjian.us>
> Sent: Saturday, April 8, 2023 10:50 AM
> Uh, we actually want the patch to implement partial aggregate pushdown for all
> builtin data types that can support it.

I improved the patch to push partial aggregate down for all builtin aggregates that
support it.

[comment2]
> From: Bruce Momjian <bruce@momjian.us>
> Sent: Thursday, April 13, 2023 3:51 PM
> In summary, we don't do any version check for built-in function pushdown, so
> we don't need it for aggregates either.  We check extension functions against
> the extension pushdown list, so we should be checking this for partial
> aggregate pushdown, and for partition-wise aggregate pushdown.

I removed partialagg_minversion from pg_aggregate and removed the version
check for partial aggregate pushdown by it. postgres_fdw assumes that every
built-in aggregate function has its aggpartialfunc on remote server.
postgres_fdw assumes that a user-defined aggregate function has its
aggpartialfunc on remote server only when the user-defined aggregate function
and the aggpartialfunc of it belong to some extension that's listed in the
foreign server's extensions option.

[comment3]
> From: Bruce Momjian <bruce@momjian.us>
> Sent: Saturday, April 8, 2023 10:50 AM
> > > > 3. Documentation
> > > > I need add explanation of partialaggfunc to documents on
> > > > postgres_fdw and
> > > other places.
> > >
> > > I can help with that once we decide on the above.
> > Thank you. In the next verion of this patch, I will add documents on
> > postgres_fdw and other places.
>
> Good.

I appended description for partial aggregate pushdown feature by postgres_fdw
to existing documents.
The following is a list of sgml files that have been appended and the
contents of the additions.
  postgres-fdw.sgml    : Description about this partial aggregate pushdown feature
                         and definition of aggpartialfunc.
      * Unlike existing aggregate pushdown feature, this partial aggregate
        pushdown feature is a one of the built-in sharding features in PostgreSQL.
        So I added a section about built-in sharding feature in PostgreSQL, and in that
        section I added a description of this partial aggregate pushdown feature.
        In this document, a description of the built-in sharding feature in PostgreSQL
        is based on [1].
  xaggr.sgml           :Partial aggregate pushdown feature for user-defined
                        aggregate functions.
  create_aggregate.sgml:Description about additional parameters for
                        partial aggregate pushdown feature.
  catalogs.sgml        :Description about aggpartialfn column.

[comment4]
> From: Bruce Momjian <bruce@momjian.us>
> Sent: Thursday, April 13, 2023 3:13 PM
> > There is one more thing I would like your opinion on.
> > As the major version of PostgreSQL increase, it is possible that the
> > new builtin aggregate functions are added to the newer PostgreSQL.
> > This patch assume that aggpartialfns definitions exist in BKI files.
> > Due to this assumption, someone should add aggpartialfns definitions of
> new builtin aggregate functions to BKI files.
> > There are two possible ways to address this issue. Would the way1 be
> sufficient?
> > Or would way2 be preferable?
> >   way1) Adding documentaion for how to add these definitions to BKI files
> >   way2) Improving this patch to automatically add these definitions to BKI
> files by some tool such as initdb.
>
> I think documentation is sufficient.  You already showed that someone can do
> this with CREATE AGGREGATE for non-builtin aggregates.

The update addressesing to comment3 also addresses this comment.
If a new aggregate function is added in future,
definition of aggpartialfunc in postgres-fdw.sgml
helps postgres_fdw developer add a new aggpartialfunc corresponding to the aggregate
function to existing BKI files.
For user-defined aggregate functions, description in xaggr.sgml and
create_aggregate.sgml help a user create an aggpartialfunc corresponding to
a user-defined aggregate function.

In addition, I added a validation mechanism to determine whether an
aggpartialfunc corresponding to an aggregate function is correctly created.
The aggpartialfunc information for the built-in aggregate functions is registered
in the system catalog pg_aggregate at the time the database cluster is created
from BKI files.
So I added this validation process to the regression test of postgres_fdw.

However, I am concerned that it might be more appropriate to add this validation
process to the build process or initdb, rather than to the regression test.
I would appreciate comments from the PostgreSQL community on this point.
For aggpartialfunc for user-defined functions,
I added this validation process to pg_aggregate.c.

[comment5]
> From: Bruce Momjian <bruce@momjian.us>
> Sent: Friday, April 7, 2023 11:00 AM
> I think 'partialaggfn' should be named 'aggpartialfn' to match other columns in
> pg_aggregate.

Fixed.

[1] PostgreSQL wiki, WIP PostgreSQL Sharding
https://wiki.postgresql.org/wiki/WIP_PostgreSQL_Sharding

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v18.patch

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

05 июня 2023 г., 12:00:27

Fujii.Yuki@df.MitsubishiElectric.co.jp писал 2023-06-02 06:54:
> Hi Mr.Bruce, hackers.
> 
> I updated the patch.
> The following is a list of comments received on the previous version
> of the patch
> and my update to them in this version of the patch.
> 

Hi.

I've looked through the last version of the patch.

Have found one issue -

src/backend/catalog/pg_aggregate.c

585                 if(strcmp(strVal(linitial(aggpartialfnName)), 
aggName) == 0){
586                         if(((aggTransType != INTERNALOID) && 
(finalfn != InvalidOid))
587                                         || ((aggTransType == 
INTERNALOID) && (finalfn != serialfn)))
588                                 elog(ERROR, "%s is not its own 
aggpartialfunc", aggName);
589                 } else {

Here string comparison of aggName and aggpartialfnName looks very 
suspicios - it seems you should compare oids, not names (in this case,
likely oids of transition function and partial aggregation function). 
The fact that aggregate name matches partial aggregation function name
is not a enough to make any decisions.


In documentation

doc/src/sgml/postgres-fdw.sgml:

  930    <filename>postgres_fdw</filename> attempts to optimize remote 
queries to reduce
  931    the amount of data transferred from foreign servers.  This is 
done by
  932    sending query <literal>WHERE</literal> clauses and ggregate 
expressions
  933    to the remote server for execution, and by not retrieving table 
columns that
  934    are not needed for the current query.
  935    To reduce the risk of misexecution of queries,
  936    <literal>WHERE</literal> clauses and ggregate expressions are 
not sent to
  937    the remote server unless they use only data types, operators, 
and functions
  938    that are built-in or belong to an extension that's listed in the 
foreign
  939    server's <literal>extensions</literal> option.
  940    Operators and functions in such clauses must
  941    be <literal>IMMUTABLE</literal> as well.

there are misprints in lines 932 and 936 - missing "a" in "aggregate" 
expressions.

Note that after these changes "select sum()" will fail for certain 
cases, when remote server version is not the latest. In other cases we 
tried
to preserve compatibility. Should we have a switch for a foreign server 
to turn this optimization off? Or do we just state that users
should use different workarounds if remote server version doesn't match 
local one?

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

05 июня 2023 г., 19:26:05

On Mon, Jun  5, 2023 at 12:00:27PM +0300, Alexander Pyhalov wrote:
> Note that after these changes "select sum()" will fail for certain cases,
> when remote server version is not the latest. In other cases we tried
> to preserve compatibility. Should we have a switch for a foreign server to
> turn this optimization off? Or do we just state that users
> should use different workarounds if remote server version doesn't match
> local one?

We covered this in April in this and previous emails:

    https://www.postgresql.org/message-id/ZDGTza4rovCa%2BN3d%40momjian.us

We don't check the version number for _any_ builtin functions so why
would we need to check for aggregate pushdown?  Yes, these will be new
functions in PG 17, we have added functions regularly in major releases
and have never heard reports of problems about that.

This patch will filter pushdown based on the FDW extension whitelist:

    https://www.postgresql.org/message-id/20230408041614.wfasmdm46bupbif4%40awork3.anarazel.de

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

05 июня 2023 г., 21:14:46

Bruce Momjian писал 2023-06-05 19:26:
> On Mon, Jun  5, 2023 at 12:00:27PM +0300, Alexander Pyhalov wrote:
>> Note that after these changes "select sum()" will fail for certain 
>> cases,
>> when remote server version is not the latest. In other cases we tried
>> to preserve compatibility. Should we have a switch for a foreign 
>> server to
>> turn this optimization off? Or do we just state that users
>> should use different workarounds if remote server version doesn't 
>> match
>> local one?
> 
> We covered this in April in this and previous emails:
> 
>     https://www.postgresql.org/message-id/ZDGTza4rovCa%2BN3d%40momjian.us
> 
> We don't check the version number for _any_ builtin functions so why
> would we need to check for aggregate pushdown?  Yes, these will be new
> functions in PG 17, we have added functions regularly in major releases
> and have never heard reports of problems about that.
> 
Hi.

I've seen this message. But introduction of new built-in function will 
break requests to old servers
only if this new function is used in the request (this means that query 
changes). However, this patch
changes the behavior of old queries, which worked prior to update. This 
seems to be different to me.
Also I see that in connection.c (configure_remote_session()), we care 
about old PostgreSQL versions.
And now we make querying them more tricky. Is it consistent? Do you 
think that
enable_partitionwise_aggregate is a good enough protection in this 
cases?

In documentation I see

"F.38.7. Cross-Version Compatibility
postgres_fdw can be used with remote servers dating back to PostgreSQL 
8.3. Read-only capability is available back to 8.1. A limitation however 
is that postgres_fdw generally assumes that immutable built-in functions 
and operators are safe to send to the remote server for execution, if 
they appear in a WHERE clause for a foreign table. Thus, a built-in 
function that was added since the remote server's release might be sent 
to it for execution, resulting in “function does not exist” or a similar 
error. This type of failure can be worked around by rewriting the query, 
for example by embedding the foreign table reference in a sub-SELECT 
with OFFSET 0 as an optimization fence, and placing the problematic 
function or operator outside the sub-SELECT."

Likely, this paragraph should be expanded to state that partition-wise 
aggregation for many functions can fail to work with old foreign 
servers.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

06 июня 2023 г., 03:10:18

On Fri, Jun  2, 2023 at 03:54:06AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> Hi Mr.Bruce, hackers.
> 
> I updated the patch.
> The following is a list of comments received on the previous version of the patch
> and my update to them in this version of the patch.

This thread started in October 2021 so I would like to explain what this
feature adds.

Basically for partitions made up of postgres_fdw tables, there are four
possible optimizations:

1.  Pruning, 3 stages, see slide 30 here:

    https://momjian.us/main/writings/pgsql/partitioning.pdf#page=30

2.  Parallelism across partitions, see slide 38 here:

    https://momjian.us/main/writings/pgsql/beyond.pdf#page=38

3.  Pushdown of partition-wise joins and aggregates, see slide 43 here:

    https://momjian.us/main/writings/pgsql/partitioning.pdf#page=43

4.  Pushdown of aggregates that aren't partition-wise

As far as I know, over the years we have accomplished all of these
items, except for #4.  #3 involves aggregates where the GROUP BY or
JOINed tables match the partition keys.

Number 4 involves things like a SUM our COUNT that does not match the
partition key, or has no groupings at all.

#3 is easier than #4 since we just need to pass _rows_ back from the
foreign servers.  #4 is more complex because _partial_ count/sum, or
even average values must be passed from the foreign servers to the
requesting server.

The good news is that we already have partial aggregate support as part
of our parallel aggregate feature, see:

    https://momjian.us/main/writings/pgsql/beyond.pdf#page=38

What the patch does is to expand the existing partial aggregate code to
allow partial aggregate results to pass from the foreign servers to the
requesting server.  This feature will be very useful for data warehouse
queries that need to compute aggregate across partitions.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

06 июня 2023 г., 06:08:50

Hi Mr.Pyhalov.

Thank you for your always thoughtful review.

> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Sent: Monday, June 5, 2023 6:00 PM
> Have found one issue -
> 
> src/backend/catalog/pg_aggregate.c
> 
> 585                 if(strcmp(strVal(linitial(aggpartialfnName)),
> aggName) == 0){
> 586                         if(((aggTransType != INTERNALOID) &&
> (finalfn != InvalidOid))
> 587                                         || ((aggTransType ==
> INTERNALOID) && (finalfn != serialfn)))
> 588                                 elog(ERROR, "%s is not its own
> aggpartialfunc", aggName);
> 589                 } else {
> 
> Here string comparison of aggName and aggpartialfnName looks very
> suspicios - it seems you should compare oids, not names (in this case,
> likely oids of transition function and partial aggregation function).
> The fact that aggregate name matches partial aggregation function name
> is not a enough to make any decisions.

I see no problem with this string comparison. Here is the reason.
The intent of this code is, to determine whether the user specifies 
the new aggregate function whose aggpartialfunc is itself.
For two aggregate functions,
If the argument list and function name match, then the two aggregate functions must match.
By definition of aggpartialfunc,
every aggregate function and its aggpartialfn must have the same argument list.
Thus, if aggpartialfnName and aggName are equal as strings,
I think it is correct to determine that the user is specifying 
the new aggregate function whose aggpartialfunc is itself.

However, since the document does not state these intentions
I think your suspicions are valid.
Therefore, I have added a specification to the document reflecting the above intentions.

> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Sent: Monday, June 5, 2023 6:00 PM
> In documentation
> 
> doc/src/sgml/postgres-fdw.sgml:
> 
>   930    <filename>postgres_fdw</filename> attempts to optimize remote
> queries to reduce
>   931    the amount of data transferred from foreign servers.  This is
> done by
>   932    sending query <literal>WHERE</literal> clauses and ggregate
> expressions
>   933    to the remote server for execution, and by not retrieving table
> columns that
>   934    are not needed for the current query.
>   935    To reduce the risk of misexecution of queries,
>   936    <literal>WHERE</literal> clauses and ggregate expressions are
> not sent to
>   937    the remote server unless they use only data types, operators,
> and functions
>   938    that are built-in or belong to an extension that's listed in the
> foreign
>   939    server's <literal>extensions</literal> option.
>   940    Operators and functions in such clauses must
>   941    be <literal>IMMUTABLE</literal> as well.
> 
> there are misprints in lines 932 and 936 - missing "a" in "aggregate"
> expressions.

Fixed.

> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Sent: Monday, June 5, 2023 6:00 PM
> Note that after these changes "select sum()" will fail for certain
> cases, when remote server version is not the latest. In other cases we
> tried
> to preserve compatibility. Should we have a switch for a foreign server
> to turn this optimization off? Or do we just state that users
> should use different workarounds if remote server version doesn't match
> local one?

It is the latter.
I added description about the above limitation to F.38.6. Built-in sharding in PostgreSQL and
F.38.8 Cross-Version Compatibility of doc/src/sgml/postgres-fdw.sgml.

> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Sent: Tuesday, June 6, 2023 3:15 AM
> Bruce Momjian писал 2023-06-05 19:26:
> > On Mon, Jun  5, 2023 at 12:00:27PM +0300, Alexander Pyhalov wrote:
> >> Note that after these changes "select sum()" will fail for certain
> >> cases, when remote server version is not the latest. In other cases
> >> we tried to preserve compatibility. Should we have a switch for a
> >> foreign server to turn this optimization off? Or do we just state
> >> that users should use different workarounds if remote server version
> >> doesn't match local one?
> >
> > We covered this in April in this and previous emails:
> >
> >
>     https://www.postgresql.org/message-id/ZDGTza4rovCa%2BN3d%40
> momjian.us
> >
> > We don't check the version number for _any_ builtin functions so why
> > would we need to check for aggregate pushdown?  Yes, these will be new
> > functions in PG 17, we have added functions regularly in major
> > releases and have never heard reports of problems about that.
> >
> Hi.
> 
> I've seen this message. But introduction of new built-in function will break
> requests to old servers only if this new function is used in the request (this
> means that query changes). However, this patch changes the behavior of old
> queries, which worked prior to update. This seems to be different to me.

You are right.
However, for now, partial aggregates pushdown is mainly used when using built-in sharding in PostgreSQL.
I believe when using built-in sharding in PostgreSQL, the version of the primary node server and
the version of the remote server will usually be the same.
So I think it would be sufficient to include in the documentation a note about such problem
and a workaround for them.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v19.patch

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

06 июня 2023 г., 07:19:01

Fujii.Yuki@df.MitsubishiElectric.co.jp писал 2023-06-06 06:08:
> Hi Mr.Pyhalov.
> 
> Thank you for your always thoughtful review.
> 
>> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
>> Sent: Monday, June 5, 2023 6:00 PM
>> Have found one issue -
>> 
>> src/backend/catalog/pg_aggregate.c
>> 
>> 585                 if(strcmp(strVal(linitial(aggpartialfnName)),
>> aggName) == 0){
>> 586                         if(((aggTransType != INTERNALOID) &&
>> (finalfn != InvalidOid))
>> 587                                         || ((aggTransType ==
>> INTERNALOID) && (finalfn != serialfn)))
>> 588                                 elog(ERROR, "%s is not its own
>> aggpartialfunc", aggName);
>> 589                 } else {
>> 
>> Here string comparison of aggName and aggpartialfnName looks very
>> suspicios - it seems you should compare oids, not names (in this case,
>> likely oids of transition function and partial aggregation function).
>> The fact that aggregate name matches partial aggregation function name
>> is not a enough to make any decisions.
> 
> I see no problem with this string comparison. Here is the reason.
> The intent of this code is, to determine whether the user specifies
> the new aggregate function whose aggpartialfunc is itself.
> For two aggregate functions,
> If the argument list and function name match, then the two aggregate
> functions must match.
> By definition of aggpartialfunc,
> every aggregate function and its aggpartialfn must have the same 
> argument list.
> Thus, if aggpartialfnName and aggName are equal as strings,
> I think it is correct to determine that the user is specifying
> the new aggregate function whose aggpartialfunc is itself.
> 
> However, since the document does not state these intentions
> I think your suspicions are valid.
> Therefore, I have added a specification to the document reflecting the
> above intentions.
> 

Hi. Let me explain.

Look at this example, taken from test.

CREATE AGGREGATE udf_avg_p_int4(int4) (
        sfunc = int4_avg_accum,
        stype = _int8,
        combinefunc = int4_avg_combine,
        initcond = '{0,0}'
);
CREATE AGGREGATE udf_sum(int4) (
        sfunc = int4_avg_accum,
        stype = _int8,
        finalfunc = int8_avg,
        combinefunc = int4_avg_combine,
        initcond = '{0,0}',
        aggpartialfunc = udf_avg_p_int4
);

Now, let's create another aggregate.

# create schema test ;
create aggregate test.udf_avg_p_int4(int4) (
        sfunc = int4_avg_accum,
        stype = _int8,
        finalfunc = int8_avg,
        combinefunc = int4_avg_combine,
        initcond = '{0,0}',
        aggpartialfunc = udf_avg_p_int4
);
ERROR:  udf_avg_p_int4 is not its own aggpartialfunc

What's the difference between test.udf_avg_p_int4(int4) aggregate and 
udf_sum(int4)? They are essentially the same, but second one can't be 
defined.

Also note difference:

# CREATE AGGREGATE udf_sum(int4) (
        sfunc = int4_avg_accum,
        stype = _int8,
        finalfunc = int8_avg,
        combinefunc = pg_catalog.int4_avg_combine,
        initcond = '{0,0}',
        aggpartialfunc = udf_avg_p_int4
);
CREATE AGGREGATE

# CREATE AGGREGATE udf_sum(int4) (
        sfunc = int4_avg_accum,
        stype = _int8,
        finalfunc = int8_avg,
        combinefunc = pg_catalog.int4_avg_combine,
        initcond = '{0,0}',
        aggpartialfunc = public.udf_avg_p_int4
);
ERROR:  aggpartialfnName is invalid

It seems that assumption about aggpartialfnName - that it's a 
non-qualified name is incorrect. And if we use qualified names, we can't 
compare just names, likely we should compare oids.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

06 июня 2023 г., 15:31:20

Hi Mr.Pyhalov.

Thank you for comments.

> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Sent: Tuesday, June 6, 2023 1:19 PM
> >> Have found one issue -
> >>
> >> src/backend/catalog/pg_aggregate.c
> >>
> >> 585                 if(strcmp(strVal(linitial(aggpartialfnName)),
> >> aggName) == 0){
> >> 586                         if(((aggTransType != INTERNALOID) &&
> >> (finalfn != InvalidOid))
> >> 587                                         || ((aggTransType ==
> >> INTERNALOID) && (finalfn != serialfn)))
> >> 588                                 elog(ERROR, "%s is not its own
> >> aggpartialfunc", aggName);
> >> 589                 } else {
> >>
> >> Here string comparison of aggName and aggpartialfnName looks very
> >> suspicios - it seems you should compare oids, not names (in this
> >> case, likely oids of transition function and partial aggregation function).
> >> The fact that aggregate name matches partial aggregation function
> >> name is not a enough to make any decisions.
> >
> > I see no problem with this string comparison. Here is the reason.
> > The intent of this code is, to determine whether the user specifies
> > the new aggregate function whose aggpartialfunc is itself.
> > For two aggregate functions,
> > If the argument list and function name match, then the two aggregate
> > functions must match.
> > By definition of aggpartialfunc,
> > every aggregate function and its aggpartialfn must have the same
> > argument list.
> > Thus, if aggpartialfnName and aggName are equal as strings, I think it
> > is correct to determine that the user is specifying the new aggregate
> > function whose aggpartialfunc is itself.
> >
> > However, since the document does not state these intentions I think
> > your suspicions are valid.
> > Therefore, I have added a specification to the document reflecting the
> > above intentions.
> >
> 
> Hi. Let me explain.
> 
> Look at this example, taken from test.
> 
> CREATE AGGREGATE udf_avg_p_int4(int4) (
>         sfunc = int4_avg_accum,
>         stype = _int8,
>         combinefunc = int4_avg_combine,
>         initcond = '{0,0}'
> );
> CREATE AGGREGATE udf_sum(int4) (
>         sfunc = int4_avg_accum,
>         stype = _int8,
>         finalfunc = int8_avg,
>         combinefunc = int4_avg_combine,
>         initcond = '{0,0}',
>         aggpartialfunc = udf_avg_p_int4
> );
> 
> Now, let's create another aggregate.
> 
> # create schema test ;
> create aggregate test.udf_avg_p_int4(int4) (
>         sfunc = int4_avg_accum,
>         stype = _int8,
>         finalfunc = int8_avg,
>         combinefunc = int4_avg_combine,
>         initcond = '{0,0}',
>         aggpartialfunc = udf_avg_p_int4
> );
> ERROR:  udf_avg_p_int4 is not its own aggpartialfunc
> 
> What's the difference between test.udf_avg_p_int4(int4) aggregate and
> udf_sum(int4)? They are essentially the same, but second one can't be
> defined.
> 
> Also note difference:
> 
> # CREATE AGGREGATE udf_sum(int4) (
>         sfunc = int4_avg_accum,
>         stype = _int8,
>         finalfunc = int8_avg,
>         combinefunc = pg_catalog.int4_avg_combine,
>         initcond = '{0,0}',
>         aggpartialfunc = udf_avg_p_int4
> );
> CREATE AGGREGATE
> 
> # CREATE AGGREGATE udf_sum(int4) (
>         sfunc = int4_avg_accum,
>         stype = _int8,
>         finalfunc = int8_avg,
>         combinefunc = pg_catalog.int4_avg_combine,
>         initcond = '{0,0}',
>         aggpartialfunc = public.udf_avg_p_int4 );
> ERROR:  aggpartialfnName is invalid
> 
> It seems that assumption about aggpartialfnName - that it's a non-qualified
> name is incorrect. And if we use qualified names, we can't compare just names,
> likely we should compare oids.

Thanks for the explanation.
I understand that the method of comparing two function name strings is incorrect.
Instead, I added the parameter isaggpartialfunc indicating whether the aggregate
function and its aggpartialfunc are the same or different.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v20.patch

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

07 июня 2023 г., 12:47:01

Fujii.Yuki@df.MitsubishiElectric.co.jp писал 2023-06-06 15:31:
> Thanks for the explanation.
> I understand that the method of comparing two function name strings is
> incorrect.
> Instead, I added the parameter isaggpartialfunc indicating whether the 
> aggregate
> function and its aggpartialfunc are the same or different.

Hi.

This seems to be more robust, but the interface became more strange.
I'm not sure what to do with it. Some ideas I had to avoid introducing 
this parameter. Not sure I like any of them.

1) You can use QualifiedNameGetCreationNamespace() for aggpartialfnName 
and still compare namespace and function name  for it and  aggName, 
aggNamespace.
Seems to be not ideal, but avoids introducing new parameters.

2) You can lookup for partial aggregate function after ProcedureCreate() 
in AggregateCreate(), if it wasn't found at earlier stages. If it is the 
aggregate itself - check it. If it's still not found, error out. Also 
seems to be a bit ugly - you leave uncommitted garbage for vacuum in 
catalogue.


Another issue - the patch misses recording dependency between 
aggpartialfn and aggregate procedure.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

08 июня 2023 г., 02:08:54

Hi Mr.Pyhalov.

Thank you for comments.

> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Sent: Wednesday, June 7, 2023 6:47 PM
> This seems to be more robust, but the interface became more strange.
> I'm not sure what to do with it. Some ideas I had to avoid introducing this
> parameter. Not sure I like any of them.
>
> 1) You can use QualifiedNameGetCreationNamespace() for aggpartialfnName
> and still compare namespace and function name  for it and  aggName,
> aggNamespace.
> Seems to be not ideal, but avoids introducing new parameters.
>
> 2) You can lookup for partial aggregate function after ProcedureCreate() in
> AggregateCreate(), if it wasn't found at earlier stages. If it is the aggregate itself
> - check it. If it's still not found, error out. Also seems to be a bit ugly - you leave
> uncommitted garbage for vacuum in catalogue.
Thank you for suggesting alternatives.
The disadvantages of alternative 2) appear to be undesirable,
I have modified it according to alternative 1)

> Another issue - the patch misses recording dependency between aggpartialfn
> and aggregate procedure.
I added code to record dependencys between aggpartialfn
and aggregate procedure, similar to the code for functions such as combinefunc.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v21.patch

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

08 июня 2023 г., 10:39:30

Fujii.Yuki@df.MitsubishiElectric.co.jp писал 2023-06-08 02:08:
>> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
>> Sent: Wednesday, June 7, 2023 6:47 PM
>> This seems to be more robust, but the interface became more strange.
>> I'm not sure what to do with it. Some ideas I had to avoid introducing 
>> this
>> parameter. Not sure I like any of them.
>> 
>> 1) You can use QualifiedNameGetCreationNamespace() for 
>> aggpartialfnName
>> and still compare namespace and function name  for it and  aggName,
>> aggNamespace.
>> Seems to be not ideal, but avoids introducing new parameters.
>> 
>> 2) You can lookup for partial aggregate function after 
>> ProcedureCreate() in
>> AggregateCreate(), if it wasn't found at earlier stages. If it is the 
>> aggregate itself
>> - check it. If it's still not found, error out. Also seems to be a bit 
>> ugly - you leave
>> uncommitted garbage for vacuum in catalogue.
> Thank you for suggesting alternatives.
> The disadvantages of alternative 2) appear to be undesirable,
> I have modified it according to alternative 1)
> 
>> Another issue - the patch misses recording dependency between 
>> aggpartialfn
>> and aggregate procedure.
> I added code to record dependencys between aggpartialfn
> and aggregate procedure, similar to the code for functions such as 
> combinefunc.
> 

Hi.

Looks better. The only question I have is should we record dependency 
between procOid and aggpartialfn if aggpartialfn == procOid.

Also it seems new code likely should be run through pgindent.

doc/src/sgml/postgres-fdw.sgml:

+   For <literal>WHERE</literal> clauses,
+   <literal>JOIN</literal> clauses, this sending is active if
+   conditions in <xref 
linkend="postgres-fdw-remote-query-optimization"/>
+   hold and <varname>enable_partitionwise_join</varname> is true(this 
condition
+   is need for only <literal>JOIN</literal> clauses).
+   For aggregate expressions, this sending is active if conditions in

No space between "true" and "(" in "is true(this condition".

Some sentences in documentation, like one starting with
"For aggregate expressions, this sending is active if conditions in..."
seem to be too long, but I'm not the best man to read out documentation.

In "Built-in sharding in PostgreSQL" term "shard" doesn't have a 
definition.

By the way, I'm not sure that "sharding" documentation belongs to this 
patch,
at least it needs a review from native speaker.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

09 июня 2023 г., 14:38:37

Hi Mr.Pyhalov.

Thank you for comments.

> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Sent: Thursday, June 8, 2023 4:40 PM
> Looks better. The only question I have is should we record dependency
> between procOid and aggpartialfn if aggpartialfn == procOid.
Fixed.

> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Sent: Thursday, June 8, 2023 4:40 PM
> Also it seems new code likely should be run through pgindent.
Done.

> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Sent: Thursday, June 8, 2023 4:40 PM
> doc/src/sgml/postgres-fdw.sgml:
> 
> +   For <literal>WHERE</literal> clauses,
> +   <literal>JOIN</literal> clauses, this sending is active if
> +   conditions in <xref
> linkend="postgres-fdw-remote-query-optimization"/>
> +   hold and <varname>enable_partitionwise_join</varname> is true(this
> condition
> +   is need for only <literal>JOIN</literal> clauses).
> +   For aggregate expressions, this sending is active if conditions in
> 
> No space between "true" and "(" in "is true(this condition".
Fixed.

> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Sent: Thursday, June 8, 2023 4:40 PM
> Some sentences in documentation, like one starting with "For aggregate
> expressions, this sending is active if conditions in..."
> seem to be too long, but I'm not the best man to read out documentation.
Fixed.

> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Sent: Thursday, June 8, 2023 4:40 PM
> In "Built-in sharding in PostgreSQL" term "shard" doesn't have a definition.
I have removed the sentence you pointed out.

> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Sent: Thursday, June 8, 2023 4:40 PM
> By the way, I'm not sure that "sharding" documentation belongs to this patch, at
> least it needs a review from native speaker.
I removed general description of sharding.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v22.patch

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

09 июня 2023 г., 17:09:42

Hi.

+       An aggregate function, called the partial aggregate function for 
partial aggregate
+       that corresponding to the aggregate function, is defined on the 
primary node and
+       the <filename>postgres_fdw</filename> node.

Something is clearly wrong here.

+   When using built-in sharding feature in PostgreSQL is used,

And here.

Overall the code looks good to me, but I suppose that documentation 
needs further review from some native speaker.


-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

09 июня 2023 г., 19:44:03

On Tue, Jun  6, 2023 at 03:08:50AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> > I've seen this message. But introduction of new built-in function will break
> > requests to old servers only if this new function is used in the request (this
> > means that query changes). However, this patch changes the behavior of old
> > queries, which worked prior to update. This seems to be different to me.
> 
> You are right.
> However, for now, partial aggregates pushdown is mainly used when using built-in sharding in PostgreSQL.
> I believe when using built-in sharding in PostgreSQL, the version of the primary node server and
> the version of the remote server will usually be the same.
> So I think it would be sufficient to include in the documentation a note about such problem
> and a workaround for them.

I agree that this feature is designed for built-in sharding, but it is
possible people could be using aggregates on partitions backed by
foreign tables without sharding.  Adding a requirement for non-sharding
setups to need PG 17+ servers might be unreasonable.

Looking at previous release note incompatibilities, we don't normally
change non-administrative functions in a way that causes errors if run
on older servers.  Based on Alexander's observations, I wonder if we
need to re-add the postgres_fdw option to control partial aggregate
pushdown, and default it to enabled.

If we ever add more function breakage we might need more postgres_fdw
options.  Fortunately, such changes are rare.

Yuki, thank you for writing and updating this patch, and Alexander,
thank you for helping with this patch.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

12 июня 2023 г., 11:51:30

Hi Mr.Bruce, Mr.Pyhalov, hackers.

Thank you for comments. I will try to respond to both of your comments as follows.
I plan to start revising the patch next week. If you have any comments on the following
respondences, I would appreciate it if you could give them to me this week.

> From: Bruce Momjian <bruce@momjian.us>
> Sent: Saturday, June 10, 2023 1:44 AM
> I agree that this feature is designed for built-in sharding, but it is possible people could be using aggregates on
partitions
> backed by foreign tables without sharding.  Adding a requirement for non-sharding setups to need PG 17+ servers might
> be unreasonable.
Indeed, it is possible to use partial aggregate pushdown feature for purposes other than sharding.
The description of the section "F.38.6. Built-in sharding in PostgreSQL" assumes the use of
Built-in sharding and will be modified to eliminate this assumption.
The title of this section should be changed to something like "Aggregate on partitioned table".

> From: Bruce Momjian <bruce@momjian.us>
> Sent: Saturday, June 10, 2023 1:44 AM
> Looking at previous release note incompatibilities, we don't normally change non-administrative functions in a way
that
> causes errors if run on older servers.  Based on Alexander's observations, I wonder if we need to re-add the
postgres_fdw
> option to control partial aggregate pushdown, and default it to enabled.
>
> If we ever add more function breakage we might need more postgres_fdw options.  Fortunately, such changes are rare.

I understand what the problem is. I will put a mechanism maintaining compatibility into the patch.
I believe there are three approaches.
Approach 1-1 is preferable because it does not require additional options for postgres_fdw.
I will revise the patch according to Approach 1-1, unless otherwise commented.

Approach1:
I ensure that postgres_fdw retrieves the version of each remote server
and does not partial aggregate pushd down if the server version is less than 17.
There are two approaches to obtaining remote server versions.
Approach1-1: postgres_fdw connects a remote server and use PQserverVersion().
Approach1-2: Adding a postgres_fdw option about a remote server version (like "server_version").

Approach2:
Adding a postgres_fdw option for partial aggregate pushdown is enable or not
(like enable_partial_aggregate_pushdown).

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

12 июня 2023 г., 16:37:35

On Mon, Jun 12, 2023 at 08:51:30AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> Hi Mr.Bruce, Mr.Pyhalov, hackers.
> 
> Thank you for comments. I will try to respond to both of your comments as follows.
> I plan to start revising the patch next week. If you have any comments on the following
> respondences, I would appreciate it if you could give them to me this week.
> 
> > From: Bruce Momjian <bruce@momjian.us>
> > Sent: Saturday, June 10, 2023 1:44 AM
> > I agree that this feature is designed for built-in sharding, but it is possible people could be using aggregates on
partitions
> > backed by foreign tables without sharding.  Adding a requirement for non-sharding setups to need PG 17+ servers
might
> > be unreasonable.
> Indeed, it is possible to use partial aggregate pushdown feature for purposes other than sharding.
> The description of the section "F.38.6. Built-in sharding in PostgreSQL" assumes the use of
> Built-in sharding and will be modified to eliminate this assumption.
> The title of this section should be changed to something like "Aggregate on partitioned table".

Sounds good.

> > From: Bruce Momjian <bruce@momjian.us>
> > Sent: Saturday, June 10, 2023 1:44 AM
> > Looking at previous release note incompatibilities, we don't normally change non-administrative functions in a way
that
> > causes errors if run on older servers.  Based on Alexander's observations, I wonder if we need to re-add the
postgres_fdw
> > option to control partial aggregate pushdown, and default it to enabled.
> > 
> > If we ever add more function breakage we might need more postgres_fdw options.  Fortunately, such changes are
rare.
> 
> I understand what the problem is. I will put a mechanism maintaining compatibility into the patch.
> I believe there are three approaches.
> Approach 1-1 is preferable because it does not require additional options for postgres_fdw.
> I will revise the patch according to Approach 1-1, unless otherwise commented.
> 
> Approach1:
> I ensure that postgres_fdw retrieves the version of each remote server
> and does not partial aggregate pushd down if the server version is less than 17.
> There are two approaches to obtaining remote server versions.
> Approach1-1: postgres_fdw connects a remote server and use PQserverVersion().
> Approach1-2: Adding a postgres_fdw option about a remote server version (like "server_version").
> 
> Approach2:
> Adding a postgres_fdw option for partial aggregate pushdown is enable or not
> (like enable_partial_aggregate_pushdown).

These are good questions.  Adding a postgres_fdw option called
enable_partial_aggregate_pushdown helps make the purpose of the option
clear, but remote_version can be used for future breakage as well.

I think remote_version is the best idea, and in the documention for the
option, let's explcitly say it is useful to disable partial aggreates
pushdown on pre-PG 17 servers.  If we need to use the option for other
cases, we can just update the documentation.  When the option is blank,
the default, everything is pushed down.

I see remote_version a logical addition to match our "extensions" option
that controls what extension functions can be pushed down.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

13 июня 2023 г., 05:18:15

Hi Mr.Momjian.

Thank you for advises.

> From: Bruce Momjian <bruce@momjian.us>
> Sent: Monday, June 12, 2023 10:38 PM
> > I understand what the problem is. I will put a mechanism maintaining compatibility into the patch.
> > I believe there are three approaches.
> > Approach 1-1 is preferable because it does not require additional options for postgres_fdw.
> > I will revise the patch according to Approach 1-1, unless otherwise commented.
> >
> > Approach1:
> > I ensure that postgres_fdw retrieves the version of each remote server
> > and does not partial aggregate pushd down if the server version is less than 17.
> > There are two approaches to obtaining remote server versions.
> > Approach1-1: postgres_fdw connects a remote server and use PQserverVersion().
> > Approach1-2: Adding a postgres_fdw option about a remote server version (like "server_version").
> >
> > Approach2:
> > Adding a postgres_fdw option for partial aggregate pushdown is enable
> > or not (like enable_partial_aggregate_pushdown).
>
> These are good questions.  Adding a postgres_fdw option called enable_partial_aggregate_pushdown helps make the
> purpose of the option clear, but remote_version can be used for future breakage as well.
>
> I think remote_version is the best idea, and in the documention for the option, let's explcitly say it is useful to
disable
> partial aggreates pushdown on pre-PG 17 servers.  If we need to use the option for other cases, we can just update
the
> documentation.  When the option is blank, the default, everything is pushed down.
>
> I see remote_version a logical addition to match our "extensions" option that controls what extension functions can
be
> pushed down.

Thank you for your perspective.
So, of the approaches I have presented, you think that approach 1-2 is
preferable and that the option name remote_server is preferable?
Indeed, the option of a remote version may have other uses.
However, this information can be obtained by connecting to a remote server,
I'm concerned that some people may find this option redundant.

Is the problem with approach 1-1 because the user cannot decide whether to include the compatibility check in the
decisionto do partial aggregate pushdown or not? 
# If Approach 1-1 is taken, the problem is that this feature cannot be used for all buit-in aggregate functions
# when the remote server is older than PG17.
If so, Approache1-3 below seem more desirable.
Would it be possible for us to hear your thoughts?

Approache1-3：We add a postgres_fdw option about a compatibility check for partial aggregate pushdown
(like "enable_aggpartialfunc_compatibility_check"). This option is false, the default.
When this option is true, postgres_fdw obtains the remote server version by connecting the remote server and using
PQserverVersion(). 
And if the remote server version is older than PG17, then the partial aggregate pushdown feature is enable for all
buit-inaggregate functions. 
Otherwise the partial aggregate pushdown feature is disable for all buit-in aggregate functions.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

20 июня 2023 г., 03:42:45

On Tue, Jun 13, 2023 at 02:18:15AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> Hi Mr.Momjian.
> 
> Thank you for advises.
> 
> > From: Bruce Momjian <bruce@momjian.us>
> > Sent: Monday, June 12, 2023 10:38 PM
> > > I understand what the problem is. I will put a mechanism maintaining compatibility into the patch.
> > > I believe there are three approaches.
> > > Approach 1-1 is preferable because it does not require additional options for postgres_fdw.
> > > I will revise the patch according to Approach 1-1, unless otherwise commented.
> > >
> > > Approach1:
> > > I ensure that postgres_fdw retrieves the version of each remote server
> > > and does not partial aggregate pushd down if the server version is less than 17.
> > > There are two approaches to obtaining remote server versions.
> > > Approach1-1: postgres_fdw connects a remote server and use PQserverVersion().
> > > Approach1-2: Adding a postgres_fdw option about a remote server version (like "server_version").
> > >
> > > Approach2:
> > > Adding a postgres_fdw option for partial aggregate pushdown is enable
> > > or not (like enable_partial_aggregate_pushdown).
> > 
> > These are good questions.  Adding a postgres_fdw option called enable_partial_aggregate_pushdown helps make the
> > purpose of the option clear, but remote_version can be used for future breakage as well.
> > 
> > I think remote_version is the best idea, and in the documentation for the option, let's explicitly say it is useful
todisable
 
> > partial aggregates pushdown on pre-PG 17 servers.  If we need to use the option for other cases, we can just update
the
> > documentation.  When the option is blank, the default, everything is pushed down.
> > 
> > I see remote_version a logical addition to match our "extensions" option that controls what extension functions can
be
> > pushed down.
> 
> Thank you for your perspective.
> So, of the approaches I have presented, you think that approach 1-2 is
> preferable and that the option name remote_server is preferable?
> Indeed, the option of a remote version may have other uses.
> However, this information can be obtained by connecting to a remote server, 
> I'm concerned that some people may find this option redundant.
> 
> Is the problem with approach 1-1 because the user cannot decide whether to include the compatibility check in the
decisionto do partial aggregate pushdown or not?
 
> # If Approach 1-1 is taken, the problem is that this feature cannot be used for all bait-in aggregate functions
> # when the remote server is older than PG17.
> If so, Approache1-3 below seem more desirable.
> Would it be possible for us to hear your thoughts?
> 
> Approache1-3：We add a postgres_fdw option about a compatibility check for partial aggregate pushdown
> (like "enable_aggpartialfunc_compatibility_check"). This option is false, the default.
> When this option is true, postgres_fdw obtains the remote server version by connecting the remote server and using
PQserverVersion().
 
> And if the remote server version is older than PG17, then the partial aggregate pushdown feature is enable for all
buit-inaggregate functions.
 
> Otherwise the partial aggregate pushdown feature is disable for all buit-in aggregate functions.

Apologies for the delay in my reply to this email.  I looked into the
existing code and I found three things:

1)  PQserverVersion() just pulls the conn->sversion value from the
existing connection because pqSaveParameterStatus() pulls the
server_version sent by the backend;  no need to issue SELECT version().

2)  postgres_fdw already has nine calls to GetConnection(), and only
opens a connection if it already doesn't have one.  Here is an example:

    /* Get the remote estimate */
    conn = GetConnection(fpinfo->user, false, NULL);
    get_remote_estimate(sql.data, conn, &rows, &width,
                &startup_cost, &total_cost);
    ReleaseConnection(conn);

Therefore, it seems like it would be near-zero cost to just call conn =
GetConnection() and then PQserverVersion(conn), and ReleaseConnection().
You can then use the return value of PQserverVersion() to determine if
you can push down partial aggregates.

3)  Looking at postgresAcquireSampleRowsFunc(), I see this exact method
used:

    conn = GetConnection(user, false, NULL);

    /* We'll need server version, so fetch it now. */
    server_version_num = PQserverVersion(conn);

    ...

    if ((server_version_num < 95000) &&
    (method == ANALYZE_SAMPLE_SYSTEM ||
     method == ANALYZE_SAMPLE_BERNOULLI))
    ereport(ERROR,
        (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
         errmsg("remote server does not support TABLESAMPLE feature")));

I am sorry if you already knew all this, but I didn't.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

20 июня 2023 г., 09:59:11

Bruce Momjian писал 2023-06-20 03:42:
> Apologies for the delay in my reply to this email.  I looked into the
> existing code and I found three things:
> 
> 1)  PQserverVersion() just pulls the conn->sversion value from the
> existing connection because pqSaveParameterStatus() pulls the
> server_version sent by the backend;  no need to issue SELECT version().
> 
> 2)  postgres_fdw already has nine calls to GetConnection(), and only
> opens a connection if it already doesn't have one.  Here is an example:
> 
>     /* Get the remote estimate */
>     conn = GetConnection(fpinfo->user, false, NULL);
>     get_remote_estimate(sql.data, conn, &rows, &width,
>                 &startup_cost, &total_cost);
>     ReleaseConnection(conn);
> 
> Therefore, it seems like it would be near-zero cost to just call conn =
> GetConnection() and then PQserverVersion(conn), and 
> ReleaseConnection().
> You can then use the return value of PQserverVersion() to determine if
> you can push down partial aggregates.
> 

Hi.
Currently we don't get remote connection while planning if 
use_remote_estimate is not set.
Such change would require to get remote connection in planner, not in 
executor.
This can lead to change of behavior (like errors in explain when user 
mapping is wrong - e.g. bad password is specified).
Also this potentially can lead to establishing connections even when 
plan node is not actually used
(like extreme example - select sum(score) from t limit 0).
I'm not saying we shouldn't do it - just hint at possible consequences.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

21 июня 2023 г., 18:43:57

On Tue, Jun 20, 2023 at 09:59:11AM +0300, Alexander Pyhalov wrote:
> > Therefore, it seems like it would be near-zero cost to just call conn =
> > GetConnection() and then PQserverVersion(conn), and ReleaseConnection().
> > You can then use the return value of PQserverVersion() to determine if
> > you can push down partial aggregates.
> 
> Hi.
> Currently we don't get remote connection while planning if
> use_remote_estimate is not set.
> Such change would require to get remote connection in planner, not in
> executor.
> This can lead to change of behavior (like errors in explain when user
> mapping is wrong - e.g. bad password is specified).
> Also this potentially can lead to establishing connections even when plan
> node is not actually used
> (like extreme example - select sum(score) from t limit 0).
> I'm not saying we shouldn't do it - just hint at possible consequences.

Agreed.  I noticed it was doing FDW connections during optimization, but
didn't see the postgres_fdw option that would turn it off. 
Interestingly, it is disabled by default.

After considering the options, I think we should have a postgres_fdw
option called "planner_version_check", and default that false.  When
false, a remote server version check will not be performed during
planning and partial aggregates will be always be considered.  When
true, a version check will be performed during planning and partial
aggregate pushdown disabled for pre-PG 17 foreign servers during the
query.

If we want to be more specific, we can call it
"check_partial_aggregate_support".

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

22 июня 2023 г., 08:23:33

Hi Mr.Momjian, Mr.Pyhalov, hackers.

> From: Bruce Momjian <bruce@momjian.us>
> Sent: Thursday, June 22, 2023 12:44 AM
> On Tue, Jun 20, 2023 at 09:59:11AM +0300, Alexander Pyhalov wrote:
> > > Therefore, it seems like it would be near-zero cost to just call
> > > conn =
> > > GetConnection() and then PQserverVersion(conn), and ReleaseConnection().
> > > You can then use the return value of PQserverVersion() to determine
> > > if you can push down partial aggregates.
> >
> > Hi.
> > Currently we don't get remote connection while planning if
> > use_remote_estimate is not set.
> > Such change would require to get remote connection in planner, not in
> > executor.
> > This can lead to change of behavior (like errors in explain when user
> > mapping is wrong - e.g. bad password is specified).
> > Also this potentially can lead to establishing connections even when
> > plan node is not actually used (like extreme example - select
> > sum(score) from t limit 0).
> > I'm not saying we shouldn't do it - just hint at possible consequences.
>
> Agreed.  I noticed it was doing FDW connections during optimization, but didn't see the postgres_fdw option that
would
> turn it off.
> Interestingly, it is disabled by default.
>
> After considering the options, I think we should have a postgres_fdw option called "planner_version_check", and
default
> that false.  When false, a remote server version check will not be performed during planning and partial aggregates
willbe 
> always be considered.  When true, a version check will be performed during planning and partial aggregate pushdown
> disabled for pre-PG 17 foreign servers during the query.
>
> If we want to be more specific, we can call it "check_partial_aggregate_support".
Thank you both for your advice.
We will address the compatibility issues as follows.

Approach1-3:
I will add a postgres_fdw option "check_partial_aggregate_support".
This option is false, default.
Only if this option is true, postgres_fdw connect to the remote server and get the version of the remote server.
And if the version of the remote server is less than PG17, then partial aggregate push down to the remote server is
disable.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

> -----Original Message-----
> From: Bruce Momjian <bruce@momjian.us>
> Sent: Thursday, June 22, 2023 12:44 AM
> To: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Cc: Fujii Yuki/藤井 雄規(MELCO/情報総研 ＤＭ最適Ｇ) <Fujii.Yuki@df.MitsubishiElectric.co.jp>;
> PostgreSQL-development <pgsql-hackers@postgresql.org>; Andres Freund <andres@anarazel.de>; Tom Lane
> <tgl@sss.pgh.pa.us>; Tomas Vondra <tomas.vondra@enterprisedb.com>; Julien Rouhaud <rjuju123@gmail.com>;
> Daniel Gustafsson <daniel@yesql.se>; Ilya Gladyshev <i.gladyshev@postgrespro.ru>
> Subject: Re: Partial aggregates pushdown
>
> On Tue, Jun 20, 2023 at 09:59:11AM +0300, Alexander Pyhalov wrote:
> > > Therefore, it seems like it would be near-zero cost to just call
> > > conn =
> > > GetConnection() and then PQserverVersion(conn), and ReleaseConnection().
> > > You can then use the return value of PQserverVersion() to determine
> > > if you can push down partial aggregates.
> >
> > Hi.
> > Currently we don't get remote connection while planning if
> > use_remote_estimate is not set.
> > Such change would require to get remote connection in planner, not in
> > executor.
> > This can lead to change of behavior (like errors in explain when user
> > mapping is wrong - e.g. bad password is specified).
> > Also this potentially can lead to establishing connections even when
> > plan node is not actually used (like extreme example - select
> > sum(score) from t limit 0).
> > I'm not saying we shouldn't do it - just hint at possible consequences.
>
> Agreed.  I noticed it was doing FDW connections during optimization, but didn't see the postgres_fdw option that
would
> turn it off.
> Interestingly, it is disabled by default.
>
> After considering the options, I think we should have a postgres_fdw option called "planner_version_check", and
default
> that false.  When false, a remote server version check will not be performed during planning and partial aggregates
willbe 
> always be considered.  When true, a version check will be performed during planning and partial aggregate pushdown
> disabled for pre-PG 17 foreign servers during the query.
>
> If we want to be more specific, we can call it "check_partial_aggregate_support".
>
> --
>   Bruce Momjian  <bruce@momjian.us>        https://momjian.us
>   EDB                                      https://enterprisedb.com
>
>   Only you can decide what is important to you.

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

22 июня 2023 г., 14:39:14

On Thu, Jun 22, 2023 at 05:23:33AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> Approach1-3:
> I will add a postgres_fdw option "check_partial_aggregate_support".
> This option is false, default.
> Only if this option is true, postgres_fdw connect to the remote server and get the version of the remote server.
> And if the version of the remote server is less than PG17, then partial aggregate push down to the remote server is
disable.

Great!

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

10 июля 2023 г., 10:35:27

Hi Mr.Bruce, Mr.Pyhalov, hackers.

> From: Bruce Momjian <bruce@momjian.us>
> Sent: Monday, June 12, 2023 10:38 PM
>
> On Mon, Jun 12, 2023 at 08:51:30AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> > Hi Mr.Bruce, Mr.Pyhalov, hackers.
> >
> > Thank you for comments. I will try to respond to both of your comments as follows.
> > I plan to start revising the patch next week. If you have any comments
> > on the following respondences, I would appreciate it if you could give them to me this week.
> >
> > > From: Bruce Momjian <bruce@momjian.us>
> > > Sent: Saturday, June 10, 2023 1:44 AM I agree that this feature is
> > > designed for built-in sharding, but it is possible people could be
> > > using aggregates on partitions backed by foreign tables without
> > > sharding.  Adding a requirement for non-sharding setups to need PG 17+ servers might be unreasonable.
> > Indeed, it is possible to use partial aggregate pushdown feature for purposes other than sharding.
> > The description of the section "F.38.6. Built-in sharding in
> > PostgreSQL" assumes the use of Built-in sharding and will be modified to eliminate this assumption.
> > The title of this section should be changed to something like "Aggregate on partitioned table".
>
> Sounds good.
I have modified documents according to the above policy.

> From: Bruce Momjian <bruce@momjian.us>
> Sent: Thursday, June 22, 2023 8:39 PM
> On Thu, Jun 22, 2023 at 05:23:33AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> > Approach1-3:
> > I will add a postgres_fdw option "check_partial_aggregate_support".
> > This option is false, default.
> > Only if this option is true, postgres_fdw connect to the remote server and get the version of the remote server.
> > And if the version of the remote server is less than PG17, then partial aggregate push down to the remote server is
> disable.
>
> Great!
I have modified the program except for the point "if the version of the remote server is less than PG17".
Instead, we have addressed the following.
"If check_partial_aggregate_support is true and the remote server version is older than the local server
version, postgres_fdw does not assume that the partial aggregate function is on the remote server unless
the partial aggregate function and the aggregate function match."
The reason for this is to maintain compatibility with any aggregate function that does not support partial
aggregate in one version of V1 (V1 is PG17 or higher), even if the next version supports partial aggregate.
For example, string_agg does not support partial aggregation in PG15, but it will support partial aggregation
in PG16.

We have not been able to add a test for the case where the remote server version is older than the
local server version to the regression test. Is there any way to add such tests to the existing regression
tests?

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v23.patch

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

14 июля 2023 г., 16:40:16

Fujii.Yuki@df.MitsubishiElectric.co.jp писал 2023-07-10 10:35:
> I have modified the program except for the point "if the version of
> the remote server is less than PG17".
> Instead, we have addressed the following.
> "If check_partial_aggregate_support is true and the remote server
> version is older than the local server
> version, postgres_fdw does not assume that the partial aggregate
> function is on the remote server unless
> the partial aggregate function and the aggregate function match."
> The reason for this is to maintain compatibility with any aggregate
> function that does not support partial
> aggregate in one version of V1 (V1 is PG17 or higher), even if the
> next version supports partial aggregate.
> For example, string_agg does not support partial aggregation in PG15,
> but it will support partial aggregation
> in PG16.
> 

Hi.

1) In foreign_join_ok() should we set fpinfo->user if 
fpinfo->check_partial_aggregate_support is set like it's done for 
fpinfo->use_remote_estimate? It seems we can end up with fpinfo->user = 
NULL if use_remote_estimate is not set.

2) It seeems we found an additional issue with original patch, which is 
present in current one. I'm attaching a patch which seems to fix it, but 
I'm not quite sure in it.


> We have not been able to add a test for the case where the remote
> server version is older than the
> local server version to the regression test. Is there any way to add
> such tests to the existing regression
> tests?
> 

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Вложения

0001-For-partial-aggregation-we-can-t-rely-on-the-fact-th.diff

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

18 июля 2023 г., 04:35:53

Hi Mr.Pyhalov.

> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Sent: Friday, July 14, 2023 10:40 PM
> 1) In foreign_join_ok() should we set fpinfo->user if
> fpinfo->check_partial_aggregate_support is set like it's done for 
> fpinfo->use_remote_estimate? It seems we can end up with fpinfo->user 
> fpinfo->=
> NULL if use_remote_estimate is not set.
You are right. I will modify this patch according to your advice.
Thank you for advice.

> 2) It seeems we found an additional issue with original patch, which 
> is present in current one. I'm attaching a patch which seems to fix 
> it, but I'm not quite sure in it.
Thank you for pointing out the issue.
If a query's group-by clause contains variable based expression(not variable)
and the query's select clause contains another expression,
the partial aggregate could be unsafe to push down.

An example of such queries:
SELECT (b/2)::numeric, avg(a), max(a), count(*) FROM pagg_tab GROUP BY b/2

Your patch disables partial aggregate pushdown for such queries.
I'll see if we can modify the patch to safely do a partial aggregate pushdown for such queries as well.
Such a query expects the variable in the select clause expression to be included in the target of the grouped rel
(let see make_partial_grouping_target), 
but the original groupby clause has no reference to this variable,
this seems to be the direct cause(let see foreign_grouping_ok). 
I will examine whether a safe pushdown can be achieved by matching the
groupby clause information referenced by foreign_grouping_ok with the grouped rel target information.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

19 июля 2023 г., 03:43:38

Hi Mr.Pyhalov, hackers.

I have made the following three modifications about this patch.

1)
> <Fujii.Yuki@df.MitsubishiElectric.co.jp>
> Sent: Tuesday, July 18, 2023 10:36 AM
> > From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> > Sent: Friday, July 14, 2023 10:40 PM
> > 1) In foreign_join_ok() should we set fpinfo->user if
> > fpinfo->check_partial_aggregate_support is set like it's done for
> > fpinfo->use_remote_estimate? It seems we can end up with fpinfo->user
> > fpinfo->=
> > NULL if use_remote_estimate is not set.
> You are right. I will modify this patch according to your advice.
> Thank you for advice.
Done.

2)
> <Fujii.Yuki@df.MitsubishiElectric.co.jp>
> Sent: Tuesday, July 18, 2023 10:36 AM
> > 2) It seeems we found an additional issue with original patch, which
> > is present in current one. I'm attaching a patch which seems to fix
> > it, but I'm not quite sure in it.
> Thank you for pointing out the issue.
> If a query's group-by clause contains variable based expression(not variable)
> and the query's select clause contains another expression,
> the partial aggregate could be unsafe to push down.
>
> An example of such queries:
> SELECT (b/2)::numeric, avg(a), max(a), count(*) FROM pagg_tab GROUP BY b/2
>
> Your patch disables partial aggregate pushdown for such queries.
> I'll see if we can modify the patch to safely do a partial aggregate pushdown for such queries as well.
> Such a query expects the variable in the select clause expression to be included in the target of the grouped rel
> (let see make_partial_grouping_target),
> but the original groupby clause has no reference to this variable,
> this seems to be the direct cause(let see foreign_grouping_ok).
> I will examine whether a safe pushdown can be achieved by matching the
> groupby clause information referenced by foreign_grouping_ok with the grouped rel target information.
I modified the patch to safely do a partial aggregate pushdown for such queries as well
 by matching the groupby clause information referenced by foreign_grouping_ok with the grouped rel target information.

3)
I modified the patch to safely do a partial aggregate pushdown for queries which contain having clauses.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v24.patch

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

20 июля 2023 г., 13:23:44

Fujii.Yuki@df.MitsubishiElectric.co.jp писал 2023-07-19 03:43:
> Hi Mr.Pyhalov, hackers.

> 3)
> I modified the patch to safely do a partial aggregate pushdown for
> queries which contain having clauses.
> 

Hi.
Sorry, but I don't see how it could work.
For example, the attached test returns wrong result:

CREATE FUNCTION f() RETURNS INT AS $$
begin
   return 10;
end
$$ LANGUAGE PLPGSQL;

SELECT b, sum(a) FROM pagg_tab GROUP BY b HAVING sum(a) < f() ORDER BY 
1;
  b  | sum
----+-----
   0 |   0
  10 |   0
  20 |   0
  30 |   0
  40 |   0
+(5 rows)

In fact the above query should have returned 0 rows, as

SELECT b, sum(a) FROM pagg_tab GROUP BY b ORDER BY 1;
  b  | sum
----+------
   0 |  600
   1 |  660
   2 |  720
   3 |  780
   4 |  840
   5 |  900
   6 |  960
   7 | 1020
   8 | 1080
   9 | 1140
  10 |  600
  11 |  660
  12 |  720
....
shows no such rows.

Or, on the same data

SELECT b, sum(a) FROM pagg_tab GROUP BY b HAVING sum(a) > 660 ORDER BY 
1;

You'll get 0 rows.

But
SELECT b, sum(a) FROM pagg_tab GROUP BY b;
  b  | sum
----+------
  42 |  720
  29 | 1140
   4 |  840
  34 |  840
  41 |  660
   0 |  600
  40 |  600
gives.

The issue is that you can't calculate "partial" having. You should 
compare full aggregate in filter, but it's not possible on the level of 
one partition.
And you have this in plans

  Finalize GroupAggregate
    Output: pagg_tab.b, avg(pagg_tab.a), max(pagg_tab.a), count(*)
    Group Key: pagg_tab.b
    Filter: (sum(pagg_tab.a) < 700)
    ->  Sort
          Output: pagg_tab.b, (PARTIAL avg(pagg_tab.a)), (PARTIAL 
max(pagg_tab.a)), (PARTIAL count(*)), (PARTIAL sum(pagg_tab.a))
          Sort Key: pagg_tab.b
          ->  Append
                ->  Foreign Scan
                      Output: pagg_tab.b, (PARTIAL avg(pagg_tab.a)), 
(PARTIAL max(pagg_tab.a)), (PARTIAL count(*)), (PARTIAL sum(pagg_tab.a))
                      Filter: ((PARTIAL sum(pagg_tab.a)) < 700)   !!!! 
<--- here we can't compare anything yet, sum is incomplete.
                      Relations: Aggregate on (public.fpagg_tab_p1 
pagg_tab)
                      Remote SQL: SELECT b, avg_p_int4(a), max(a), 
count(*), sum(a) FROM public.pagg_tab_p1 GROUP BY 1
                ->  Foreign Scan
                      Output: pagg_tab_1.b, (PARTIAL avg(pagg_tab_1.a)), 
(PARTIAL max(pagg_tab_1.a)), (PARTIAL count(*)), (PARTIAL 
sum(pagg_tab_1.a))
                      Filter: ((PARTIAL sum(pagg_tab_1.a)) < 700)
                      Relations: Aggregate on (public.fpagg_tab_p2 
pagg_tab_1)
                      Remote SQL: SELECT b, avg_p_int4(a), max(a), 
count(*), sum(a) FROM public.pagg_tab_p2 GROUP BY 1
                ->  Foreign Scan
                      Output: pagg_tab_2.b, (PARTIAL avg(pagg_tab_2.a)), 
(PARTIAL max(pagg_tab_2.a)), (PARTIAL count(*)), (PARTIAL 
sum(pagg_tab_2.a))
                      Filter: ((PARTIAL sum(pagg_tab_2.a)) < 700)
                      Relations: Aggregate on (public.fpagg_tab_p3 
pagg_tab_2)
                      Remote SQL: SELECT b, avg_p_int4(a), max(a), 
count(*), sum(a) FROM public.pagg_tab_p3 GROUP BY 1

In foreign_grouping_ok()
6586                         if (IsA(expr, Aggref))
6587                         {
6588                                 if (partial)
6589                                 {
6590                                         mark_partial_aggref((Aggref 
*) expr, AGGSPLIT_INITIAL_SERIAL);
6591                                         continue;
6592                                 }
6593                                 else if (!is_foreign_expr(root, 
grouped_rel, expr))
6594                                         return false;
6595
6596                                 tlist = add_to_flat_tlist(tlist, 
list_make1(expr));
6597                         }

at least you shouldn't do anything with expr, if is_foreign_expr() 
returned false. If we restrict pushing down queries with havingQuals, 
I'm not quite sure how Aggref can appear in local_conds.

As for changes in planner.c (setGroupClausePartial()) I have several 
questions.

1) Why don't we add non_group_exprs to pathtarget->exprs when 
partial_target->exprs is not set?

2) We replace extra->partial_target->exprs with partial_target->exprs 
after processing. Why are we sure that after this tleSortGroupRef is 
correct?

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Вложения

check.patch

Re: Partial aggregates pushdown

От

"Finnerty, Jim"

Дата:

01 августа 2023 г., 23:25:55

When it is valid to filter based on a HAVING clause predicate, it should already have been converted into a WHERE
clausepredicate, except in the special case of an LIMIT TO .k .. ORDER BY case where the HAVING clause predicate can be
determinedapproximately after having found k fully qualified tuples and then that predicate is successively tightened
asmore qualified records are found.

*that*, by the way, is a very powerful optimization.

    /Jim F

On 7/20/23, 6:24 AM, "Alexander Pyhalov" <a.pyhalov@postgrespro.ru <mailto:a.pyhalov@postgrespro.ru>> wrote:

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can
confirmthe sender and know the content is safe.

Fujii.Yuki@df.MitsubishiElectric.co.jp <mailto:Fujii.Yuki@df.MitsubishiElectric.co.jp> писал 2023-07-19 03:43:
> Hi Mr.Pyhalov, hackers.

> 3)
> I modified the patch to safely do a partial aggregate pushdown for
> queries which contain having clauses.
>

Hi.
Sorry, but I don't see how it could work.
For example, the attached test returns wrong result:

CREATE FUNCTION f() RETURNS INT AS $$
begin
return 10;
end
$$ LANGUAGE PLPGSQL;

SELECT b, sum(a) FROM pagg_tab GROUP BY b HAVING sum(a) < f() ORDER BY
1;
b | sum
----+-----
0 | 0
10 | 0
20 | 0
30 | 0
40 | 0
+(5 rows)

In fact the above query should have returned 0 rows, as

SELECT b, sum(a) FROM pagg_tab GROUP BY b ORDER BY 1;
b | sum
----+------
0 | 600
1 | 660
2 | 720
3 | 780
4 | 840
5 | 900
6 | 960
7 | 1020
8 | 1080
9 | 1140
10 | 600
11 | 660
12 | 720
....
shows no such rows.

Or, on the same data

SELECT b, sum(a) FROM pagg_tab GROUP BY b HAVING sum(a) > 660 ORDER BY
1;

You'll get 0 rows.

But
SELECT b, sum(a) FROM pagg_tab GROUP BY b;
b | sum
----+------
42 | 720
29 | 1140
4 | 840
34 | 840
41 | 660
0 | 600
40 | 600
gives.

The issue is that you can't calculate "partial" having. You should
compare full aggregate in filter, but it's not possible on the level of
one partition.
And you have this in plans

Finalize GroupAggregate
Output: pagg_tab.b, avg(pagg_tab.a), max(pagg_tab.a), count(*)
Group Key: pagg_tab.b
Filter: (sum(pagg_tab.a) < 700)
-> Sort
Output: pagg_tab.b, (PARTIAL avg(pagg_tab.a)), (PARTIAL
max(pagg_tab.a)), (PARTIAL count(*)), (PARTIAL sum(pagg_tab.a))
Sort Key: pagg_tab.b
-> Append
-> Foreign Scan
Output: pagg_tab.b, (PARTIAL avg(pagg_tab.a)),
(PARTIAL max(pagg_tab.a)), (PARTIAL count(*)), (PARTIAL sum(pagg_tab.a))
Filter: ((PARTIAL sum(pagg_tab.a)) < 700) !!!!
<--- here we can't compare anything yet, sum is incomplete.
Relations: Aggregate on (public.fpagg_tab_p1
pagg_tab)
Remote SQL: SELECT b, avg_p_int4(a), max(a),
count(*), sum(a) FROM public.pagg_tab_p1 GROUP BY 1
-> Foreign Scan
Output: pagg_tab_1.b, (PARTIAL avg(pagg_tab_1.a)),
(PARTIAL max(pagg_tab_1.a)), (PARTIAL count(*)), (PARTIAL
sum(pagg_tab_1.a))
Filter: ((PARTIAL sum(pagg_tab_1.a)) < 700)
Relations: Aggregate on (public.fpagg_tab_p2
pagg_tab_1)
Remote SQL: SELECT b, avg_p_int4(a), max(a),
count(*), sum(a) FROM public.pagg_tab_p2 GROUP BY 1
-> Foreign Scan
Output: pagg_tab_2.b, (PARTIAL avg(pagg_tab_2.a)),
(PARTIAL max(pagg_tab_2.a)), (PARTIAL count(*)), (PARTIAL
sum(pagg_tab_2.a))
Filter: ((PARTIAL sum(pagg_tab_2.a)) < 700)
Relations: Aggregate on (public.fpagg_tab_p3
pagg_tab_2)
Remote SQL: SELECT b, avg_p_int4(a), max(a),
count(*), sum(a) FROM public.pagg_tab_p3 GROUP BY 1

In foreign_grouping_ok()
6586 if (IsA(expr, Aggref))
6587 {
6588 if (partial)
6589 {
6590 mark_partial_aggref((Aggref
*) expr, AGGSPLIT_INITIAL_SERIAL);
6591 continue;
6592 }
6593 else if (!is_foreign_expr(root,
grouped_rel, expr))
6594 return false;
6595
6596 tlist = add_to_flat_tlist(tlist,
list_make1(expr));
6597 }

at least you shouldn't do anything with expr, if is_foreign_expr()
returned false. If we restrict pushing down queries with havingQuals,
I'm not quite sure how Aggref can appear in local_conds.

As for changes in planner.c (setGroupClausePartial()) I have several
questions.

1) Why don't we add non_group_exprs to pathtarget->exprs when
partial_target->exprs is not set?

2) We replace extra->partial_target->exprs with partial_target->exprs
after processing. Why are we sure that after this tleSortGroupRef is
correct?

--
Best regards,
Alexander Pyhalov,
Postgres Professional

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

07 августа 2023 г., 21:30:54

On Mon, Jul 10, 2023 at 07:35:27AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> > > I will add a postgres_fdw option "check_partial_aggregate_support".
> > > This option is false, default.
> > > Only if this option is true, postgres_fdw connect to the remote server and get the version of the remote server.
> > > And if the version of the remote server is less than PG17, then partial aggregate push down to the remote server
is
> > disable.
> > 
> > Great!
> I have modified the program except for the point "if the version of the remote server is less than PG17".
> Instead, we have addressed the following.
> "If check_partial_aggregate_support is true and the remote server version is older than the local server
> version, postgres_fdw does not assume that the partial aggregate function is on the remote server unless
> the partial aggregate function and the aggregate function match."
> The reason for this is to maintain compatibility with any aggregate function that does not support partial
> aggregate in one version of V1 (V1 is PG17 or higher), even if the next version supports partial aggregate.
> For example, string_agg does not support partial aggregation in PG15, but it will support partial aggregation
> in PG16.

Just to clarify, I think you are saying:

    If check_partial_aggregate_support is true and the remote server
    version is older than the local server version, postgres_fdw
    checks if the partial aggregate function exists on the remote
    server during planning and only uses it if it does.

I tried to phrase it in a positive way, and mentioned the plan time
distinction.  Also, I am sorry I was away for most of July and am just
getting to this.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

25 сентября 2023 г., 06:18:13

Hi Mr.Bruce, Mr.Pyhalov, Mr.Finnerty, hackers.

Thank you for your valuable comments. I sincerely apologize for the very late reply.
Here is a response to your comments or a fix to the patch.

Tuesday, August 8, 2023 at 3:31 Bruce Momjian
> > I have modified the program except for the point "if the version of the remote server is less than PG17".
> > Instead, we have addressed the following.
> > "If check_partial_aggregate_support is true and the remote server version is older than the local server
> > version, postgres_fdw does not assume that the partial aggregate function is on the remote server unless
> > the partial aggregate function and the aggregate function match."
> > The reason for this is to maintain compatibility with any aggregate function that does not support partial
> > aggregate in one version of V1 (V1 is PG17 or higher), even if the next version supports partial aggregate.
> > For example, string_agg does not support partial aggregation in PG15, but it will support partial aggregation
> > in PG16.
>
> Just to clarify, I think you are saying:
>
>         If check_partial_aggregate_support is true and the remote server
>         version is older than the local server version, postgres_fdw
>         checks if the partial aggregate function exists on the remote
>         server during planning and only uses it if it does.
>
> I tried to phrase it in a positive way, and mentioned the plan time
> distinction.  Also, I am sorry I was away for most of July and am just
> getting to this.
Thanks for your comment. In the documentation, the description of check_partial_aggregate_support is as follows
(please see postgres-fdw.sgml).
--
check_partial_aggregate_support (boolean)
Only if this option is true, during query planning, postgres_fdw connects to the remote server and check if the remote
serverversion is older than the local server version. If so, postgres_fdw assumes that for each built-in aggregate
function,the partial aggregate function is not defined on the remote server unless the partial aggregate function and
theaggregate function match. The default is false. 
--

Thursday, 20 July 2023 19:23 Alexander Pyhalov <a.pyhalov@postgrespro.ru>:
> Fujii.Yuki@df.MitsubishiElectric.co.jp писал 2023-07-19 03:43:
> > Hi Mr.Pyhalov, hackers.
>
> > 3)
> > I modified the patch to safely do a partial aggregate pushdown for
> > queries which contain having clauses.
> >
>
> Hi.
> Sorry, but I don't see how it could work.
We apologize for any inconvenience caused.
Thanks to Pyhalov's and Jim's comments, I have realized that I have made a fundamental mistake regarding the pushdown
ofthe HAVING clause and the difficulty of achieving it performing Partial aggregate pushdown. 
So, I removed the codes about pushdown of the HAVING clause performing Partial aggregate pushdown.

Thursday, 20 July 2023 19:23 Alexander Pyhalov <a.pyhalov@postgrespro.ru>:
> As for changes in planner.c (setGroupClausePartial()) I have several
> questions.
>
> 1) Why don't we add non_group_exprs to pathtarget->exprs when
> partial_target->exprs is not set?
>
> 2) We replace extra->partial_target->exprs with partial_target->exprs
> after processing. Why are we sure that after this tleSortGroupRef is
> correct?
Response to 1)
The code you pointed out was unnecessary. I have removed this code.
Also, the process of adding PlaceHolderVar's expr to partial_target was missing.
So I fixed this.

Response to 2)
The making procedures extra->groupClausePartial and extra->partial_target
in make_partial_grouping_target for this patch is as follows.
STEP1. From grouping_target->exprs, extract Aggref, Var and Placeholdervar that are not included in Aggref.
STEP2. setGroupClausePartial sets the copy of original groupClause to extra->groupClausePartial
and sets the copy of original partial_target to extra->partial_target.
STEP3. setGroupClausePartial adds Var and Placeholdervar in STEP1 to partial_target.
The sortgroupref of partial_target->sortgrouprefs to be added to value is set to
(the maximum value of the existing sortgroupref) + 1.
setGroupClausePartial adds data sgc of sortgroupclause type where sgc->tlesortgroupref
matches the sortgroupref to GroupClause.
STEP4. add_new_columns_to_pathtarget adds STEP1's Aggref to partial_target.

Due to STEP2, the list of tlesortgrouprefs set in extra->groupClausePartial is not duplicated.
Also, sortgrouprefs added to extra->partial_target matches with corresponding
tlesortgrouprefs added to extra->groupClausePartial.
So these tlesortgrouprefs are correct.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v25.patch

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

26 сентября 2023 г., 01:30:32

On Mon, Sep 25, 2023 at 03:18:13AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> Hi Mr.Bruce, Mr.Pyhalov, Mr.Finnerty, hackers.
> 
> Thank you for your valuable comments. I sincerely apologize for the very late reply.
> Here is a response to your comments or a fix to the patch.
> 
> Tuesday, August 8, 2023 at 3:31 Bruce Momjian
> > > I have modified the program except for the point "if the version of the remote server is less than PG17".
> > > Instead, we have addressed the following.
> > > "If check_partial_aggregate_support is true and the remote server version is older than the local server
> > > version, postgres_fdw does not assume that the partial aggregate function is on the remote server unless
> > > the partial aggregate function and the aggregate function match."
> > > The reason for this is to maintain compatibility with any aggregate function that does not support partial
> > > aggregate in one version of V1 (V1 is PG17 or higher), even if the next version supports partial aggregate.
> > > For example, string_agg does not support partial aggregation in PG15, but it will support partial aggregation
> > > in PG16.
> >
> > Just to clarify, I think you are saying:
> >
> >         If check_partial_aggregate_support is true and the remote server
> >         version is older than the local server version, postgres_fdw
> >         checks if the partial aggregate function exists on the remote
> >         server during planning and only uses it if it does.
> >
> > I tried to phrase it in a positive way, and mentioned the plan time
> > distinction.  Also, I am sorry I was away for most of July and am just
> > getting to this.
> Thanks for your comment. In the documentation, the description of check_partial_aggregate_support is as follows
> (please see postgres-fdw.sgml).
> --
> check_partial_aggregate_support (boolean)
> Only if this option is true, during query planning, postgres_fdw connects to the remote server and check if the
remoteserver version is older than the local server version. If so, postgres_fdw assumes that for each built-in
aggregatefunction, the partial aggregate function is not defined on the remote server unless the partial aggregate
functionand the aggregate function match. The default is false.
 
> --

My point is that there are three behaviors:

*  false - no check
*  true, remote version >= sender - no check
*  true, remove version < sender - check

Here is your code:

    + * Check that a buit-in aggpartialfunc exists on the remote server. If
    + * check_partial_aggregate_support is false, we assume the partial aggregate
    + * function exsits on the remote server. Otherwise we assume the partial
    + * aggregate function exsits on the remote server only if the remote server
    + * version is not less than the local server version.
    + */
    +static bool
    +is_builtin_aggpartialfunc_shippable(Oid aggpartialfn, PgFdwRelationInfo *fpinfo)
    +{
    +       bool            shippable = true;
    +
    +       if (fpinfo->check_partial_aggregate_support)
    +       {
    +               if (fpinfo->remoteversion == 0)
    +               {
    +                       PGconn     *conn = GetConnection(fpinfo->user, false, NULL);
    +
    +                       fpinfo->remoteversion = PQserverVersion(conn);
    +               }
    +               if (fpinfo->remoteversion < PG_VERSION_NUM)
    +                       shippable = false;
    +       }
    +       return shippable;
    +}

I think this needs to be explained in the docs.  I am ready to adjust
the patch to improve the wording whenever you are ready.  Should I do it
now and post an updated version for you to use?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

26 сентября 2023 г., 09:26:25

Hi Mr.Bruce.

Tuesday, September 26, 2023 7:31 Bruce Momjian
> On Mon, Sep 25, 2023 at 03:18:13AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> > Hi Mr.Bruce, Mr.Pyhalov, Mr.Finnerty, hackers.
> >
> > Thank you for your valuable comments. I sincerely apologize for the very late reply.
> > Here is a response to your comments or a fix to the patch.
> >
> > Tuesday, August 8, 2023 at 3:31 Bruce Momjian
> > > > I have modified the program except for the point "if the version of the remote server is less than PG17".
> > > > Instead, we have addressed the following.
> > > > "If check_partial_aggregate_support is true and the remote server
> > > > version is older than the local server version, postgres_fdw does
> > > > not assume that the partial aggregate function is on the remote server unless the partial aggregate function
andthe 
> aggregate function match."
> > > > The reason for this is to maintain compatibility with any
> > > > aggregate function that does not support partial aggregate in one version of V1 (V1 is PG17 or higher), even if
the
> next version supports partial aggregate.
> > > > For example, string_agg does not support partial aggregation in
> > > > PG15, but it will support partial aggregation in PG16.
> > >
> > > Just to clarify, I think you are saying:
> > >
> > >         If check_partial_aggregate_support is true and the remote server
> > >         version is older than the local server version, postgres_fdw
> > >         checks if the partial aggregate function exists on the remote
> > >         server during planning and only uses it if it does.
> > >
> > > I tried to phrase it in a positive way, and mentioned the plan time
> > > distinction.  Also, I am sorry I was away for most of July and am
> > > just getting to this.
> > Thanks for your comment. In the documentation, the description of
> > check_partial_aggregate_support is as follows (please see postgres-fdw.sgml).
> > --
> > check_partial_aggregate_support (boolean) Only if this option is true,
> > during query planning, postgres_fdw connects to the remote server and check if the remote server version is older
than
> the local server version. If so, postgres_fdw assumes that for each built-in aggregate function, the partial
aggregate
> function is not defined on the remote server unless the partial aggregate function and the aggregate function match.
The
> default is false.
> > --
>
> My point is that there are three behaviors:
>
> *  false - no check
> *  true, remote version >= sender - no check
> *  true, remove version < sender - check
>
> Here is your code:
>
>     + * Check that a buit-in aggpartialfunc exists on the remote server. If
>     + * check_partial_aggregate_support is false, we assume the partial aggregate
>     + * function exsits on the remote server. Otherwise we assume the partial
>     + * aggregate function exsits on the remote server only if the remote server
>     + * version is not less than the local server version.
>     + */
>     +static bool
>     +is_builtin_aggpartialfunc_shippable(Oid aggpartialfn, PgFdwRelationInfo *fpinfo)
>     +{
>     +       bool            shippable = true;
>     +
>     +       if (fpinfo->check_partial_aggregate_support)
>     +       {
>     +               if (fpinfo->remoteversion == 0)
>     +               {
>     +                       PGconn     *conn = GetConnection(fpinfo->user, false, NULL);
>     +
>     +                       fpinfo->remoteversion = PQserverVersion(conn);
>     +               }
>     +               if (fpinfo->remoteversion < PG_VERSION_NUM)
>     +                       shippable = false;
>     +       }
>     +       return shippable;
>     +}
>
> I think this needs to be explained in the docs.  I am ready to adjust the patch to improve the wording whenever you
are
> ready.  Should I do it now and post an updated version for you to use?
The following explanation was omitted from the documentation, so I added it.
> *  false - no check
> *  true, remove version < sender - check
I have responded to your comment, but if there is a problem with the wording, could you please suggest a correction?

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

> -----Original Message-----
> From: Bruce Momjian <bruce@momjian.us>
> Sent: Tuesday, September 26, 2023 7:31 AM
> To: Fujii Yuki/藤井 雄規(MELCO/情報総研 ＤＭ最適Ｇ) <Fujii.Yuki@df.MitsubishiElectric.co.jp>
> Cc: Alexander Pyhalov <a.pyhalov@postgrespro.ru>; Finnerty, Jim <jfinnert@amazon.com>; PostgreSQL-development
> <pgsql-hackers@postgresql.org>; Andres Freund <andres@anarazel.de>; Tom Lane <tgl@sss.pgh.pa.us>; Tomas
> Vondra <tomas.vondra@enterprisedb.com>; Julien Rouhaud <rjuju123@gmail.com>; Daniel Gustafsson
> <daniel@yesql.se>; Ilya Gladyshev <i.gladyshev@postgrespro.ru>
> Subject: Re: Partial aggregates pushdown
>
> On Mon, Sep 25, 2023 at 03:18:13AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> > Hi Mr.Bruce, Mr.Pyhalov, Mr.Finnerty, hackers.
> >
> > Thank you for your valuable comments. I sincerely apologize for the very late reply.
> > Here is a response to your comments or a fix to the patch.
> >
> > Tuesday, August 8, 2023 at 3:31 Bruce Momjian
> > > > I have modified the program except for the point "if the version of the remote server is less than PG17".
> > > > Instead, we have addressed the following.
> > > > "If check_partial_aggregate_support is true and the remote server
> > > > version is older than the local server version, postgres_fdw does
> > > > not assume that the partial aggregate function is on the remote server unless the partial aggregate function
andthe 
> aggregate function match."
> > > > The reason for this is to maintain compatibility with any
> > > > aggregate function that does not support partial aggregate in one version of V1 (V1 is PG17 or higher), even if
the
> next version supports partial aggregate.
> > > > For example, string_agg does not support partial aggregation in
> > > > PG15, but it will support partial aggregation in PG16.
> > >
> > > Just to clarify, I think you are saying:
> > >
> > >         If check_partial_aggregate_support is true and the remote server
> > >         version is older than the local server version, postgres_fdw
> > >         checks if the partial aggregate function exists on the remote
> > >         server during planning and only uses it if it does.
> > >
> > > I tried to phrase it in a positive way, and mentioned the plan time
> > > distinction.  Also, I am sorry I was away for most of July and am
> > > just getting to this.
> > Thanks for your comment. In the documentation, the description of
> > check_partial_aggregate_support is as follows (please see postgres-fdw.sgml).
> > --
> > check_partial_aggregate_support (boolean) Only if this option is true,
> > during query planning, postgres_fdw connects to the remote server and check if the remote server version is older
than
> the local server version. If so, postgres_fdw assumes that for each built-in aggregate function, the partial
aggregate
> function is not defined on the remote server unless the partial aggregate function and the aggregate function match.
The
> default is false.
> > --
>
> My point is that there are three behaviors:
>
> *  false - no check
> *  true, remote version >= sender - no check
> *  true, remove version < sender - check
>
> Here is your code:
>
>     + * Check that a buit-in aggpartialfunc exists on the remote server. If
>     + * check_partial_aggregate_support is false, we assume the partial aggregate
>     + * function exsits on the remote server. Otherwise we assume the partial
>     + * aggregate function exsits on the remote server only if the remote server
>     + * version is not less than the local server version.
>     + */
>     +static bool
>     +is_builtin_aggpartialfunc_shippable(Oid aggpartialfn, PgFdwRelationInfo *fpinfo)
>     +{
>     +       bool            shippable = true;
>     +
>     +       if (fpinfo->check_partial_aggregate_support)
>     +       {
>     +               if (fpinfo->remoteversion == 0)
>     +               {
>     +                       PGconn     *conn = GetConnection(fpinfo->user, false, NULL);
>     +
>     +                       fpinfo->remoteversion = PQserverVersion(conn);
>     +               }
>     +               if (fpinfo->remoteversion < PG_VERSION_NUM)
>     +                       shippable = false;
>     +       }
>     +       return shippable;
>     +}
>
> I think this needs to be explained in the docs.  I am ready to adjust the patch to improve the wording whenever you
are
> ready.  Should I do it now and post an updated version for you to use?
>
> --
>   Bruce Momjian  <bruce@momjian.us>        https://momjian.us
>   EDB                                      https://enterprisedb.com
>
>   Only you can decide what is important to you.

Вложения

0001-Partial-aggregates-push-down-v26.patch

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

26 сентября 2023 г., 16:15:14

Fujii.Yuki@df.MitsubishiElectric.co.jp писал 2023-09-25 06:18:
> Hi Mr.Bruce, Mr.Pyhalov, Mr.Finnerty, hackers.
> 
> Thank you for your valuable comments. I sincerely apologize for the
> very late reply.
> Here is a response to your comments or a fix to the patch.
> 
> Tuesday, August 8, 2023 at 3:31 Bruce Momjian
>> > I have modified the program except for the point "if the version of the remote server is less than PG17".
>> > Instead, we have addressed the following.
>> > "If check_partial_aggregate_support is true and the remote server version is older than the local server
>> > version, postgres_fdw does not assume that the partial aggregate function is on the remote server unless
>> > the partial aggregate function and the aggregate function match."
>> > The reason for this is to maintain compatibility with any aggregate function that does not support partial
>> > aggregate in one version of V1 (V1 is PG17 or higher), even if the next version supports partial aggregate.
>> > For example, string_agg does not support partial aggregation in PG15, but it will support partial aggregation
>> > in PG16.
>> 
>> Just to clarify, I think you are saying:
>> 
>>         If check_partial_aggregate_support is true and the remote 
>> server
>>         version is older than the local server version, postgres_fdw
>>         checks if the partial aggregate function exists on the remote
>>         server during planning and only uses it if it does.
>> 
>> I tried to phrase it in a positive way, and mentioned the plan time
>> distinction.  Also, I am sorry I was away for most of July and am just
>> getting to this.
> Thanks for your comment. In the documentation, the description of
> check_partial_aggregate_support is as follows
> (please see postgres-fdw.sgml).
> --
> check_partial_aggregate_support (boolean)
> Only if this option is true, during query planning, postgres_fdw
> connects to the remote server and check if the remote server version
> is older than the local server version. If so, postgres_fdw assumes
> that for each built-in aggregate function, the partial aggregate
> function is not defined on the remote server unless the partial
> aggregate function and the aggregate function match. The default is
> false.
> --
> 
> Thursday, 20 July 2023 19:23 Alexander Pyhalov 
> <a.pyhalov@postgrespro.ru>:
>> Fujii.Yuki@df.MitsubishiElectric.co.jp писал 2023-07-19 03:43:
>> > Hi Mr.Pyhalov, hackers.
>> 
>> > 3)
>> > I modified the patch to safely do a partial aggregate pushdown for
>> > queries which contain having clauses.
>> >
>> 
>> Hi.
>> Sorry, but I don't see how it could work.
> We apologize for any inconvenience caused.
> Thanks to Pyhalov's and Jim's comments, I have realized that I have
> made a fundamental mistake regarding the pushdown of the HAVING clause
> and the difficulty of achieving it performing Partial aggregate
> pushdown.
> So, I removed the codes about pushdown of the HAVING clause performing
> Partial aggregate pushdown.
> 
> Thursday, 20 July 2023 19:23 Alexander Pyhalov 
> <a.pyhalov@postgrespro.ru>:
>> As for changes in planner.c (setGroupClausePartial()) I have several
>> questions.
>> 
>> 1) Why don't we add non_group_exprs to pathtarget->exprs when
>> partial_target->exprs is not set?
>> 
>> 2) We replace extra->partial_target->exprs with partial_target->exprs
>> after processing. Why are we sure that after this tleSortGroupRef is
>> correct?
> Response to 1)
> The code you pointed out was unnecessary. I have removed this code.
> Also, the process of adding PlaceHolderVar's expr to partial_target was 
> missing.
> So I fixed this.
> 
> Response to 2)
> The making procedures extra->groupClausePartial and 
> extra->partial_target
> in make_partial_grouping_target for this patch is as follows.
> STEP1. From grouping_target->exprs, extract Aggref, Var and
> Placeholdervar that are not included in Aggref.
> STEP2. setGroupClausePartial sets the copy of original groupClause to
> extra->groupClausePartial
> and sets the copy of original partial_target to extra->partial_target.
> STEP3. setGroupClausePartial adds Var and Placeholdervar in STEP1 to
> partial_target.
> The sortgroupref of partial_target->sortgrouprefs to be added to value 
> is set to
> (the maximum value of the existing sortgroupref) + 1.
> setGroupClausePartial adds data sgc of sortgroupclause type where
> sgc->tlesortgroupref
> matches the sortgroupref to GroupClause.
> STEP4. add_new_columns_to_pathtarget adds STEP1's Aggref to 
> partial_target.
> 
> Due to STEP2, the list of tlesortgrouprefs set in
> extra->groupClausePartial is not duplicated.

Do you mean that extra->partial_target->sortgrouprefs is not replaced, 
and so we preserve tlesortgroupref numbers?
I'm suspicious about rewriting extra->partial_target->exprs with 
partial_target->exprs - I'm still not sure why we
  don't we loose information, added by add_column_to_pathtarget() to 
extra->partial_target->exprs?

Also look at the following example.

EXPLAIN VERBOSE SELECT  count(*) , (b/2)::numeric FROM pagg_tab GROUP BY 
b/2 ORDER BY 1;
                                             QUERY PLAN
---------------------------------------------------------------------------------------------------
  Sort  (cost=511.35..511.47 rows=50 width=44)
    Output: (count(*)), ((((pagg_tab.b / 2)))::numeric), ((pagg_tab.b / 
2))
    Sort Key: (count(*))
    ->  Finalize HashAggregate  (cost=509.06..509.94 rows=50 width=44)
          Output: count(*), (((pagg_tab.b / 2)))::numeric, ((pagg_tab.b / 
2))
          Group Key: ((pagg_tab.b / 2))
          ->  Append  (cost=114.62..506.06 rows=600 width=16)
                ->  Foreign Scan  (cost=114.62..167.69 rows=200 width=16)
                      Output: ((pagg_tab.b / 2)), (PARTIAL count(*)), 
pagg_tab.b
                      Relations: Aggregate on (public.fpagg_tab_p1 
pagg_tab)
                      Remote SQL: SELECT (b / 2), count(*), b FROM 
public.pagg_tab_p1 GROUP BY 1, 2
                ->  Foreign Scan  (cost=114.62..167.69 rows=200 width=16)
                      Output: ((pagg_tab_1.b / 2)), (PARTIAL count(*)), 
pagg_tab_1.b
                      Relations: Aggregate on (public.fpagg_tab_p2 
pagg_tab_1)
                      Remote SQL: SELECT (b / 2), count(*), b FROM 
public.pagg_tab_p2 GROUP BY 1, 2
                ->  Foreign Scan  (cost=114.62..167.69 rows=200 width=16)
                      Output: ((pagg_tab_2.b / 2)), (PARTIAL count(*)), 
pagg_tab_2.b
                      Relations: Aggregate on (public.fpagg_tab_p3 
pagg_tab_2)
                      Remote SQL: SELECT (b / 2), count(*), b FROM 
public.pagg_tab_p3 GROUP BY 1, 2

Note that group by is still deparsed incorrectly.
-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

26 сентября 2023 г., 17:16:30

On Tue, Sep 26, 2023 at 06:26:25AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> Hi Mr.Bruce.
> > I think this needs to be explained in the docs.  I am ready to adjust the patch to improve the wording whenever you
are
> > ready.  Should I do it now and post an updated version for you to use?
> The following explanation was omitted from the documentation, so I added it.
> > *  false - no check
> > *  true, remove version < sender - check
> I have responded to your comment, but if there is a problem with the wording, could you please suggest a correction?

I like your new wording, thanks.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

27 сентября 2023 г., 01:35:28

Hi Mr.Momjian, Mr.Pyhalov.

Tuesday, 26 September 2023 22:15 Alexander Pyhalov <a.pyhalov@postgrespro.ru>:
> Do you mean that extra->partial_target->sortgrouprefs is not replaced,
> and so we preserve tlesortgroupref numbers?
Yes, that is correct.

> I'm suspicious about rewriting extra->partial_target->exprs with
> partial_target->exprs - I'm still not sure why we
>   don't we loose information, added by add_column_to_pathtarget() to
> extra->partial_target->exprs?
>
> Also look at the following example.
>
> EXPLAIN VERBOSE SELECT  count(*) , (b/2)::numeric FROM pagg_tab GROUP BY
> b/2 ORDER BY 1;
>                                              QUERY PLAN
> ---------------------------------------------------------------------------------------------------
>   Sort  (cost=511.35..511.47 rows=50 width=44)
>     Output: (count(*)), ((((pagg_tab.b / 2)))::numeric), ((pagg_tab.b /
> 2))
>     Sort Key: (count(*))
>     ->  Finalize HashAggregate  (cost=509.06..509.94 rows=50 width=44)
>           Output: count(*), (((pagg_tab.b / 2)))::numeric, ((pagg_tab.b /
> 2))
>           Group Key: ((pagg_tab.b / 2))
>           ->  Append  (cost=114.62..506.06 rows=600 width=16)
>                 ->  Foreign Scan  (cost=114.62..167.69 rows=200 width=16)
>                       Output: ((pagg_tab.b / 2)), (PARTIAL count(*)),
> pagg_tab.b
>                       Relations: Aggregate on (public.fpagg_tab_p1
> pagg_tab)
>                       Remote SQL: SELECT (b / 2), count(*), b FROM
> public.pagg_tab_p1 GROUP BY 1, 2
>                 ->  Foreign Scan  (cost=114.62..167.69 rows=200 width=16)
>                       Output: ((pagg_tab_1.b / 2)), (PARTIAL count(*)),
> pagg_tab_1.b
>                       Relations: Aggregate on (public.fpagg_tab_p2
> pagg_tab_1)
>                       Remote SQL: SELECT (b / 2), count(*), b FROM
> public.pagg_tab_p2 GROUP BY 1, 2
>                 ->  Foreign Scan  (cost=114.62..167.69 rows=200 width=16)
>                       Output: ((pagg_tab_2.b / 2)), (PARTIAL count(*)),
> pagg_tab_2.b
>                       Relations: Aggregate on (public.fpagg_tab_p3
> pagg_tab_2)
>                       Remote SQL: SELECT (b / 2), count(*), b FROM
> public.pagg_tab_p3 GROUP BY 1, 2
>
> Note that group by is still deparsed incorrectly.
Thank you for comments. You are right.
It is a mistake to rewrite extra->partial_target->exprs with partial_target->exprs.
I fixed this point.

September 26, 2023 (Fire) 23:16 Bruce Momjian <bruce@momjian.us>.
> On Tue, Sep 26, 2023 at 06:26:25AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> > Hi Mr.Bruce.
> > > I think this needs to be explained in the docs.  I am ready to adjust the patch to improve the wording whenever
youare 
> > > ready.  Should I do it now and post an updated version for you to use?
> > The following explanation was omitted from the documentation, so I added it.
> > > *  false - no check
> > > *  true, remove version < sender - check
> > I have responded to your comment, but if there is a problem with the wording, could you please suggest a
correction?
>
> I like your new wording, thanks.
Thanks.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v27.patch

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

27 сентября 2023 г., 15:36:34

Fujii.Yuki@df.MitsubishiElectric.co.jp писал 2023-09-27 01:35:
> Hi Mr.Momjian, Mr.Pyhalov.
> 
> Tuesday, 26 September 2023 22:15 Alexander Pyhalov 
> <a.pyhalov@postgrespro.ru>:
>> Do you mean that extra->partial_target->sortgrouprefs is not replaced,
>> and so we preserve tlesortgroupref numbers?
> Yes, that is correct.
> 
>> I'm suspicious about rewriting extra->partial_target->exprs with
>> partial_target->exprs - I'm still not sure why we
>>   don't we loose information, added by add_column_to_pathtarget() to
>> extra->partial_target->exprs?
>> 

Hi.

In postgres_fdw.sql

"Partial aggregates are unsafe to push down having clause when there are 
partial aggregates" - this comment likely should be fixed.

Some comments should be added to setGroupClausePartial() and to 
make_partial_grouping_target() - especially why setGroupClausePartial()
is called prior to add_new_columns_to_pathtarget().

I'm not sure that I like this mechanics of adding sort group clauses - 
it seems we do in core additional work, which is of use only for
one extension, but at least it seems to be working.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

28 сентября 2023 г., 07:40:40

Hi Mr.Pyhalov.

Thank you for comments.
> In postgres_fdw.sql
>
> "Partial aggregates are unsafe to push down having clause when there are
> partial aggregates" - this comment likely should be fixed.
Fixed.

> Some comments should be added to setGroupClausePartial() and to
> make_partial_grouping_target() - especially why setGroupClausePartial()
> is called prior to add_new_columns_to_pathtarget().
I have added comments to setGroupClausePartial() and to make_partial_grouping_target().

> I'm not sure that I like this mechanics of adding sort group clauses -
> it seems we do in core additional work, which is of use only for
> one extension, but at least it seems to be working.
We cannot deparse the original sort group clauses and pathtarget
when performing partial aggreggate pushdown by any FDWs.
So I think the additional sort group clauses and pathtarget are
needed by any FDWs, not only postgres_fdw.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v28.patch

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

28 сентября 2023 г., 09:15:43

Fujii.Yuki@df.MitsubishiElectric.co.jp писал 2023-09-28 07:40:

>> I'm not sure that I like this mechanics of adding sort group clauses -
>> it seems we do in core additional work, which is of use only for
>> one extension, but at least it seems to be working.
> We cannot deparse the original sort group clauses and pathtarget
> when performing partial aggreggate pushdown by any FDWs.
> So I think the additional sort group clauses and pathtarget are
> needed by any FDWs, not only postgres_fdw.
> 

Hi.
It seems to me that *fdw postfixes don't clarify things, but just make 
naming more ugly.

+ * Adding these Vars and PlaceHolderVars to PathTarget,
+ * FDW cannot deparse this by the original List of SortGroupClauses.
+ * So, before this adding process,
+ * setGroupClausePartial generates another Pathtarget and another
+ * List of SortGroupClauses for FDW.

It seems that something like:

/*
  * Modified PathTarget cannot be used by FDW as-is to deparse this 
statement.
  * So, before modifying PathTarget, setGroupClausePartial generates
  * another Pathtarget and another list List of SortGroupClauses
  * to make deparsing possible.
  */

sounds better.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

28 сентября 2023 г., 10:21:10

Hi Mr.Pyhalov.

Alexander Pyhalov <a.pyhalov@postgrespro.ru>
Thursday, September 28, 2023 3:16 PM
> It seems to me that *fdw postfixes don't clarify things, but just make naming more ugly.
I have removed *fdw postfixes.

> + * Adding these Vars and PlaceHolderVars to PathTarget,
> + * FDW cannot deparse this by the original List of SortGroupClauses.
> + * So, before this adding process,
> + * setGroupClausePartial generates another Pathtarget and another
> + * List of SortGroupClauses for FDW.
> 
> It seems that something like:
> 
> /*
>   * Modified PathTarget cannot be used by FDW as-is to deparse this statement.
>   * So, before modifying PathTarget, setGroupClausePartial generates
>   * another Pathtarget and another list List of SortGroupClauses
>   * to make deparsing possible.
>   */
> 
> sounds better.
Thank you for the suggested modifications. I have modified it according to your suggestion.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v29.patch

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

18 октября 2023 г., 08:22:34

Hi hackers.

Because there is a degrade in pg_dump.c, I fixed it.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v30.patch

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

23 октября 2023 г., 19:55:46

On Wed, Oct 18, 2023 at 05:22:34AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> Hi hackers.
> 
> Because there is a degrade in pg_dump.c, I fixed it.

Fujii-san, to get this patch closer to finished, can I modify this
version of the patch to improve some wording and post an updated version
you can use for future changes?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

24 октября 2023 г., 03:12:41

Hi Mr.Momjian.

> Fujii-san, to get this patch closer to finished, can I modify this version of the patch to improve some wording and
postan 
> updated version you can use for future changes?
Yes, I greatly appreciate your offer.
I would very much appreciate your modifications.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

26 октября 2023 г., 01:08:23

On Tue, Oct 24, 2023 at 12:12:41AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> Hi Mr.Momjian.
> 
> > Fujii-san, to get this patch closer to finished, can I modify this version of the patch to improve some wording and
postan
 
> > updated version you can use for future changes?
> Yes, I greatly appreciate your offer.
> I would very much appreciate your modifications.

I am almost done updating the patch, but I got stuck on how the feature
is supposed to work.  This documentation sentence is where I got
confused:

    <varlistentry>
     <term><literal>check_partial_aggregate_support</literal> (<type>boolean</type>)</term>
     <listitem>
      <para>
       If this option is false, <filename>postgres_fdw</filename> assumes
       that for each built-in aggregate function,
       the partial aggregate function is defined on the remote server
       without checking the remote server version.
       If this option is true, during query planning,
       <filename>postgres_fdw</filename> connects to the remote server
       and checks if the remote server version is older than the local server version.
       If so,
       <filename>postgres_fdw</filename>
-->       assumes that for each built-in aggregate function, the partial aggregate function is not defined
-->       on the remote server unless the partial aggregate function and the aggregate
-->       function match.
       Otherwise <filename>postgres_fdw</filename> assumes that for each built-in aggregate function,
       the partial aggregate function is defined on the remote server.
       The default is <literal>false</literal>.
      </para>
     </listitem>
    </varlistentry>

What does that marked sentence mean?  What is match?  Are one or both of
these remote?  It sounds like you are checking the local aggregate
against the remote partial aggregate, but I don't see any code that does
this in the patch.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

26 октября 2023 г., 14:11:09

Hi Mr.Momjian

> From: Bruce Momjian <bruce@momjian.us>
> Sent: Thursday, October 26, 2023 7:08 AM
> I am almost done updating the patch, but I got stuck on how the feature is supposed to work.  This documentation
> sentence is where I got
> confused:
>
>     <varlistentry>
>      <term><literal>check_partial_aggregate_support</literal> (<type>boolean</type>)</term>
>      <listitem>
>       <para>
>        If this option is false, <filename>postgres_fdw</filename> assumes
>        that for each built-in aggregate function,
>        the partial aggregate function is defined on the remote server
>        without checking the remote server version.
>        If this option is true, during query planning,
>        <filename>postgres_fdw</filename> connects to the remote server
>        and checks if the remote server version is older than the local server version.
>        If so,
>        <filename>postgres_fdw</filename>
> -->       assumes that for each built-in aggregate function, the partial aggregate function is not defined
> -->       on the remote server unless the partial aggregate function and the aggregate
> -->       function match.
>        Otherwise <filename>postgres_fdw</filename> assumes that for each built-in aggregate function,
>        the partial aggregate function is defined on the remote server.
>        The default is <literal>false</literal>.
>       </para>
>      </listitem>
>     </varlistentry>
>
> What does that marked sentence mean?  What is match?  Are one or both of these remote?  It sounds like you are
> checking the local aggregate against the remote partial aggregate, but I don't see any code that does this in the
patch.
This sentence means that
"If the partial aggregate function has the same OID as the aggregate function,
then postgres_fdw assumes that for each built-in aggregate function, the partial aggregate function is not defined
 on the remote server."
"Match" means that the partial aggregate function has the same OID as the aggregate function in local server.
But, in v30, there is no code which checks the partial aggregate function has the same OID as the aggregate function in
localserver. 
So I modified the code of is_builtin_aggpartialfunc_shippable().
Also, I modified wording postgres-fdw.sgml.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v31.patch

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

26 октября 2023 г., 22:43:29

On Thu, Oct 26, 2023 at 11:11:09AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> >        and checks if the remote server version is older than the local server version.
> >        If so,
> >        <filename>postgres_fdw</filename>
> > -->       assumes that for each built-in aggregate function, the partial aggregate function is not defined
> > -->       on the remote server unless the partial aggregate function and the aggregate
> > -->       function match.
> >        Otherwise <filename>postgres_fdw</filename> assumes that for each built-in aggregate function,
> >        the partial aggregate function is defined on the remote server.
> >        The default is <literal>false</literal>.
> >       </para>
> >      </listitem>
> >     </varlistentry>
> > 
> > What does that marked sentence mean?  What is match?  Are one or both of these remote?  It sounds like you are
> > checking the local aggregate against the remote partial aggregate, but I don't see any code that does this in the
patch.
> This sentence means that
> "If the partial aggregate function has the same OID as the aggregate function,
> then postgres_fdw assumes that for each built-in aggregate function, the partial aggregate function is not defined
>  on the remote server."
> "Match" means that the partial aggregate function has the same OID as the aggregate function in local server.
> But, in v30, there is no code which checks the partial aggregate function has the same OID as the aggregate function
inlocal server.
 
> So I modified the code of is_builtin_aggpartialfunc_shippable().
> Also, I modified wording postgres-fdw.sgml.

Yes, that is what I needed.  Attached is a modification of your v31
patch (the most recent) that mostly improves the documentation and
comments.  What else needs to be done before committers start to review
this?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Вложения

fdw_partial.diff.gz

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

27 октября 2023 г., 05:44:42

Hi Momjian.

Thank you for your improvement.
As a matter of detail, I think that the areas marked below are erroneous.

--
+   Pushdown causes aggregate function cals to send partial aggregate
                                         ^
+   function calls to the remote server. If the partial aggregate
+   function doesn't doesn't exist on the remote server, it causes
                     ^^^^^^^
--

> What else needs to be done before committers start to review
> this?
There are no others. May I make a new version of v31 with your
suggested improvements for the committer's review?

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

27 октября 2023 г., 06:06:17

On Fri, Oct 27, 2023 at 02:44:42AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> Hi Momjian.
> 
> Thank you for your improvement.
> As a matter of detail, I think that the areas marked below are erroneous.
> 
> --
> +   Pushdown causes aggregate function cals to send partial aggregate
>                                          ^
> +   function calls to the remote server. If the partial aggregate
> +   function doesn't doesn't exist on the remote server, it causes
>                      ^^^^^^^
> --

Agreed.  Do you want to fix that on your vesion?  I don't have any more
improvements to make.

> > What else needs to be done before committers start to review
> > this?
> There are no others. May I make a new version of v31 with your
> suggested improvements for the committer's review?

Yes, please.  I think the updated docs will help people understand how
the patch works.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

27 октября 2023 г., 08:32:48

Hi Mr.Momjian.

> From: Bruce Momjian <bruce@momjian.us>
> Sent: Friday, October 27, 2023 12:06 PM
> > Thank you for your improvement.
> > As a matter of detail, I think that the areas marked below are erroneous.
> >
> > --
> > +   Pushdown causes aggregate function cals to send partial aggregate
> >                                          ^
> > +   function calls to the remote server. If the partial aggregate
> > +   function doesn't doesn't exist on the remote server, it causes
> >                      ^^^^^^^
> > --
>
> Agreed.  Do you want to fix that on your vesion?  I don't have any more improvements to make.
Yes, I have fixed that in v32(Attached).

> > > What else needs to be done before committers start to review this?
> > There are no others. May I make a new version of v31 with your
> > suggested improvements for the committer's review?
>
> Yes, please.  I think the updated docs will help people understand how the patch works.
Thank you. v32(Attached) is tha updated version.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v32.patch

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

13 ноября 2023 г., 08:48:47

Hi Hackers.

I have rebased this patch.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v33.patch

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

13 ноября 2023 г., 11:25:48

Hi Hackers.

In postgres_fdw.sql, I have corrected the output format for floating point numbers
by extra_float_digits.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-v34.patch

Re: Partial aggregates pushdown

От

Robert Haas

Дата:

20 ноября 2023 г., 23:51:33

On Mon, Nov 13, 2023 at 3:26 AM Fujii.Yuki@df.MitsubishiElectric.co.jp
<Fujii.Yuki@df.mitsubishielectric.co.jp> wrote:
> In postgres_fdw.sql, I have corrected the output format for floating point numbers
> by extra_float_digits.

Looking at this, I find that it's not at all clear to me how the
partial aggregate function is defined. Let's look at what we have for
documentation:

+  <para>
+   Paraemter <literal>AGGPARTIALFUNC</literal> optionally defines a
+   partial aggregate function used for partial aggregate pushdown; see
+   <xref linkend="xaggr-partial-aggregates"/> for details.
+  </para>

+       Partial aggregate function (zero if none).
+       See <xref linkend="partial-aggregate-pushdown"/> for the definition
+       of partial aggregate function.

+   Partial aggregate pushdown is an optimization for queries that contains
+   aggregate expressions for a partitioned table across one or more remote
+   servers. If multiple conditions are met, partial aggregate function

+   When partial aggregate pushdown is used for aggregate expressions,
+   remote queries replace aggregate function calls with partial
+   aggregate function calls.  If the data type of the state value is not

But there's no definition of what the behavior of the function is
anywhere that I can see, not even in <sect2
id="partial-aggregate-pushdown">. Everywhere it only describes how the
partial aggregate function is used, not what it is supposed to do.

Looking at the changes in pg_aggregate.dat, it seems like the partial
aggregate function is a second aggregate defined in a way that mostly
matches the original, except that (1) if the original final function
would have returned a data type other than internal, then the final
function is removed; and (2) if the original final function would have
returned a value of internal type, then the final function is the
serialization function of the original aggregate. I think that's a
reasonable definition, but the documentation and code comments need to
be a lot clearer.

I do have a concern about this, though. It adds a lot of bloat. It
adds a whole lot of additional entries to pg_aggregate, and every new
aggregate we add in the future will require a bonus entry for this,
and it needs a bunch of new pg_proc entries as well. One idea that
I've had in the past is to instead introduce syntax that just does
this, without requiring a separate aggregate definition in each case.
For example, maybe instead of changing string_agg(whatever) to
string_agg_p_text_text(whatever), you can say PARTIAL_AGGREGATE
string_agg(whatever) or string_agg(PARTIAL_AGGREGATE whatever) or
something. Then all aggregates could be treated in a generic way. I'm
not completely sure that's better, but I think it's worth considering.

I think that the control mechanism needs some thought. Right now,
there are two possible behaviors: either we assume that the local and
remote sides are the same unconditionally, or we assume that they're
the same if the remote side is a new enough version. I do like having
those behaviors available, but I wonder if we need to do something
better or different. What if somebody wants to push down a
non-built-in aggregate, for example? I realize that we don't have
great solutions to the problem of knowing which functions are
push-downable in general, and I don't know that partial aggregation
needs to be any better than anything else, but it's probably worth
comparing and contrasting the approach we take here with the
approaches we've taken in other, similar cases. From that point of
view, I think check_partial_aggregate_support is a novelty: we don't
do those kinds of checks in other cases, AFAIK. But on the other hand,
there is the 'extensions' argument to postgres_fdw.

I don't think the patch does a good job explaining why HAVING,
DISTINCT, and ORDER BY are a problem. It seems to me that HAVING
shouldn't really be a problem, because HAVING is basically a WHERE
clause that occurs after aggregation is complete, and whether or not
the aggregation is safe shouldn't depend on what we're going to do
with the value afterward. The HAVING clause can't necessarily be
pushed to the remote side, but I don't see how or why it could make
the aggregate itself unsafe to push down. DISTINCT and ORDER BY are a
little trickier: if we pushed down DISTINCT, we'd still have to
re-DISTINCT-ify when combining locally, and if we pushed down ORDER
BY, we'd have to do a merge pass to combine the returned values unless
we could prove that the partitions were non-overlapping ranges that
would be visited in the correct order. Although that all sounds
doable, I think it's probably a good thing that the current patch
doesn't try to handle it -- this is complicated already. But it should
explain why it's not handling it and maybe even a bit about how it
could be handling in the future, rather than just saying "well, this
kind of thing is not safe." The trouble with that explanation is that
it does nothing to help the reader understand whether the thing in
question is *fundamentally* unsafe or whether we just don't have the
right code to make it work.

Typo: Paraemter

I'm so sorry to keep complaining about comments, but I think the
comments in src/backend/optimizer are very far from being adequate.
They are strictly formulaic and don't really explain anything. For
example, I see that the patch adds a partial_target to
GroupPathExtraData, but how do I understand the reason why we now need
a second pathtarget beside the one that already exists? Certainly not
from the comments in setGroupClausePartial, because there basically
aren't any. True, there's a header comment, but it just says we
generate this thing, not WHY we generate this thing. There's nothing
meaningful to be found in src/include/nodes/pathnodes.h about why
we're doing this, either.

And this problem really extends throughout the patch: comments are
mostly short and just describe what the code does, not WHY it does
that. And the WHY is really the important part. Otherwise we will not
be able to maintain this code going forward.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

21 ноября 2023 г., 01:48:42

On Mon, Nov 20, 2023 at 03:51:33PM -0500, Robert Haas wrote:
> On Mon, Nov 13, 2023 at 3:26 AM Fujii.Yuki@df.MitsubishiElectric.co.jp
> <Fujii.Yuki@df.mitsubishielectric.co.jp> wrote:
> > In postgres_fdw.sql, I have corrected the output format for floating point numbers
> > by extra_float_digits.
> 
> Looking at this, I find that it's not at all clear to me how the
> partial aggregate function is defined. Let's look at what we have for
> documentation:
> 
> +  <para>
> +   Paraemter <literal>AGGPARTIALFUNC</literal> optionally defines a
> +   partial aggregate function used for partial aggregate pushdown; see
> +   <xref linkend="xaggr-partial-aggregates"/> for details.
> +  </para>
> 
> +       Partial aggregate function (zero if none).
> +       See <xref linkend="partial-aggregate-pushdown"/> for the definition
> +       of partial aggregate function.
> 
> +   Partial aggregate pushdown is an optimization for queries that contains
> +   aggregate expressions for a partitioned table across one or more remote
> +   servers. If multiple conditions are met, partial aggregate function
> 
> +   When partial aggregate pushdown is used for aggregate expressions,
> +   remote queries replace aggregate function calls with partial
> +   aggregate function calls.  If the data type of the state value is not
> 
> But there's no definition of what the behavior of the function is
> anywhere that I can see, not even in <sect2
> id="partial-aggregate-pushdown">. Everywhere it only describes how the
> partial aggregate function is used, not what it is supposed to do.

Yes, I had to figure that out myself, and I was wondering how much
detail to have in our docs vs README files vs. C comments.  I think we
should put more details somewhere.

> Looking at the changes in pg_aggregate.dat, it seems like the partial
> aggregate function is a second aggregate defined in a way that mostly
> matches the original, except that (1) if the original final function
> would have returned a data type other than internal, then the final
> function is removed; and (2) if the original final function would have
> returned a value of internal type, then the final function is the
> serialization function of the original aggregate. I think that's a
> reasonable definition, but the documentation and code comments need to
> be a lot clearer.

Agreed.  I wasn't sure enough about this to add it when I was reviewing
the patch.

> I do have a concern about this, though. It adds a lot of bloat. It
> adds a whole lot of additional entries to pg_aggregate, and every new
> aggregate we add in the future will require a bonus entry for this,
> and it needs a bunch of new pg_proc entries as well. One idea that
> I've had in the past is to instead introduce syntax that just does
> this, without requiring a separate aggregate definition in each case.
> For example, maybe instead of changing string_agg(whatever) to
> string_agg_p_text_text(whatever), you can say PARTIAL_AGGREGATE
> string_agg(whatever) or string_agg(PARTIAL_AGGREGATE whatever) or
> something. Then all aggregates could be treated in a generic way. I'm
> not completely sure that's better, but I think it's worth considering.

So use an SQL keyword to indicates a pushdown call?  We could then
automate the behavior rather than requiring special catalog functions?

> I think that the control mechanism needs some thought. Right now,
> there are two possible behaviors: either we assume that the local and
> remote sides are the same unconditionally, or we assume that they're
> the same if the remote side is a new enough version. I do like having
> those behaviors available, but I wonder if we need to do something
> better or different. What if somebody wants to push down a
> non-built-in aggregate, for example? I realize that we don't have

It does allow specification of extensions that can be pushed down.

> great solutions to the problem of knowing which functions are
> push-downable in general, and I don't know that partial aggregation
> needs to be any better than anything else, but it's probably worth
> comparing and contrasting the approach we take here with the
> approaches we've taken in other, similar cases. From that point of
> view, I think check_partial_aggregate_support is a novelty: we don't
> do those kinds of checks in other cases, AFAIK. But on the other hand,
> there is the 'extensions' argument to postgres_fdw.

Right.  I am not sure how to improve what the patch does.

> I don't think the patch does a good job explaining why HAVING,
> DISTINCT, and ORDER BY are a problem. It seems to me that HAVING
> shouldn't really be a problem, because HAVING is basically a WHERE
> clause that occurs after aggregation is complete, and whether or not
> the aggregation is safe shouldn't depend on what we're going to do
> with the value afterward. The HAVING clause can't necessarily be
> pushed to the remote side, but I don't see how or why it could make
> the aggregate itself unsafe to push down. DISTINCT and ORDER BY are a
> little trickier: if we pushed down DISTINCT, we'd still have to
> re-DISTINCT-ify when combining locally, and if we pushed down ORDER
> BY, we'd have to do a merge pass to combine the returned values unless
> we could prove that the partitions were non-overlapping ranges that
> would be visited in the correct order. Although that all sounds
> doable, I think it's probably a good thing that the current patch
> doesn't try to handle it -- this is complicated already. But it should
> explain why it's not handling it and maybe even a bit about how it
> could be handling in the future, rather than just saying "well, this
> kind of thing is not safe." The trouble with that explanation is that
> it does nothing to help the reader understand whether the thing in
> question is *fundamentally* unsafe or whether we just don't have the
> right code to make it work.

Makes sense.

> Typo: Paraemter
> 
> I'm so sorry to keep complaining about comments, but I think the
> comments in src/backend/optimizer are very far from being adequate.
> They are strictly formulaic and don't really explain anything. For
> example, I see that the patch adds a partial_target to
> GroupPathExtraData, but how do I understand the reason why we now need
> a second pathtarget beside the one that already exists? Certainly not
> from the comments in setGroupClausePartial, because there basically
> aren't any. True, there's a header comment, but it just says we
> generate this thing, not WHY we generate this thing. There's nothing
> meaningful to be found in src/include/nodes/pathnodes.h about why
> we're doing this, either.
> 
> And this problem really extends throughout the patch: comments are
> mostly short and just describe what the code does, not WHY it does
> that. And the WHY is really the important part. Otherwise we will not
> be able to maintain this code going forward.

Understood.  I wish I knew enough to add them myself.  I can help if
someone can supply the details.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Re: Partial aggregates pushdown

От

Robert Haas

Дата:

21 ноября 2023 г., 20:16:41

On Mon, Nov 20, 2023 at 5:48 PM Bruce Momjian <bruce@momjian.us> wrote:
> > I do have a concern about this, though. It adds a lot of bloat. It
> > adds a whole lot of additional entries to pg_aggregate, and every new
> > aggregate we add in the future will require a bonus entry for this,
> > and it needs a bunch of new pg_proc entries as well. One idea that
> > I've had in the past is to instead introduce syntax that just does
> > this, without requiring a separate aggregate definition in each case.
> > For example, maybe instead of changing string_agg(whatever) to
> > string_agg_p_text_text(whatever), you can say PARTIAL_AGGREGATE
> > string_agg(whatever) or string_agg(PARTIAL_AGGREGATE whatever) or
> > something. Then all aggregates could be treated in a generic way. I'm
> > not completely sure that's better, but I think it's worth considering.
>
> So use an SQL keyword to indicates a pushdown call?  We could then
> automate the behavior rather than requiring special catalog functions?

Right. It would require more infrastructure in the parser, planner,
and executor, but it would be infinitely reusable instead of needing a
new thing for every aggregate. I think that might be better, but to be
honest I'm not totally sure.

> > I don't think the patch does a good job explaining why HAVING,
> > DISTINCT, and ORDER BY are a problem. It seems to me that HAVING
> > shouldn't really be a problem, because HAVING is basically a WHERE
> > clause that occurs after aggregation is complete, and whether or not
> > the aggregation is safe shouldn't depend on what we're going to do
> > with the value afterward. The HAVING clause can't necessarily be
> > pushed to the remote side, but I don't see how or why it could make
> > the aggregate itself unsafe to push down. DISTINCT and ORDER BY are a
> > little trickier: if we pushed down DISTINCT, we'd still have to
> > re-DISTINCT-ify when combining locally, and if we pushed down ORDER
> > BY, we'd have to do a merge pass to combine the returned values unless
> > we could prove that the partitions were non-overlapping ranges that
> > would be visited in the correct order. Although that all sounds
> > doable, I think it's probably a good thing that the current patch
> > doesn't try to handle it -- this is complicated already. But it should
> > explain why it's not handling it and maybe even a bit about how it
> > could be handling in the future, rather than just saying "well, this
> > kind of thing is not safe." The trouble with that explanation is that
> > it does nothing to help the reader understand whether the thing in
> > question is *fundamentally* unsafe or whether we just don't have the
> > right code to make it work.
>
> Makes sense.

Actually, I think I was wrong about this. We can't handle ORDER BY or
DISTINCT because we can't distinct-ify or order after we've already
partially aggregated. At least not in general, and not without
additional aggregate support functions. So what I said above was wrong
with respect to those. Or so I believe, anyway. But I still don't see
why HAVING should be a problem.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

21 ноября 2023 г., 23:34:19

On Tue, Nov 21, 2023 at 12:16:41PM -0500, Robert Haas wrote:
> On Mon, Nov 20, 2023 at 5:48 PM Bruce Momjian <bruce@momjian.us> wrote:
> > > I do have a concern about this, though. It adds a lot of bloat. It
> > > adds a whole lot of additional entries to pg_aggregate, and every new
> > > aggregate we add in the future will require a bonus entry for this,
> > > and it needs a bunch of new pg_proc entries as well. One idea that
> > > I've had in the past is to instead introduce syntax that just does
> > > this, without requiring a separate aggregate definition in each case.
> > > For example, maybe instead of changing string_agg(whatever) to
> > > string_agg_p_text_text(whatever), you can say PARTIAL_AGGREGATE
> > > string_agg(whatever) or string_agg(PARTIAL_AGGREGATE whatever) or
> > > something. Then all aggregates could be treated in a generic way. I'm
> > > not completely sure that's better, but I think it's worth considering.
> >
> > So use an SQL keyword to indicates a pushdown call?  We could then
> > automate the behavior rather than requiring special catalog functions?
> 
> Right. It would require more infrastructure in the parser, planner,
> and executor, but it would be infinitely reusable instead of needing a
> new thing for every aggregate. I think that might be better, but to be
> honest I'm not totally sure.

It would make it automatic.  I guess we need to look at how big the
patch is to do it.

> > > I don't think the patch does a good job explaining why HAVING,
> > > DISTINCT, and ORDER BY are a problem. It seems to me that HAVING
> > > shouldn't really be a problem, because HAVING is basically a WHERE
> > > clause that occurs after aggregation is complete, and whether or not
> > > the aggregation is safe shouldn't depend on what we're going to do
> > > with the value afterward. The HAVING clause can't necessarily be
> > > pushed to the remote side, but I don't see how or why it could make
> > > the aggregate itself unsafe to push down. DISTINCT and ORDER BY are a
> > > little trickier: if we pushed down DISTINCT, we'd still have to
> > > re-DISTINCT-ify when combining locally, and if we pushed down ORDER
> > > BY, we'd have to do a merge pass to combine the returned values unless
> > > we could prove that the partitions were non-overlapping ranges that
> > > would be visited in the correct order. Although that all sounds
> > > doable, I think it's probably a good thing that the current patch
> > > doesn't try to handle it -- this is complicated already. But it should
> > > explain why it's not handling it and maybe even a bit about how it
> > > could be handling in the future, rather than just saying "well, this
> > > kind of thing is not safe." The trouble with that explanation is that
> > > it does nothing to help the reader understand whether the thing in
> > > question is *fundamentally* unsafe or whether we just don't have the
> > > right code to make it work.
> >
> > Makes sense.
> 
> Actually, I think I was wrong about this. We can't handle ORDER BY or
> DISTINCT because we can't distinct-ify or order after we've already
> partially aggregated. At least not in general, and not without
> additional aggregate support functions. So what I said above was wrong
> with respect to those. Or so I believe, anyway. But I still don't see
> why HAVING should be a problem.

This should probably be documented in the patch.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

22 ноября 2023 г., 09:32:58

Robert Haas писал 2023-11-21 20:16:

>> > I don't think the patch does a good job explaining why HAVING,
>> > DISTINCT, and ORDER BY are a problem. It seems to me that HAVING
>> > shouldn't really be a problem, because HAVING is basically a WHERE
>> > clause that occurs after aggregation is complete, and whether or not
>> > the aggregation is safe shouldn't depend on what we're going to do
>> > with the value afterward. The HAVING clause can't necessarily be
>> > pushed to the remote side, but I don't see how or why it could make
>> > the aggregate itself unsafe to push down. DISTINCT and ORDER BY are a
>> > little trickier: if we pushed down DISTINCT, we'd still have to
>> > re-DISTINCT-ify when combining locally, and if we pushed down ORDER
>> > BY, we'd have to do a merge pass to combine the returned values unless
>> > we could prove that the partitions were non-overlapping ranges that
>> > would be visited in the correct order. Although that all sounds
>> > doable, I think it's probably a good thing that the current patch
>> > doesn't try to handle it -- this is complicated already. But it should
>> > explain why it's not handling it and maybe even a bit about how it
>> > could be handling in the future, rather than just saying "well, this
>> > kind of thing is not safe." The trouble with that explanation is that
>> > it does nothing to help the reader understand whether the thing in
>> > question is *fundamentally* unsafe or whether we just don't have the
>> > right code to make it work.
>> 
>> Makes sense.
> 
> Actually, I think I was wrong about this. We can't handle ORDER BY or
> DISTINCT because we can't distinct-ify or order after we've already
> partially aggregated. At least not in general, and not without
> additional aggregate support functions. So what I said above was wrong
> with respect to those. Or so I believe, anyway. But I still don't see
> why HAVING should be a problem.

Hi. HAVING is also a problem. Consider the following query

SELECT count(a) FROM t HAVING count(a) > 10 - we can't push it down to
foreign server as HAVING needs full aggregate result, but foreign server
don't know it.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

22 ноября 2023 г., 13:16:16

Hi Mr. Haas, hackers.

Thank you for your thoughtful comments.

> From: Robert Haas <robertmhaas@gmail.com>
> Sent: Tuesday, November 21, 2023 5:52 AM
> I do have a concern about this, though. It adds a lot of bloat. It adds a whole lot of additional entries to
pg_aggregate,and
 
> every new aggregate we add in the future will require a bonus entry for this, and it needs a bunch of new pg_proc
entries
> as well. One idea that I've had in the past is to instead introduce syntax that just does this, without requiring a
separate
> aggregate definition in each case.
> For example, maybe instead of changing string_agg(whatever) to string_agg_p_text_text(whatever), you can say
> PARTIAL_AGGREGATE
> string_agg(whatever) or string_agg(PARTIAL_AGGREGATE whatever) or something. Then all aggregates could be treated
> in a generic way. I'm not completely sure that's better, but I think it's worth considering.
I believe this comment addresses a fundamental aspect of the approach.
So, firstly, could we discuss whether we should fundamentally reconsider the approach?

The approach adopted in this patch is as follows.
Approach 1: Adding partial aggregation functions to the catalogs(pg_aggregate, pg_proc)

The approach proposed by Mr.Haas is as follows.
Approach 2: Adding a keyword to the SQL syntax to indicate partial aggregation requests

The amount of code required to implement Approach 2 has not been investigated,
but comparing Approach 1 and Approach 2 in other aspects, 
I believe they each have the following advantages and disadvantages. 

1. Approach 1
(1) Advantages
(a) No need to change the SQL syntax
(2) Disadvantages
(a) Catalog bloat
As Mr.Haas pointed out, the catalog will bloat by adding partial aggregation functions (e.g. avg_p_int8(int8)) 
for each individual aggregate function (e.g. avg(int8)) in pg_aggregate and pg_proc (theoretically doubling the size).
Some PostgreSQL developers and users may find this uncomfortable.
(b) Increase in manual procedures
Developers of new aggregate functions (both built-in and user-defined) need to manually add the partial aggregation
functions when defining the aggregate functions.
However, the procedure for adding partial aggregation functions for a certain aggregate function can be automated,
so this problem can be resolved by improving the patch.
The automation method involves the core part (AggregateCreate() and related functions) that executes
the CREATE AGGREGATE command for user-defined functions.
For built-in functions, it involves generating the initial data for the pg_aggregate catalog and pg_proc catalog from
pg_aggregate.datand pg_proc.dat
 
(using the genbki.pl script and related scripts).

2. Approach 2
(1) Advantages
(a) No need to add partial aggregate functions to the catalogs for each aggregation
(2) Disadvantages
(a) Need to add non-standard keywords to the SQL syntax.

I did not choose Approach2 because I was not confident that the disadvantage mentioned in 2.(2)(a)
would be accepted by the PostgreSQL development community.
If it is accepted, I think Approach 2 is smarter.
Could you please provide your opinion on which
approach is preferable after comparing these two approaches?
If we cannot say anything without comparing the amount of source code, as Mr.Momjian mentioned,
we need to estimate the amount of source code required to implement Approach2.

Sincerely yours,
Yuuki Fujii
 
--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

23 ноября 2023 г., 00:16:02

On Wed, Nov 22, 2023 at 10:16:16AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> 2. Approach 2
> (1) Advantages
> (a) No need to add partial aggregate functions to the catalogs for each aggregation
> (2) Disadvantages
> (a) Need to add non-standard keywords to the SQL syntax.
> 
> I did not choose Approach2 because I was not confident that the disadvantage mentioned in 2.(2)(a)
> would be accepted by the PostgreSQL development community.
> If it is accepted, I think Approach 2 is smarter.
> Could you please provide your opinion on which
> approach is preferable after comparing these two approaches?

I didn't know #2 was possible, but given the great number of catalog
entries, doing it in the SQL grammar seems cleaner to me.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

27 ноября 2023 г., 10:04:07

Hi Mr.Momjian, Mr.Haas, hackers.

> From: Bruce Momjian <bruce@momjian.us>
> Sent: Thursday, November 23, 2023 6:16 AM
> On Wed, Nov 22, 2023 at 10:16:16AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> > 2. Approach 2
> > (1) Advantages
> > (a) No need to add partial aggregate functions to the catalogs for
> > each aggregation
> > (2) Disadvantages
> > (a) Need to add non-standard keywords to the SQL syntax.
> >
> > I did not choose Approach2 because I was not confident that the
> > disadvantage mentioned in 2.(2)(a) would be accepted by the PostgreSQL development community.
> > If it is accepted, I think Approach 2 is smarter.
> > Could you please provide your opinion on which approach is preferable
> > after comparing these two approaches?
> 
> I didn't know #2 was possible, but given the great number of catalog entries, doing it in the SQL grammar seems
cleaner
> to me.
Thank you for comments. Yes, I understand.

> From: Bruce Momjian <bruce@momjian.us>
> Sent: Wednesday, November 22, 2023 5:34 AM
> On Tue, Nov 21, 2023 at 12:16:41PM -0500, Robert Haas wrote:
> > On Mon, Nov 20, 2023 at 5:48 PM Bruce Momjian <bruce@momjian.us> wrote:
> > > > I do have a concern about this, though. It adds a lot of bloat. It
> > > > adds a whole lot of additional entries to pg_aggregate, and every
> > > > new aggregate we add in the future will require a bonus entry for
> > > > this, and it needs a bunch of new pg_proc entries as well. One
> > > > idea that I've had in the past is to instead introduce syntax that
> > > > just does this, without requiring a separate aggregate definition in each case.
> > > > For example, maybe instead of changing string_agg(whatever) to
> > > > string_agg_p_text_text(whatever), you can say PARTIAL_AGGREGATE
> > > > string_agg(whatever) or string_agg(PARTIAL_AGGREGATE whatever) or
> > > > something. Then all aggregates could be treated in a generic way.
> > > > I'm not completely sure that's better, but I think it's worth considering.
> > >
> > > So use an SQL keyword to indicates a pushdown call?  We could then
> > > automate the behavior rather than requiring special catalog functions?
> >
> > Right. It would require more infrastructure in the parser, planner,
> > and executor, but it would be infinitely reusable instead of needing a
> > new thing for every aggregate. I think that might be better, but to be
> > honest I'm not totally sure.
> 
> It would make it automatic.  I guess we need to look at how big the patch is to do it.
I will investigate specifically which parts of the PostgreSQL source code need to be modified and how big the patch
willbe if you take this approach.
 

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Re: Partial aggregates pushdown

От

Robert Haas

Дата:

27 ноября 2023 г., 22:07:40

On Wed, Nov 22, 2023 at 1:32 AM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:
> Hi. HAVING is also a problem. Consider the following query
>
> SELECT count(a) FROM t HAVING count(a) > 10 - we can't push it down to
> foreign server as HAVING needs full aggregate result, but foreign server
> don't know it.

I don't see it that way. What we would push to the foreign server
would be something like SELECT count(a) FROM t. Then, after we get the
results back and combine the various partial counts locally, we would
locally evaluate the HAVING clause afterward. That is, partial
aggregation is a barrier to pushing down HAVING clause itself, but it
doesn't preclude pushing down the aggregation.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Partial aggregates pushdown

От

Robert Haas

Дата:

27 ноября 2023 г., 23:03:22

On Wed, Nov 22, 2023 at 5:16 AM Fujii.Yuki@df.MitsubishiElectric.co.jp
<Fujii.Yuki@df.mitsubishielectric.co.jp> wrote:
> I did not choose Approach2 because I was not confident that the disadvantage mentioned in 2.(2)(a)
> would be accepted by the PostgreSQL development community.
> If it is accepted, I think Approach 2 is smarter.
> Could you please provide your opinion on which
> approach is preferable after comparing these two approaches?
> If we cannot say anything without comparing the amount of source code, as Mr.Momjian mentioned,
> we need to estimate the amount of source code required to implement Approach2.

I've had the same concern, that approach #2 would draw objections, so
I think you were right to be cautious about it. I don't think it is a
wonderful approach in all ways, but I do think that it is superior to
approach #1. If we add dedicated support to the grammar, it is mostly
a one-time effort, and after that, there should not be much need for
anyone to be concerned about it. If we instead add extra aggregates,
then that generates extra work every time someone writes a patch that
adds a new aggregate to core. I have a difficult time believing that
anyone will prefer an approach that involves an ongoing maintenance
effort of that type over one that doesn't.

One point that seems to me to be of particular importance is that if
we add new aggregates, there is a risk that some future aggregate
might do that incorrectly, so that the main aggregate works, but the
secondary aggregate created for this feature does not work. That seems
like it would be very frustrating for future code authors so I'd like
to avoid the risk as much as we can.

Also, I want to make one other point here about security and
reliability. Right now, there is no way for a user to feed arbitrary
data to a deserialization function. Since serialization and
deserialization functions are only used in the context of parallel
query, we always know that the data fed to the deserialization
function must have come from the serialization function on the same
machine. Nor can users call the deserialization function directly with
arbitrary data of their own choosing, because users cannot call
functions that take or return internal. But with this system, it
becomes possible to feed arbitrary data to a deserialization function.
The user could redefine the function on the remote side so that it
produces arbitrary data of their choosing, and the local
deserialization function will ingest it.

That's potentially quite a significant problem. Consider for example
that numericvar_deserialize() does no validity checking on any of the
weight, sign, or dscale, but not all values for those fields are
legal. Right now that doesn't matter, but if you can feed arbitrary
data to that function, then it is. I don't know exactly what the
consequences are if you can get that function to spit out a NumericVar
with values outside the normal legal range. What can you do then?
Store a bogus numeric on disk? Crash the server? Worst case, some
problem like this could be a security issue allowing for escalation to
superuser; more likely, it would be a crash bug, corrupt your
database, or lead to unexpected and strange error messages.

Unfortunately, I have the unpleasant suspicion that most internal-type
aggregates will be affected by this problem. Consider, for example,
string_agg_deserialize(). Generally, strings are one of the
least-constrained data types, so you might hope that this function
would be OK. But it doesn't look very promising. The first int4 in the
serialized representation is the cursor, which would have to be
bounds-checked, lest someone provide a cursor that falls outside the
bounds of the StringInfo and, maybe, cause a reference to an arbitrary
memory location. Then the rest of the representation is the actual
data, which could be anything. This function is used for both bytea
and for text, and for bytea, letting the payload be anything is OK.
But for text, the supplied data shouldn't contain an embedded zero
byte, or otherwise be invalid in the server encoding. If it were, that
would provide a vector to inject invalidly encoded data into the
database. This feature can't be allowed to do that.

What could be a solution to this class of problems? One option is to
just give up on supporting this feature for internal-type aggregates
for now. That's easy enough to do, and just means we have less
functionality, but it's sad because that's functionality we'd like to
have. Another approach is to add necessary sanity checks to the
relevant deserialization functions, but that seems a little hard to
get right, and it would slow down parallel query cases which are
probably going to be more common than the use of this feature. I think
the slowdown might be significant, too. A third option is to change
those aggregates in some way, like giving them a transition function
that operates on some data type other than internal, but there again
we have to be careful of slowdowns. A final option is to rethink the
infrastructure in some way, like having a way to serialize to
something other than bytea, for which we already have input functions
with adequate checks. For instance, if string_agg_serialize() produced
a record containing an integer column and a text or bytea column, we
could attempt to ingest that record on the other side and presumably
the right things would happen in the case of any invalid data. But I'm
not quite sure what infrastructure would be required to make this kind
of idea work.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Partial aggregates pushdown

От

Tom Lane

Дата:

27 ноября 2023 г., 23:59:29

Robert Haas <robertmhaas@gmail.com> writes:
> Also, I want to make one other point here about security and
> reliability. Right now, there is no way for a user to feed arbitrary
> data to a deserialization function. Since serialization and
> deserialization functions are only used in the context of parallel
> query, we always know that the data fed to the deserialization
> function must have come from the serialization function on the same
> machine. Nor can users call the deserialization function directly with
> arbitrary data of their own choosing, because users cannot call
> functions that take or return internal. But with this system, it
> becomes possible to feed arbitrary data to a deserialization function.

Ouch.  That is absolutely horrid --- we have a lot of stuff that
depends on users not being able to get at "internal" values, and
it sounds like the current proposal breaks all of that.

Quite aside from security concerns, there is no justification for
assuming that the "internal" values used on one platform/PG version
are identical to those used on another.  So if the idea is to
ship back "internal" values from the remote server to the local one,
I think it's basically impossible to make that work.

Even if the partial-aggregate serialization value isn't "internal"
but some more-narrowly-defined type, it is still an internal
implementation detail of the aggregate.  You have no right to assume
that the remote server implements the aggregate the same way the
local one does.  If we start making such an assumption then we'll
be unable to revise the implementation of an aggregate ever again.

TBH, I think this entire proposal is dead in the water.  Which is
sad from a performance standpoint, but I can't see any way that
we would not regret shipping a feature that makes such assumptions.

            regards, tom lane

Re: Partial aggregates pushdown

От

Robert Haas

Дата:

28 ноября 2023 г., 00:10:02

On Mon, Nov 27, 2023 at 3:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Even if the partial-aggregate serialization value isn't "internal"
> but some more-narrowly-defined type, it is still an internal
> implementation detail of the aggregate.  You have no right to assume
> that the remote server implements the aggregate the same way the
> local one does.  If we start making such an assumption then we'll
> be unable to revise the implementation of an aggregate ever again.
>
> TBH, I think this entire proposal is dead in the water.  Which is
> sad from a performance standpoint, but I can't see any way that
> we would not regret shipping a feature that makes such assumptions.

I think it's ridiculous to just hold our breath and pretend like this
feature isn't needed -- it's at least half a decade overdue. We engage
in endless hand-wringing over local-remote symmetry in cases where
other systems seem to effortlessly make that assumption and then get
on with building new features. It's not that I disagree with the
concern; we're *already* doing stuff that is unprincipled in a bunch
of different areas and that could and occasionally does cause queries
that push things to the remote side to return wrong answers, and I
hate that. But the response to that can't be to refuse to add new
features and maybe rip out the features we already have. Users don't
like it when pushdown causes queries to return wrong answers, but they
like it even less when the pushdown doesn't happen in the first place
and the query runs until the heat death of the universe. I'm not
entirely sure what the right design ideas are here, but giving up and
refusing to add features ought to be completely off the table.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Partial aggregates pushdown

От

Tom Lane

Дата:

28 ноября 2023 г., 00:23:24

Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Nov 27, 2023 at 3:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> TBH, I think this entire proposal is dead in the water.  Which is
>> sad from a performance standpoint, but I can't see any way that
>> we would not regret shipping a feature that makes such assumptions.

> I think it's ridiculous to just hold our breath and pretend like this
> feature isn't needed -- it's at least half a decade overdue. We engage
> in endless hand-wringing over local-remote symmetry in cases where
> other systems seem to effortlessly make that assumption and then get
> on with building new features.

Well, one of the founding principles of postgres_fdw was to be able
to talk to PG servers that are not of the same version as yours.
If we break that in the name of performance, we are going to have
a lot of unhappy users.  Even the ones who do get the benefit of
the speedup are going to be unhappy when it breaks because they
didn't upgrade local and remote at exactly the same time.

Just because we'd like to have it doesn't make the patch workable
in the real world.

            regards, tom lane

Re: Partial aggregates pushdown

От

Robert Haas

Дата:

28 ноября 2023 г., 02:50:34

First of all, that last email of mine was snippy, and I apologize for it. Sorry.

That said:

On Mon, Nov 27, 2023 at 4:23 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Well, one of the founding principles of postgres_fdw was to be able
> to talk to PG servers that are not of the same version as yours.
> If we break that in the name of performance, we are going to have
> a lot of unhappy users.  Even the ones who do get the benefit of
> the speedup are going to be unhappy when it breaks because they
> didn't upgrade local and remote at exactly the same time.

I agree with this.

> Just because we'd like to have it doesn't make the patch workable
> in the real world.

And also with this in concept - I'd like to plan arbitrarily
complicated queries perfectly and near-instantly, and then execute
them at faster-than-light speed, but we can't. However, I don't
understand the fatalism with respect to the feature at hand. As I said
before, it's not like no other product has made this work. Sure, some
of those products may not have the extensible system of data types
that we do, or may not care about cross-version communication, but
those don't seem like good enough reasons to just immediately give up.

TBH, I suspect even some PG forks have made this work, like maybe PGXC
or PGXL, although I don't know for certain. We might not like the
trade-offs they made to get there, but we haven't even talked through
possible design ideas yet, so it seems way too early to give up.

One of the things that I think is a problem in this area is that the
ways we have to configure FDW connections are just not very rich.
We're trying to cram everything into a set of strings that can be
attached to the foreign server or the user mapping, but that's not a
very good fit for something like how all the local SQL functions that
might exist map onto all of the remote SQL functions that might exist.
Now you might well say that we don't want the act of configuring a
foreign data wrapper to be insanely complicated, and I would agree
with that. But, on the other hand, as Larry Wall once said, a good
programming language makes simple things simple and complicated things
possible. I think our current configuration system is only
accomplishing the first of those goals.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Partial aggregates pushdown

От

Ashutosh Bapat

Дата:

28 ноября 2023 г., 13:23:58

On Tue, Nov 28, 2023 at 5:21 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> TBH, I suspect even some PG forks have made this work, like maybe PGXC
> or PGXL, although I don't know for certain. We might not like the
> trade-offs they made to get there, but we haven't even talked through
> possible design ideas yet, so it seems way too early to give up.

If my memory serves me right, PGXC implemented partial aggregation
only when the output of partial aggregate was a SQL data type
(non-Internal, non-Unknown). But I may be wrong. But at that time,
JSONB wasn't there or wasn't that widespread.

Problem with Internal is it's just a binary string whose content can
change across version and which can be interpreted differently across
different versions. There is no metadata in it to know how to
interpret it. We can add that metadata to JSONB. The result of partial
aggregate can be sent as a JSONB. If the local server finds the JSONB
familiar it will construct the right partial aggregate value otherwise
it will throw an error. If there's a way to even avoid that error (by
looking at server version etc.) the error can be avoided too. But
JSONB leaves very very less chance that the value will be interpreted
wrong. Downside is we are tying PARTIAL's output to be JSONB thus
tying SQL syntax with a data type.

Does that look acceptable?

--
Best Wishes,
Ashutosh Bapat

Re: Partial aggregates pushdown

От

Robert Haas

Дата:

28 ноября 2023 г., 15:44:51

On Tue, Nov 28, 2023 at 5:24 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
> If my memory serves me right, PGXC implemented partial aggregation
> only when the output of partial aggregate was a SQL data type
> (non-Internal, non-Unknown). But I may be wrong. But at that time,
> JSONB wasn't there or wasn't that widespread.
>
> Problem with Internal is it's just a binary string whose content can
> change across version and which can be interpreted differently across
> different versions. There is no metadata in it to know how to
> interpret it. We can add that metadata to JSONB. The result of partial
> aggregate can be sent as a JSONB. If the local server finds the JSONB
> familiar it will construct the right partial aggregate value otherwise
> it will throw an error. If there's a way to even avoid that error (by
> looking at server version etc.) the error can be avoided too. But
> JSONB leaves very very less chance that the value will be interpreted
> wrong. Downside is we are tying PARTIAL's output to be JSONB thus
> tying SQL syntax with a data type.
>
> Does that look acceptable?

If somebody had gone to the trouble of making this work, and had done
a good job, I wouldn't vote against it, but in a vacuum, I'm not sure
it's the best design. The problem in my view is that working with JSON
is not actually very pleasant. It's super-easy to generate, and
super-easy for humans to read. But parsing and validating it is a
pain. You basically have to have two parsers, one to do syntactical
validation and then a second one to ensure that the structure of the
document and the contents of each item are as expected. See
parse_manifest.c for an example of what I mean by that. Now, if we add
new code, it can reuse the JSON parser we've already got, so it's not
that you need to write a new JSON parser for every new application of
JSON, but the semantic validator (a la parse_manifest.c) isn't
necessarily any less code than a whole new parser for a bespoke
format.

To make that a bit more concrete, for something like string_agg(), is
it easier to write a validator for the existing deserialization
function that accepts a bytea blob, or to write a validator for a JSON
blob that we could be passing instead? My suspicion is that the former
is less work and easier to verify, but it's possible I'm wrong about
that and they're more or less equal. I don't really see any way that
the JSON thing is straight-up better; at best it's a toss-up in terms
of amount of code. Now somebody could still make an argument that they
would like JSON better because it would be more useful for some
purpose other than this feature, and that is fine, but here I'm just
thinking about this feature in particular.

My personal suspicion is that the easiest way to support internal-type
aggregates here is to convert them to use an array or record type as a
transition state instead, or maybe serialize the internal state to one
of those things instead of to bytea. I suspect that would allow us to
leverage more of our existing validation infrastructure than using
JSON or sticking with bytea. But I'm certainly amenable to other
points of view. I'm not trying to pretend that my gut feeling is
necessarily correct; I'm just explaining what I currently think.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Partial aggregates pushdown

От

Stephen Frost

Дата:

28 ноября 2023 г., 17:24:31

Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Mon, Nov 27, 2023 at 4:23 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Well, one of the founding principles of postgres_fdw was to be able
> > to talk to PG servers that are not of the same version as yours.
> > If we break that in the name of performance, we are going to have
> > a lot of unhappy users.  Even the ones who do get the benefit of
> > the speedup are going to be unhappy when it breaks because they
> > didn't upgrade local and remote at exactly the same time.
>
> I agree with this.

+1.  We do want to continue to make this work- to the extent possible.
I don't think there's any problem with saying that when talking to an
older server, you don't get the same capabilities as you do when talking
to a newer server.

> > Just because we'd like to have it doesn't make the patch workable
> > in the real world.
>
> And also with this in concept - I'd like to plan arbitrarily
> complicated queries perfectly and near-instantly, and then execute
> them at faster-than-light speed, but we can't. However, I don't
> understand the fatalism with respect to the feature at hand. As I said
> before, it's not like no other product has made this work. Sure, some
> of those products may not have the extensible system of data types
> that we do, or may not care about cross-version communication, but
> those don't seem like good enough reasons to just immediately give up.

Certainly there are other projects out there which are based on PG that
have managed to make this work and work really quite well.

> TBH, I suspect even some PG forks have made this work, like maybe PGXC
> or PGXL, although I don't know for certain. We might not like the
> trade-offs they made to get there, but we haven't even talked through
> possible design ideas yet, so it seems way too early to give up.

Yes, Citus[1] and Greenplum[2], to just name two.

I certainly understand the concern around the security of this and would
have thought the approach we'd use would be to not just take internal
state and pass it along but rather to provide a way for aggregates to
opt-in to supporting this and have them serialize/deserialize with
new dedicated functions that have appropriate checks to avoid bad things
happening.  That could also be versioned, perhaps, if we feel that's
necessary (I'm a bit skeptical, but it would hopefully address the
concern about different versions having different data that they want to
pass along).

> One of the things that I think is a problem in this area is that the
> ways we have to configure FDW connections are just not very rich.

Agreed.

> We're trying to cram everything into a set of strings that can be
> attached to the foreign server or the user mapping, but that's not a
> very good fit for something like how all the local SQL functions that
> might exist map onto all of the remote SQL functions that might exist.
> Now you might well say that we don't want the act of configuring a
> foreign data wrapper to be insanely complicated, and I would agree
> with that. But, on the other hand, as Larry Wall once said, a good
> programming language makes simple things simple and complicated things
> possible. I think our current configuration system is only
> accomplishing the first of those goals.

We've already got issues in this area with extensions- there's no way
for a user to say what version of an extension exists on the remote side
and no way for an extension to do anything different based on that
information.  Perhaps we could work on a solution to both of these
issues, but at the least I don't see holding back on this effort for a
problem that already exists but which we've happily accepted because of
the benefit it provides, like being able to push-down postgis bounding
box conditionals to allow for indexed lookups.

Thanks,

Stephen

[1]: https://docs.citusdata.com/en/v11.1/develop/reference_sql.html
[2]:
https://postgresconf.org/conferences/Beijing/program/proposals/implementation-of-distributed-aggregation-in-greenplum

Вложения

signature.asc

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

06 декабря 2023 г., 11:41:21

Hi Mr.Haas, hackers.

> From: Robert Haas <robertmhaas@gmail.com>
> Sent: Tuesday, November 28, 2023 5:03 AM
> Also, I want to make one other point here about security and reliability. Right now, there is no way for a user to
feed
> arbitrary data to a deserialization function. Since serialization and deserialization functions are only used in the
contextof
 
> parallel query, we always know that the data fed to the deserialization function must have come from the
serialization
> function on the same machine. Nor can users call the deserialization function directly with arbitrary data of their
own
> choosing, because users cannot call functions that take or return internal. But with this system, it becomes possible
to
> feed arbitrary data to a deserialization function.
> The user could redefine the function on the remote side so that it produces arbitrary data of their choosing, and the
local
> deserialization function will ingest it.
> 
> That's potentially quite a significant problem. Consider for example that numericvar_deserialize() does no validity
> checking on any of the weight, sign, or dscale, but not all values for those fields are legal. Right now that doesn't
matter,
> but if you can feed arbitrary data to that function, then it is. I don't know exactly what the consequences are if
youcan get
 
> that function to spit out a NumericVar with values outside the normal legal range. What can you do then?
> Store a bogus numeric on disk? Crash the server? Worst case, some problem like this could be a security issue
allowingfor
 
> escalation to superuser; more likely, it would be a crash bug, corrupt your database, or lead to unexpected and
strange
> error messages.
> 
> Unfortunately, I have the unpleasant suspicion that most internal-type aggregates will be affected by this problem.
> Consider, for example, string_agg_deserialize(). Generally, strings are one of the least-constrained data types, so
you
> might hope that this function would be OK. But it doesn't look very promising. The first int4 in the serialized
representation
> is the cursor, which would have to be bounds-checked, lest someone provide a cursor that falls outside the bounds of
the
> StringInfo and, maybe, cause a reference to an arbitrary memory location. Then the rest of the representation is the
actual
> data, which could be anything. This function is used for both bytea and for text, and for bytea, letting the payload
be
> anything is OK.
> But for text, the supplied data shouldn't contain an embedded zero byte, or otherwise be invalid in the server
encoding.If
 
> it were, that would provide a vector to inject invalidly encoded data into the database. This feature can't be
allowedto do
 
> that.
I completely overlooked this issue. I should have considered the risks of sending raw state values or serialized state
data directly from remote to local. I apologize.

> What could be a solution to this class of problems? One option is to just give up on supporting this feature for
internal-type
> aggregates for now. That's easy enough to do, and just means we have less functionality, but it's sad because that's
> functionality we'd like to have. Another approach is to add necessary sanity checks to the relevant deserialization
> functions, but that seems a little hard to get right, and it would slow down parallel query cases which are probably
goingto
 
> be more common than the use of this feature. I think the slowdown might be significant, too. A third option is to
change
> those aggregates in some way, like giving them a transition function that operates on some data type other than
internal,
> but there again we have to be careful of slowdowns. A final option is to rethink the infrastructure in some way, like
having
> a way to serialize to something other than bytea, for which we already have input functions with adequate checks.
For
> instance, if string_agg_serialize() produced a record containing an integer column and a text or bytea column, we
could
> attempt to ingest that record on the other side and presumably the right things would happen in the case of any
invalid
> data. But I'm not quite sure what infrastructure would be required to make this kind of idea work.
Thank you very much for providing a direction towards resolving this issue.
As you have suggested as the last option, it seems that expanding the current mechanism of the aggregation
function is the only choice. It may take some time, but I will consider specific solutions.

> From: Robert Haas <robertmhaas@gmail.com>
> Sent: Tuesday, November 28, 2023 4:08 AM
> On Wed, Nov 22, 2023 at 1:32 AM Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote:
> > Hi. HAVING is also a problem. Consider the following query
> >
> > SELECT count(a) FROM t HAVING count(a) > 10 - we can't push it down to
> > foreign server as HAVING needs full aggregate result, but foreign
> > server don't know it.
> 
> I don't see it that way. What we would push to the foreign server would be something like SELECT count(a) FROM t.
Then,
> after we get the results back and combine the various partial counts locally, we would locally evaluate the HAVING
clause
> afterward. That is, partial aggregation is a barrier to pushing down HAVING clause itself, but it doesn't preclude
pushing
> down the aggregation.
I understand what the problem is. I will try to fix it in the next version.

> From: Robert Haas <robertmhaas@gmail.com>
> Sent: Tuesday, November 28, 2023 5:03 AM
> On Wed, Nov 22, 2023 at 5:16 AM Fujii.Yuki@df.MitsubishiElectric.co.jp
> <Fujii.Yuki@df.mitsubishielectric.co.jp> wrote:
> > I did not choose Approach2 because I was not confident that the
> > disadvantage mentioned in 2.(2)(a) would be accepted by the PostgreSQL development community.
> > If it is accepted, I think Approach 2 is smarter.
> > Could you please provide your opinion on which approach is preferable
> > after comparing these two approaches?
> > If we cannot say anything without comparing the amount of source code,
> > as Mr.Momjian mentioned, we need to estimate the amount of source code required to implement Approach2.
> 
> I've had the same concern, that approach #2 would draw objections, so I think you were right to be cautious about it.
I
> don't think it is a wonderful approach in all ways, but I do think that it is superior to approach #1. If we add
dedicated
> support to the grammar, it is mostly a one-time effort, and after that, there should not be much need for anyone to
be
> concerned about it. If we instead add extra aggregates, then that generates extra work every time someone writes a
patch
> that adds a new aggregate to core. I have a difficult time believing that anyone will prefer an approach that
involvesan
 
> ongoing maintenance effort of that type over one that doesn't.
> 
> One point that seems to me to be of particular importance is that if we add new aggregates, there is a risk that
some
> future aggregate might do that incorrectly, so that the main aggregate works, but the secondary aggregate created for
this
> feature does not work. That seems like it would be very frustrating for future code authors so I'd like to avoid the
riskas
 
> much as we can.
Are you concerned about the hassle and potential human errors of manually adding new partial
aggregation functions, rather than the catalog becoming bloated?
The process of creating partial aggregation functions from aggregation functions can be automated,
so I believe this issue can be resolved. However, automating it may increase the size of the patch
even more, so overall, approach#2 might be better.
To implement approach #2, it would be necessary to investigate how much additional code is required.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Re: Partial aggregates pushdown

От

Robert Haas

Дата:

06 декабря 2023 г., 16:25:16

On Wed, Dec 6, 2023 at 3:41 AM Fujii.Yuki@df.MitsubishiElectric.co.jp
<Fujii.Yuki@df.mitsubishielectric.co.jp> wrote:
> Are you concerned about the hassle and potential human errors of manually adding new partial
> aggregation functions, rather than the catalog becoming bloated?

I'm concerned about both.

> The process of creating partial aggregation functions from aggregation functions can be automated,
> so I believe this issue can be resolved. However, automating it may increase the size of the patch
> even more, so overall, approach#2 might be better.
> To implement approach #2, it would be necessary to investigate how much additional code is required.

Yes. Unfortunately I fear that there is quite a lot of work left to do
here in order to produce a committable feature. To me it seems
necessary to conduct an investigation of approach #2. If the result of
that investigation is that nothing major stands in the way of approach
#2, then I think we should adopt it, which is more work. In addition,
the problems around transmitting serialized bytea blobs between
machines that can't be assumed to fully trust each other will need to
be addressed in some way, which seems like it will require a good deal
of design work, forming some kind of consensus, and then
implementation work to follow. In addition to that there may be some
small problems that need to be solved at a detail level, such as the
HAVING issue. I think the last category won't be too hard to sort out,
but that still leaves two really major areas to address.

--
Robert Haas
EDB: http://www.enterprisedb.com

RE: [CAUTION!! freemail] Re: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

07 декабря 2023 г., 03:10:58

Hi Mr.Haas.

> -----Original Message-----
> From: Robert Haas <robertmhaas@gmail.com>
> Sent: Wednesday, December 6, 2023 10:25 PM
> On Wed, Dec 6, 2023 at 3:41 AM Fujii.Yuki@df.MitsubishiElectric.co.jp
> <Fujii.Yuki@df.mitsubishielectric.co.jp> wrote:
> > Are you concerned about the hassle and potential human errors of
> > manually adding new partial aggregation functions, rather than the catalog becoming bloated?
> 
> I'm concerned about both.
Understood. Thank you for your response.

> > The process of creating partial aggregation functions from aggregation
> > functions can be automated, so I believe this issue can be resolved.
> > However, automating it may increase the size of the patch even more, so overall, approach#2 might be better.
> > To implement approach #2, it would be necessary to investigate how much additional code is required.
> 
> Yes. Unfortunately I fear that there is quite a lot of work left to do here in order to produce a committable
feature.To me it
 
> seems necessary to conduct an investigation of approach #2. If the result of that investigation is that nothing
major
> stands in the way of approach #2, then I think we should adopt it, which is more work. In addition, the problems
around
> transmitting serialized bytea blobs between machines that can't be assumed to fully trust each other will need to be
> addressed in some way, which seems like it will require a good deal of design work, forming some kind of consensus,
and
> then implementation work to follow. In addition to that there may be some small problems that need to be solved at a
> detail level, such as the HAVING issue. I think the last category won't be too hard to sort out, but that still
leavestwo really
 
> major areas to address.
Yes, I agree with you. It is clear that further investigation and discussion are still needed. 
I would be grateful if we can resolve this issue gradually. I would also like to continue the discussion if possible in
thefuture.
 

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Re: [CAUTION!! freemail] Re: Partial aggregates pushdown

От

Robert Haas

Дата:

07 декабря 2023 г., 17:56:08

On Wed, Dec 6, 2023 at 7:11 PM Fujii.Yuki@df.MitsubishiElectric.co.jp
<Fujii.Yuki@df.mitsubishielectric.co.jp> wrote:
> I would be grateful if we can resolve this issue gradually. I would also like to continue the discussion if possible
inthe future. 

I think that would be good. Thanks for your work on this. It is a hard problem.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: [CAUTION!! freemail] Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

08 декабря 2023 г., 00:12:18

On Thu, Dec  7, 2023 at 09:56:08AM -0500, Robert Haas wrote:
> On Wed, Dec 6, 2023 at 7:11 PM Fujii.Yuki@df.MitsubishiElectric.co.jp
> <Fujii.Yuki@df.mitsubishielectric.co.jp> wrote:
> > I would be grateful if we can resolve this issue gradually. I would also like to continue the discussion if
possiblein the future.

> 
> I think that would be good. Thanks for your work on this. It is a hard problem.

Agreed.  First, Robert is right that this feature is long overdue.  It
might not help many of our existing workloads, but it opens us up to
handling new, larger workloads.

Second, the patch already has a mechanism to check the remote server
version to see if it is the same or newer.   Here is the version check
documentation patch:

    check_partial_aggregate_support (boolean)

    If this option is false, <filename>postgres_fdw</filename> always
    uses partial aggregate pushdown by assuming that each built-in
    aggregate function has a partial aggregate function defined on
    the remote server.  If this option is true, local aggregates
    whose partial computation function references itself are assumed
    to exist on the remote server.    If not, during query planning,
    <filename>postgres_fdw</filename> will connect to the remote
    server and retrieve the remote server version.    If the remote
    version is the same or newer, partial aggregate functions will be
    assumed to exist.  If older, <filename>postgres_fdw</filename>
    checks that the remote server has a matching partial aggregate
    function before performing partial aggregate pushdown.    The default
    is <literal>false</literal>.

There is also an extension list that specifies which extension-owned
functions can be pushed down;  from the doc patch:

    To reduce the risk of misexecution of queries, WHERE clauses and
    aggregate expressions are not sent to the remote server unless they
    only use data types, operators, and functions that are built-in
    or belong to an extension that is listed in the foreign server's
    <literal>extensions</literal> option.

Third, we already have a way of creating records for tables:

    SELECT pg_language FROM pg_language;
                    pg_language
    -------------------------------------------
     (12,internal,10,f,f,0,0,2246,)
     (13,c,10,f,f,0,0,2247,)
     (14,sql,10,f,t,0,0,2248,)
     (13576,plpgsql,10,t,t,13573,13574,13575,)

And we do have record input functionality:

    CREATE TABLE test (x int, language pg_language);

    INSERT INTO test SELECT 0, pg_language FROM pg_language;

    SELECT * FROM test;
     x |                 language
    ---+-------------------------------------------
     0 | (12,internal,10,f,f,0,0,2246,)
     0 | (13,c,10,f,f,0,0,2247,)
     0 | (14,sql,10,f,t,0,0,2248,)
     0 | (13576,plpgsql,10,t,t,13573,13574,13575,)
    (4 rows)

However, functions don't have pre-created records, and internal
functions don't see to have an SQL-defined structure, but as I remember
the internal aggregate functions all take the same internal structure,
so I guess we only need one fixed input and one output that would
output/input such records.  Performance might be an issue, but at this
point let's just implement this and measure the overhead since there are
few/any(?) other viable options.

Fourth, going with #2 where we do the pushdown using an SQL keyword also
allows extensions to automatically work, while requiring partial
aggregate functions for every non-partial aggregate will require work
for extensions, and potentially lead to more version mismatch issues.

Finally, I am now concerned that this will not be able to be in PG 17,
which I was hoping for.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Re: [CAUTION!! freemail] Re: Partial aggregates pushdown

От

Robert Haas

Дата:

11 декабря 2023 г., 17:24:25

On Thu, Dec 7, 2023 at 4:12 PM Bruce Momjian <bruce@momjian.us> wrote:
> Second, the patch already has a mechanism to check the remote server
> version to see if it is the same or newer.   Here is the version check
> documentation patch:

Right. This feature can certainly be implemented in a
backward-compatible way. I'm not sure that we have as much control
over what does and does not get pushed down as we really want here,
but it's completely possible to do this in a way that doesn't break
other use cases.

> However, functions don't have pre-created records, and internal
> functions don't see to have an SQL-defined structure, but as I remember
> the internal aggregate functions all take the same internal structure,
> so I guess we only need one fixed input and one output that would
> output/input such records.  Performance might be an issue, but at this
> point let's just implement this and measure the overhead since there are
> few/any(?) other viable options.

IMHO records will be the easiest approach, but it will be some work to try it.

> Fourth, going with #2 where we do the pushdown using an SQL keyword also
> allows extensions to automatically work, while requiring partial
> aggregate functions for every non-partial aggregate will require work
> for extensions, and potentially lead to more version mismatch issues.

Yeah.

> Finally, I am now concerned that this will not be able to be in PG 17,
> which I was hoping for.

Getting it ready to ship by March seems very difficult. I'm not saying
it couldn't happen, but I think more likely it won't.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: [CAUTION!! freemail] Re: Partial aggregates pushdown

От

vignesh C

Дата:

27 января, 04:57:34

On Thu, 7 Dec 2023 at 05:41, Fujii.Yuki@df.MitsubishiElectric.co.jp
<Fujii.Yuki@df.mitsubishielectric.co.jp> wrote:
>
> Hi Mr.Haas.
>
> > -----Original Message-----
> > From: Robert Haas <robertmhaas@gmail.com>
> > Sent: Wednesday, December 6, 2023 10:25 PM
> > On Wed, Dec 6, 2023 at 3:41 AM Fujii.Yuki@df.MitsubishiElectric.co.jp
> > <Fujii.Yuki@df.mitsubishielectric.co.jp> wrote:
> > > Are you concerned about the hassle and potential human errors of
> > > manually adding new partial aggregation functions, rather than the catalog becoming bloated?
> >
> > I'm concerned about both.
> Understood. Thank you for your response.
>
> > > The process of creating partial aggregation functions from aggregation
> > > functions can be automated, so I believe this issue can be resolved.
> > > However, automating it may increase the size of the patch even more, so overall, approach#2 might be better.
> > > To implement approach #2, it would be necessary to investigate how much additional code is required.
> >
> > Yes. Unfortunately I fear that there is quite a lot of work left to do here in order to produce a committable
feature.To me it 
> > seems necessary to conduct an investigation of approach #2. If the result of that investigation is that nothing
major
> > stands in the way of approach #2, then I think we should adopt it, which is more work. In addition, the problems
around
> > transmitting serialized bytea blobs between machines that can't be assumed to fully trust each other will need to
be
> > addressed in some way, which seems like it will require a good deal of design work, forming some kind of consensus,
and
> > then implementation work to follow. In addition to that there may be some small problems that need to be solved at
a
> > detail level, such as the HAVING issue. I think the last category won't be too hard to sort out, but that still
leavestwo really 
> > major areas to address.
> Yes, I agree with you. It is clear that further investigation and discussion are still needed.
> I would be grateful if we can resolve this issue gradually. I would also like to continue the discussion if possible
inthe future. 

Thanks for all the efforts on this patch. I have changed the status of
the commitfest entry to "Returned with Feedback" as there is still
some work to get this patch out. Feel free to continue the discussion
and add a new entry when the patch is in a reviewable shape.

Regards,
Vignesh

Re: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

22 февраля, 10:20:45

Hi. Mr.Haas, hackers.

I apologize for the significant delay since my last post.
I have conducted investigations and considerations regarding the remaining tasks as follows.
Would it be possible for you to review them?
In particular, could you please confirm if the approach mentioned in 1. is acceptable? 
If there are no issues with the direction outlined in 1., I plan to make a simple prototype based on this approach.

1. Transmitting state value safely between machines
> From: Robert Haas <robertmhaas@gmail.com>
> Sent: Wednesday, December 6, 2023 10:25 PM
> the problems around transmitting
> serialized bytea blobs between machines that can't be assumed to fully trust each other will need to be addressed in
some
> way, which seems like it will require a good deal of design work, forming some kind of consensus, and then
implementation
> work to follow. 
I have considered methods for safely transmitting state values between different machines.
I have taken into account the version policy of PostgreSQL (5 years of support) and the major version release cycle
overthe past 10 years (1 year), and as a result, I have made the assumption that transmission is allowed only when the
differencebetween the local version and the remote version is 5 or less.

I believe that by adding new components, "export function" and "import function", to the aggregate functions, and
furtherintroducing a new SQL keyword to the query syntax of aggregate expressions, we can address this issue.

If the version of the local server is higher than or equal to the version of the remote server, the proposed method can
besimplified. The export version mentioned later in (1) would not be necessary. Furthermore, if the version of the
localserver matches the version of the remote server, the proposed method can be further simplified.

I would appreciate your input on reasonable assumptions regarding the differences in versions between the local server
andthe remote server.

I will explain the specifications of the export function, import function, the new SQL keyword for aggregate
expressions,and the behavior of query processing for partial aggregation separately.

(1) Export Function Specification
This function is another final function for partial aggregate.
This function converts the state value that represents the result of partial aggregation into a format that can be read
bythe local server. 

This function is called instead of the existing finalfunc during the final stage of aggregation when performing partial
aggregation.
The conversion process described above will be referred to as "export".
The argument of an export function is the version of the server that will receive the return value.
Hereafter, this version will be referred to as the export version.
The concept of an export version is necessary to handle cases where the version of the local server is smaller than the
versionof the remote server.

The return value of the export function is the transformed state value, and its data type is bytea.
For backward compatibility, the developer of the export function must ensure that the export can be performed for major
versionsup to five versions prior to the major version of PostgreSQL that the export function is being developed for.

For built-in functions, I believe it is necessary to allow for the possibility of not developing the export
functionalityfor specific versions in the future (due to reasons such as development burden) after the export function
isdeveloped for a certain version.

To achieve this, for built-in functions, we will add a column to the pg_aggregate catalog that indicates the presence
orabsence of export functionality for each major version, including the major version being developed and the previous
fivemajor versions. This column will be named safety_export_versions and will have a data type of boolean[6].

For user-defined functions, we will refer to the extensions option and add an external server option called
safety_export_extensions,which will maintain a list of extensions that include only the aggregate functions that can be
exportedto the local server version.

(2) Import Function Specification
The import function is a function that performs validity checks on the exported data and converts it into a state
value.The process of this conversion is referred to as importing.

The import function is called from postgres_fdw in the local server.
The arguments of the import function are the exported data and the export version.
The return value of the import function is a state value that can be read on the local server.
The import function will terminate with an error if the validity check determines that the exported result cannot be
readon the local server.

For backward compatibility, developers of the import function must ensure that it can be imported if the export version
isup to five versions prior to their own version.

(3) The new SQL keyword for aggregate expressions
The local server passes the instructions for partial aggregation and the export version to the remote server using SQL
keywords.The syntax specification is as follows:

aggregate_function(PARTIAL_AGGREGATE(export_version) expr)

Here, PARTIAL_AGGREGATE is a keyword that indicates partial aggregation, and export_version is a string constant that
indicatesthe export version.

(4) The behavior of query processing for partial aggregation
I will illustrate the flow of query processing using the example query "select aggfunc(c) from t".
In the following explanation, the major version of the remote server will be referred to as remote_version, and the
majorversion of the local server will be referred to as local_version.

STEP1. Checking the feasibility of partial aggregation pushdown on the local server
(i) Retrieving the remote_version
The postgres_fdw connects to the remote server and retrieves the remote_version.
(ii) Checking the versions
The postgres_fdw determines whether the difference between local_version and remote_version is within 5. If the
differenceis 6 or more, it is determined that partial aggregation pushdown is not possible.

(iii) Checking the import function
The postgres_fdw checks the pg_aggregate catalog to see if there is an import function for aggfunc. If there is none,
itis determined that partial aggregation pushdown is not possible.

(iv) Checking the export function
If aggfunc is a built-in function, the postgres_fdw checks the pg_aggregate catalog. It checks if there is a version
numberexport_version that satisfies the conditions local_version >= export_version >= local_version-5 and if there is
anexport function available for that version. If the version number export_version does not exist, it is determined
thatpartial aggregation pushdown is not possible. This check is only performed if local_version >= remote_version.

If aggfunc is a user-defined function, the postgres_fdw checks if the extension on which aggfunc depends is included in
export_safety_extensions.If it is not included, it is determined that partial aggregation pushdown is not possible.

STEP2. Sending a remote query on the local server
The query containing the keyword indicating partial aggregation is sent to the remote server. The remote query for the
samplequery would be as follows: 

"select aggfunc(PARTIAL_AGGREGATE(export_version) c) from t"

STEP3. Executing the remote query on the remote server
The remote server performs partial aggregation for aggfunc. Instead of calling the final function at the last stage of
aggregation,the remote server calls the export function with export_version and generates the return value of the
partialaggregation.

STEP4. Receiving the result of the remote query on the local server
The postgres_fdw passes the export_version and the return value of STEP3 to the import function of aggfunc and receives
thestate value. The postgres_fdw then passes the received state value to the executor of the local server.

2. The approach of adding SQL keywords
> From: Robert Haas <robertmhaas@gmail.com>
> Sent: Tuesday, November 21, 2023 5:52 AM
> I do have a concern about this, though. It adds a lot of bloat. It adds a whole lot of additional entries to
pg_aggregate,and

> every new aggregate we add in the future will require a bonus entry for this, and it needs a bunch of new pg_proc
entriesas

> well. One idea that I've had in the past is to instead introduce syntax that just does this, without requiring a
separate
> aggregate definition in each case.
> For example, maybe instead of changing string_agg(whatever) to string_agg_p_text_text(whatever), you can say
> PARTIAL_AGGREGATE
> string_agg(whatever) or string_agg(PARTIAL_AGGREGATE whatever) or something. Then all aggregates could be treated
> in a generic way. I'm not completely sure that's better, but I think it's worth considering.
I have prototyped an approach using SQL keywords for the patch that does not include the functionality of Step 1.
Pleasefind the prototype attached as a file.

# I apologize for not including sufficient comments, documentation, and tests in the prototype. Please understand.
Mainly, it seems that we can address this by adding handling for the new SQL keywords in the parser and making
modificationsto the finalize process for aggregation in the executor.

As pointed out by Mr.Haas, it has been realized that the code can be significantly simplified.
The additional lines of code, excluding documentation and tests, are as follows.
Adding new aggregate functions approach(approach #1): 1,069
Adding new SQL keyword approach(approach #2): 318
As mentioned in 1., I plan to modify the patch by adding SQL keywords in the future.

3. Fixing the behavior when the HAVING clause is present
> From: Robert Haas <robertmhaas@gmail.com>
> Sent: Tuesday, November 28, 2023 4:08 AM
> 
> On Wed, Nov 22, 2023 at 1:32 AM Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote:
> > Hi. HAVING is also a problem. Consider the following query
> >
> > SELECT count(a) FROM t HAVING count(a) > 10 - we can't push it down to
> > foreign server as HAVING needs full aggregate result, but foreign
> > server don't know it.
> 
> I don't see it that way. What we would push to the foreign server would be something like SELECT count(a) FROM t.
Then,
> after we get the results back and combine the various partial counts locally, we would locally evaluate the HAVING
clause
> afterward. That is, partial aggregation is a barrier to pushing down HAVING clause itself, but it doesn't preclude
pushing
> down the aggregation.
I have made modifications in the attached patch to ensure that when the HAVING clause is present, the HAVING clause is
executedlocally while the partial aggregations are pushed down.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-SQL-keyword.patch

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

28 февраля, 16:43:07

Hi.

Fujii.Yuki@df.MitsubishiElectric.co.jp писал(а) 2024-02-22 10:20:
> Hi. Mr.Haas, hackers.
> 
> I apologize for the significant delay since my last post.
> I have conducted investigations and considerations regarding the 
> remaining tasks as follows.
> Would it be possible for you to review them?
> In particular, could you please confirm if the approach mentioned in 1. 
> is acceptable?
> If there are no issues with the direction outlined in 1., I plan to 
> make a simple prototype based on this approach.
> 
> 1. Transmitting state value safely between machines
>> From: Robert Haas <robertmhaas@gmail.com>
>> Sent: Wednesday, December 6, 2023 10:25 PM
>> the problems around transmitting
>> serialized bytea blobs between machines that can't be assumed to fully 
>> trust each other will need to be addressed in some
>> way, which seems like it will require a good deal of design work, 
>> forming some kind of consensus, and then implementation
>> work to follow.
> I have considered methods for safely transmitting state values between 
> different machines.
> I have taken into account the version policy of PostgreSQL (5 years of 
> support) and the major version release cycle over the past 10 years (1 
> year), and as a result, I have made the assumption that transmission is 
> allowed only when the difference between the local version and the 
> remote version is 5 or less.
> I believe that by adding new components, "export function" and "import 
> function", to the aggregate functions, and further introducing a new 
> SQL keyword to the query syntax of aggregate expressions, we can 
> address this issue.
> If the version of the local server is higher than or equal to the 
> version of the remote server, the proposed method can be simplified. 
> The export version mentioned later in (1) would not be necessary. 
> Furthermore, if the version of the local server matches the version of 
> the remote server, the proposed method can be further simplified.
> I would appreciate your input on reasonable assumptions regarding the 
> differences in versions between the local server and the remote server.
> I will explain the specifications of the export function, import 
> function, the new SQL keyword for aggregate expressions, and the 
> behavior of query processing for partial aggregation separately.
> (1) Export Function Specification
> This function is another final function for partial aggregate.
> This function converts the state value that represents the result of 
> partial aggregation into a format that can be read by the local server.
> This function is called instead of the existing finalfunc during the 
> final stage of aggregation when performing partial aggregation.
> The conversion process described above will be referred to as "export".
> The argument of an export function is the version of the server that 
> will receive the return value.
> Hereafter, this version will be referred to as the export version.
> The concept of an export version is necessary to handle cases where the 
> version of the local server is smaller than the version of the remote 
> server.
> The return value of the export function is the transformed state value, 
> and its data type is bytea.
> For backward compatibility, the developer of the export function must 
> ensure that the export can be performed for major versions up to five 
> versions prior to the major version of PostgreSQL that the export 
> function is being developed for.
> For built-in functions, I believe it is necessary to allow for the 
> possibility of not developing the export functionality for specific 
> versions in the future (due to reasons such as development burden) 
> after the export function is developed for a certain version.
> To achieve this, for built-in functions, we will add a column to the 
> pg_aggregate catalog that indicates the presence or absence of export 
> functionality for each major version, including the major version being 
> developed and the previous five major versions. This column will be 
> named safety_export_versions and will have a data type of boolean[6].
> For user-defined functions, we will refer to the extensions option and 
> add an external server option called safety_export_extensions, which 
> will maintain a list of extensions that include only the aggregate 
> functions that can be exported to the local server version.
> ...

I honestly think that achieving cross-version compatibility in this way 
puts a significant burden on developers. Can we instead always use the 
more or less universal export and import function to fix possible issues 
with binary representations on different architectures and just refuse 
to push down partial aggregates on server version mismatch? At least at 
the first step?

> 
> 3. Fixing the behavior when the HAVING clause is present
>> From: Robert Haas <robertmhaas@gmail.com>
>> Sent: Tuesday, November 28, 2023 4:08 AM
>> 
>> On Wed, Nov 22, 2023 at 1:32 AM Alexander Pyhalov 
>> <a.pyhalov@postgrespro.ru> wrote:
>> > Hi. HAVING is also a problem. Consider the following query
>> >
>> > SELECT count(a) FROM t HAVING count(a) > 10 - we can't push it down to
>> > foreign server as HAVING needs full aggregate result, but foreign
>> > server don't know it.
>> 
>> I don't see it that way. What we would push to the foreign server 
>> would be something like SELECT count(a) FROM t. Then,
>> after we get the results back and combine the various partial counts 
>> locally, we would locally evaluate the HAVING clause
>> afterward. That is, partial aggregation is a barrier to pushing down 
>> HAVING clause itself, but it doesn't preclude pushing
>> down the aggregation.
> I have made modifications in the attached patch to ensure that when the 
> HAVING clause is present, the HAVING clause is executed locally while 
> the partial aggregations are pushed down.
> 
> 

Sorry, I don't see how it works. When we have partial aggregates and 
having clause, foreign_grouping_ok() returns false and 
add_foreign_grouping_paths() adds no paths.
I'm not saying it's necessary to fix this in the first patch version.

explain verbose select sum(a) from pagg_tab having sum(a)>10;
                                                 QUERY PLAN
-----------------------------------------------------------------------------------------------------------
  Finalize Aggregate  (cost=2282.49..2282.50 rows=1 width=8)
    Output: sum(pagg_tab.a)
    Filter: (sum(pagg_tab.a) > 10)
    ->  Append  (cost=760.81..2282.48 rows=3 width=8)
          ->  Partial Aggregate  (cost=760.81..760.82 rows=1 width=8)
                Output: PARTIAL sum(pagg_tab.a)
                ->  Foreign Scan on public.fpagg_tab_p1 pagg_tab  
(cost=100.00..753.50 rows=2925 width=4)
                      Output: pagg_tab.a
                      Remote SQL: SELECT a FROM public.pagg_tab_p1
          ->  Partial Aggregate  (cost=760.81..760.82 rows=1 width=8)
                Output: PARTIAL sum(pagg_tab_1.a)
                ->  Foreign Scan on public.fpagg_tab_p2 pagg_tab_1  
(cost=100.00..753.50 rows=2925 width=4)
                      Output: pagg_tab_1.a
                      Remote SQL: SELECT a FROM public.pagg_tab_p2
          ->  Partial Aggregate  (cost=760.81..760.82 rows=1 width=8)
                Output: PARTIAL sum(pagg_tab_2.a)
                ->  Foreign Scan on public.fpagg_tab_p3 pagg_tab_2  
(cost=100.00..753.50 rows=2925 width=4)
                      Output: pagg_tab_2.a
                      Remote SQL: SELECT a FROM public.pagg_tab_p3


Also I have some minor notices on the code.

contrib/postgres_fdw/deparse.c: comment before appendFunctionName() has 
gone, this seems to be wrong.

In finalize_aggregate()

1079         /*
1080          * Apply the agg's finalfn if one is provided, else return 
transValue.
1081          */

Comment should be updated to note behavior for agg_partial aggregates.

1129         else if (peragg->aggref->agg_partial
1130                         && (peragg->aggref->aggtranstype ==
INTERNALOID)
1131                         && OidIsValid(peragg->serialfn_oid))

In this if branch, should we check just for peragg->aggref->agg_partial 
and peragg->aggref->aggtranstype == INTERNALOID? It seems that if 
peragg->aggref->aggtranstype == INTERNALOID and there's no
serialfn_oid, it's likely an error (and one should be generated).

Overall patch seems nicer. Will look at it more this week.
-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

16 марта, 05:28:50

Hi. Mr.Pyhalov.

> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Sent: Wednesday, February 28, 2024 10:43 PM
> > 1. Transmitting state value safely between machines
> >> From: Robert Haas <robertmhaas@gmail.com>
> >> Sent: Wednesday, December 6, 2023 10:25 PM the problems around
> >> transmitting serialized bytea blobs between machines that can't be
> >> assumed to fully trust each other will need to be addressed in some
> >> way, which seems like it will require a good deal of design work,
> >> forming some kind of consensus, and then implementation work to
> >> follow.
> > I have considered methods for safely transmitting state values between
> > different machines.
> > I have taken into account the version policy of PostgreSQL (5 years of
> > support) and the major version release cycle over the past 10 years (1
> > year), and as a result, I have made the assumption that transmission
> > is allowed only when the difference between the local version and the
> > remote version is 5 or less.
> > I believe that by adding new components, "export function" and "import
> > function", to the aggregate functions, and further introducing a new
> > SQL keyword to the query syntax of aggregate expressions, we can
> > address this issue.
> >
 ...
> 
> I honestly think that achieving cross-version compatibility in this way puts a significant burden on developers. Can
we
> instead always use the more or less universal export and import function to fix possible issues with binary
representations
> on different architectures and just refuse to push down partial aggregates on server version mismatch? At least at
thefirst
 
> step?
Thank you for your comment. I agree with your point that the proposed method would impose a significant burden on
developers.In order to ensure cross-version compatibility, it is necessary to impose constraints on the format of the
statevalues exchanged between servers, which would indeed burden developers. As you mentioned, I think that it is
realisticto allow partial aggregation pushdown only when coordinating between the same versions in the first step.
 

> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Sent: Wednesday, February 28, 2024 10:43 PM
> > 3. Fixing the behavior when the HAVING clause is present
> >> From: Robert Haas <robertmhaas@gmail.com>
> >> Sent: Tuesday, November 28, 2023 4:08 AM
> >>
> >> On Wed, Nov 22, 2023 at 1:32 AM Alexander Pyhalov
> >> <a.pyhalov@postgrespro.ru> wrote:
> >> > Hi. HAVING is also a problem. Consider the following query
> >> >
> >> > SELECT count(a) FROM t HAVING count(a) > 10 - we can't push it down
> >> > to foreign server as HAVING needs full aggregate result, but
> >> > foreign server don't know it.
> >>
> >> I don't see it that way. What we would push to the foreign server
> >> would be something like SELECT count(a) FROM t. Then, after we get
> >> the results back and combine the various partial counts locally, we
> >> would locally evaluate the HAVING clause afterward. That is, partial
> >> aggregation is a barrier to pushing down HAVING clause itself, but it
> >> doesn't preclude pushing down the aggregation.
> > I have made modifications in the attached patch to ensure that when
> > the HAVING clause is present, the HAVING clause is executed locally
> > while the partial aggregations are pushed down.
> >
> >
> 
> Sorry, I don't see how it works. When we have partial aggregates and having clause, foreign_grouping_ok() returns
falseand
 
> add_foreign_grouping_paths() adds no paths.
> I'm not saying it's necessary to fix this in the first patch version.
Our sincere apologies. I had attached an older version before this modification.

> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Sent: Wednesday, February 28, 2024 10:43 PM
> contrib/postgres_fdw/deparse.c: comment before appendFunctionName() has gone, this seems to be wrong.
Fixed.

> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Sent: Wednesday, February 28, 2024 10:43 PM
> In finalize_aggregate()
> 
> 1079         /*
> 1080          * Apply the agg's finalfn if one is provided, else return
> transValue.
> 1081          */
> 
> Comment should be updated to note behavior for agg_partial aggregates.
Fixed.

> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
> Sent: Wednesday, February 28, 2024 10:43 PM
> In this if branch, should we check just for peragg->aggref->agg_partial and peragg->aggref->aggtranstype ==
> INTERNALOID? It seems that if
> peragg->aggref->aggtranstype == INTERNALOID and there's no
> serialfn_oid, it's likely an error (and one should be generated).
As you pointed out, I have made modifications to the source code so that it terminates with an error if serialfn is
invalid.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-SQL-keyword-v2.patch

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

19 марта, 23:29:42

On Sat, Mar 16, 2024 at 02:28:50AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> Hi. Mr.Pyhalov.
>
> > From: Alexander Pyhalov <a.pyhalov@postgrespro.ru> Sent: Wednesday,
> > February 28, 2024 10:43 PM
> > > 1. Transmitting state value safely between machines
> > >> From: Robert Haas <robertmhaas@gmail.com> Sent: Wednesday,
> > >> December 6, 2023 10:25 PM the problems around transmitting
> > >> serialized bytea blobs between machines that can't be assumed to
> > >> fully trust each other will need to be addressed in some way,
> > >> which seems like it will require a good deal of design work,
> > >> forming some kind of consensus, and then implementation work to
> > >> follow.
> > > I have considered methods for safely transmitting state values
> > > between different machines.  I have taken into account the version
> > > policy of PostgreSQL (5 years of support) and the major version
> > > release cycle over the past 10 years (1 year), and as a result, I
> > > have made the assumption that transmission is allowed only when
> > > the difference between the local version and the remote version
> > > is 5 or less.  I believe that by adding new components, "export
> > > function" and "import function", to the aggregate functions, and
> > > further introducing a new SQL keyword to the query syntax of
> > > aggregate expressions, we can address this issue.
> >
> > I honestly think that achieving cross-version compatibility in
> > this way puts a significant burden on developers. Can we instead
> > always use the more or less universal export and import function
> > to fix possible issues with binary representations on different
> > architectures and just refuse to push down partial aggregates on
> > server version mismatch? At least at the first step?
>
> Thank you for your comment. I agree with your point that the proposed
> method would impose a significant burden on developers. In order
> to ensure cross-version compatibility, it is necessary to impose
> constraints on the format of the state values exchanged between
> servers, which would indeed burden developers. As you mentioned, I
> think that it is realistic to allow partial aggregation pushdown only
> when coordinating between the same versions in the first step.

The current patch has:

      if ((OidIsValid(aggform->aggfinalfn) ||
              (aggform->aggtranstype == INTERNALOID)) &&
              fpinfo->check_partial_aggregate_support)
      {
              if (fpinfo->remoteversion == 0)
              {
                      PGconn     *conn = GetConnection(fpinfo->user, false, NULL);

                      fpinfo->remoteversion = PQserverVersion(conn);
              }

              if (fpinfo->remoteversion < PG_VERSION_NUM)
                      partial_agg_ok = false;
      }

It uses check_partial_aggregate_support, which defaults to false,
meaning partial aggregates will be pushed down with no version check by
default.  If set to true, pushdown will happen if the remote server is
the same version or newer, which seems acceptable to me.

FYI, the patch is much smaller now.  :-)

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Re: Partial aggregates pushdown

От

Tom Lane

Дата:

20 марта, 00:29:07

Bruce Momjian <bruce@momjian.us> writes:
> The current patch has:

>       if ((OidIsValid(aggform->aggfinalfn) ||
>               (aggform->aggtranstype == INTERNALOID)) &&
>               fpinfo->check_partial_aggregate_support)
>       {
>               if (fpinfo->remoteversion == 0)
>               {
>                       PGconn     *conn = GetConnection(fpinfo->user, false, NULL);

>                       fpinfo->remoteversion = PQserverVersion(conn);
>               }

>               if (fpinfo->remoteversion < PG_VERSION_NUM)
>                       partial_agg_ok = false;
>       }

> It uses check_partial_aggregate_support, which defaults to false,
> meaning partial aggregates will be pushed down with no version check by
> default.  If set to true, pushdown will happen if the remote server is
> the same version or newer, which seems acceptable to me.

I'd like to vociferously protest both of those decisions.

"No version check by default" means "unsafe by default", which is not
project style in general and is especially not so for postgres_fdw.
We have tried very hard for years to ensure that postgres_fdw will
work with a wide range of remote server versions, and generally been
extremely conservative about what we think will work (example:
collations); but this patch seems ready to throw that principle away.

Also, surely "remoteversion < PG_VERSION_NUM" is backwards.  What
this would mean is that nobody can ever change a partial aggregate's
implementation, because that would break queries issued from older
servers (that couldn't know about the change) to newer ones.

Realistically, I think it's fairly unsafe to try aggregate pushdown
to anything but the same PG major version; otherwise, you're buying
into knowing which aggregates have partial support in which versions,
as well as predicting the future about incompatible state changes.
Even that isn't bulletproof --- e.g, maybe somebody wasn't careful
about endianness-independence of the serialized partial state, making
it unsafe to ship --- so there had better be a switch whereby the user
can disable it.

Maybe we could define a three-way setting:

* default: push down partial aggs only to same major PG version
* disable: don't push down, period
* force: push down regardless of remote version

With the "force" setting, it's the user's responsibility not to
issue any remote-able aggregation that would be unsafe to push
down.  This is still a pretty crude tool: I can foresee people
wanting to have per-aggregate control over things, especially
extension-supplied aggregates.  But it'd do for starters.

I'm not super thrilled by the fact that the patch contains zero
user-facing documentation, even though it's created new SQL syntax,
not to mention a new postgres_fdw option.  I assume this means that
nobody thinks it's anywhere near ready to commit.

            regards, tom lane

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

20 марта, 02:09:14

On Tue, Mar 19, 2024 at 05:29:07PM -0400, Tom Lane wrote:
> I'd like to vociferously protest both of those decisions.
> 
> "No version check by default" means "unsafe by default", which is not
> project style in general and is especially not so for postgres_fdw.
> We have tried very hard for years to ensure that postgres_fdw will
> work with a wide range of remote server versions, and generally been
> extremely conservative about what we think will work (example:
> collations); but this patch seems ready to throw that principle away.
> 
> Also, surely "remoteversion < PG_VERSION_NUM" is backwards.  What
> this would mean is that nobody can ever change a partial aggregate's
> implementation, because that would break queries issued from older
> servers (that couldn't know about the change) to newer ones.

Well it is the origin server that is issuing the PUSHDOWN syntax, so an
older origin server should be able to push to a newer remote server.

> Realistically, I think it's fairly unsafe to try aggregate pushdown
> to anything but the same PG major version; otherwise, you're buying
> into knowing which aggregates have partial support in which versions,
> as well as predicting the future about incompatible state changes.

Yes, incompatible state changes would be a problem with an older origin
server with a newer remote server setup.

If we require matching versions, we must accept that upgrades will
require more downtime.

> Even that isn't bulletproof --- e.g, maybe somebody wasn't careful
> about endianness-independence of the serialized partial state, making
> it unsafe to ship --- so there had better be a switch whereby the user
> can disable it.

Makes sense.  I was also wondering how a user would know whether the
pushdown is happening, or not.

> Maybe we could define a three-way setting:
> 
> * default: push down partial aggs only to same major PG version
> * disable: don't push down, period
> * force: push down regardless of remote version

What would be the default?  If it is the first one, it requires a
remote version check on first in the session.

> With the "force" setting, it's the user's responsibility not to
> issue any remote-able aggregation that would be unsafe to push
> down.  This is still a pretty crude tool: I can foresee people
> wanting to have per-aggregate control over things, especially
> extension-supplied aggregates.  But it'd do for starters.

We have the postgres_fdw extensions option to control function pushdown
to extensions.

> I'm not super thrilled by the fact that the patch contains zero
> user-facing documentation, even though it's created new SQL syntax,
> not to mention a new postgres_fdw option.  I assume this means that
> nobody thinks it's anywhere near ready to commit.

Previous versions of the patch had docs since I know I worked on
improving them.  I am not sure what happened to them.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

21 марта, 14:37:50

Hi. Mr.Momjian, Mr.Lane, Mr.Haas, hackers.

I apologize for any misunderstanding regarding the context of the attached patch and
the points on which I requested a review. Could you please allow me to clarify?

In the review around early December 2023, I received the following three issues pointed out by Mr.Haas[1].
1. Transmitting state value safely between machines
2. Making the patch clearer by adding SQL keywords
3. Fixing the behavior when the HAVING clause is present

In the email sent on February 22, 2024[2], I provided an update on the progress made in addressing these issues.
Regarding issue 1, I have only provided a proposed solution in the email and have not started the programming.
Therefore, the latest patch is not in a commit-ready state. As mentioned later, we have also temporarily reverted the
changesmade to the documentation.
Before proceeding with the programming, I would like to discuss the proposed solution with the community and seek
consensus.
If it is necessary to have source code in order to discuss, I can create a simple prototype so that I can receive your
feedback.
Would you be able to provide your opinions on it?

Regarding issue 2., I have confirmed that creating a prototype allows us to address the issue and clear the patch.
In this prototype creation, the main purpose was to verify if the patch can be cleared and significant revisions were
madeto the previous version.
Therefore, I have removed all the document differences.
I have submitted a patch [3] that includes the fixes for issue 3. to the patch that was posted in [2].
Regarding the proposed solution for issue 1, unlike the patch posted in [3],
we have a policy of not performing partial aggregation pushdown if we cannot guarantee compatibility and safety.
The latest patch in [3] is a POC patch. The patch that Mr. Momjian reviewed is this.
If user-facing documentation is needed for this POC patch, it can be added.

I apologize for the lack of explanation regarding this positioning, which may have caused misunderstandings regarding
thepatch posted in [3].

[1] https://www.postgresql.org/message-id/CA%2BTgmoYCrtOvk2f32qQKZV%3DjNL35tandf2A2Dp_2F5ASuiG1BA%40mail.gmail.com
[2]
https://www.postgresql.org/message-id/TYAPR01MB5514F0CBD9CD4F84A261198195562%40TYAPR01MB5514.jpnprd01.prod.outlook.com
[3]
https://www.postgresql.org/message-id/TYAPR01MB55141D18188AC86ADCE35FCB952F2%40TYAPR01MB5514.jpnprd01.prod.outlook.com

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

22 марта, 01:01:00

On Thu, Mar 21, 2024 at 11:37:50AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> Hi. Mr.Momjian, Mr.Lane, Mr.Haas, hackers.
> 
> I apologize for any misunderstanding regarding the context of the attached patch and
> the points on which I requested a review. Could you please allow me to clarify?
> 
> In the review around early December 2023, I received the following three issues pointed out by Mr.Haas[1].
> 1. Transmitting state value safely between machines
> 2. Making the patch clearer by adding SQL keywords
> 3. Fixing the behavior when the HAVING clause is present
> 
> In the email sent on February 22, 2024[2], I provided an update on the progress made in addressing these issues.
> Regarding issue 1, I have only provided a proposed solution in the email and have not started the programming. 
> Therefore, the latest patch is not in a commit-ready state. As mentioned later, we have also temporarily reverted the
changesmade to the documentation.
 
> Before proceeding with the programming, I would like to discuss the proposed solution with the community and seek
consensus.
> If it is necessary to have source code in order to discuss, I can create a simple prototype so that I can receive
yourfeedback.
 
> Would you be able to provide your opinions on it?
> 
> Regarding issue 2., I have confirmed that creating a prototype allows us to address the issue and clear the patch.
> In this prototype creation, the main purpose was to verify if the patch can be cleared and significant revisions were
madeto the previous version.
 
> Therefore, I have removed all the document differences.
> I have submitted a patch [3] that includes the fixes for issue 3. to the patch that was posted in [2].
> Regarding the proposed solution for issue 1, unlike the patch posted in [3], 
> we have a policy of not performing partial aggregation pushdown if we cannot guarantee compatibility and safety.
> The latest patch in [3] is a POC patch. The patch that Mr. Momjian reviewed is this.
> If user-facing documentation is needed for this POC patch, it can be added.
> 
> I apologize for the lack of explanation regarding this positioning, which may have caused misunderstandings regarding
thepatch posted in [3].
 

That makes sense.  Let's get you answers to those questions first before
you continue.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

25 марта, 10:00:51

Fujii.Yuki@df.MitsubishiElectric.co.jp писал(а) 2024-03-16 05:28:
> Hi. Mr.Pyhalov.
>> >>
>> >> I don't see it that way. What we would push to the foreign server
>> >> would be something like SELECT count(a) FROM t. Then, after we get
>> >> the results back and combine the various partial counts locally, we
>> >> would locally evaluate the HAVING clause afterward. That is, partial
>> >> aggregation is a barrier to pushing down HAVING clause itself, but it
>> >> doesn't preclude pushing down the aggregation.
>> > I have made modifications in the attached patch to ensure that when
>> > the HAVING clause is present, the HAVING clause is executed locally
>> > while the partial aggregations are pushed down.
>> >
>> >
>> 
>> Sorry, I don't see how it works. When we have partial aggregates and 
>> having clause, foreign_grouping_ok() returns false and
>> add_foreign_grouping_paths() adds no paths.
>> I'm not saying it's necessary to fix this in the first patch version.
> Our sincere apologies. I had attached an older version before this 
> modification.
> 

Hi.

In foreign_grouping_ok() having qual is added to local conds here:

6635                         if (is_foreign_expr(root, grouped_rel, 
expr) && !partial)
6636                                 fpinfo->remote_conds = 
lappend(fpinfo->remote_conds, rinfo);
6637                         else
6638                                 fpinfo->local_conds = 
lappend(fpinfo->local_conds, rinfo);
6639                 }
6640         }


This is incorrect. If you look at plan for query in postgres_fdw.sql


-- Partial aggregates are safe to push down when there is a HAVING 
clause
EXPLAIN (VERBOSE, COSTS OFF)
SELECT b, avg(a), max(a), count(*) FROM pagg_tab GROUP BY b HAVING 
sum(a) < 700 ORDER BY 1;
                                                                      
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------
  Finalize GroupAggregate
    Output: pagg_tab.b, avg(pagg_tab.a), max(pagg_tab.a), count(*)
    Group Key: pagg_tab.b
    Filter: (sum(pagg_tab.a) < 700)
    ->  Sort
          Output: pagg_tab.b, (PARTIAL avg(pagg_tab.a)), (PARTIAL 
max(pagg_tab.a)), (PARTIAL count(*)), (PARTIAL sum(pagg_tab.a))
          Sort Key: pagg_tab.b
          ->  Append
                ->  Foreign Scan
                      Output: pagg_tab.b, (PARTIAL avg(pagg_tab.a)), 
(PARTIAL max(pagg_tab.a)), (PARTIAL count(*)), (PARTIAL sum(pagg_tab.a))
                      Filter: ((sum(pagg_tab.a)) < 700)
                      Relations: Aggregate on (public.fpagg_tab_p1 
pagg_tab)
                      Remote SQL: SELECT b, avg(PARTIAL_AGGREGATE a), 
max(a), count(*), sum(a), sum(a) FROM public.pagg_tab_p1 GROUP BY 1
                ->  Foreign Scan
                      Output: pagg_tab_1.b, (PARTIAL avg(pagg_tab_1.a)), 
(PARTIAL max(pagg_tab_1.a)), (PARTIAL count(*)), (PARTIAL 
sum(pagg_tab_1.a))
                      Filter: ((sum(pagg_tab_1.a)) < 700)
                      Relations: Aggregate on (public.fpagg_tab_p2 
pagg_tab_1)
                      Remote SQL: SELECT b, avg(PARTIAL_AGGREGATE a), 
max(a), count(*), sum(a), sum(a) FROM public.pagg_tab_p2 GROUP BY 1
                ->  Foreign Scan
                      Output: pagg_tab_2.b, (PARTIAL avg(pagg_tab_2.a)), 
(PARTIAL max(pagg_tab_2.a)), (PARTIAL count(*)), (PARTIAL 
sum(pagg_tab_2.a))
                      Filter: ((sum(pagg_tab_2.a)) < 700)
                      Relations: Aggregate on (public.fpagg_tab_p3 
pagg_tab_2)
                      Remote SQL: SELECT b, avg(PARTIAL_AGGREGATE a), 
max(a), count(*), sum(a), sum(a) FROM public.pagg_tab_p3 GROUP BY 1


You can see that filter is applied before append. The result is correct 
only by chance, as sum in every partition is actually < 700. If you 
lower this bound, let's say, to 200, you'll start getting wrong results 
as data is filtered prior to aggregation.

It seems, however, that in partial case you should just avoid pulling 
conditions from having qual at all, all filters will be applied on upper 
level. Something like

diff --git a/contrib/postgres_fdw/postgres_fdw.c 
b/contrib/postgres_fdw/postgres_fdw.c
index 42eb17ae7c0..54918b9f1a4 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -6610,7 +6610,7 @@ foreign_grouping_ok(PlannerInfo *root, RelOptInfo 
*grouped_rel,
          * Classify the pushable and non-pushable HAVING clauses and 
save them in
          * remote_conds and local_conds of the grouped rel's fpinfo.
          */
-       if (extra->havingQual)
+       if (extra->havingQual && !partial)
         {
                 foreach(lc, (List *) extra->havingQual)
                 {
@@ -6632,7 +6632,7 @@ foreign_grouping_ok(PlannerInfo *root, RelOptInfo 
*grouped_rel,
                                                                          
  grouped_rel->relids,
                                                                          
  NULL,
                                                                          
  NULL);
-                       if (is_foreign_expr(root, grouped_rel, expr) && 
!partial)
+                       if (is_foreign_expr(root, grouped_rel, expr))
                                 fpinfo->remote_conds = 
lappend(fpinfo->remote_conds, rinfo);
                         else
                                 fpinfo->local_conds = 
lappend(fpinfo->local_conds, rinfo);

>> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
>> Sent: Wednesday, February 28, 2024 10:43 PM
>> contrib/postgres_fdw/deparse.c: comment before appendFunctionName() 
>> has gone, this seems to be wrong.
> Fixed.
> 
>> From: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
>> Sent: Wednesday, February 28, 2024 10:43 PM
>> In finalize_aggregate()
>> 
>> 1079         /*
>> 1080          * Apply the agg's finalfn if one is provided, else 
>> return
>> transValue.
>> 1081          */
>> 
>> Comment should be updated to note behavior for agg_partial aggregates.
> Fixed.

Comment in nodeAgg.c seems to be strange:

1079         /*
1080          * If the agg's finalfn is provided and PARTIAL_AGGREGATE 
keyword is
1081          * not specified, apply the agg's finalfn.
1082          * If PARTIAL_AGGREGATE keyword is specified and the 
transValue type
1083          * is internal, apply the agg's serialfn. In this case, if 
the agg's
1084          * serialfn must not be invalid. Otherwise return 
transValue.
1085          */

Likely, you mean:

... In this case the agg'ss serialfn must not be invalid...


Lower, in the same file, please, correct error message:

1136                 if(!OidIsValid(peragg->serialfn_oid))
1137                         elog(ERROR, "serialfunc is note provided 
for partial aggregate");

it should be "serialfunc is not provided for partial aggregate"

Also something is wrong with the following test :

  SELECT /* aggregate <> partial aggregate */
      array_agg(c_int4array), array_agg(b),
      avg(b::int2), avg(b::int4), avg(b::int8), avg(c_interval),
      avg(b::float4), avg(b::float8),
      corr(b::float8, (b * b)::float8),
      covar_pop(b::float8, (b * b)::float8),
      covar_samp(b::float8, (b * b)::float8),
      regr_avgx((2 * b)::float8, b::float8),
.....

Its results have changed since last patch. Do they depend on daylight 
saving time?

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

26 марта, 14:33:21

Alexander Pyhalov писал(а) 2024-03-25 10:00:
> Fujii.Yuki@df.MitsubishiElectric.co.jp писал(а) 2024-03-16 05:28:
>> Hi. Mr.Pyhalov.
>>> >>
>>> >> I don't see it that way. What we would push to the foreign server
>>> >> would be something like SELECT count(a) FROM t. Then, after we get
>>> >> the results back and combine the various partial counts locally, we
>>> >> would locally evaluate the HAVING clause afterward. That is, partial
>>> >> aggregation is a barrier to pushing down HAVING clause itself, but it
>>> >> doesn't preclude pushing down the aggregation.
>>> > I have made modifications in the attached patch to ensure that when
>>> > the HAVING clause is present, the HAVING clause is executed locally
>>> > while the partial aggregations are pushed down.
>>> >
>>> >
>>> 
>>> Sorry, I don't see how it works. When we have partial aggregates and 
>>> having clause, foreign_grouping_ok() returns false and
>>> add_foreign_grouping_paths() adds no paths.
>>> I'm not saying it's necessary to fix this in the first patch version.
>> Our sincere apologies. I had attached an older version before this 
>> modification.
>> 
> 

Hi.
Found one more problem. You can fire partial aggregate over partitioned 
table, but convert_combining_aggrefs() will make non-partial copy, which 
leads to
'variable not found in subplan target list' error.

Attaching fixed version. Also I've added changes, related to HAVING 
processing.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Вложения

0001-Partial-aggregates-push-down-SQL-keyword-v3.patch

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

29 марта, 17:46:31

Alexander Pyhalov писал(а) 2024-03-26 14:33:
> Alexander Pyhalov писал(а) 2024-03-25 10:00:
>> Fujii.Yuki@df.MitsubishiElectric.co.jp писал(а) 2024-03-16 05:28:
>>> Hi. Mr.Pyhalov.
>>>> >>
>>>> >> I don't see it that way. What we would push to the foreign server
>>>> >> would be something like SELECT count(a) FROM t. Then, after we get
>>>> >> the results back and combine the various partial counts locally, we
>>>> >> would locally evaluate the HAVING clause afterward. That is, partial
>>>> >> aggregation is a barrier to pushing down HAVING clause itself, but it
>>>> >> doesn't preclude pushing down the aggregation.
>>>> > I have made modifications in the attached patch to ensure that when
>>>> > the HAVING clause is present, the HAVING clause is executed locally
>>>> > while the partial aggregations are pushed down.
>>>> >
>>>> >
>>>> 
>>>> Sorry, I don't see how it works. When we have partial aggregates and 
>>>> having clause, foreign_grouping_ok() returns false and
>>>> add_foreign_grouping_paths() adds no paths.
>>>> I'm not saying it's necessary to fix this in the first patch 
>>>> version.
>>> Our sincere apologies. I had attached an older version before this 
>>> modification.
>>> 
>> 
> 
> Hi.
> Found one more problem. You can fire partial aggregate over partitioned 
> table, but convert_combining_aggrefs() will make non-partial copy, 
> which leads to
> 'variable not found in subplan target list' error.
> 
> Attaching fixed version. Also I've added changes, related to HAVING 
> processing.

Hi.

There was an issue in previous patch version - setGroupClausePartial() 
looked at root->parse->groupClause, not at root->processed_groupClause.
Fixed and added test to cover this.


Also denied partial agregates pushdown on server version mismatch. 
Should check_partial_aggregate_support be 'true' by default?

I'm not sure what to do with current grammar - it precludes partial 
distinct aggregates. I understand that it's currently impossible to have 
partial aggregation for distinct agregates -but does it worth to have 
such restriction at grammar level?

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Вложения

0001-Partial-aggregates-push-down-SQL-keyword-v4.patch

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

28 мая, 00:30:59

Hi Mr. Pyhalov.

Sorry for the late reply.
Thank you for your modification and detailed review.
I attach a fixed patch, have been not yet rebased.

Monday, 25 March 2024 16:01 Alexander Pyhalov <a.pyhalov@postgrespro.ru>:.
> Comment in nodeAgg.c seems to be strange:
>
> 1079 /*
> 1080 * If the agg's finalfn is provided and PARTIAL_AGGREGATE
> keyword is
> 1081 * not specified, apply the agg's finalfn.
> 1082 * If PARTIAL_AGGREGATE keyword is specified and the
> transValue type
> 1083 * is internal, apply the agg's serialfn. In this case, if
> the agg's
> 1084 * serialfn must not be invalid. Otherwise return
> transValue.
> 1085 */
>
> Likely, you mean:
>
> ... In this case the agg'ss serialfn must not be invalid...
Fixed.

> Lower, in the same file, please, correct error message:
>
> 1136 if(!OidIsValid(peragg->serialfn_oid))
> 1137  elog(ERROR, "serialfunc is note provided
> for partial aggregate");
>
> it should be "serialfunc is not provided for partial aggregate"
Fixed.

> Also something is wrong with the following test :
>
> SELECT /* aggregate <> partial aggregate */
>  array_agg(c_int4array), array_agg(b),
>  avg(b::int2), avg(b::int4), avg(b::int8), avg(c_interval),
>  avg(b::float4), avg(b::float8),
>  corr(b::float8, (b * b)::float8),
>  covar_pop(b::float8, (b * b)::float8),
>  covar_samp(b::float8, (b * b)::float8),
>  regr_avgx((2 * b)::float8, b::float8),
> .....
>
> Its results have changed since last patch. Do they depend on daylight
> saving time?
You are right. In my environment, TimeZone is set to 'PST8PDT'
with which timetz values depends on daylight saving time.
Changed TimeZone to 'UTC' in this test.

> You can see that filter is applied before append. The result is correct
> only by chance, as sum in every partition is actually < 700. If you
> lower this bound, let's say, to 200, you'll start getting wrong results
> as data is filtered prior to aggregation.
>
> It seems, however, that in partial case you should just avoid pulling
> conditions from having qual at all, all filters will be applied on upper
> level. Something like
Thank you for your modification.

> Found one more problem. You can fire partial aggregate over partitioned
> table, but convert_combining_aggrefs() will make non-partial copy, which
> leads to
> 'variable not found in subplan target list' error.
Thanks for the correction as well.
As you pointed out,
the original patch certainly had the potential to cause problems.
However, I could not actually reproduce the problem in cases such as the following.

  Settings:
    t(c1, c2) is a patitioned table whose partition key is c1.
    t1, t2 are patitions of t and are partitioned table.
    t11, t12: partitions of t1 and foreign table of postgres_fdw.
    t21, t22: partitions of t2 and foreign table of postgres_fdw.
  Query:
    select c2 / 2, sum(c1) from t group by c2 / 2 order by 1

If you have a reproducible example, I would like to add it to
the regression test.
Do you have a reproducible example?

> Also denied partial agregates pushdown on server version mismatch.
> Should check_partial_aggregate_support be 'true' by default?
Could we discuss this point after we determine how to transfer state values?
If we determine this point, we can easly determine whether check_partial_aggregate_support shold be 'true' by default.

> I'm not sure what to do with current grammar - it precludes partial
> distinct aggregates. I understand that it's currently impossible to have
> partial aggregation for distinct agregates -but does it worth to have
> such restriction at grammar level?
If partial aggregation for distinct agregates becomes possible in the future,
I see no problem with the policy of accepting new SQL keywords,
such as "PARTIL_AGGREGATE DISTINCT".

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

0001-Partial-aggregates-push-down-SQL-keyword-v5.patch

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

28 мая, 08:45:04

Fujii.Yuki@df.MitsubishiElectric.co.jp писал(а) 2024-05-28 00:30:
> Hi Mr. Pyhalov.
> 
Hi.

>> Found one more problem. You can fire partial aggregate over 
>> partitioned
>> table, but convert_combining_aggrefs() will make non-partial copy, 
>> which
>> leads to
>> 'variable not found in subplan target list' error.
> Thanks for the correction as well.
> As you pointed out,
> the original patch certainly had the potential to cause problems.
> However, I could not actually reproduce the problem in cases such as 
> the following.
> 
>   Settings:
>     t(c1, c2) is a patitioned table whose partition key is c1.
>     t1, t2 are patitions of t and are partitioned table.
>     t11, t12: partitions of t1 and foreign table of postgres_fdw.
>     t21, t22: partitions of t2 and foreign table of postgres_fdw.
>   Query:
>     select c2 / 2, sum(c1) from t group by c2 / 2 order by 1
> 
> If you have a reproducible example, I would like to add it to
> the regression test.
> Do you have a reproducible example?
> 

The fix was to set child_agg->agg_partial to orig_agg->agg_partial in 
convert_combining_aggrefs(), it's already in the patch,
as well as the example - without this fix

-- Check partial aggregate over partitioned table
EXPLAIN (VERBOSE, COSTS OFF)
SELECT avg(PARTIAL_AGGREGATE a), avg(a) FROM pagg_tab;

fails with

ERROR:  variable not found in subplan target list
-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

28 мая, 08:57:39

Fujii.Yuki@df.MitsubishiElectric.co.jp писал(а) 2024-05-28 00:30:
> Hi Mr. Pyhalov.
> 
> Sorry for the late reply.
> Thank you for your modification and detailed review.
> I attach a fixed patch, have been not yet rebased.
> 
> Monday, 25 March 2024 16:01 Alexander Pyhalov 
> <a.pyhalov@postgrespro.ru>:.
>> Comment in nodeAgg.c seems to be strange:
>> 
>> 1079 /*
>> 1080 * If the agg's finalfn is provided and PARTIAL_AGGREGATE
>> keyword is
>> 1081 * not specified, apply the agg's finalfn.
>> 1082 * If PARTIAL_AGGREGATE keyword is specified and the
>> transValue type
>> 1083 * is internal, apply the agg's serialfn. In this case, if
>> the agg's
>> 1084 * serialfn must not be invalid. Otherwise return
>> transValue.
>> 1085 */
>> 
>> Likely, you mean:
>> 
>> ... In this case the agg'ss serialfn must not be invalid...
> Fixed.
> 
>> Lower, in the same file, please, correct error message:
>> 
>> 1136 if(!OidIsValid(peragg->serialfn_oid))
>> 1137  elog(ERROR, "serialfunc is note provided
>> for partial aggregate");
>> 
>> it should be "serialfunc is not provided for partial aggregate"
> Fixed.
> 
>> Also something is wrong with the following test :
>> 
>> SELECT /* aggregate <> partial aggregate */
>>  array_agg(c_int4array), array_agg(b),
>>  avg(b::int2), avg(b::int4), avg(b::int8), avg(c_interval),
>>  avg(b::float4), avg(b::float8),
>>  corr(b::float8, (b * b)::float8),
>>  covar_pop(b::float8, (b * b)::float8),
>>  covar_samp(b::float8, (b * b)::float8),
>>  regr_avgx((2 * b)::float8, b::float8),
>> .....
>> 
>> Its results have changed since last patch. Do they depend on daylight
>> saving time?
> You are right. In my environment, TimeZone is set to 'PST8PDT'
> with which timetz values depends on daylight saving time.
> Changed TimeZone to 'UTC' in this test.
> 
>> You can see that filter is applied before append. The result is 
>> correct
>> only by chance, as sum in every partition is actually < 700. If you
>> lower this bound, let's say, to 200, you'll start getting wrong 
>> results
>> as data is filtered prior to aggregation.
>> 
>> It seems, however, that in partial case you should just avoid pulling
>> conditions from having qual at all, all filters will be applied on 
>> upper
>> level. Something like
> Thank you for your modification.
> 
>> Found one more problem. You can fire partial aggregate over 
>> partitioned
>> table, but convert_combining_aggrefs() will make non-partial copy, 
>> which
>> leads to
>> 'variable not found in subplan target list' error.
> Thanks for the correction as well.
> As you pointed out,
> the original patch certainly had the potential to cause problems.
> However, I could not actually reproduce the problem in cases such as 
> the following.
> 
>   Settings:
>     t(c1, c2) is a patitioned table whose partition key is c1.
>     t1, t2 are patitions of t and are partitioned table.
>     t11, t12: partitions of t1 and foreign table of postgres_fdw.
>     t21, t22: partitions of t2 and foreign table of postgres_fdw.
>   Query:
>     select c2 / 2, sum(c1) from t group by c2 / 2 order by 1
> 
> If you have a reproducible example, I would like to add it to
> the regression test.
> Do you have a reproducible example?
> 
>> Also denied partial agregates pushdown on server version mismatch.
>> Should check_partial_aggregate_support be 'true' by default?
> Could we discuss this point after we determine how to transfer state 
> values?
> If we determine this point, we can easly determine whether 
> check_partial_aggregate_support shold be 'true' by default.
> 
>> I'm not sure what to do with current grammar - it precludes partial
>> distinct aggregates. I understand that it's currently impossible to 
>> have
>> partial aggregation for distinct agregates -but does it worth to have
>> such restriction at grammar level?
> If partial aggregation for distinct agregates becomes possible in the 
> future,
> I see no problem with the policy of accepting new SQL keywords,
> such as "PARTIL_AGGREGATE DISTINCT".

BTW, there's I have an issue with test results in the last version of 
the patch. Attaching regression diffs.
I have partial sum over c_interval instead of sum(c_interval).


-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Вложения

regression.diffs

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

04 июня, 20:12:04

On Mon, May 27, 2024 at 09:30:59PM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> Hi Mr. Pyhalov.
> 
> Sorry for the late reply.
> Thank you for your modification and detailed review.
> I attach a fixed patch, have been not yet rebased.

I know this patch was discussed at the Vancouver conference.  What are
the open issues?  I know of several:

*  add tests that were requested by Fujii-san and now posted by
   Alexander Pyhalov
*  Where is the documentation?  I know the original patch had some, and
   I improved it, but it seems to be missing.
*  Passes unsafe binary data from the foreign server.

Can someone show me where that last item is in the patch, and why can't
we just pass back values cast to text?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

05 июня, 03:14:45

Hi. Bruce, Robert, Tom, Alexander, hackers.

Apologies for the late response.
Thanks to Bruce, I had the opportunity to have individual discussions with Robert and Tom at PGConf.dev 2024.
I'm truly grateful to Bruce, Robert, and Tom for their time.
I've attached the presentation slides used during our discussions.
Below is a summary of our conversations.
As my English is not perfect, there may be some misunderstandings.
If so, I apologize and would appreciate any corrections or comments.

[Summary]
Basically, I will make the prototype as the presentation slides.
In specialty, the prototype has the following two limits.
Limit1. The server versions of the coordinator and the worker match.
Limit2. Supported built-in aggregate functions are a few subset
(Ex. avg, sum, count).
However, there are many build-in aggregate functions in which import and export functions are not necessary.
(See p.18 in the presentation slides).
I will support these aggregate functions.

In this prototype, It is necessary for the following requirements.
Requirement1. Ensure the compatibility even if the server setting(Ex. encodings) of the coordinator and the worker
aredifferent. (with Tom)
Requirement2. Consider appropriate position of the new keyword "PARTIAL_AGGREGATE". (with Robert)
Existing patch: Before the target expression. Ex. avg(PARTIAL_AGGREGATE c1)
Ideal: Before the aggregate function. Ex. PARTIAL_AGGREGATE avg(c1)
Requirement3. Consider to avoid to make the new keyword "PARTIAL_AGGREGATE" become a reserved word. (with Robert)
In the existing patch, "PARTIAL_AGGREGATE" is a reserved word.

In addition to the summary,
I will add sufficient document and comments to the next patch as Bruce and Robert said.
And, I will fix the problem as Alexander pointed in the last week.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Вложения

PGConfDev2024_Presentation_Aggregation_Scaleout_FDW_Sharding_20240531.pdf

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

05 июня, 04:35:46

On Wed, Jun  5, 2024 at 12:14:45AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> I will add sufficient document and comments to the next patch as Bruce and Robert said.

Great, I am available to help improve the documentation.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Re: Partial aggregates pushdown

От

Alexander Pyhalov

Дата:

05 июня, 08:19:04

Bruce Momjian писал(а) 2024-06-04 20:12:
> On Mon, May 27, 2024 at 09:30:59PM +0000, 
> Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
>> Hi Mr. Pyhalov.
>> 
>> Sorry for the late reply.
>> Thank you for your modification and detailed review.
>> I attach a fixed patch, have been not yet rebased.
> 
> I know this patch was discussed at the Vancouver conference.  What are
> the open issues?  I know of several:
> 
> *  add tests that were requested by Fujii-san and now posted by
>    Alexander Pyhalov
> *  Where is the documentation?  I know the original patch had some, and
>    I improved it, but it seems to be missing.
> *  Passes unsafe binary data from the foreign server.
> 
> Can someone show me where that last item is in the patch, and why can't
> we just pass back values cast to text?

Hi.

In finalize_aggregate() when we see partial aggregate with 
peragg->aggref->aggtranstype = INTERNALOID
we call aggregate's serialization function and return it as bytea.

The issue is that this internal representation isn't guaranteed to be 
compatible between servers
of different versions (or architectures?). So, likely, we instead should 
have called some export function for aggregate
and later - some import function on postgres_fdw side. It doesn't matter 
much, what this export function
generates - text, json or some portable binary format,
1) export/import functions should just "understand" it,
2) it should be a stable representation.
-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Re: Partial aggregates pushdown

От

Bruce Momjian

Дата:

05 июня, 17:04:29

On Wed, Jun  5, 2024 at 08:19:04AM +0300, Alexander Pyhalov wrote:
> > *  Passes unsafe binary data from the foreign server.
> > 
> > Can someone show me where that last item is in the patch, and why can't
> > we just pass back values cast to text?
> 
> In finalize_aggregate() when we see partial aggregate with
> peragg->aggref->aggtranstype = INTERNALOID
> we call aggregate's serialization function and return it as bytea.
> 
> The issue is that this internal representation isn't guaranteed to be
> compatible between servers
> of different versions (or architectures?). So, likely, we instead should
> have called some export function for aggregate
> and later - some import function on postgres_fdw side. It doesn't matter
> much, what this export function
> generates - text, json or some portable binary format,
> 1) export/import functions should just "understand" it,
> 2) it should be a stable representation.

Okay, so looking at the serialization output functions already defined, I
see many zeros, which I assume means just the base data type, and eight
more:

    SELECT DISTINCT aggserialfn from pg_aggregate WHERE aggserialfn::oid != 0;
            aggserialfn
    ---------------------------
     numeric_avg_serialize
     string_agg_serialize
     array_agg_array_serialize
     numeric_serialize
     int8_avg_serialize
     array_agg_serialize
     interval_avg_serialize
     numeric_poly_serialize

I realize we need to return the sum and count for average, so that makes
sense.

So, we need import/export text representation for the partial aggregate
mode for these eight, and call the base data type text import/export
functions for the zero ones when in this mode?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

11 июня, 14:39:34

Hi. Bruce.

Sorry for the late. Thank you for comment.

> From: Bruce Momjian <bruce@momjian.us>
> Sent: Wednesday, June 5, 2024 11:04 PM
> > > *  Passes unsafe binary data from the foreign server.
> > >
> > > Can someone show me where that last item is in the patch, and why
> > > can't we just pass back values cast to text?
> >
> > In finalize_aggregate() when we see partial aggregate with
> > peragg->aggref->aggtranstype = INTERNALOID
> > we call aggregate's serialization function and return it as bytea.
> >
> > The issue is that this internal representation isn't guaranteed to be
> > compatible between servers of different versions (or architectures?).
> > So, likely, we instead should have called some export function for
> > aggregate and later - some import function on postgres_fdw side. It
> > doesn't matter much, what this export function generates - text, json
> > or some portable binary format,
> > 1) export/import functions should just "understand" it,
> > 2) it should be a stable representation.
>
> Okay, so looking at the serialization output functions already defined, I see many zeros, which I assume means just
thebase 
> data type, and eight
> more:
>
>     SELECT DISTINCT aggserialfn from pg_aggregate WHERE aggserialfn::oid != 0;
>             aggserialfn
>     ---------------------------
>      numeric_avg_serialize
>      string_agg_serialize
>      array_agg_array_serialize
>      numeric_serialize
>      int8_avg_serialize
>      array_agg_serialize
>      interval_avg_serialize
>      numeric_poly_serialize
>
> I realize we need to return the sum and count for average, so that makes sense.
>
> So, we need import/export text representation for the partial aggregate mode for these eight, and call the base data
type
> text import/export functions for the zero ones when in this mode?

I think that you are basically right.
But, I think, in a perfect world we should also add an import/export function for the following
two category.

  Category1. Validation Chek is needed for Safety.
    For example, I think a validation check is needed for avg(float4),
     whose transition type is not internal. (See p.18 in [1])
    I plan to add import functions of avg, count (See p.18, p.19 in [1]).
  Category1. Transition type is a pseudo data type.
    Aggregate functions of this category needs to accept many actual data types,
    including user-defined types. So I think that it is hard to implement import/export functions.
    Consequently, I do not plan to support these category. (See p.19 in [1])

Sincerely yours,
Yuki Fujii

--
Yuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

[1]
https://www.postgresql.org/message-id/attachment/160659/PGConfDev2024_Presentation_Aggregation_Scaleout_FDW_Sharding_20240531.pdf

Re: Partial aggregates pushdown

От

Jelte Fennema-Nio

Дата:

11 июня, 16:59:36

On Tue, 11 Jun 2024 at 13:40, Fujii.Yuki@df.MitsubishiElectric.co.jp
<Fujii.Yuki@df.mitsubishielectric.co.jp> wrote:
> > From: Bruce Momjian <bruce@momjian.us>
> > So, we need import/export text representation for the partial aggregate mode for these eight, and call the base
datatype

> > text import/export functions for the zero ones when in this mode?
>
> I think that you are basically right.
> But, I think, in a perfect world we should also add an import/export function for the following
> two category.
>
>   Category1. Validation Chek is needed for Safety.
>     For example, I think a validation check is needed for avg(float4),
>      whose transition type is not internal. (See p.18 in [1])
>     I plan to add import functions of avg, count (See p.18, p.19 in [1]).
>   Category1. Transition type is a pseudo data type.
>     Aggregate functions of this category needs to accept many actual data types,
>     including user-defined types. So I think that it is hard to implement import/export functions.
>     Consequently, I do not plan to support these category. (See p.19 in [1])

How about instead of trying to serialize the output of
serialfn/deserialfn, instead we don't use the "internal" type and
create actual types in pg_type for these transtypes? Then we can
simply use the in/out and recv/send functions of those types to
serialize the values of the partial aggregate over the network.
Instead of having to rely on serialfn/deserialfn to be network-safe
(which they probably aren't).

That indeed still leaves the pseudo types. Since non of those
pseudotypes have a working in/recv function (they always error by
definition), I agree that we can simply not support those.

Basically that would mean that any aggregate with a non-internal and
non-pseudotype as a transtype could be used in this multi-node partial
aggregate pushdown.

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

12 июня, 08:27:02

Hi Jelte,

Thank you for your comment!
> From: Jelte Fennema-Nio <postgres@jeltef.nl>
> Sent: Tuesday, June 11, 2024 11:00 PM
> How about instead of trying to serialize the output of serialfn/deserialfn, instead we don't use the "internal" type
and
> create actual types in pg_type for these transtypes? Then we can simply use the in/out and recv/send functions of
those
> types to serialize the values of the partial aggregate over the network.
> Instead of having to rely on serialfn/deserialfn to be network-safe (which they probably aren't).
> 
> That indeed still leaves the pseudo types. Since non of those pseudotypes have a working in/recv function (they
always
> error by definition), I agree that we can simply not support those.
> 
> Basically that would mean that any aggregate with a non-internal and non-pseudotype as a transtype could be used in
this
> multi-node partial aggregate pushdown.
Could you please clarify what you mean?
Are you referring to:
  Option 1: Modifying existing aggregate functions to minimize the use of internal state values.
  Option 2: Not supporting the push down of partial aggregates for functions with internal state values.
  Option 3: Something other than Option 1 and Option 2.

There are many aggregate functions with internal state values, so if we go with Option 1,
we might need to change a lot of existing code, like transition functions and finalize functions.
Also, I'm not sure how many existing aggregate functions can be modified this way.

There are also many popular functions with internal state values,
like sum(int8) and avg(int8)(see [1]), so I don't think Option2 would be acceptable.

Best regards, Yuki Fujii

--
Yuki Fujii
Information Technology R&D Center, Mitsubishi Electric Corporation

[1] https://github.com/postgres/postgres/blob/REL_16_STABLE/src/include/catalog/pg_aggregate.dat

Re: Partial aggregates pushdown

От

Jelte Fennema-Nio

Дата:

12 июня, 11:02:17

On Wed, 12 Jun 2024 at 07:27, Fujii.Yuki@df.MitsubishiElectric.co.jp
<Fujii.Yuki@df.mitsubishielectric.co.jp> wrote:
> Could you please clarify what you mean?
> Are you referring to:
>   Option 1: Modifying existing aggregate functions to minimize the use of internal state values.
>   Option 2: Not supporting the push down of partial aggregates for functions with internal state values.

Basically I mean both Option 1 and Option 2 together. i.e. once we do
option 1, supporting partial aggregate pushdown for all important
aggregates with internal state values, then supporting pushdown of
internal state values becomes unnecessary.

> There are many aggregate functions with internal state values, so if we go with Option 1,
> we might need to change a lot of existing code, like transition functions and finalize functions.
> Also, I'm not sure how many existing aggregate functions can be modified this way.

There are indeed 57 aggregate functions with internal state values:

> SELECT count(*) from pg_aggregate where aggtranstype = 'internal'::regtype;
 count
───────
    57
(1 row)

But there are only 26 different aggtransfns. And the most used 8 of
those cover 39 of those 57 aggregates.

> SELECT
    sum (count) OVER(ORDER BY count desc, aggtransfn ROWS BETWEEN
unbounded preceding and current row) AS cumulative_count
    , *
FROM (
    SELECT
        count(*),
        aggtransfn
    from pg_aggregate
    where aggtranstype = 'internal'::regtype
    group by aggtransfn
    order by count(*) desc, aggtransfn
);
 cumulative_count │ count │               aggtransfn
──────────────────┼───────┼────────────────────────────────────────
                7 │     7 │ ordered_set_transition
               13 │     6 │ numeric_accum
               19 │     6 │ int2_accum
               25 │     6 │ int4_accum
               31 │     6 │ int8_accum
               35 │     4 │ ordered_set_transition_multi
               37 │     2 │ int8_avg_accum
               39 │     2 │ numeric_avg_accum
               40 │     1 │ array_agg_transfn
               41 │     1 │ json_agg_transfn
               42 │     1 │ json_object_agg_transfn
               43 │     1 │ jsonb_agg_transfn
               44 │     1 │ jsonb_object_agg_transfn
               45 │     1 │ string_agg_transfn
               46 │     1 │ bytea_string_agg_transfn
               47 │     1 │ array_agg_array_transfn
               48 │     1 │ range_agg_transfn
               49 │     1 │ multirange_agg_transfn
               50 │     1 │ json_agg_strict_transfn
               51 │     1 │ json_object_agg_strict_transfn
               52 │     1 │ json_object_agg_unique_transfn
               53 │     1 │ json_object_agg_unique_strict_transfn
               54 │     1 │ jsonb_agg_strict_transfn
               55 │     1 │ jsonb_object_agg_strict_transfn
               56 │     1 │ jsonb_object_agg_unique_transfn
               57 │     1 │ jsonb_object_agg_unique_strict_transfn
(26 rows)

And actually most of those don't have a serialfn, so they wouldn't be
supported by your suggested approach either. Looking at the
distribution of aggserialfns instead we see the following:

> SELECT
    sum (count) OVER(ORDER BY count desc, aggserialfn ROWS BETWEEN
unbounded preceding and current row) AS cumulative_count
    , *
FROM (
    SELECT
        count(*),
        aggserialfn
    from pg_aggregate
    where
        aggtranstype = 'internal'::regtype
        AND aggserialfn != 0
    group by aggserialfn
    order by count(*) desc, aggserialfn
);
 cumulative_count │ count │        aggserialfn
──────────────────┼───────┼───────────────────────────
               12 │    12 │ numeric_serialize
               24 │    12 │ numeric_poly_serialize
               26 │     2 │ numeric_avg_serialize
               28 │     2 │ int8_avg_serialize
               30 │     2 │ string_agg_serialize
               31 │     1 │ array_agg_serialize
               32 │     1 │ array_agg_array_serialize
(7 rows)

So there are only 7 aggserialfns, and thus at most 7 new postgres
types that you would need to create to support the same aggregates as
in your current proposal. But looking at the implementations of these
serialize functions even that is an over-estimation: numeric_serialize
and numeric_avg_serialize both serialize a NumericAggState, and
numeric_poly_serialize and int8_avg_serialize both serialize a
PolyNumAggState. So probably a we could even do with only 5 types. And
to be clear: only converting PolyNumAggState and NumericAggState to
actual postgres types would already cover 28 out of the 32 aggregates.
That seems quite feasible to do.

So I agree it's probably more code than your current approach. At the
very least because you would need to implement in/out text
serialization functions for these internal types that currently don't
have them. But I do think it would be quite a feasible amount. And to
clarify, I see a few benefits of using the approach that I'm
proposing:

1. So far aggserialfn and aggdeserialfn haven't been required to be
network safe at all. In theory extensions might reference shared
memory pointers with in them, that are valid for serialization within
the same postgres process tree but not outside of it. Or they might
serialize to bytes in a way that does not work across different
bigendian/littleendian systems, thus causing wrong aggregation
results. Never sending results of serialfn over the network solves
that issue.
2. Partial aggregate pushdown across different postgres version could
be made to work by using the in/out functions instead of receive/send
functions, to use the text based serialization format (which should be
stable across versions)
3. It seems nice to be able to get the text representation of all
PARTIAL_AGGREGATE output for debugging purposes. With your approach I
think what currently happens is that it will show a bytea for when
using PARTIAL_AGGREGATE for avg(bigint) directly from psql.
4. In my experience it's easier to get patches merged if they don't
change a lot at once and are useful by themselves. This way you could
split your current patch up into multiple smaller patches, each of
which could be merged separately (appart from b relying on a).
a. Introduce PARTIAL_AGGREGATE syntax for non-internal & non-pseudo types
b. Start using PARTIAL_AGGREGATE for FDW pushdown
c. Convert NumericAggState to non-internal
d. Convert PolyNumAggState to non-internal
e. Use non-internal for string_agg_serialize aggregates
f. Use non-internal for array_agg_serialize
g. Use non-internal for array_agg_array_serialize

P.S. The problem described in benefit 1 could also be solved in your
approach by adding a boolean opt in flag to CREATE AGGREGATE. e.g.
CREATE AGGREGATE ( ..., NETWORKSAFESERIALFUNCS = true)

RE: Partial aggregates pushdown

От

"Fujii.Yuki@df.MitsubishiElectric.co.jp"

Дата:

23 июня, 11:23:32

Hi. Jelte, hackers.

Sorry for the late response.
Thank you for detailed and useful comments.

On Wed, Jun 12, 2024 at 5:02 PM Jelte Fennema-Nio <postgres@jeltef.nl> wrote:
>
> On Wed, 12 Jun 2024 at 07:27, Fujii.Yuki@df.MitsubishiElectric.co.jp
> <Fujii.Yuki@df.mitsubishielectric.co.jp> wrote:
> > Could you please clarify what you mean?
> > Are you referring to:
> >   Option 1: Modifying existing aggregate functions to minimize the use of internal state values.
> >   Option 2: Not supporting the push down of partial aggregates for functions with internal state values.
>
> Basically I mean both Option 1 and Option 2 together. i.e. once we do
> option 1, supporting partial aggregate pushdown for all important
> aggregates with internal state values, then supporting pushdown of
> internal state values becomes unnecessary.
Understood.
However, there are points that I agree with and others that I don't agree with.

I will show my opinion using one aggregate function avg(int8), whose transtype is internal.

I agree that, in general, any remote server should transmit the state value to the local server using a format whose
datatype is a native data type and is not serialized, whenever possible. I call this format standard format along with
[1].
For avg(int8), I think that it is rational that any remote server transmit the state value to the local server using
theformat whose data type is _numeric of count and sum.

Before your advice, I plan to use the standard format whose data type is text, like "count=5 sum=13.3". But now, I
thinkthat it is rational to use _numeric.

I attached the POC patch(the second one) which supports avg(int8) whose standard format is _numeric type.

However, I do not agree that I modify the internal transtype to the native data type. The reasons are the following
three.
1. Generality
I believe we should develop a method that can theoretically apply to any aggregate function, even if we cannot
implementit immediately. However, I find it exceptionally challenging to demonstrate that any internal transtype can be
universallyconverted to a native data type for aggregate functions that are parallel-safe under the current parallel
queryfeature. Specifically, proving this for user-defined aggregate functions presents a significant difficulty in my
view.

On the other hand, I think that the usage of export and import functions can theoretically apply to any aggregate
functions.

2. Amount of codes.
It could need more codes.

3. Concern about performance
I'm concerned that altering the current internal data types could impact performance.

> So I agree it's probably more code than your current approach. At the
> very least because you would need to implement in/out text
> serialization functions for these internal types that currently don't
> have them. But I do think it would be quite a feasible amount. And to
> clarify, I see a few benefits of using the approach that I'm
> proposing:
>
> 1. So far aggserialfn and aggdeserialfn haven't been required to be
> network safe at all. In theory extensions might reference shared
> memory pointers with in them, that are valid for serialization within
> the same postgres process tree but not outside of it. Or they might
> serialize to bytes in a way that does not work across different
> bigendian/littleendian systems, thus causing wrong aggregation
> results. Never sending results of serialfn over the network solves
> that issue.
I know. In my proposal, the standard format is not seriarized data by serialfn, instead, is text or other native data
type.
Just to clarify, I'm writing this to avoid any potential misunderstanding.

> 2. Partial aggregate pushdown across different postgres version could
> be made to work by using the in/out functions instead of receive/send
> functions, to use the text based serialization format (which should be
> stable across versions)
Thank you for your advice. I agree that, as mentioned earlier, standard formats should generally use native data types
wheneverpossible.

> 3. It seems nice to be able to get the text representation of all
> PARTIAL_AGGREGATE output for debugging purposes. With your approach I
> think what currently happens is that it will show a bytea for when
> using PARTIAL_AGGREGATE for avg(bigint) directly from psql.
I may have caused some misunderstanding.
To clarify and prevent any potential misunderstanding: In my proposal, the standard format does not involve serialized
databy serialfn; rather, it utilizes text or other native data types.

> 4. In my experience it's easier to get patches merged if they don't
> change a lot at once and are useful by themselves. This way you could
> split your current patch up into multiple smaller patches, each of
> which could be merged separately (appart from b relying on a).
> a. Introduce PARTIAL_AGGREGATE syntax for non-internal & non-pseudo types
> b. Start using PARTIAL_AGGREGATE for FDW pushdown
> c. Convert NumericAggState to non-internal
> d. Convert PolyNumAggState to non-internal
> e. Use non-internal for string_agg_serialize aggregates
> f. Use non-internal for array_agg_serialize
> g. Use non-internal for array_agg_array_serialize
I understand. Thank you for advice.
Basically responding to your advice,
for now, I prepare two POC patches.
The first supports case a, currently covering only avg(int4) and other aggregate functions that do not require import
orexport functions, such as min, max, and count.

The second supports case b and commonly used functions like sum and avg. Currently, it only includes avg(int8).

Best regards, Yuki Fujii
--
Yuki Fujii
Information Technology R&D Center, Mitsubishi Electric Corporation

[1]
https://www.postgresql.org/message-id/attachment/160659/PGConfDev2024_Presentation_Aggregation_Scaleout_FDW_Sharding_20240531.pdf

Hi hackers.

On Wed, Jun 5, 2024 at 9:15?AM Fujii.Yuki@df.MitsubishiElectric.co.jp <Fujii.Yuki@df.mitsubishielectric.co.jp> wrote:
>   Requirement2. Consider appropriate position of the new keyword "PARTIAL_AGGREGATE". (with Robert)
>     Existing patch: Before the target expression. Ex. avg(PARTIAL_AGGREGATE c1)
>     Ideal: Before the aggregate function. Ex. PARTIAL_AGGREGATE avg(c1)
>   Requirement3. Consider to avoid to make the new keyword "PARTIAL_AGGREGATE" become a reserved word. (with Robert)
>     In the existing patch, "PARTIAL_AGGREGATE" is a reserved word.
I considered the above two requirement.
Based on my research, there is no way to use PARTIAL_AGGREGATE in front of a function name without making it a reserved
word.

Instead, I can make PARTIAL_AGGREGATE an unreserved word by placing it after the FILTER clause, like avg(c1) FILTER
(WHEREc2 > 0) PARTIAL_AGGREGATE, and by marking it as an ASLABEL word like FILTER.
 
I attached the patch of the method.
If there are no objections, I would like to proceed with the method described above.
I'd appreciate it if anyone comment the method.

I have addressed several comments, though not all of them.

On Mon, Jun 24, 2024 at 6:09?PM Jelte Fennema-Nio <postgres@jeltef.nl> wrote:
> 4. Related to 3, I think it would be good to have some tests of
> PARTIAL_AGGREGATE that don't involve postgres_fdw at all. I also
> spotted some comments too that mention FDW, even though they apply to
> the "pure" PARTIAL_AGGREGATE code.
> 5. This comment now seems incorrect:
> -    * Apply the agg's finalfn if one is provided, else return transValue.
> +    * If the agg's finalfn is provided and PARTIAL_AGGREGATE keyword is
> +    * not specified, apply the agg's finalfn.
> +    * If PARTIAL_AGGREGATE keyword is specified and the transValue type
> +    * is internal, apply the agg's serialfn. In this case the agg's
> +    * serialfn must not be invalid. Otherwise return transValue.
>
> 6. These errors are not on purpose afaict (if they are a comment in
> the test would be good to explain why)
>
> +SELECT b, avg(a), max(a), count(*) FROM pagg_tab GROUP BY b ORDER BY 1;
> +ERROR:  could not connect to server "loopback"
> +DETAIL:  invalid connection option "partial_aggregate_support"
Fixed.

Best regards, Yuki Fujii
--
Yuki Fujii
Information Technology R&D Center, Mitsubishi Electric Corporation

SUMMARY OF THREAD

The design of patch 0001 is agreed upon by everyone on the thread (so
far). This adds the PARTIAL_AGGREGATE label for aggregates, which will
cause the finalfunc not to run. It also starts using PARTIAL_AGGREGATE
for pushdown of aggregates in postgres_fdw. In 0001 PARTIAL_AGGREGATE
is only supported for aggregates with a non-internal/pseudo type as
the stype.

The design for patch 0002 is still under debate. This would expand on
the functionality added by adding support for PARTIAL_AGGREGATE for
aggregates with an internal stype. This is done by returning a byte
array containing the bytes that the serialfunc of the aggregate
returns.

A competing proposal for 0002 is to instead change aggregates to not
use an internal stype anymore, and create dedicated types. The main
downside here is that infunc and outfunc would need to be added for
text serialization, in addition to the binary serialization. An open
question is: Can we change the requirements for CREATE TYPE, so that
types can be created without infunc and outfunc.

WHAT IS NEEDED?

The things needed for this patch are that docs need to be added, and
detailed codereview needs to be done.

Feedback from more people on the two competing proposals for 0002
would be very helpful in making a decision.

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Partial aggregates pushdown

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения