Обсуждение: Re: [GENERAL] aggregate returning anyarray and 'cannot determine result data type'

Поиск
Список
Период
Сортировка

Re: [GENERAL] aggregate returning anyarray and 'cannot determine result data type'

От
Tom Lane
Дата:
[ redirecting to -hackers ]

Tomas Vondra <tv@fuzzy.cz> writes:
> So my plan was to do something like this:

>   sample_append(internal, anyelement, int) -> internal
>   sample_final(internal) -> anyarray

>   CREATE AGGREGATE sample_agg(anyelement, int) (
>       SFUNC = sample_append,
>       STYPE = internal,
>       FINALFUNC = sample_final
>   );

> However this leads to
>   ERROR:  cannot determine result data type
>   DETAIL:  A function returning a polymorphic type must have at least
>            one polymorphic argument

> because 'sample_final' produces anyarray but has no polymorphic
> argument.

Yeah, this is a problem with trying to use internal stype for polymorphic
aggregates.

The same problem came up in connection with the "ordered set" aggregates
that were added recently, and that patch implemented an interesting
workaround: the final function for an OSA gets additional dummy arguments
of the same type as the aggregate inputs.  They are always passed as NULLs
at runtime, and have no real value except if the aggregate is polymorphic
--- but when it is, they provide a way to resolve the result type of a
polymorphic final function, even if the state type is "internal" or
otherwise non-polymorphic.

I thought at the time that maybe we should offer this feature for regular
aggregates as well as ordered-set ones, but didn't do anything about
it because there hadn't been demand.  If we did have it, you could solve
this problem with
 sample_append(internal, anyelement, int) -> internal sample_final(internal, anyelement, int) -> anyarray
 CREATE AGGREGATE sample_agg(anyelement, int) (     SFUNC = sample_append,     STYPE = internal,     FINALFUNC =
sample_final);
 

where sample_final would have to be declared non-strict (since it'd always
be getting some NULL arguments), but that's a small price to pay.

I think it'd be a pretty small adjustment to the already-committed
code to allow this to happen.  Basically we'd just have to decouple
the extra-arguments-to-finalfn behavior from ordered-set aggregates.

One potential issue though is that if sample_final existed in both
signatures it wouldn't be very clear which one got selected for the
aggregate.  Perhaps the best fix would be to invent a different
CREATE AGGREGATE keyword for finalfns with extra arguments?  If
so, that's something we ought to do *now*, not in 9.5, because
it'll be too late to redefine how to create OSAs once 9.4 ships.

Thoughts?
        regards, tom lane



Re: [GENERAL] aggregate returning anyarray and 'cannot determine result data type'

От
Tom Lane
Дата:
I wrote:
> The same problem came up in connection with the "ordered set" aggregates
> that were added recently, and that patch implemented an interesting
> workaround: the final function for an OSA gets additional dummy arguments
> of the same type as the aggregate inputs.  They are always passed as NULLs
> at runtime, and have no real value except if the aggregate is polymorphic
> --- but when it is, they provide a way to resolve the result type of a
> polymorphic final function, even if the state type is "internal" or
> otherwise non-polymorphic.

> I thought at the time that maybe we should offer this feature for regular
> aggregates as well as ordered-set ones, but didn't do anything about
> it because there hadn't been demand.

After sleeping on it, I'm convinced that this was an oversight that
we should fix before 9.4 ships.  The code changes should be pretty
minimal; the executor in particular probably needs *less* code to
do this in a uniform way.

> One potential issue though is that if sample_final existed in both
> signatures it wouldn't be very clear which one got selected for the
> aggregate.  Perhaps the best fix would be to invent a different
> CREATE AGGREGATE keyword for finalfns with extra arguments?

To be concrete: let's add a new boolean parameter with the semantics of
"final function takes extra dummy arguments" (default false).  There would
need to be one for the separate moving-aggregate final function too,
of course.

The best naming idea I've got right now is "finalfunc_extra" and
"mfinalfunc_extra", but maybe somebody can do better?
        regards, tom lane



Re: [GENERAL] aggregate returning anyarray and 'cannot determine result data type'

От
Tomas Vondra
Дата:
On 23.4.2014 16:07, Tom Lane wrote:
>
> To be concrete: let's add a new boolean parameter with the semantics
> of "final function takes extra dummy arguments" (default false).
> There would need to be one for the separate moving-aggregate final
> function too, of course.
> 
> The best naming idea I've got right now is "finalfunc_extra" and 
> "mfinalfunc_extra", but maybe somebody can do better?

Do we really need a separate parameter for this? Couldn't this be
decided simply using the signature of the final function? Either it has
a single parameter (current behavior), or it has the same parameters as
the state transition function (new behavior).

regards
Tomas



Re: [GENERAL] aggregate returning anyarray and 'cannot determine result data type'

От
Tom Lane
Дата:
Tomas Vondra <tv@fuzzy.cz> writes:
> On 23.4.2014 16:07, Tom Lane wrote:
>> To be concrete: let's add a new boolean parameter with the semantics
>> of "final function takes extra dummy arguments" (default false).
>> There would need to be one for the separate moving-aggregate final
>> function too, of course.

> Do we really need a separate parameter for this? Couldn't this be
> decided simply using the signature of the final function? Either it has
> a single parameter (current behavior), or it has the same parameters as
> the state transition function (new behavior).

The problem is that the CREATE AGGREGATE syntax only specifies the name of
the final function, not its argument list, so you have to make an
assumption about the argument list in order to look up the final function
in the first place.

I did consider the idea of looking for both signatures and using whatever
we find, but that seems fairly dangerous: the same CREATE AGGREGATE
command could give different results depending on what versions of the
final function happen to exist.  This would create an ordering hazard that
pg_dump could not reliably cope with, for example.
        regards, tom lane



Re: [GENERAL] aggregate returning anyarray and 'cannot determine result data type'

От
Tomas Vondra
Дата:
On 25.4.2014 23:26, Tom Lane wrote:
> Tomas Vondra <tv@fuzzy.cz> writes:
>> On 23.4.2014 16:07, Tom Lane wrote:
>>> To be concrete: let's add a new boolean parameter with the 
>>> semantics of "final function takes extra dummy arguments" 
>>> (default false). There would need to be one for the separate 
>>> moving-aggregate final function too, of course.
> 
>> Do we really need a separate parameter for this? Couldn't this be 
>> decided simply using the signature of the final function? Either
>> it has a single parameter (current behavior), or it has the same 
>> parameters as the state transition function (new behavior).
> 
> The problem is that the CREATE AGGREGATE syntax only specifies the 
> name of the final function, not its argument list, so you have to 
> make an assumption about the argument list in order to look up the 
> final function in the first place.
> 
> I did consider the idea of looking for both signatures and using 
> whatever we find, but that seems fairly dangerous: the same CREATE 
> AGGREGATE command could give different results depending on what 
> versions of the final function happen to exist. This would create an 
> ordering hazard that pg_dump could not reliably cope with, for 
> example.

Yeah. And it wouldn't be clear which function to use in case two
suitable functions (with different signatures) exist. So I guess this
actually requires a parameter.

I'd vote for "finalfunc_extra" - can't think of a better name, and I'm
not sure what the "m" in "mfinalfunc_extra" stands for.

regards
Tomas



Re: [GENERAL] aggregate returning anyarray and 'cannot determine result data type'

От
Tom Lane
Дата:
Tomas Vondra <tv@fuzzy.cz> writes:
> On 25.4.2014 23:26, Tom Lane wrote:
>> The problem is that the CREATE AGGREGATE syntax only specifies the 
>> name of the final function, not its argument list, so you have to 
>> make an assumption about the argument list in order to look up the 
>> final function in the first place.

> Yeah. And it wouldn't be clear which function to use in case two
> suitable functions (with different signatures) exist. So I guess this
> actually requires a parameter.

Exactly.

> I'd vote for "finalfunc_extra" - can't think of a better name, and I'm
> not sure what the "m" in "mfinalfunc_extra" stands for.

Sorry for not being clear.  The "m" version is the alternate setting for
the moving-aggregate sub-implementation, which is new as of a couple weeks
ago:
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a9d9acbf219b9e96585779cd5f99d674d4ccba74
        regards, tom lane