Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)

Поиск
Список
Период
Сортировка
От Atri Sharma
Тема Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)
Дата
Msg-id CAOeZVienct7pONTeLrA9d6twcj5CDF3KAtRn5myGwre-Jr-1uA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)  (David Fetter <david@fetter.org>)
Список pgsql-hackers
On Thu, Jul 18, 2013 at 10:02 AM, David Fetter <david@fetter.org> wrote:
> On Thu, Jul 18, 2013 at 03:15:14AM +0000, Andrew Gierth wrote:
>> The spec defines two types of aggregate function classed as "ordered set
>> function", as follows:
>>
>> 1. An "inverse distribution function" taking one argument (which must be
>>    a grouped column or otherwise constant within groups) plus a sorted
>>    group with exactly one column:
>>
>>    =# SELECT (func(p) WITHIN GROUP (ORDER BY q)) from ...
>>
>>    The motivating example for this (and the only ones in the spec) are
>>    percentile_cont and percentile_disc, to return a percentile result
>>    from a continuous or discrete distribution. (Thus
>>    percentile_cont(0.5) within group (order by x) is the spec's version
>>    of a median(x) function.)
>>
>> 2. A "hypothetical set function" taking N arguments of arbitrary types
>>    (a la VARIADIC "any", rather than a fixed list) plus a sorted group
>>    with N columns of matching types:
>>
>>    =# SELECT (func(p1,p2,...) WITHIN GROUP (ORDER BY q1,q2,...)) from ...
>>
>>    (where typeof(p1)==typeof(q1) and so on, at least up to trivial
>>    conversions)
>>
>>    The motivating example here is to be able to do rank(p1,p2,...) to
>>    return the rank that the specified values would have had if they were
>>    added to the group.
>>
>> As usual, we do not want to constrain ourselves to supporting only the
>> specific cases in the spec, but would prefer a general solution.
>>
>> We (meaning myself and Atri) have an implementation that basically
>> works, though it is not yet complete, but before taking it any further
>> we need to resolve the design question of how to represent these two
>> types of function in the system catalogs. The fact that there are in
>> effect two parts to the parameter list, which are either independent
>> (for inverse distribution funcs) or closely related (for hypothetical
>> set functions), doesn't seem to point to an obvious way to represent
>> this in pg_proc/pg_aggregate.
>>
>> I'm not yet satisfied with the method used in our implementation,
>
> What is that method?

We currently represent ordered set functions with a new bool flag in
pg_aggregate. The flag is set to true for ordered set
functions(obviously) and false for all others. The currently
implemented functions i.e. percentile_disc, percentile_cont and
percentile_cont for intervals have their finalfns present in
pg_aggregate.

The aggregate functions take in two arguments, one for the percentile
value and other for the input row set. So, percentile_cont's entry in
pg_proc has float8 and float8 as its parameters and another entry of
percentile_cont (with the interval version as the finalfn) has float8
and interval as its parameter types.

As you can see, there isn't a way right now to resolve the return type
of the aggregate for polymorphic cases. This is something we wish to
resolve.

Regards,

Atri





--
Regards,

Atri
l'apprenant



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
Следующее
От: Tatsuo Ishii
Дата:
Сообщение: Re: [PATCH] pgbench --throttle (submission 7 - with lag measurement)