BUG #12228: Primary use-case of PERCENT_RANK not supported

Поиск
Список
Период
Сортировка
От jonlachlan@gmail.com
Тема BUG #12228: Primary use-case of PERCENT_RANK not supported
Дата
Msg-id 20141213221134.2699.23317@wrigleys.postgresql.org
обсуждение исходный текст
Ответы Re: BUG #12228: Primary use-case of PERCENT_RANK not supported
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      12228
Logged by:          Jonathon Lachlan-Hache
Email address:      jonlachlan@gmail.com
PostgreSQL version: 9.4beta2
Operating system:   MacOSX
Description:

I'm using the 9.4 RC1 and trying out the WITHIN GROUP functions. I used a
percentile_disc pattern successfully, however I think I came across a
problem with using percent_rank(). Here is my SQL:

SELECT percent_rank(datavalue) WITHIN GROUP (ORDER BY datavalue) as
pct_rank, measureid
 FROM measuredata
 WHERE surveyyear=2013
 GROUP BY measureid;

I want to be able to run a query that programmatically displays the
'pct_rank' of datavalue. My table is organized into four columns:
organizationid
measureid
surveyyear
datavalue

The 'datavalue' column is the measure, whereas all the other columns are
attributes. When I run percent_rank() it needs to determine what percentile
is the 'datavalue', but within the scope of the same 'measureid' and
'surveyyear'.

The percent_rank() function does not appear to support this. The above query
returns the following:

ERROR:  column "measuredata.datavalue" must appear in the GROUP BY clause or
be used in an aggregate function
LINE 1: SELECT percent_rank(datavalue) WITHIN GROUP (ORDER BY datava...
                            ^
DETAIL:  Direct arguments of an ordered-set aggregate must use only grouped
columns.

********** Error **********

ERROR: column "measuredata.datavalue" must appear in the GROUP BY clause or
be used in an aggregate function
SQL state: 42803
Detail: Direct arguments of an ordered-set aggregate must use only grouped
columns.
Character: 21


This doesn't make sense, because if I include 'datavalue' as a GROUP BY
clause, then my group N-size is exactly 1.

Basically, because the argument of percent_rank() must also be present in
the GROUP BY clause, there is no way to use percent_rank() to
programatically determine the percent-rank of a value within a set. And this
is likely to be the primary use-case of any kind of percent-rank function,
so it would make sense to include it in the new implementation.

Thanks so much,
Jon

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Mark Kirkwood
Дата:
Сообщение: Re: regression, deadlock in high frequency single-row UPDATE
Следующее
От: Tom Lane
Дата:
Сообщение: Re: BUG #12228: Primary use-case of PERCENT_RANK not supported