Re: wip: functions median and percentile

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: wip: functions median and percentile
Дата
Msg-id AANLkTi=aVqWeFh1_mqDKZxy5+4aRd6feMD-m6TOyZ_6m@mail.gmail.com
обсуждение исходный текст
Ответ на Re: wip: functions median and percentile  (Hitoshi Harada <umi.tanuki@gmail.com>)
Ответы Re: wip: functions median and percentile  (Pavel Stehule <pavel.stehule@gmail.com>)
Список pgsql-hackers
Hello

I moved a "median" function to core.

+ doc part
+ regress test

Regards

Pavel Stehule


2010/9/20 Hitoshi Harada <umi.tanuki@gmail.com>:
> 2010/8/19 Pavel Stehule <pavel.stehule@gmail.com>:
>> Hello
>>
>> I am sending a prototype implementation of functions median and
>> percentile. This implementation is very simple and I moved it to
>> contrib for this moment - it is more easy maintainable. Later I'll
>> move it to core.
>
> I've reviewed this patch.
>
> * The patch can apply cleanly and make doesn't print any errors nor
> warnings. But it doesn't touch contrib/Makefile so I had to make by
> changing dir to contrib/median.
>
> * Cosmetic coding style should be more appropriate, including trailing
> white spaces and indentation positions.
>
> * Since these two aggregates use tuplesort inside it, there're
> possible risk to cause out of memory in normal use case. I don't think
> this fact is critical, but at least some notation should be referred
> in docs.
>
> * It doesn't contain any document nor regression tests.
>
> * They should be callable in window function context; for example
>
> contrib_regression=# select median(a) over (order by a) from t1;
> ERROR:  invalid tuplesort state
>
> or at least user-friend message should be printed.
>
> * The returned type is controversy. median(int) returns float8 as the
> patch intended, but avg(int) returns numeric. AFAIK only avg(float8)
> returns float8.
>
> * percentile() is more problematic; first, the second argument for the
> aggregate takes N of "N%ile" as int, like 50 if you want 50%ile,  but
> it doesn't care about negative values or more than 100. In addition,
> the second argument is taken at the first non-NULL value of the first
> argument, but the second argument is semantically constant. For
> example, for (1.. 10) value of a in table t1,
>
> contrib_regression=# select percentile(a, a * 10 order by a) from t1;
>  percentile
> ------------
>          1
> (1 row)
>
> contrib_regression=# select percentile(a, a * 10 order by a desc) from t1;
>  percentile
> ------------
>         10
> (1 row)
>
> and if the argument comes from the subquery which doesn't contain
> ORDER BY clause, you cannot predict the result.
>
> That's all of my quick review up to now.
>
> Regards,
>
> --
> Hitoshi Harada
>

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Configuring synchronous replication
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Latch implementation