Обсуждение: wip: functions median and percentile

Поиск

Список

Период

Сортировка

wip: functions median and percentile

От

Pavel Stehule

Дата:

19 августа 2010 г., 08:00:23

Hello

I am sending a prototype implementation of functions median and
percentile. This implementation is very simple and I moved it to
contrib for this moment - it is more easy maintainable. Later I'll
move it to core.

These functions are relative simple, there are not barrier for
implementation own specific mutations of this functions - so I propose
move to core only basic and well known form of these to core.

postgres=# select median(v) from generate_series(1,10) g(v);
 median
────────
    5.5
(1 row)

Time: 1.475 ms
postgres=# select percentile(v,50) from generate_series(1,10) g(v);
 percentile
────────────
          5
(1 row)

Time: 0.626 ms

This implementation is based on tuplesort and the speed is relative
well - the result from 1000000 rows is less 1 sec.

Regards

Pavel Stehule

Вложения

median.diff

Re: wip: functions median and percentile

От

Greg Stark

Дата:

19 августа 2010 г., 11:50:57

On Thu, Aug 19, 2010 at 11:59 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> I am sending a prototype implementation of functions median and
> percentile. This implementation is very simple and I moved it to
> contrib for this moment - it is more easy maintainable. Later I'll
> move it to core.

So if the entire result set fits in memory it would be nice to use the
O(n) Quickselect algorithm -- which would only be a small change to
the existing Quicksort code -- instead of sorting the entire set. That
should be possible both for percentile() and median(). It would be
good to generalize the tuplesort api sufficiently so that you can
implement this as a contrib module even if we eventually decide to
integrate it in core. It's probably worth having two copies of the
sort code, one for Quickselect and one for Quicksort just for speed,
though I suppose it's worth benchmarking.

But I'm not aware of a generalization of tapesort to allow O(n)
selection with the sequential i/o properties of tapesort. It would be
especially interesting, probably more so than the in-memory case.

-- 
greg

Re: wip: functions median and percentile

От

Tom Lane

Дата:

19 августа 2010 г., 12:33:27

Greg Stark <gsstark@mit.edu> writes:
> On Thu, Aug 19, 2010 at 11:59 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> I am sending a prototype implementation of functions median and
>> percentile. This implementation is very simple and I moved it to
>> contrib for this moment - it is more easy maintainable. Later I'll
>> move it to core.

> So if the entire result set fits in memory it would be nice to use the
> O(n) Quickselect algorithm -- which would only be a small change to
> the existing Quicksort code -- instead of sorting the entire set.

That seems like rather a lot of added infrastructure for functions whose
popularity is at best unknown.  I think we should KISS for the first
implementation.
        regards, tom lane

Re: wip: functions median and percentile

От

Robert Haas

Дата:

19 августа 2010 г., 13:50:03

On Thu, Aug 19, 2010 at 11:33 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Greg Stark <gsstark@mit.edu> writes:
>> On Thu, Aug 19, 2010 at 11:59 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>> I am sending a prototype implementation of functions median and
>>> percentile. This implementation is very simple and I moved it to
>>> contrib for this moment - it is more easy maintainable. Later I'll
>>> move it to core.
>
>> So if the entire result set fits in memory it would be nice to use the
>> O(n) Quickselect algorithm -- which would only be a small change to
>> the existing Quicksort code -- instead of sorting the entire set.
>
> That seems like rather a lot of added infrastructure for functions whose
> popularity is at best unknown.  I think we should KISS for the first
> implementation.

+1.  I think the functions are useful, but the perfect is the enemy of the good.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: wip: functions median and percentile

От

David Fetter

Дата:

19 августа 2010 г., 14:03:29

On Thu, Aug 19, 2010 at 12:49:45PM -0400, Robert Haas wrote:
> On Thu, Aug 19, 2010 at 11:33 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Greg Stark <gsstark@mit.edu> writes:
> >> On Thu, Aug 19, 2010 at 11:59 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> >>> I am sending a prototype implementation of functions median and
> >>> percentile. This implementation is very simple and I moved it to
> >>> contrib for this moment - it is more easy maintainable. Later I'll
> >>> move it to core.
> >
> >> So if the entire result set fits in memory it would be nice to use the
> >> O(n) Quickselect algorithm -- which would only be a small change to
> >> the existing Quicksort code -- instead of sorting the entire set.
> >
> > That seems like rather a lot of added infrastructure for functions whose
> > popularity is at best unknown.  I think we should KISS for the first
> > implementation.
> 
> +1.  I think the functions are useful, but the perfect is the enemy of the good.

Percentile is already there as NTILE, a windowing function.  Median
may be useful, but we pretty much can't just call it "median."
Instead, we need to call it something like "left_median" or
"arithmetic_median."

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: wip: functions median and percentile

От

"Kevin Grittner"

Дата:

19 августа 2010 г., 14:12:41

David Fetter <david@fetter.org> wrote:
> Median may be useful, but we pretty much can't just call it
> "median." Instead, we need to call it something like "left_median"
> or "arithmetic_median."
I think it would be reasonable, and perhaps preferable, to use just
"median" for the semantics described in most dictionaries -- for
example, this:
http://www.merriam-webster.com/dictionary/median
If you do a google search for "median" and poke around, you'll find
many places where this is the only definition mentioned; the others
seem to be rather infrequently used.  Why not make the commone usage
convenient?
-Kevin

Re: wip: functions median and percentile

От

David Fetter

Дата:

19 августа 2010 г., 14:15:12

On Thu, Aug 19, 2010 at 12:12:12PM -0500, Kevin Grittner wrote:
> David Fetter <david@fetter.org> wrote:
>  
> > Median may be useful, but we pretty much can't just call it
> > "median." Instead, we need to call it something like "left_median"
> > or "arithmetic_median."
>  
> I think it would be reasonable, and perhaps preferable, to use just
> "median" for the semantics described in most dictionaries -- for
> example, this:
>  
> http://www.merriam-webster.com/dictionary/median
>  
> If you do a google search for "median" and poke around, you'll find
> many places where this is the only definition mentioned; the others
> seem to be rather infrequently used.  Why not make the commone usage
> convenient?

The reason not to is the same reason that MEDIAN doesn't appear in the
SQL standard, namely that what's common in one field is wrong in
another.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: wip: functions median and percentile

От

Tom Lane

Дата:

19 августа 2010 г., 14:25:57

David Fetter <david@fetter.org> writes:
> On Thu, Aug 19, 2010 at 12:12:12PM -0500, Kevin Grittner wrote:
>> http://www.merriam-webster.com/dictionary/median
>> 
>> If you do a google search for "median" and poke around, you'll find
>> many places where this is the only definition mentioned; the others
>> seem to be rather infrequently used.  Why not make the commone usage
>> convenient?

> The reason not to is the same reason that MEDIAN doesn't appear in the
> SQL standard, namely that what's common in one field is wrong in
> another.

Hmm, do you have any *evidence* that that's why it's not in the standard?

My own take on that is that it's reasonably probable that the SQL
committee might standardize a function by that name someday.  What we
need to be worrying about is the prospect that if there are multiple
definitions for the term, they might pick a different one than we did.
A name like "arithmetic_median" seems much less likely to get blindsided
by future standardization.
        regards, tom lane

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

19 августа 2010 г., 14:39:14

2010/8/19 David Fetter <david@fetter.org>:
> On Thu, Aug 19, 2010 at 12:49:45PM -0400, Robert Haas wrote:
>> On Thu, Aug 19, 2010 at 11:33 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> > Greg Stark <gsstark@mit.edu> writes:
>> >> On Thu, Aug 19, 2010 at 11:59 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> >>> I am sending a prototype implementation of functions median and
>> >>> percentile. This implementation is very simple and I moved it to
>> >>> contrib for this moment - it is more easy maintainable. Later I'll
>> >>> move it to core.
>> >
>> >> So if the entire result set fits in memory it would be nice to use the
>> >> O(n) Quickselect algorithm -- which would only be a small change to
>> >> the existing Quicksort code -- instead of sorting the entire set.
>> >
>> > That seems like rather a lot of added infrastructure for functions whose
>> > popularity is at best unknown.  I think we should KISS for the first
>> > implementation.
>>
>> +1.  I think the functions are useful, but the perfect is the enemy of the good.
>
> Percentile is already there as NTILE, a windowing function.  Median

I don't think it. The purpose is same, but usage is different -
aggregate functions can be used as window func, but window functions
cannot be used as aggregates.


> may be useful, but we pretty much can't just call it "median."
> Instead, we need to call it something like "left_median" or
> "arithmetic_median."

I put some time to deep searching in this area - and I talked with my
kolegues mathematican and everywhere use a just median. Some longer or
different name isn't barrier for me, but people can be confused.

Regards
Pavel

>
> Cheers,
> David.
> --
> David Fetter <david@fetter.org> http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter      XMPP: david.fetter@gmail.com
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

19 августа 2010 г., 14:43:00

2010/8/19 David Fetter <david@fetter.org>:
> On Thu, Aug 19, 2010 at 12:12:12PM -0500, Kevin Grittner wrote:
>> David Fetter <david@fetter.org> wrote:
>>
>> > Median may be useful, but we pretty much can't just call it
>> > "median." Instead, we need to call it something like "left_median"
>> > or "arithmetic_median."
>>
>> I think it would be reasonable, and perhaps preferable, to use just
>> "median" for the semantics described in most dictionaries -- for
>> example, this:
>>
>> http://www.merriam-webster.com/dictionary/median
>>
>> If you do a google search for "median" and poke around, you'll find
>> many places where this is the only definition mentioned; the others
>> seem to be rather infrequently used.  Why not make the commone usage
>> convenient?
>
> The reason not to is the same reason that MEDIAN doesn't appear in the
> SQL standard, namely that what's common in one field is wrong in
> another.

I think some else. The reason can be more simple - implementation of
median is significantly harder  then other aggregates.

I looked there and Oracle11g use "median" in common sense.

Regards

Pavel Stehule
>
> Cheers,
> David.
> --
> David Fetter <david@fetter.org> http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter      XMPP: david.fetter@gmail.com
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>

Re: wip: functions median and percentile

От

David Fetter

Дата:

19 августа 2010 г., 14:48:18

On Thu, Aug 19, 2010 at 01:25:36PM -0400, Tom Lane wrote:
> David Fetter <david@fetter.org> writes:
> > On Thu, Aug 19, 2010 at 12:12:12PM -0500, Kevin Grittner wrote:
> >> http://www.merriam-webster.com/dictionary/median
> >> 
> >> If you do a google search for "median" and poke around, you'll find
> >> many places where this is the only definition mentioned; the others
> >> seem to be rather infrequently used.  Why not make the commone usage
> >> convenient?
> 
> > The reason not to is the same reason that MEDIAN doesn't appear in the
> > SQL standard, namely that what's common in one field is wrong in
> > another.
> 
> Hmm, do you have any *evidence* that that's why it's not in the standard?

Only from Joe Celko, who was on the standards committee, and has
written on the subject.  There's nothing I've found in the standard
itself for this, if that's what you're looking for.

> My own take on that is that it's reasonably probable that the SQL
> committee might standardize a function by that name someday.  What
> we need to be worrying about is the prospect that if there are
> multiple definitions for the term, they might pick a different one
> than we did.

Exactly :)

> A name like "arithmetic_median" seems much less likely to get
> blindsided by future standardization.

Yep.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: wip: functions median and percentile

От

"Kevin Grittner"

Дата:

19 августа 2010 г., 15:17:50

Pavel Stehule <pavel.stehule@gmail.com> wrote:
> I looked there and Oracle11g use "median" in common sense.
As does Sybase IQ.  FWIW, Excel spreadsheets do, too.
The chance of the SQL committee picking some other meaning for bare
MEDIAN seems vanishingly small; although I have to grant that with
only Oracle and Sybase as a precedent in the SQL world, it's not
zero.
-Kevin

Re: wip: functions median and percentile

От

Tom Lane

Дата:

19 августа 2010 г., 15:18:12

David Fetter <david@fetter.org> writes:
> On Thu, Aug 19, 2010 at 01:25:36PM -0400, Tom Lane wrote:
>> A name like "arithmetic_median" seems much less likely to get
>> blindsided by future standardization.

> Yep.

OTOH, if Pavel's right that Oracle already has an aggregate named
median(), it seems quite unlikely that the SQL committee would pick
a definition incompatible with Oracle's to standardize on.
        regards, tom lane

Re: wip: functions median and percentile

От

Hitoshi Harada

Дата:

20 сентября 2010 г., 00:47:18

2010/8/19 Pavel Stehule <pavel.stehule@gmail.com>:
> Hello
>
> I am sending a prototype implementation of functions median and
> percentile. This implementation is very simple and I moved it to
> contrib for this moment - it is more easy maintainable. Later I'll
> move it to core.

I've reviewed this patch.

* The patch can apply cleanly and make doesn't print any errors nor
warnings. But it doesn't touch contrib/Makefile so I had to make by
changing dir to contrib/median.

* Cosmetic coding style should be more appropriate, including trailing
white spaces and indentation positions.

* Since these two aggregates use tuplesort inside it, there're
possible risk to cause out of memory in normal use case. I don't think
this fact is critical, but at least some notation should be referred
in docs.

* It doesn't contain any document nor regression tests.

* They should be callable in window function context; for example

contrib_regression=# select median(a) over (order by a) from t1;
ERROR:  invalid tuplesort state

or at least user-friend message should be printed.

* The returned type is controversy. median(int) returns float8 as the
patch intended, but avg(int) returns numeric. AFAIK only avg(float8)
returns float8.

* percentile() is more problematic; first, the second argument for the
aggregate takes N of "N%ile" as int, like 50 if you want 50%ile,  but
it doesn't care about negative values or more than 100. In addition,
the second argument is taken at the first non-NULL value of the first
argument, but the second argument is semantically constant. For
example, for (1.. 10) value of a in table t1,

contrib_regression=# select percentile(a, a * 10 order by a) from t1;percentile
------------         1
(1 row)

contrib_regression=# select percentile(a, a * 10 order by a desc) from t1;percentile
------------        10
(1 row)

and if the argument comes from the subquery which doesn't contain
ORDER BY clause, you cannot predict the result.

That's all of my quick review up to now.

Regards,

-- 
Hitoshi Harada

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

21 сентября 2010 г., 17:29:07

Hello

I found probably hard problem in cooperation with window functions :(

tuplesort_begin_datum creates child context inside aggcontext. This
context is used for tuplestore. But when this function is called from
WindowAgg_Aggregates context - someone drops all child context every
iteration, and then tuplestore state is invalid. For this moment we
can block using a median function as window function, but it should be
solved better - if it is possible - I don't see inside window function
implementation.

2010/9/20 Hitoshi Harada <umi.tanuki@gmail.com>:
> 2010/8/19 Pavel Stehule <pavel.stehule@gmail.com>:
>> Hello
>>
>> I am sending a prototype implementation of functions median and
>> percentile. This implementation is very simple and I moved it to
>> contrib for this moment - it is more easy maintainable. Later I'll
>> move it to core.
>
> I've reviewed this patch.
>
> * The patch can apply cleanly and make doesn't print any errors nor
> warnings. But it doesn't touch contrib/Makefile so I had to make by
> changing dir to contrib/median.
>

fixed

> * Cosmetic coding style should be more appropriate, including trailing
> white spaces and indentation positions.
>

can you specify where, please?

> * Since these two aggregates use tuplesort inside it, there're
> possible risk to cause out of memory in normal use case. I don't think
> this fact is critical, but at least some notation should be referred
> in docs.
>

it is same as windows function, no? So the risk is same.

> * It doesn't contain any document nor regression tests.
>

yes, it is my fault, some median regress test appended, documentation
I am will sending later - resp. if you agree with current form of
patch, I'll finalize this patch and remove "median" to core. I put
these functions to contrib just for simple testing of proof concept.

> * They should be callable in window function context; for example
>
> contrib_regression=# select median(a) over (order by a) from t1;
> ERROR:  invalid tuplesort state
>
> or at least user-friend message should be printed.
>

the problem is in windows function implementation - partially fixed -
blocked from windows context

> * The returned type is controversy. median(int) returns float8 as the
> patch intended, but avg(int) returns numeric. AFAIK only avg(float8)
> returns float8.
>

fixed

> * percentile() is more problematic; first, the second argument for the
> aggregate takes N of "N%ile" as int, like 50 if you want 50%ile,  but
> it doesn't care about negative values or more than 100. In addition,
> the second argument is taken at the first non-NULL value of the first
> argument, but the second argument is semantically constant. For
> example, for (1.. 10) value of a in table t1,
>

yes, I understand - I don't have a problem to modify it. Just this has
same behave as string_agg. This is again deeper problem - we cannot
ensure so some parameter of aggregation function will be a constant. -
so it cannot be well solved now :( Have you some idea? There isn't any
tool for checking if some parameter is constant or not.

I removed percentile now.

> contrib_regression=# select percentile(a, a * 10 order by a) from t1;
>  percentile
> ------------
>          1
> (1 row)
>
> contrib_regression=# select percentile(a, a * 10 order by a desc) from t1;
>  percentile
> ------------
>         10
> (1 row)
>
> and if the argument comes from the subquery which doesn't contain
> ORDER BY clause, you cannot predict the result.
>
> That's all of my quick review up to now.
>
> Regards,
>
> --
> Hitoshi Harada
>

Вложения

median.diff

Re: wip: functions median and percentile

От

Robert Haas

Дата:

21 сентября 2010 г., 17:44:24

On Tue, Sep 21, 2010 at 4:28 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> * Cosmetic coding style should be more appropriate, including trailing
>> white spaces and indentation positions.
>
> can you specify where, please?

I noticed a lot of problems in this area when working on your \ef /
\sf patch, too.  Trailing whitespace is nearly always bad, and it's
not hard to find - just grep the diff for it.  As for indentation, try
to match the surrounding code.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

21 сентября 2010 г., 18:04:20

2010/9/21 Robert Haas <robertmhaas@gmail.com>:
> On Tue, Sep 21, 2010 at 4:28 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>> * Cosmetic coding style should be more appropriate, including trailing
>>> white spaces and indentation positions.
>>
>> can you specify where, please?
>
> I noticed a lot of problems in this area when working on your \ef /
> \sf patch, too.  Trailing whitespace is nearly always bad, and it's
> not hard to find - just grep the diff for it.  As for indentation, try
> to match the surrounding code.

is updated patch better?

Pavel

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>

Вложения

median.diff

Re: wip: functions median and percentile

От

Hitoshi Harada

Дата:

21 сентября 2010 г., 23:21:09

2010/9/22 Pavel Stehule <pavel.stehule@gmail.com>:
> 2010/9/21 Robert Haas <robertmhaas@gmail.com>:
>> On Tue, Sep 21, 2010 at 4:28 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>>> * Cosmetic coding style should be more appropriate, including trailing
>>>> white spaces and indentation positions.
>>>
>>> can you specify where, please?
>>
>> I noticed a lot of problems in this area when working on your \ef /
>> \sf patch, too.  Trailing whitespace is nearly always bad, and it's
>> not hard to find - just grep the diff for it.  As for indentation, try
>> to match the surrounding code.
>
> is updated patch better?

Better, but you should be more careful, for example,

+         tuplesort_performsort(aggstate->sortstate);
+         <-- 1
+         while (tuplesort_getdatum(aggstate->sortstate,
+                                 true,
+                                       &value, &isNull))

For #1, please remove horizontal tabs in empty lines.

Regards,


--
Hitoshi Harada

Re: wip: functions median and percentile

От

Hitoshi Harada

Дата:

22 сентября 2010 г., 00:07:09

2010/9/22 Pavel Stehule <pavel.stehule@gmail.com>:
> Hello
>
> I found probably hard problem in cooperation with window functions :(
>
> tuplesort_begin_datum creates child context inside aggcontext. This
> context is used for tuplestore. But when this function is called from
> WindowAgg_Aggregates context - someone drops all child context every
> iteration, and then tuplestore state is invalid. For this moment we
> can block using a median function as window function, but it should be
> solved better - if it is possible - I don't see inside window function
> implementation.

Does it happen when the window frame starts from not UNBOUNDED
PRECEDING? In those cases, nodeWindowAgg tries to discard all
aggregate contexts and to initialize the aggregate state. AFAIK the
memory context is brand-new although it was reset and its children
deleted, so if the function is well-formed and initializes its state
on NULL input, it doesn't cause a problem.


Regards,


-- 
Hitoshi Harada

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

22 сентября 2010 г., 01:37:14

2010/9/22 Hitoshi Harada <umi.tanuki@gmail.com>:
> 2010/9/22 Pavel Stehule <pavel.stehule@gmail.com>:
>> 2010/9/21 Robert Haas <robertmhaas@gmail.com>:
>>> On Tue, Sep 21, 2010 at 4:28 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>>>> * Cosmetic coding style should be more appropriate, including trailing
>>>>> white spaces and indentation positions.
>>>>
>>>> can you specify where, please?
>>>
>>> I noticed a lot of problems in this area when working on your \ef /
>>> \sf patch, too.  Trailing whitespace is nearly always bad, and it's
>>> not hard to find - just grep the diff for it.  As for indentation, try
>>> to match the surrounding code.
>>
>> is updated patch better?
>
> Better, but you should be more careful, for example,
>
> +               tuplesort_performsort(aggstate->sortstate);
> +               <-- 1
> +               while (tuplesort_getdatum(aggstate->sortstate,
> +                                                               true,
> +                                                                     &value, &isNull))
>
> For #1, please remove horizontal tabs in empty lines.

ok, checked removed

Pavel
>
> Regards,
>
>
> --
> Hitoshi Harada
>

Вложения

median.diff

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

22 сентября 2010 г., 01:45:50

2010/9/22 Hitoshi Harada <umi.tanuki@gmail.com>:
> 2010/9/22 Pavel Stehule <pavel.stehule@gmail.com>:
>> Hello
>>
>> I found probably hard problem in cooperation with window functions :(
>>
>> tuplesort_begin_datum creates child context inside aggcontext. This
>> context is used for tuplestore. But when this function is called from
>> WindowAgg_Aggregates context - someone drops all child context every
>> iteration, and then tuplestore state is invalid. For this moment we
>> can block using a median function as window function, but it should be
>> solved better - if it is possible - I don't see inside window function
>> implementation.
>
> Does it happen when the window frame starts from not UNBOUNDED
> PRECEDING? In those cases, nodeWindowAgg tries to discard all
> aggregate contexts and to initialize the aggregate state. AFAIK the
> memory context is brand-new although it was reset and its children
> deleted, so if the function is well-formed and initializes its state
> on NULL input, it doesn't cause a problem.
>

It is bad premise. Inside a aggregates I don't have a control over
memory context. There missing one level of persistent memory context
where I can put a child context - but this context should be created
outside - some like group context.

Pavel

>
> Regards,
>
>
> --
> Hitoshi Harada
>

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

23 сентября 2010 г., 06:36:35

Hello

2010/9/22 Hitoshi Harada <umi.tanuki@gmail.com>:
> 2010/9/22 Pavel Stehule <pavel.stehule@gmail.com>:
>> Hello
>>
>> I found probably hard problem in cooperation with window functions :(
>>
>> tuplesort_begin_datum creates child context inside aggcontext. This
>> context is used for tuplestore. But when this function is called from
>> WindowAgg_Aggregates context - someone drops all child context every
>> iteration, and then tuplestore state is invalid. For this moment we
>> can block using a median function as window function, but it should be
>> solved better - if it is possible - I don't see inside window function
>> implementation.
>
> Does it happen when the window frame starts from not UNBOUNDED
> PRECEDING? In those cases, nodeWindowAgg tries to discard all
> aggregate contexts and to initialize the aggregate state. AFAIK the
> memory context is brand-new although it was reset and its children
> deleted, so if the function is well-formed and initializes its state
> on NULL input, it doesn't cause a problem.

maybe I was confused. I found a other possible problems.

The problem with median function is probably inside a final function
implementation. Actually we request possibility of repetitive call of
final function. But final function call tuplesort_end function and
tuplesort_performsort. These function changes a state of tuplesort.
The most basic question is "who has to call tuplesort_end function and
when?

Regards

Pavel Stehule

>
>
> Regards,
>
>
> --
> Hitoshi Harada
>

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

23 сентября 2010 г., 10:23:29

Hello

I moved a "median" function to core.

+ doc part
+ regress test

Regards

Pavel Stehule


2010/9/20 Hitoshi Harada <umi.tanuki@gmail.com>:
> 2010/8/19 Pavel Stehule <pavel.stehule@gmail.com>:
>> Hello
>>
>> I am sending a prototype implementation of functions median and
>> percentile. This implementation is very simple and I moved it to
>> contrib for this moment - it is more easy maintainable. Later I'll
>> move it to core.
>
> I've reviewed this patch.
>
> * The patch can apply cleanly and make doesn't print any errors nor
> warnings. But it doesn't touch contrib/Makefile so I had to make by
> changing dir to contrib/median.
>
> * Cosmetic coding style should be more appropriate, including trailing
> white spaces and indentation positions.
>
> * Since these two aggregates use tuplesort inside it, there're
> possible risk to cause out of memory in normal use case. I don't think
> this fact is critical, but at least some notation should be referred
> in docs.
>
> * It doesn't contain any document nor regression tests.
>
> * They should be callable in window function context; for example
>
> contrib_regression=# select median(a) over (order by a) from t1;
> ERROR:  invalid tuplesort state
>
> or at least user-friend message should be printed.
>
> * The returned type is controversy. median(int) returns float8 as the
> patch intended, but avg(int) returns numeric. AFAIK only avg(float8)
> returns float8.
>
> * percentile() is more problematic; first, the second argument for the
> aggregate takes N of "N%ile" as int, like 50 if you want 50%ile,  but
> it doesn't care about negative values or more than 100. In addition,
> the second argument is taken at the first non-NULL value of the first
> argument, but the second argument is semantically constant. For
> example, for (1.. 10) value of a in table t1,
>
> contrib_regression=# select percentile(a, a * 10 order by a) from t1;
>  percentile
> ------------
>          1
> (1 row)
>
> contrib_regression=# select percentile(a, a * 10 order by a desc) from t1;
>  percentile
> ------------
>         10
> (1 row)
>
> and if the argument comes from the subquery which doesn't contain
> ORDER BY clause, you cannot predict the result.
>
> That's all of my quick review up to now.
>
> Regards,
>
> --
> Hitoshi Harada
>

Вложения

median.diff

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

23 сентября 2010 г., 10:26:24

sorry

little bit fixed patch

Pavel

2010/9/23 Pavel Stehule <pavel.stehule@gmail.com>:
> Hello
>
> I moved a "median" function to core.
>
> + doc part
> + regress test
>
> Regards
>
> Pavel Stehule
>
>
> 2010/9/20 Hitoshi Harada <umi.tanuki@gmail.com>:
>> 2010/8/19 Pavel Stehule <pavel.stehule@gmail.com>:
>>> Hello
>>>
>>> I am sending a prototype implementation of functions median and
>>> percentile. This implementation is very simple and I moved it to
>>> contrib for this moment - it is more easy maintainable. Later I'll
>>> move it to core.
>>
>> I've reviewed this patch.
>>
>> * The patch can apply cleanly and make doesn't print any errors nor
>> warnings. But it doesn't touch contrib/Makefile so I had to make by
>> changing dir to contrib/median.
>>
>> * Cosmetic coding style should be more appropriate, including trailing
>> white spaces and indentation positions.
>>
>> * Since these two aggregates use tuplesort inside it, there're
>> possible risk to cause out of memory in normal use case. I don't think
>> this fact is critical, but at least some notation should be referred
>> in docs.
>>
>> * It doesn't contain any document nor regression tests.
>>
>> * They should be callable in window function context; for example
>>
>> contrib_regression=# select median(a) over (order by a) from t1;
>> ERROR:  invalid tuplesort state
>>
>> or at least user-friend message should be printed.
>>
>> * The returned type is controversy. median(int) returns float8 as the
>> patch intended, but avg(int) returns numeric. AFAIK only avg(float8)
>> returns float8.
>>
>> * percentile() is more problematic; first, the second argument for the
>> aggregate takes N of "N%ile" as int, like 50 if you want 50%ile,  but
>> it doesn't care about negative values or more than 100. In addition,
>> the second argument is taken at the first non-NULL value of the first
>> argument, but the second argument is semantically constant. For
>> example, for (1.. 10) value of a in table t1,
>>
>> contrib_regression=# select percentile(a, a * 10 order by a) from t1;
>>  percentile
>> ------------
>>          1
>> (1 row)
>>
>> contrib_regression=# select percentile(a, a * 10 order by a desc) from t1;
>>  percentile
>> ------------
>>         10
>> (1 row)
>>
>> and if the argument comes from the subquery which doesn't contain
>> ORDER BY clause, you cannot predict the result.
>>
>> That's all of my quick review up to now.
>>
>> Regards,
>>
>> --
>> Hitoshi Harada
>>
>

Вложения

median.diff

Re: wip: functions median and percentile

От

Hitoshi Harada

Дата:

23 сентября 2010 г., 14:45:46

2010/9/23 Pavel Stehule <pavel.stehule@gmail.com>:
> Hello
>
> 2010/9/22 Hitoshi Harada <umi.tanuki@gmail.com>:
>> 2010/9/22 Pavel Stehule <pavel.stehule@gmail.com>:
>>> Hello
>>>
>>> I found probably hard problem in cooperation with window functions :(
>
> maybe I was confused. I found a other possible problems.
>
> The problem with median function is probably inside a final function
> implementation. Actually we request possibility of repetitive call of
> final function. But final function call tuplesort_end function and
> tuplesort_performsort. These function changes a state of tuplesort.
> The most basic question is "who has to call tuplesort_end function and
> when?

Reading the comment in array_userfuncs.c, array_agg_finalfn() doesn't
clean up its internal state at all and tells it's the executor's
responsibility to clear memory. It is allowed since ArrayBuildState is
only in-memory state. In the other hand, TupleSort should be cleared
by calling tuplesort_end() if it has tapeset member (on file based
sort) to close physical files.

So 2 or 3 ways to go in my mind:

1. call tuplesort_begin_datum with INT_MAX workMem rather than the
global work_mem, to avoid it spills out sort state to files. It may
sounds dangerous, but actually memory exhausting can happen in
array_agg() as well.

2. add TupleSort an argument that tells not to use file at all. This
results in the same as #1 but more generic approach.

3. don't use tuplesort in median() but implement its original sort
management. This looks quite redundant and like maintenance problem.

#2 sounds like the best in generic and consistent way. The only point
is whether the change is worth for implementing median() as it's very
system-wide common fundamentals.

Other options?


Regards,
-- 
Hitoshi Harada

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

23 сентября 2010 г., 15:28:27

2010/9/23 Hitoshi Harada <umi.tanuki@gmail.com>:
> 2010/9/23 Pavel Stehule <pavel.stehule@gmail.com>:
>> Hello
>>
>> 2010/9/22 Hitoshi Harada <umi.tanuki@gmail.com>:
>>> 2010/9/22 Pavel Stehule <pavel.stehule@gmail.com>:
>>>> Hello
>>>>
>>>> I found probably hard problem in cooperation with window functions :(
>>
>> maybe I was confused. I found a other possible problems.
>>
>> The problem with median function is probably inside a final function
>> implementation. Actually we request possibility of repetitive call of
>> final function. But final function call tuplesort_end function and
>> tuplesort_performsort. These function changes a state of tuplesort.
>> The most basic question is "who has to call tuplesort_end function and
>> when?
>
> Reading the comment in array_userfuncs.c, array_agg_finalfn() doesn't
> clean up its internal state at all and tells it's the executor's
> responsibility to clear memory. It is allowed since ArrayBuildState is
> only in-memory state. In the other hand, TupleSort should be cleared
> by calling tuplesort_end() if it has tapeset member (on file based
> sort) to close physical files.
>
> So 2 or 3 ways to go in my mind:

it is little bit worse - we cannot to call tuplesort_performsort repetitive.

>
> 1. call tuplesort_begin_datum with INT_MAX workMem rather than the
> global work_mem, to avoid it spills out sort state to files. It may
> sounds dangerous, but actually memory exhausting can happen in
> array_agg() as well.
>
> 2. add TupleSort an argument that tells not to use file at all. This
> results in the same as #1 but more generic approach.
>
> 3. don't use tuplesort in median() but implement its original sort
> management. This looks quite redundant and like maintenance problem.
>
> #2 sounds like the best in generic and consistent way. The only point
> is whether the change is worth for implementing median() as it's very
> system-wide common fundamentals.
>
> Other options?

#4 block median under window clause

#5 use a C array instead tuplesort under window clause. It is very
unpractical to use a windows clauses with large datasets, so it should
not be a problem. More, this can be very quick, because for C array we
can use a qsort function.

Now I prefer #5 - it can be fast for using inside windows clause and
safe when window clause will not be used.

Regards

Pavel
>
>
> Regards,
> --
> Hitoshi Harada
>

Re: wip: functions median and percentile

От

David Fetter

Дата:

23 сентября 2010 г., 15:48:56

On Thu, Sep 23, 2010 at 08:27:38PM +0200, Pavel Stehule wrote:
> 2010/9/23 Hitoshi Harada <umi.tanuki@gmail.com>:
> > 2010/9/23 Pavel Stehule <pavel.stehule@gmail.com>:
> >> Hello
> >>
> >> 2010/9/22 Hitoshi Harada <umi.tanuki@gmail.com>:
> >>> 2010/9/22 Pavel Stehule <pavel.stehule@gmail.com>:
> >>>> Hello
> >>>>
> >>>> I found probably hard problem in cooperation with window functions :(
> >>
> >> maybe I was confused. I found a other possible problems.
> >>
> >> The problem with median function is probably inside a final function
> >> implementation. Actually we request possibility of repetitive call of
> >> final function. But final function call tuplesort_end function and
> >> tuplesort_performsort. These function changes a state of tuplesort.
> >> The most basic question is "who has to call tuplesort_end function and
> >> when?
> >
> > Reading the comment in array_userfuncs.c, array_agg_finalfn() doesn't
> > clean up its internal state at all and tells it's the executor's
> > responsibility to clear memory. It is allowed since ArrayBuildState is
> > only in-memory state. In the other hand, TupleSort should be cleared
> > by calling tuplesort_end() if it has tapeset member (on file based
> > sort) to close physical files.
> >
> > So 2 or 3 ways to go in my mind:
> 
> it is little bit worse - we cannot to call tuplesort_performsort repetitive.
> 
> >
> > 1. call tuplesort_begin_datum with INT_MAX workMem rather than the
> > global work_mem, to avoid it spills out sort state to files. It may
> > sounds dangerous, but actually memory exhausting can happen in
> > array_agg() as well.
> >
> > 2. add TupleSort an argument that tells not to use file at all. This
> > results in the same as #1 but more generic approach.
> >
> > 3. don't use tuplesort in median() but implement its original sort
> > management. This looks quite redundant and like maintenance problem.
> >
> > #2 sounds like the best in generic and consistent way. The only point
> > is whether the change is worth for implementing median() as it's very
> > system-wide common fundamentals.
> >
> > Other options?
> 
> #4 block median under window clause
> 
> #5 use a C array instead tuplesort under window clause. It is very
> unpractical to use a windows clauses with large datasets, so it should
> not be a problem. More, this can be very quick, because for C array we
> can use a qsort function.
> 
> Now I prefer #5 - it can be fast for using inside windows clause and
> safe when window clause will not be used.

If there's some way to do this using the same code in the windowing
and non-windowing case, that would be much, much better from an
architectural point of view.  Single Point of Truth and all that.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

23 сентября 2010 г., 15:57:41

2010/9/23 David Fetter <david@fetter.org>:
> On Thu, Sep 23, 2010 at 08:27:38PM +0200, Pavel Stehule wrote:
>> 2010/9/23 Hitoshi Harada <umi.tanuki@gmail.com>:
>> > 2010/9/23 Pavel Stehule <pavel.stehule@gmail.com>:
>> >> Hello
>> >>
>> >> 2010/9/22 Hitoshi Harada <umi.tanuki@gmail.com>:
>> >>> 2010/9/22 Pavel Stehule <pavel.stehule@gmail.com>:
>> >>>> Hello
>> >>>>
>> >>>> I found probably hard problem in cooperation with window functions :(
>> >>
>> >> maybe I was confused. I found a other possible problems.
>> >>
>> >> The problem with median function is probably inside a final function
>> >> implementation. Actually we request possibility of repetitive call of
>> >> final function. But final function call tuplesort_end function and
>> >> tuplesort_performsort. These function changes a state of tuplesort.
>> >> The most basic question is "who has to call tuplesort_end function and
>> >> when?
>> >
>> > Reading the comment in array_userfuncs.c, array_agg_finalfn() doesn't
>> > clean up its internal state at all and tells it's the executor's
>> > responsibility to clear memory. It is allowed since ArrayBuildState is
>> > only in-memory state. In the other hand, TupleSort should be cleared
>> > by calling tuplesort_end() if it has tapeset member (on file based
>> > sort) to close physical files.
>> >
>> > So 2 or 3 ways to go in my mind:
>>
>> it is little bit worse - we cannot to call tuplesort_performsort repetitive.
>>
>> >
>> > 1. call tuplesort_begin_datum with INT_MAX workMem rather than the
>> > global work_mem, to avoid it spills out sort state to files. It may
>> > sounds dangerous, but actually memory exhausting can happen in
>> > array_agg() as well.
>> >
>> > 2. add TupleSort an argument that tells not to use file at all. This
>> > results in the same as #1 but more generic approach.
>> >
>> > 3. don't use tuplesort in median() but implement its original sort
>> > management. This looks quite redundant and like maintenance problem.
>> >
>> > #2 sounds like the best in generic and consistent way. The only point
>> > is whether the change is worth for implementing median() as it's very
>> > system-wide common fundamentals.
>> >
>> > Other options?
>>
>> #4 block median under window clause
>>
>> #5 use a C array instead tuplesort under window clause. It is very
>> unpractical to use a windows clauses with large datasets, so it should
>> not be a problem. More, this can be very quick, because for C array we
>> can use a qsort function.
>>
>> Now I prefer #5 - it can be fast for using inside windows clause and
>> safe when window clause will not be used.
>
> If there's some way to do this using the same code in the windowing
> and non-windowing case, that would be much, much better from an
> architectural point of view.  Single Point of Truth and all that.

We can have a median with support a window clause, but limited to
work_mem, or we can have a unlimited median, but without window
clause. I think, I am able to minimalize a code duplicity - just to
define some envelope over  tuplesort.

The unique code isn't possible there - minimally now we have a two
variants - one for numeric result and second for double. But it is
usual - try to look how much AVG functions are in core.

Regards

Pavel

>
> Cheers,
> David.
> --
> David Fetter <david@fetter.org> http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter      XMPP: david.fetter@gmail.com
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

26 сентября 2010 г., 07:40:15

Hello,

there is updated version - with support of window clause. The limits
are work_mem for using inside window aggregate or unlimited when it is
used as standard query.

This patch needs a few work - can share a compare functionality with
tuplesort.c, but I would to verify a concept now.

Comments?

Regards

Pavel

Вложения

median.diff

Re: wip: functions median and percentile

От

Hitoshi Harada

Дата:

01 октября 2010 г., 05:17:28

2010/9/26 Pavel Stehule <pavel.stehule@gmail.com>:
> Hello,
>
> there is updated version - with support of window clause. The limits
> are work_mem for using inside window aggregate or unlimited when it is
> used as standard query.
>
> This patch needs a few work - can share a compare functionality with
> tuplesort.c, but I would to verify a concept now.
>
> Comments?

Sorry for delay. I read the patch and it seems the result is sane. For
window function calls, I agree that the current tuplesort is not
enough to implement median functions and the patch introduces its own
memsort mechanism, although memsort has too much copied from
tuplesort. It looks to me not so difficult to modify the existing
tuplesort to guarantee staying in memory always if an option to do so
is specified from caller. I think that option can be used by other
cases in the core code.

Regards,


-- 
Hitoshi Harada

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

01 октября 2010 г., 05:47:42

Hello

2010/10/1 Hitoshi Harada <umi.tanuki@gmail.com>:
> 2010/9/26 Pavel Stehule <pavel.stehule@gmail.com>:
>> Hello,
>>
>> there is updated version - with support of window clause. The limits
>> are work_mem for using inside window aggregate or unlimited when it is
>> used as standard query.
>>
>> This patch needs a few work - can share a compare functionality with
>> tuplesort.c, but I would to verify a concept now.
>>
>> Comments?
>
> Sorry for delay. I read the patch and it seems the result is sane. For
> window function calls, I agree that the current tuplesort is not
> enough to implement median functions and the patch introduces its own
> memsort mechanism, although memsort has too much copied from
> tuplesort. It looks to me not so difficult to modify the existing
> tuplesort to guarantee staying in memory always if an option to do so
> is specified from caller. I think that option can be used by other
> cases in the core code.

There are two factors - a) tuplesorts should to uses only memory, b)
data can be sorted again and again. I'll look on possible tuplesort
modifications on weekend.

Pavel


>
> Regards,
>
>
> --
> Hitoshi Harada
>

Re: wip: functions median and percentile

От

Tom Lane

Дата:

01 октября 2010 г., 10:05:54

Hitoshi Harada <umi.tanuki@gmail.com> writes:
> 2010/9/26 Pavel Stehule <pavel.stehule@gmail.com>:
>> This patch needs a few work - can share a compare functionality with
>> tuplesort.c, but I would to verify a concept now.

> Sorry for delay. I read the patch and it seems the result is sane. For
> window function calls, I agree that the current tuplesort is not
> enough to implement median functions and the patch introduces its own
> memsort mechanism, although memsort has too much copied from
> tuplesort. It looks to me not so difficult to modify the existing
> tuplesort to guarantee staying in memory always if an option to do so
> is specified from caller. I think that option can be used by other
> cases in the core code.

If this patch tries to force the entire sort to happen in memory,
it is not committable.  What will happen when you get a lot of data?
You need to be working on a variant that will work anyway, not working
on an unacceptable lobotomization of the main sort code.
        regards, tom lane

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

01 октября 2010 г., 10:20:45

2010/10/1 Tom Lane <tgl@sss.pgh.pa.us>:
> Hitoshi Harada <umi.tanuki@gmail.com> writes:
>> 2010/9/26 Pavel Stehule <pavel.stehule@gmail.com>:
>>> This patch needs a few work - can share a compare functionality with
>>> tuplesort.c, but I would to verify a concept now.
>
>> Sorry for delay. I read the patch and it seems the result is sane. For
>> window function calls, I agree that the current tuplesort is not
>> enough to implement median functions and the patch introduces its own
>> memsort mechanism, although memsort has too much copied from
>> tuplesort. It looks to me not so difficult to modify the existing
>> tuplesort to guarantee staying in memory always if an option to do so
>> is specified from caller. I think that option can be used by other
>> cases in the core code.
>
> If this patch tries to force the entire sort to happen in memory,
> it is not committable.  What will happen when you get a lot of data?
> You need to be working on a variant that will work anyway, not working
> on an unacceptable lobotomization of the main sort code.

The median function checking a calling context - under window
aggregate uses a just memory sort solution - and under standard
aggregate context it uses a full tuplesort. It bases on  request of
window aggregate function - the final function have to be called more
time - and it isn't possible with tuplesort. So as window aggregate it
uses just memory sort limited with work_mem. Other usage is unlimited.
Second option was a block median function under window aggregates.

It this design possible? We cannot use any complex source under window
aggregates, because there isn't any way to unlink it.

Regards

Pavel
>
>                        regards, tom lane
>

Re: wip: functions median and percentile

От

Hitoshi Harada

Дата:

01 октября 2010 г., 11:11:16

2010/10/1 Tom Lane <tgl@sss.pgh.pa.us>:
> Hitoshi Harada <umi.tanuki@gmail.com> writes:
>> 2010/9/26 Pavel Stehule <pavel.stehule@gmail.com>:
>>> This patch needs a few work - can share a compare functionality with
>>> tuplesort.c, but I would to verify a concept now.
>
>> Sorry for delay. I read the patch and it seems the result is sane. For
>> window function calls, I agree that the current tuplesort is not
>> enough to implement median functions and the patch introduces its own
>> memsort mechanism, although memsort has too much copied from
>> tuplesort. It looks to me not so difficult to modify the existing
>> tuplesort to guarantee staying in memory always if an option to do so
>> is specified from caller. I think that option can be used by other
>> cases in the core code.
>
> If this patch tries to force the entire sort to happen in memory,
> it is not committable.  What will happen when you get a lot of data?
> You need to be working on a variant that will work anyway, not working
> on an unacceptable lobotomization of the main sort code.

What about array_agg()? Doesn't it exceed memory even if the huge data come in?



--
Hitoshi Harada

Re: wip: functions median and percentile

От

Tom Lane

Дата:

01 октября 2010 г., 11:43:56

Hitoshi Harada <umi.tanuki@gmail.com> writes:
> 2010/10/1 Tom Lane <tgl@sss.pgh.pa.us>:
>> If this patch tries to force the entire sort to happen in memory,
>> it is not committable.

> What about array_agg()? Doesn't it exceed memory even if the huge data come in?

Yeah, but for array_agg the user should be expecting a result of
approximately the size of the whole input, so if he overruns memory he
hasn't got a lot of room to complain.  There is no reason for a user to
expect that median or percentile will fall over on large input, and
every reason to expect them to be more robust than that.
        regards, tom lane

Re: wip: functions median and percentile

От

Robert Haas

Дата:

01 октября 2010 г., 11:44:54

On Fri, Oct 1, 2010 at 10:11 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
> 2010/10/1 Tom Lane <tgl@sss.pgh.pa.us>:
>> Hitoshi Harada <umi.tanuki@gmail.com> writes:
>>> 2010/9/26 Pavel Stehule <pavel.stehule@gmail.com>:
>>>> This patch needs a few work - can share a compare functionality with
>>>> tuplesort.c, but I would to verify a concept now.
>>
>>> Sorry for delay. I read the patch and it seems the result is sane. For
>>> window function calls, I agree that the current tuplesort is not
>>> enough to implement median functions and the patch introduces its own
>>> memsort mechanism, although memsort has too much copied from
>>> tuplesort. It looks to me not so difficult to modify the existing
>>> tuplesort to guarantee staying in memory always if an option to do so
>>> is specified from caller. I think that option can be used by other
>>> cases in the core code.
>>
>> If this patch tries to force the entire sort to happen in memory,
>> it is not committable.  What will happen when you get a lot of data?
>> You need to be working on a variant that will work anyway, not working
>> on an unacceptable lobotomization of the main sort code.
>
> What about array_agg()? Doesn't it exceed memory even if the huge data come in?

So, if you have 512MB of RAM in the box and you build and return a 1GB
array, it's going to be a problem.  Period, full stop.  The interim
memory consumption cannot be less than the size of the final result.

If you have 512MB of RAM in the box and you want to aggregate 1GB of
data and return a 4 byte integer, it's only a problem if your
implementation is bad.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: wip: functions median and percentile

От

Hitoshi Harada

Дата:

01 октября 2010 г., 12:00:45

2010/10/1 Tom Lane <tgl@sss.pgh.pa.us>:
> Hitoshi Harada <umi.tanuki@gmail.com> writes:
>> 2010/10/1 Tom Lane <tgl@sss.pgh.pa.us>:
>>> If this patch tries to force the entire sort to happen in memory,
>>> it is not committable.
>
>> What about array_agg()? Doesn't it exceed memory even if the huge data come in?
>
> Yeah, but for array_agg the user should be expecting a result of
> approximately the size of the whole input, so if he overruns memory he
> hasn't got a lot of room to complain.  There is no reason for a user to
> expect that median or percentile will fall over on large input, and
> every reason to expect them to be more robust than that.

So it's particular problem of *median* but not the idea of
on-memory-guaranteed tuplesort.

If this way is not committable, one of alternatives is to implement
median as a window function rather than an aggregate. But the big
problem of this is that it's impossible to have two
same-input-same-name functions of aggregate and window. AFAIK they are
ambiguous at parser stage. So we have to have median() for aggregate
and something like median_w() over (). This is worse idea, I feel.

Another way is to modify nodeWindowAgg in some way, but I cannot wrap
up how to. To call some destructor in the end of partition somehow,
but this is out of the current aggregate system.

The bottom-line is to throw an error from median in window aggregate,
but personally I would like to see median in window aggregate, which
is quite smart.

Another suggestion?

Regards,

--
Hitoshi Harada

Re: wip: functions median and percentile

От

Tom Lane

Дата:

01 октября 2010 г., 12:08:10

Hitoshi Harada <umi.tanuki@gmail.com> writes:
> Another suggestion?

The implementation I would've expected to see is to do the sort and then
have two code paths for retrieving the median, depending on whether the
sort result is all in memory or not.
        regards, tom lane

Re: wip: functions median and percentile

От

"Kevin Grittner"

Дата:

01 октября 2010 г., 12:15:15

Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Hitoshi Harada <umi.tanuki@gmail.com> writes:
>> Another suggestion?
> 
> The implementation I would've expected to see is to do the sort
> and then have two code paths for retrieving the median, depending
> on whether the sort result is all in memory or not.
Would it make sense to accumulate value/count pairs in a hash table,
along with a total count, as the tuples are encountered, and sort
the (potentially smaller) hash table at the end?  (Not that this
helps with the memory management questions...)  Large sets with any
significant degree of duplication in values (say the age in years of
residents of a state) would probably run significantly faster this
way.
-Kevin

Re: wip: functions median and percentile

От

Hitoshi Harada

Дата:

01 октября 2010 г., 12:16:12

2010/10/2 Tom Lane <tgl@sss.pgh.pa.us>:
> Hitoshi Harada <umi.tanuki@gmail.com> writes:
>> Another suggestion?
>
> The implementation I would've expected to see is to do the sort and then
> have two code paths for retrieving the median, depending on whether the
> sort result is all in memory or not.
>

Hm? The problem we encountered in the middle of the patch is there is
no chance to call tuplesort_end if median is called in moving frame
window aggregate because final function is called multiple times
during moving. The nearest pattern was array_agg() which uses
on-memory state and throw its responsibility to clear memory to
executor (nodeAgg / nodeWindowAgg). Am I missing something?

Regards,

-- 
Hitoshi Harada

Re: wip: functions median and percentile

От

Tom Lane

Дата:

01 октября 2010 г., 12:22:57

Hitoshi Harada <umi.tanuki@gmail.com> writes:
> 2010/10/2 Tom Lane <tgl@sss.pgh.pa.us>:
>> The implementation I would've expected to see is to do the sort and then
>> have two code paths for retrieving the median, depending on whether the
>> sort result is all in memory or not.

> Hm? The problem we encountered in the middle of the patch is there is
> no chance to call tuplesort_end if median is called in moving frame
> window aggregate because final function is called multiple times
> during moving.

Well, if you haven't got a solution for that, then this patch isn't
ready for prime time.

It's entirely possible that median as a window function is intractable.
I'd rather have it throwing error than offer an implementation that will
fall over as soon as the window gets large.
        regards, tom lane

Re: wip: functions median and percentile

От

Hitoshi Harada

Дата:

01 октября 2010 г., 12:32:13

2010/10/2 Tom Lane <tgl@sss.pgh.pa.us>:
> Hitoshi Harada <umi.tanuki@gmail.com> writes:
>> 2010/10/2 Tom Lane <tgl@sss.pgh.pa.us>:
>>> The implementation I would've expected to see is to do the sort and then
>>> have two code paths for retrieving the median, depending on whether the
>>> sort result is all in memory or not.
>
>> Hm? The problem we encountered in the middle of the patch is there is
>> no chance to call tuplesort_end if median is called in moving frame
>> window aggregate because final function is called multiple times
>> during moving.
>
> Well, if you haven't got a solution for that, then this patch isn't
> ready for prime time.
>
> It's entirely possible that median as a window function is intractable.
> I'd rather have it throwing error than offer an implementation that will
> fall over as soon as the window gets large.

Well, that sounds like the conclusion. It is a shame, but we have to
throw an error from median() in the window aggregate, if Pavel does
not have any better solution. And as an aggregate function only, the
patch is ready if the window-related parts are removed.

Regards,


-- 
Hitoshi Harada

Re: wip: functions median and percentile

От

Hitoshi Harada

Дата:

01 октября 2010 г., 12:35:12

2010/10/2 Kevin Grittner <Kevin.Grittner@wicourts.gov>:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Hitoshi Harada <umi.tanuki@gmail.com> writes:
>>> Another suggestion?
>>
>> The implementation I would've expected to see is to do the sort
>> and then have two code paths for retrieving the median, depending
>> on whether the sort result is all in memory or not.
>
> Would it make sense to accumulate value/count pairs in a hash table,

Maybe, but it still has memory problem if the values vary, right? And
I'm not familiar with the algorithm of median other than what the
current patch does, so I'm not sure if hash table solution can be made
easily.

Regards,

-- 
Hitoshi Harada

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

01 октября 2010 г., 14:37:52

2010/10/1 Hitoshi Harada <umi.tanuki@gmail.com>:
> 2010/10/2 Tom Lane <tgl@sss.pgh.pa.us>:
>> Hitoshi Harada <umi.tanuki@gmail.com> writes:
>>> 2010/10/2 Tom Lane <tgl@sss.pgh.pa.us>:
>>>> The implementation I would've expected to see is to do the sort and then
>>>> have two code paths for retrieving the median, depending on whether the
>>>> sort result is all in memory or not.
>>
>>> Hm? The problem we encountered in the middle of the patch is there is
>>> no chance to call tuplesort_end if median is called in moving frame
>>> window aggregate because final function is called multiple times
>>> during moving.
>>
>> Well, if you haven't got a solution for that, then this patch isn't
>> ready for prime time.
>>
>> It's entirely possible that median as a window function is intractable.
>> I'd rather have it throwing error than offer an implementation that will
>> fall over as soon as the window gets large.
>
> Well, that sounds like the conclusion. It is a shame, but we have to
> throw an error from median() in the window aggregate, if Pavel does
> not have any better solution. And as an aggregate function only, the
> patch is ready if the window-related parts are removed.
>

I am sorry - I don't have a better solution. Classic algorithm isn't
well for window aggregate - it needs a sort after any append a new
item. Maybe we can use a separate functionality based on estimated
values for a windows. I read some articles about it. But this is work
on longer time - all articles about this topic are experimental. More
I am not mathematician - so I am not able to review these methods.
Today or tomorrow I'll send a updated patch without support a window
aggregates.

Regards

Pavel Stehule

> Regards,
>
>
> --
> Hitoshi Harada
>

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

01 октября 2010 г., 16:17:59

Hello

updated version
  * memsort removed
  * window aggregate support blocked

Regards

Pavel

2010/10/1 Pavel Stehule <pavel.stehule@gmail.com>:
> 2010/10/1 Hitoshi Harada <umi.tanuki@gmail.com>:
>> 2010/10/2 Tom Lane <tgl@sss.pgh.pa.us>:
>>> Hitoshi Harada <umi.tanuki@gmail.com> writes:
>>>> 2010/10/2 Tom Lane <tgl@sss.pgh.pa.us>:
>>>>> The implementation I would've expected to see is to do the sort and then
>>>>> have two code paths for retrieving the median, depending on whether the
>>>>> sort result is all in memory or not.
>>>
>>>> Hm? The problem we encountered in the middle of the patch is there is
>>>> no chance to call tuplesort_end if median is called in moving frame
>>>> window aggregate because final function is called multiple times
>>>> during moving.
>>>
>>> Well, if you haven't got a solution for that, then this patch isn't
>>> ready for prime time.
>>>
>>> It's entirely possible that median as a window function is intractable.
>>> I'd rather have it throwing error than offer an implementation that will
>>> fall over as soon as the window gets large.
>>
>> Well, that sounds like the conclusion. It is a shame, but we have to
>> throw an error from median() in the window aggregate, if Pavel does
>> not have any better solution. And as an aggregate function only, the
>> patch is ready if the window-related parts are removed.
>>
>
> I am sorry - I don't have a better solution. Classic algorithm isn't
> well for window aggregate - it needs a sort after any append a new
> item. Maybe we can use a separate functionality based on estimated
> values for a windows. I read some articles about it. But this is work
> on longer time - all articles about this topic are experimental. More
> I am not mathematician - so I am not able to review these methods.
> Today or tomorrow I'll send a updated patch without support a window
> aggregates.
>
> Regards
>
> Pavel Stehule
>
>> Regards,
>>
>>
>> --
>> Hitoshi Harada
>>
>

Вложения

median.diff

Re: wip: functions median and percentile

От

Hitoshi Harada

Дата:

03 октября 2010 г., 11:06:34

2010/10/2 Pavel Stehule <pavel.stehule@gmail.com>:
> Hello
>
> updated version
>  * memsort removed
>  * window aggregate support blocked

I ran this patch and it looks good for committer.
Just one thing, in src/backend/utils/adt/Makefile median.o to compile
the new file is missing.

And I'm now thinking about how to make median happen in window
aggregate. In fact we might fix parse_func.c to distinguish the same
name aggregate and window functions and median() can be implemented
separately from aggregate one. But it's another story than this patch.
For now we can commit the median aggregate only, without window
function support.

Regards,


--
Hitoshi Harada

Re: wip: functions median and percentile

От

Greg Stark

Дата:

03 октября 2010 г., 12:50:08

On Sun, Oct 3, 2010 at 7:06 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
> And I'm now thinking about how to make median happen in window
> aggregate.

If you were to do this by extending tuplesort what extra features
would tuplesort need?

Would tuplesort need the ability to insert additional records into an
already sorted array and maintain the sort?
Would it need the ability to remove records from the sorted set?
Would it need the ability to do a partial sort (the QuickSelect algorithm)?
The ability to do random access on disk sets?

How do existing windowed aggregates work if you specify an order by on
the aggregate? Do they resort for every output row?

Does the spec give a way to run an arbitrary subselect on the current
window? I wonder if we need more powerful machinery anyways to handle
these cases.

-- 
greg

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

03 октября 2010 г., 14:32:07

Hello

2010/10/3 Hitoshi Harada <umi.tanuki@gmail.com>:
> 2010/10/2 Pavel Stehule <pavel.stehule@gmail.com>:
>> Hello
>>
>> updated version
>>  * memsort removed
>>  * window aggregate support blocked
>
> I ran this patch and it looks good for committer.
> Just one thing, in src/backend/utils/adt/Makefile median.o to compile
> the new file is missing.
>
> And I'm now thinking about how to make median happen in window
> aggregate. In fact we might fix parse_func.c to distinguish the same
> name aggregate and window functions and median() can be implemented
> separately from aggregate one. But it's another story than this patch.
> For now we can commit the median aggregate only, without window
> function support.
>
> Regards,
>

Thank you very much for review

fixed patch + missing Makefile

Regards

Pavel Stehule
>
> --
> Hitoshi Harada
>

Вложения

median.diff

Re: wip: functions median and percentile

От

Hitoshi Harada

Дата:

04 октября 2010 г., 03:36:57

2010/10/4 Greg Stark <gsstark@mit.edu>:
> On Sun, Oct 3, 2010 at 7:06 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
>> And I'm now thinking about how to make median happen in window
>> aggregate.
>
> If you were to do this by extending tuplesort what extra features
> would tuplesort need?

I don't think we need to extend tuplesort when we do it. IMO this can
make happen by fetching all the values in the frame and performing
sort once, then getting current position and calculate median. The
only thing I consider is to mark and to go to the demanded position in
tuplesort like tuplestore, but actually we can manage to median()
without such facilities.

> How do existing windowed aggregates work if you specify an order by on
> the aggregate? Do they resort for every output row?

I don't know the true specification of median() nor the behavior of it
in other RDBs, but I bet median is median and ORDER BY in window
clause doesn't affect its result. ORDER BY specifies only the frame
boundary.

Regards,

-- 
Hitoshi Harada

Re: wip: functions median and percentile

От

Dean Rasheed

Дата:

04 октября 2010 г., 03:58:55

On 4 October 2010 07:36, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
> 2010/10/4 Greg Stark <gsstark@mit.edu>:
>> On Sun, Oct 3, 2010 at 7:06 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
>>> And I'm now thinking about how to make median happen in window
>>> aggregate.
>>
>> If you were to do this by extending tuplesort what extra features
>> would tuplesort need?
>
> I don't think we need to extend tuplesort when we do it. IMO this can
> make happen by fetching all the values in the frame and performing
> sort once, then getting current position and calculate median. The
> only thing I consider is to mark and to go to the demanded position in
> tuplesort like tuplestore, but actually we can manage to median()
> without such facilities.
>

I dont' think that works, because the ORDER BY in the WINDOW doesn't
necessarily match the sort order that the median function needs.
Consider for example:

select a, median(a) over(order by a desc) from generate_series(1,10) as g(a);a  |       median
----+--------------------10 |                 10 9 | 9.5000000000000000 8 |                  9 7 | 8.5000000000000000 6
|                 8 5 | 7.5000000000000000 4 |                  7 3 | 6.5000000000000000 2 |                  6 1 |
5.5000000000000000
(10 rows)

That requires a new sort for each row. I generated this with a minor
tweak to Pavel's patch to just restart the tuplesort each time (the
"quick-fix" solution). The problem is that performance really sucks,
because it is an O(n^2 log(n)) algorithm. I don't see an easy way
around that without significant new infrastructure, as Greg describes,
or a completely different algorithm.


>> How do existing windowed aggregates work if you specify an order by on
>> the aggregate? Do they resort for every output row?
>

Isn't the issue that they don't need to sort, so they all work unchanged.

Regards,
Dean

> I don't know the true specification of median() nor the behavior of it
> in other RDBs, but I bet median is median and ORDER BY in window
> clause doesn't affect its result. ORDER BY specifies only the frame
> boundary.
>
> Regards,
>
>
>
> --
> Hitoshi Harada
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: wip: functions median and percentile

От

Greg Stark

Дата:

04 октября 2010 г., 13:14:41

On Sun, Oct 3, 2010 at 11:58 PM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> The problem is that performance really sucks,
> because it is an O(n^2 log(n)) algorithm. I don't see an easy way
> around that without significant new infrastructure, as Greg describes,
> or a completely different algorithm.

Resorting for each record would be O(n^2 log(n)). If you use
Quickselect it would be O(n^2) because each selection would be O(n).
But then we could get O(n^2) by just doing insertion-sort. The problem
with both these algorithms is that I don't see how to do it on-disk.
Maybe there would be some way to do insertion-sort on disk by keeping
a buffer in-memory holding the last n records inserted and whenever
the in-memory buffer fills do a merge against the data on disk to
spill it. But that's a lot of special-case machinery. Personally I
think if we need it to handle sorted aggregates over window functions
it's  worth it to carry it if someone feels like writing it.

-- 
greg

Re: wip: functions median and percentile

От

Robert Haas

Дата:

04 октября 2010 г., 14:22:57

On Mon, Oct 4, 2010 at 2:58 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> That requires a new sort for each row. I generated this with a minor
> tweak to Pavel's patch to just restart the tuplesort each time (the
> "quick-fix" solution). The problem is that performance really sucks,
> because it is an O(n^2 log(n)) algorithm.

Maybe that's OK.  If you're doing repeated median operations on large
data sets, perhaps you should expect that to be slow.  I bet that
people who want to use this as a window function will want one median
per group, not n medians per group; and it doesn't seem like a good
idea to say - we're not going to let you use this as a window function
AT ALL because you might decide to do something that will be really
slow.  You can always hit ^C if you get tired of waiting.  This seems
like it's very far from being the most important thing for us to
optimize, though of course it's great if we can.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: wip: functions median and percentile

От

Dean Rasheed

Дата:

04 октября 2010 г., 16:20:37

On 4 October 2010 18:22, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Oct 4, 2010 at 2:58 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>> That requires a new sort for each row. I generated this with a minor
>> tweak to Pavel's patch to just restart the tuplesort each time (the
>> "quick-fix" solution). The problem is that performance really sucks,
>> because it is an O(n^2 log(n)) algorithm.
>
> Maybe that's OK.  If you're doing repeated median operations on large
> data sets, perhaps you should expect that to be slow.  I bet that
> people who want to use this as a window function will want one median
> per group, not n medians per group; and it doesn't seem like a good
> idea to say - we're not going to let you use this as a window function
> AT ALL because you might decide to do something that will be really
> slow.  You can always hit ^C if you get tired of waiting.  This seems
> like it's very far from being the most important thing for us to
> optimize, though of course it's great if we can.
>

Well FWIW, here's the quick-fix solution to make this median function
support window queries.

It's OK if all you want is one median per partition, but otherwise
it's pretty ugly. It will copy the entire tuplesort for each row in a
running median, which is horrible, but that's actually negligible
compared to the subsequent sort that's going to happen for that row.

I'd actually be tempted to force the tuplesort to stay in memory for a
running median, on the grounds that performance will be so bad for a
large partition that you'd run out of patience waiting for it long
before it would run out of memory :-(


Regards,
Dean


> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>

Вложения

median2.diff

Re: wip: functions median and percentile

От

Hitoshi Harada

Дата:

05 октября 2010 г., 03:04:40

2010/10/5 Dean Rasheed <dean.a.rasheed@gmail.com>:
> On 4 October 2010 18:22, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, Oct 4, 2010 at 2:58 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>>> That requires a new sort for each row. I generated this with a minor
>>> tweak to Pavel's patch to just restart the tuplesort each time (the
>>> "quick-fix" solution). The problem is that performance really sucks,
>>> because it is an O(n^2 log(n)) algorithm.
>>
>> Maybe that's OK.  If you're doing repeated median operations on large
>> data sets, perhaps you should expect that to be slow.  I bet that
>> people who want to use this as a window function will want one median
>> per group, not n medians per group; and it doesn't seem like a good
>> idea to say - we're not going to let you use this as a window function
>> AT ALL because you might decide to do something that will be really
>> slow.  You can always hit ^C if you get tired of waiting.  This seems
>> like it's very far from being the most important thing for us to
>> optimize, though of course it's great if we can.
>>
>
> Well FWIW, here's the quick-fix solution to make this median function
> support window queries.

Is this safe? It seems to me that the tuplesort_end isn't called in
window aggregates at the last of a partition, which results in
forgetting to close temp file if tuplesort is in state of tape.

Regards,


--
Hitoshi Harada

Re: wip: functions median and percentile

От

Dean Rasheed

Дата:

05 октября 2010 г., 05:48:01

On 5 October 2010 07:04, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
> 2010/10/5 Dean Rasheed <dean.a.rasheed@gmail.com>:
>> On 4 October 2010 18:22, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Mon, Oct 4, 2010 at 2:58 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>>>> That requires a new sort for each row. I generated this with a minor
>>>> tweak to Pavel's patch to just restart the tuplesort each time (the
>>>> "quick-fix" solution). The problem is that performance really sucks,
>>>> because it is an O(n^2 log(n)) algorithm.
>>>
>>> Maybe that's OK.  If you're doing repeated median operations on large
>>> data sets, perhaps you should expect that to be slow.  I bet that
>>> people who want to use this as a window function will want one median
>>> per group, not n medians per group; and it doesn't seem like a good
>>> idea to say - we're not going to let you use this as a window function
>>> AT ALL because you might decide to do something that will be really
>>> slow.  You can always hit ^C if you get tired of waiting.  This seems
>>> like it's very far from being the most important thing for us to
>>> optimize, though of course it's great if we can.
>>>
>>
>> Well FWIW, here's the quick-fix solution to make this median function
>> support window queries.
>
> Is this safe? It seems to me that the tuplesort_end isn't called in
> window aggregates at the last of a partition, which results in
> forgetting to close temp file if tuplesort is in state of tape.
>

Yeah, you're right. Actually I hate doing it this way, and only really
suggested it because I like it even less to add an aggregate that
arbitrarily doesn't support window queries. This approach will at
least work well for arbitrary sized partitions without an ORDER BY
clause.

With an ORDER BY clause though, even in the best-case situation
(values in the partition already sorted), this is going to be O(n^2)
for a running median, but it seems like it would be significantly more
work to come up with a better algorithm, and I'm not sure that there
is sufficient demand for this function to justify that.

Extrapolating from few quick timing tests, even in the best case, on
my machine, it would take 7 days for the running median to use up
100MB, and 8 years for it to use 2GB. So setting the tuplesort's
workMem to 2GB (only in the running median case) would actually be
safe in practice, and would prevent the temp file leak (for a few
years at least!). I feel dirty even suggesting that. Better ideas
anyone?

Regards,
Dean

Re: wip: functions median and percentile

От

Hitoshi Harada

Дата:

05 октября 2010 г., 09:14:33

2010/10/5 Dean Rasheed <dean.a.rasheed@gmail.com>:
> On 5 October 2010 07:04, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
> Extrapolating from few quick timing tests, even in the best case, on
> my machine, it would take 7 days for the running median to use up
> 100MB, and 8 years for it to use 2GB. So setting the tuplesort's
> workMem to 2GB (only in the running median case) would actually be
> safe in practice, and would prevent the temp file leak (for a few
> years at least!). I feel dirty even suggesting that. Better ideas
> anyone?

So, I suggested to implement median as a *pure* window function aside
from Pavel's aggregate function, and Greg suggested insertion
capability of tuplesort. By this approach, we keep tuplesort to hold
all the values in the current frame and can release it on the last of
a partition (it's possible by window function API.) This is
incremental addition of values and is far better than O(n^2 log(n))
although I didn't estimate the order. Only when the frame head is
moving down, we should re-initialize tuplesort and it is as slow as
calling aggregate version per each row (but I think we can solve it
somehow if looking precisely).

Regards,

-- 
Hitoshi Harada

Re: wip: functions median and percentile

От

Dean Rasheed

Дата:

05 октября 2010 г., 10:09:09

On 5 October 2010 13:14, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
> 2010/10/5 Dean Rasheed <dean.a.rasheed@gmail.com>:
>> On 5 October 2010 07:04, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
>> Extrapolating from few quick timing tests, even in the best case, on
>> my machine, it would take 7 days for the running median to use up
>> 100MB, and 8 years for it to use 2GB. So setting the tuplesort's
>> workMem to 2GB (only in the running median case) would actually be
>> safe in practice, and would prevent the temp file leak (for a few
>> years at least!). I feel dirty even suggesting that. Better ideas
>> anyone?
>
> So, I suggested to implement median as a *pure* window function aside
> from Pavel's aggregate function, and Greg suggested insertion
> capability of tuplesort. By this approach, we keep tuplesort to hold
> all the values in the current frame and can release it on the last of
> a partition (it's possible by window function API.) This is
> incremental addition of values and is far better than O(n^2 log(n))
> although I didn't estimate the order. Only when the frame head is
> moving down, we should re-initialize tuplesort and it is as slow as
> calling aggregate version per each row (but I think we can solve it
> somehow if looking precisely).
>

Possibly, but that sounds like a lot of work to get an efficient
algorithm. The 3 cases I see are:

1). Simple aggregate. Current algorithm is O(n log(n)) which is OK. It
could be better because a full sort is not strictly needed. As already
mentioned, a quickselect would be O(n).

2). Window without ORDER BY. This is actually basically the same as
(1), but called once per partition.

3). Window with ORDER BY (running median). The simplest algorithm is
O(n^2 log(n)). It could be tweaked to use an insertion sort, but that
would still be O(n^2), which is not a lot better for all the effort
that would be involved. In theory (perhaps with some kind of tree) it
ought to be possible to come up with an O(n log(n)) algorithm, but
that would be a lot of work.

In the meantime, the attached variation of the patch fixes the temp
file issue and will support all 3 cases. It gives OK performance for
(1) and (2), and poor performance for (3). That could be viewed as a
future development task, which perhaps the window function API would
help with. I think it would be a shame to drop support for (2) just
because we can't do (3) efficiently yet.

Regards,
Dean

> Regards,
>
> --
> Hitoshi Harada
>

Вложения

median3.diff

Re: wip: functions median and percentile

От

Tom Lane

Дата:

10 октября 2010 г., 18:17:22

Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> In the meantime, the attached variation of the patch fixes the temp
> file issue and will support all 3 cases. It gives OK performance for
> (1) and (2), and poor performance for (3). That could be viewed as a
> future development task, which perhaps the window function API would
> help with. I think it would be a shame to drop support for (2) just
> because we can't do (3) efficiently yet.

I started looking at this patch, and noticed that we got so caught up
in implementation issues that we forgot the unresolved problem of data
types.  The original patch defined median(anyelement), which is only
well-defined for an odd number of inputs; for an even number of inputs
you have to take the left or right item in the central pair.  I see
that this version definesmedian(float8) returns float8median(float4) returns float8median(numeric) returns
numericmedian(int8)returns numericmedian(int4) returns numericmedian(int2) returns numeric

which strikes me as possibly being overkill.

It was pointed out upthread that while median isn't presently
in the standard, Oracle defines it in terms of percentile_cont(0.5)
which *is* in the standard.  What I read in SQL:2008 is that
percentile_cont is defined for all numeric types (returning
approximate numeric with implementation-defined precision),
and for interval (returning interval), and not for any other
input type.  So it appears to me that what we ought to support
ismedian(float8) returns float8median(interval) returns interval
and nothing else --- we can rely on implicit casting to convert
any other numeric input type to float8.

BTW, as far as the implementation issues go, telling tuplesort that it
can use gigabytes of memory no matter what seems quite unacceptable.
Put this thing into a hash aggregation and you'll blow out your memory
in no time.  I don't think it's even a good idea to use work_mem there.
I wonder whether it'd be a good idea to augment AggCheckCallContext()
so that there's a way for aggregates to find out how much memory they
ought to try to use.  In a simple aggregation situation it's probably
OK to use work_mem, but in a hash aggregation you'd better use less
--- perhaps work_mem divided by the number of groups expected.

Also, I believe that the lack-of-cleanup problem for tuplesorts spilling
to disk should be fixable by using an exprcontext shutdown callback (see
RegisterExprContextCallback).

Comments?
        regards, tom lane

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

11 октября 2010 г., 03:59:43

2010/10/10 Tom Lane <tgl@sss.pgh.pa.us>:
> Dean Rasheed <dean.a.rasheed@gmail.com> writes:
>> In the meantime, the attached variation of the patch fixes the temp
>> file issue and will support all 3 cases. It gives OK performance for
>> (1) and (2), and poor performance for (3). That could be viewed as a
>> future development task, which perhaps the window function API would
>> help with. I think it would be a shame to drop support for (2) just
>> because we can't do (3) efficiently yet.
>
> I started looking at this patch, and noticed that we got so caught up
> in implementation issues that we forgot the unresolved problem of data
> types.  The original patch defined median(anyelement), which is only
> well-defined for an odd number of inputs; for an even number of inputs
> you have to take the left or right item in the central pair.  I see
> that this version defines
>        median(float8) returns float8
>        median(float4) returns float8
>        median(numeric) returns numeric
>        median(int8) returns numeric
>        median(int4) returns numeric
>        median(int2) returns numeric
> which strikes me as possibly being overkill.
>
> It was pointed out upthread that while median isn't presently
> in the standard, Oracle defines it in terms of percentile_cont(0.5)
> which *is* in the standard.  What I read in SQL:2008 is that
> percentile_cont is defined for all numeric types (returning
> approximate numeric with implementation-defined precision),
> and for interval (returning interval), and not for any other
> input type.  So it appears to me that what we ought to support
> is
>        median(float8) returns float8
>        median(interval) returns interval
> and nothing else --- we can rely on implicit casting to convert
> any other numeric input type to float8.

+1

>
> BTW, as far as the implementation issues go, telling tuplesort that it
> can use gigabytes of memory no matter what seems quite unacceptable.
> Put this thing into a hash aggregation and you'll blow out your memory
> in no time.  I don't think it's even a good idea to use work_mem there.
> I wonder whether it'd be a good idea to augment AggCheckCallContext()
> so that there's a way for aggregates to find out how much memory they
> ought to try to use.  In a simple aggregation situation it's probably
> OK to use work_mem, but in a hash aggregation you'd better use less
> --- perhaps work_mem divided by the number of groups expected.
>
> Also, I believe that the lack-of-cleanup problem for tuplesorts spilling
> to disk should be fixable by using an exprcontext shutdown callback (see
> RegisterExprContextCallback).

this issue is only for window aggregates, and this way is block - the
problem with tuplestore is not only wit cleanup, but more in
performance. But using a MemoryContext shutdown callback is good idea.

Regards

Pavel Stehule

>
> Comments?
>
>                        regards, tom lane
>

Re: wip: functions median and percentile

От

Dean Rasheed

Дата:

11 октября 2010 г., 06:35:26

On 10 October 2010 22:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> It was pointed out upthread that while median isn't presently
> in the standard, Oracle defines it in terms of percentile_cont(0.5)
> which *is* in the standard.  What I read in SQL:2008 is that
> percentile_cont is defined for all numeric types (returning
> approximate numeric with implementation-defined precision),
> and for interval (returning interval), and not for any other
> input type.  So it appears to me that what we ought to support
> is
>        median(float8) returns float8
>        median(interval) returns interval
> and nothing else --- we can rely on implicit casting to convert
> any other numeric input type to float8.
>

Yeah that would be much simpler.

BTW, why has percentile been removed from this patch? As the more
general, and SQL standard function, that would seem to be the more
useful one to include. Upthread it was mentioned that there is already
an ntile window function, but actually that's a completely different
thing.

> BTW, as far as the implementation issues go, telling tuplesort that it
> can use gigabytes of memory no matter what seems quite unacceptable.
> Put this thing into a hash aggregation and you'll blow out your memory
> in no time.  I don't think it's even a good idea to use work_mem there.

Argh! Yes that sounds like a much more serious problem.

Interestingly I couldn't seem to produce this effect. Every effort I
make to write a query to test this with median ends up being executed
using a GroupAggregate, while the equivalent query with avg uses a
HashAggregate. I don't understand why they are being treated
differently.

> I wonder whether it'd be a good idea to augment AggCheckCallContext()
> so that there's a way for aggregates to find out how much memory they
> ought to try to use.  In a simple aggregation situation it's probably
> OK to use work_mem, but in a hash aggregation you'd better use less
> --- perhaps work_mem divided by the number of groups expected.

Wouldn't that risk not allowing any memory at all the to aggregate in
some cases? I don't have a better idea mind you, short of somehow not
allowing hash aggregation for this function.

> Also, I believe that the lack-of-cleanup problem for tuplesorts spilling
> to disk should be fixable by using an exprcontext shutdown callback (see
> RegisterExprContextCallback).

Ah! I wasn't aware of such a callback. Sounds perfect for the job.

Regards,
Dean

>
> Comments?
>
>                        regards, tom lane
>

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

11 октября 2010 г., 06:56:05

2010/10/11 Dean Rasheed <dean.a.rasheed@gmail.com>:
> On 10 October 2010 22:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It was pointed out upthread that while median isn't presently
>> in the standard, Oracle defines it in terms of percentile_cont(0.5)
>> which *is* in the standard.  What I read in SQL:2008 is that
>> percentile_cont is defined for all numeric types (returning
>> approximate numeric with implementation-defined precision),
>> and for interval (returning interval), and not for any other
>> input type.  So it appears to me that what we ought to support
>> is
>>        median(float8) returns float8
>>        median(interval) returns interval
>> and nothing else --- we can rely on implicit casting to convert
>> any other numeric input type to float8.
>>
>
> Yeah that would be much simpler.
>
> BTW, why has percentile been removed from this patch? As the more
> general, and SQL standard function, that would seem to be the more
> useful one to include. Upthread it was mentioned that there is already
> an ntile window function, but actually that's a completely different
> thing.

The reason for removing was impossibility to specify so some parameter
must by immutable - in this case p parameter should be immutable
otherwise the result is undefined.

Regards

Pavel Stehule


>
>
>> BTW, as far as the implementation issues go, telling tuplesort that it
>> can use gigabytes of memory no matter what seems quite unacceptable.
>> Put this thing into a hash aggregation and you'll blow out your memory
>> in no time.  I don't think it's even a good idea to use work_mem there.
>
> Argh! Yes that sounds like a much more serious problem.
>
> Interestingly I couldn't seem to produce this effect. Every effort I
> make to write a query to test this with median ends up being executed
> using a GroupAggregate, while the equivalent query with avg uses a
> HashAggregate. I don't understand why they are being treated
> differently.
>
>
>> I wonder whether it'd be a good idea to augment AggCheckCallContext()
>> so that there's a way for aggregates to find out how much memory they
>> ought to try to use.  In a simple aggregation situation it's probably
>> OK to use work_mem, but in a hash aggregation you'd better use less
>> --- perhaps work_mem divided by the number of groups expected.
>
> Wouldn't that risk not allowing any memory at all the to aggregate in
> some cases? I don't have a better idea mind you, short of somehow not
> allowing hash aggregation for this function.
>
>
>> Also, I believe that the lack-of-cleanup problem for tuplesorts spilling
>> to disk should be fixable by using an exprcontext shutdown callback (see
>> RegisterExprContextCallback).
>
> Ah! I wasn't aware of such a callback. Sounds perfect for the job.
>
> Regards,
> Dean
>
>
>>
>> Comments?
>>
>>                        regards, tom lane
>>
>

Re: wip: functions median and percentile

От

Dean Rasheed

Дата:

11 октября 2010 г., 07:33:19

On 11 October 2010 10:55, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> BTW, why has percentile been removed from this patch? As the more
>> general, and SQL standard function, that would seem to be the more
>> useful one to include. Upthread it was mentioned that there is already
>> an ntile window function, but actually that's a completely different
>> thing.
>
> The reason for removing was impossibility to specify so some parameter
> must by immutable - in this case p parameter should be immutable
> otherwise the result is undefined.
>

Could we not just make an arbitrary choice, like the last non-null
value, and then document that. I can't believe that this would ever be
an issue in practice.

I don't think there is any way to deduce the following from behaviour
from the documentation:

select string_agg(i::text, repeat(',',i)) from generate_series(1,10) as g(i);                           string_agg

-------------------------------------------------------------------1,,2,,,3,,,,4,,,,,5,,,,,,6,,,,,,,7,,,,,,,,8,,,,,,,,,9,,,,,,,,,,10
(1 row)

but I don't see it as a real problem for the aggregate.

Thoughts?

Regards,
Dean

Re: wip: functions median and percentile

От

Robert Haas

Дата:

11 октября 2010 г., 07:33:21

On Sun, Oct 10, 2010 at 5:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> It was pointed out upthread that while median isn't presently
> in the standard, Oracle defines it in terms of percentile_cont(0.5)
> which *is* in the standard.  What I read in SQL:2008 is that
> percentile_cont is defined for all numeric types (returning
> approximate numeric with implementation-defined precision),
> and for interval (returning interval), and not for any other
> input type.  So it appears to me that what we ought to support
> is
>        median(float8) returns float8
>        median(interval) returns interval
> and nothing else --- we can rely on implicit casting to convert
> any other numeric input type to float8.

Isn't there a possibility of a precision loss if numeric gets cast to
float8?  Should we include an explicit variant from numeric?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: wip: functions median and percentile

От

Tom Lane

Дата:

11 октября 2010 г., 11:03:52

Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> On 10 October 2010 22:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> BTW, as far as the implementation issues go, telling tuplesort that it
>> can use gigabytes of memory no matter what seems quite unacceptable.
>> Put this thing into a hash aggregation and you'll blow out your memory
>> in no time. �I don't think it's even a good idea to use work_mem there.

> Argh! Yes that sounds like a much more serious problem.

> Interestingly I couldn't seem to produce this effect. Every effort I
> make to write a query to test this with median ends up being executed
> using a GroupAggregate, while the equivalent query with avg uses a
> HashAggregate. I don't understand why they are being treated
> differently.

If you're using recent sources, there's some code in count_agg_clauses()
that assumes that an aggregate with transtype INTERNAL will use
ALLOCSET_DEFAULT_INITSIZE (ie 8K) workspace.  So that'll discourage the
planner from selecting HashAggregate except for a pretty small number of
groups.  The problem is that there's still a whole lot of daylight
between 8K and 2G, so plenty of room to go wrong.

The other approach that we could take here is to replace the
ALLOCSET_DEFAULT_INITSIZE hack (which is certainly no more than a hack)
with some way for an aggregate to declare how much space it'll eat,
or more simply to mark it as "never use in HashAgg".  This was discussed
earlier but it would require a significant amount of dogwork and no one
was real excited about doing it.  Maybe it's time to bite that bullet
though.  Reflecting on it, I think it'd be best to allow an agg to
provide an estimation function that'd be told the input data type and
expected number of rows --- even on a per-aggregate basis, a constant
estimate just isn't good enough.
        regards, tom lane

Re: wip: functions median and percentile

От

Tom Lane

Дата:

11 октября 2010 г., 11:09:03

Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, Oct 10, 2010 at 5:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It was pointed out upthread that while median isn't presently
>> in the standard, Oracle defines it in terms of percentile_cont(0.5)
>> which *is* in the standard. �What I read in SQL:2008 is that
>> percentile_cont is defined for all numeric types (returning
>> approximate numeric with implementation-defined precision),
>> and for interval (returning interval), and not for any other
>> input type. �So it appears to me that what we ought to support
>> is
>> � � � �median(float8) returns float8
>> � � � �median(interval) returns interval
>> and nothing else --- we can rely on implicit casting to convert
>> any other numeric input type to float8.

> Isn't there a possibility of a precision loss if numeric gets cast to
> float8?

So?  The standard says "implementation-defined precision".  We can
define it as giving results that are good to float8.  I find it hard to
imagine an application for median() where that's not good enough;
and what's more, the difference in comparison speed between float8 and
numeric would render median() on numeric pretty useless anyway.
        regards, tom lane

Re: wip: functions median and percentile

От

Robert Haas

Дата:

11 октября 2010 г., 11:29:48

On Mon, Oct 11, 2010 at 10:08 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Sun, Oct 10, 2010 at 5:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> It was pointed out upthread that while median isn't presently
>>> in the standard, Oracle defines it in terms of percentile_cont(0.5)
>>> which *is* in the standard.  What I read in SQL:2008 is that
>>> percentile_cont is defined for all numeric types (returning
>>> approximate numeric with implementation-defined precision),
>>> and for interval (returning interval), and not for any other
>>> input type.  So it appears to me that what we ought to support
>>> is
>>>        median(float8) returns float8
>>>        median(interval) returns interval
>>> and nothing else --- we can rely on implicit casting to convert
>>> any other numeric input type to float8.
>
>> Isn't there a possibility of a precision loss if numeric gets cast to
>> float8?
>
> So?  The standard says "implementation-defined precision".  We can
> define it as giving results that are good to float8.  I find it hard to
> imagine an application for median() where that's not good enough;
> and what's more, the difference in comparison speed between float8 and
> numeric would render median() on numeric pretty useless anyway.

I suppose for most applications it won't matter; I just hate losing
precision, and have a deep skepticism of floats, as we've discussed
previously.  :-)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: wip: functions median and percentile

От

Dean Rasheed

Дата:

11 октября 2010 г., 12:31:13

On 11 October 2010 15:03, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Dean Rasheed <dean.a.rasheed@gmail.com> writes:
>> On 10 October 2010 22:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> BTW, as far as the implementation issues go, telling tuplesort that it
>>> can use gigabytes of memory no matter what seems quite unacceptable.
>>> Put this thing into a hash aggregation and you'll blow out your memory
>>> in no time.  I don't think it's even a good idea to use work_mem there.
>
>> Argh! Yes that sounds like a much more serious problem.
>
>> Interestingly I couldn't seem to produce this effect. Every effort I
>> make to write a query to test this with median ends up being executed
>> using a GroupAggregate, while the equivalent query with avg uses a
>> HashAggregate. I don't understand why they are being treated
>> differently.
>
> If you're using recent sources, there's some code in count_agg_clauses()
> that assumes that an aggregate with transtype INTERNAL will use
> ALLOCSET_DEFAULT_INITSIZE (ie 8K) workspace.  So that'll discourage the
> planner from selecting HashAggregate except for a pretty small number of
> groups.  The problem is that there's still a whole lot of daylight
> between 8K and 2G, so plenty of room to go wrong.
>
> The other approach that we could take here is to replace the
> ALLOCSET_DEFAULT_INITSIZE hack (which is certainly no more than a hack)
> with some way for an aggregate to declare how much space it'll eat,
> or more simply to mark it as "never use in HashAgg".  This was discussed
> earlier but it would require a significant amount of dogwork and no one
> was real excited about doing it.  Maybe it's time to bite that bullet
> though.  Reflecting on it, I think it'd be best to allow an agg to
> provide an estimation function that'd be told the input data type and
> expected number of rows --- even on a per-aggregate basis, a constant
> estimate just isn't good enough.

How good will that estimate of the number of rows be though? If
they're coming from a SRF it could be a huge under-estimate, and you'd
still risk eating all the memory, if you allowed a hash aggregate.

Regards,
Dean


>
>                        regards, tom lane
>

Re: wip: functions median and percentile

От

Tom Lane

Дата:

11 октября 2010 г., 12:44:18

Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> On 11 October 2010 15:03, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Reflecting on it, I think it'd be best to allow an agg to
>> provide an estimation function that'd be told the input data type and
>> expected number of rows --- even on a per-aggregate basis, a constant
>> estimate just isn't good enough.

> How good will that estimate of the number of rows be though?

It can't possibly be any worse than a hard-wired constant ;-)

> If they're coming from a SRF it could be a huge under-estimate, and you'd
> still risk eating all the memory, if you allowed a hash aggregate.

If, for a particular aggregate, you're too chicken to ever allow hash
aggregation, you could just return a very large number from the
estimation hook function.  I doubt that's a very useful behavior in the
majority of cases though.
        regards, tom lane

Re: wip: functions median and percentile

От

Dean Rasheed

Дата:

11 октября 2010 г., 14:34:59

On 11 October 2010 16:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Dean Rasheed <dean.a.rasheed@gmail.com> writes:
>> On 11 October 2010 15:03, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Reflecting on it, I think it'd be best to allow an agg to
>>> provide an estimation function that'd be told the input data type and
>>> expected number of rows --- even on a per-aggregate basis, a constant
>>> estimate just isn't good enough.
>
>> How good will that estimate of the number of rows be though?
>
> It can't possibly be any worse than a hard-wired constant ;-)
>
>> If they're coming from a SRF it could be a huge under-estimate, and you'd
>> still risk eating all the memory, if you allowed a hash aggregate.
>
> If, for a particular aggregate, you're too chicken to ever allow hash
> aggregation, you could just return a very large number from the
> estimation hook function.  I doubt that's a very useful behavior in the
> majority of cases though.
>

Yeah, for median I'd be too chicken :-)

set work_mem = '2MB';
explain analyse select median(i) from generate_series(1,40000) as g(i)
group by i;                                                         QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------HashAggregate
(cost=15.00..17.50 rows=200 width=4) (actual 
time=229.524..287.153 rows=40000 loops=1)  ->  Function Scan on generate_series g  (cost=0.00..10.00 rows=1000
width=4) (actual time=8.738..16.595 rows=40000 loops=1)Total runtime: 811.460 ms
(3 rows)

The estimate of 200 x 8K is below work_mem, so it uses a hash
aggregate. In reality, each tuplesort allocates around 30K initially,
so it very quickly uses over 1GB. A better estimate for the aggregate
wouldn't improve this situation much.

Regards,
Dean

Re: wip: functions median and percentile

От

Tom Lane

Дата:

11 октября 2010 г., 14:37:21

Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> The estimate of 200 x 8K is below work_mem, so it uses a hash
> aggregate. In reality, each tuplesort allocates around 30K initially,
> so it very quickly uses over 1GB. A better estimate for the aggregate
> wouldn't improve this situation much.

Sure it would: an estimate of 30K would keep the planner from using
hash aggregation.
        regards, tom lane

Re: wip: functions median and percentile

От

Dean Rasheed

Дата:

11 октября 2010 г., 14:42:59

On 11 October 2010 18:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Dean Rasheed <dean.a.rasheed@gmail.com> writes:
>> The estimate of 200 x 8K is below work_mem, so it uses a hash
>> aggregate. In reality, each tuplesort allocates around 30K initially,
>> so it very quickly uses over 1GB. A better estimate for the aggregate
>> wouldn't improve this situation much.
>
> Sure it would: an estimate of 30K would keep the planner from using
> hash aggregation.
>

Not if work_mem was 10MB.

Regards,
Dean

Re: wip: functions median and percentile

От

Tom Lane

Дата:

11 октября 2010 г., 14:49:07

Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> On 11 October 2010 18:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Sure it would: an estimate of 30K would keep the planner from using
>> hash aggregation.

> Not if work_mem was 10MB.

And?  If the memory requirement actually fits, you're in good shape.
        regards, tom lane

Re: wip: functions median and percentile

От

Dean Rasheed

Дата:

11 октября 2010 г., 14:55:43

On 11 October 2010 18:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Dean Rasheed <dean.a.rasheed@gmail.com> writes:
>> On 11 October 2010 18:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Sure it would: an estimate of 30K would keep the planner from using
>>> hash aggregation.
>
>> Not if work_mem was 10MB.
>
> And?  If the memory requirement actually fits, you're in good shape.
>

Yeah but the actual memory requirement, if it uses a hash aggregate,
is over 1GB, and could easily be much higher. It's also hard to kill,
because it eats up that memory so quickly.

Regards,
Dean

Re: wip: functions median and percentile

От

Tom Lane

Дата:

11 октября 2010 г., 15:05:26

Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> On 11 October 2010 18:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> And? �If the memory requirement actually fits, you're in good shape.

> Yeah but the actual memory requirement, if it uses a hash aggregate,
> is over 1GB, and could easily be much higher.

In that case the estimate of 30K per instance was wrong.
You still haven't explained why this is impossible to estimate,
or even particularly hard, as long as we can provide some code that
knows specifically about the behavior of this aggregate.  The amount
of space needed to sort X amount of data is not unpredictable.
        regards, tom lane

Re: wip: functions median and percentile

От

Dean Rasheed

Дата:

11 октября 2010 г., 15:23:34

On 11 October 2010 19:05, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Dean Rasheed <dean.a.rasheed@gmail.com> writes:
>> On 11 October 2010 18:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> And?  If the memory requirement actually fits, you're in good shape.
>
>> Yeah but the actual memory requirement, if it uses a hash aggregate,
>> is over 1GB, and could easily be much higher.
>
> In that case the estimate of 30K per instance was wrong.
> You still haven't explained why this is impossible to estimate,
> or even particularly hard, as long as we can provide some code that
> knows specifically about the behavior of this aggregate.  The amount
> of space needed to sort X amount of data is not unpredictable.
>

The estimate that's wrong is the number of rows that the SRF is going
to return. If I'm reading the plan right, the planner thinks that the
aggregate is going to be called 200 times on groups of 5 rows.
Actually, it's called 40000 times on groups of 1 row.

Regards,
Dean

Re: wip: functions median and percentile

От

Greg Stark

Дата:

11 октября 2010 г., 15:27:46

On Sun, Oct 10, 2010 at 2:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> It was pointed out upthread that while median isn't presently
> in the standard, Oracle defines it in terms of percentile_cont(0.5)
> which *is* in the standard.

Uhmm, then why don't we implement that? We could provide median() as a
short-cut but percentile_cont() doesn't sound much harder to implement
than median() and more general.

-- 
greg

Re: wip: functions median and percentile

От

Josh Berkus

Дата:

11 октября 2010 г., 15:30:57

> Uhmm, then why don't we implement that? We could provide median() as a
> short-cut but percentile_cont() doesn't sound much harder to implement
> than median() and more general.

I could really use percentile_cont(0.9), actually, for query
response-time analysis.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

11 октября 2010 г., 15:47:36

Hello

2010/10/11 Greg Stark <gsstark@mit.edu>:
> On Sun, Oct 10, 2010 at 2:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It was pointed out upthread that while median isn't presently
>> in the standard, Oracle defines it in terms of percentile_cont(0.5)
>> which *is* in the standard.
>
> Uhmm, then why don't we implement that? We could provide median() as a
> short-cut but percentile_cont() doesn't sound much harder to implement
> than median() and more general.

The problem is in interface. The original patch did it, but I removed
it. We cannot to unsure immutability of some parameters now. Can we
enhance a AGGREGATE to allow some mark like IMMUTABLE parameter and
probably we should to support ANSI syntax:

PERCENTILE_CONT ( expression1 )
WITHIN GROUP ( ORDER BY expression2 [ ASC | DESC ] )

This syntax allows to divide a muttable and immutable parameters.

Regards

Pavel Stehule
>
> --
> greg
>

Re: wip: functions median and percentile

От

Hitoshi Harada

Дата:

12 октября 2010 г., 00:08:42

2010/10/12 Pavel Stehule <pavel.stehule@gmail.com>:
> Hello
>
> 2010/10/11 Greg Stark <gsstark@mit.edu>:
>> On Sun, Oct 10, 2010 at 2:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> It was pointed out upthread that while median isn't presently
>>> in the standard, Oracle defines it in terms of percentile_cont(0.5)
>>> which *is* in the standard.
>>
>> Uhmm, then why don't we implement that? We could provide median() as a
>> short-cut but percentile_cont() doesn't sound much harder to implement
>> than median() and more general.
>
> The problem is in interface. The original patch did it, but I removed
> it. We cannot to unsure immutability of some parameters now. Can we
> enhance a AGGREGATE to allow some mark like IMMUTABLE parameter and
> probably we should to support ANSI syntax:
>
> PERCENTILE_CONT ( expression1 )
> WITHIN GROUP ( ORDER BY expression2 [ ASC | DESC ] )
>
> This syntax allows to divide a muttable and immutable parameters.

If this is only a syntax sugar for mutable/immutable parameter, then I
guess it's time to take it serious to implement in our syntax,
although I'm not sure if it affects more execution model than
interface.

Regards,



-- 
Hitoshi Harada

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

12 октября 2010 г., 06:15:09

2010/10/12 Hitoshi Harada <umi.tanuki@gmail.com>:
> 2010/10/12 Pavel Stehule <pavel.stehule@gmail.com>:
>> Hello
>>
>> 2010/10/11 Greg Stark <gsstark@mit.edu>:
>>> On Sun, Oct 10, 2010 at 2:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> It was pointed out upthread that while median isn't presently
>>>> in the standard, Oracle defines it in terms of percentile_cont(0.5)
>>>> which *is* in the standard.
>>>
>>> Uhmm, then why don't we implement that? We could provide median() as a
>>> short-cut but percentile_cont() doesn't sound much harder to implement
>>> than median() and more general.
>>
>> The problem is in interface. The original patch did it, but I removed
>> it. We cannot to unsure immutability of some parameters now. Can we
>> enhance a AGGREGATE to allow some mark like IMMUTABLE parameter and
>> probably we should to support ANSI syntax:
>>
>> PERCENTILE_CONT ( expression1 )
>> WITHIN GROUP ( ORDER BY expression2 [ ASC | DESC ] )
>>
>> This syntax allows to divide a muttable and immutable parameters.
>
> If this is only a syntax sugar for mutable/immutable parameter, then I
> guess it's time to take it serious to implement in our syntax,
> although I'm not sure if it affects more execution model than
> interface.

I though about it, the question is an interface for PL languages.
There are not problem for C.

Regards

Pavel Stehule


>
> Regards,
>
>
>
> --
> Hitoshi Harada
>

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

13 октября 2010 г., 07:37:52

Hello

I am looking on SQL standard for some info about "within group"
clause. This clause is necessary for functions:

rank, dense_rank, cume_dist, percent_rank and percentile_disc and
persentile_cont. These functions needs a clause "WITHIN GROUP".

If I understand, then these functions are not simple aggregates - its
some between window functions and aggregates.

Questions:

* is clause "WITHIN GROUP" just syntactic sugar for our aggregate with
ORDER BY? I am thinking so not. There are predefined set of functions
that can be used with this clause.

* what is correct implementation of these functions?  When I am
looking on parameters, these functions are very similar to window
functions. So there are two two ways for implementation. Implement it
as special case of window functions or implement it as special case of
aggregates.

Regards

Pavel Stehule

2010/10/12 Hitoshi Harada <umi.tanuki@gmail.com>:
> 2010/10/12 Pavel Stehule <pavel.stehule@gmail.com>:
>> Hello
>>
>> 2010/10/11 Greg Stark <gsstark@mit.edu>:
>>> On Sun, Oct 10, 2010 at 2:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> It was pointed out upthread that while median isn't presently
>>>> in the standard, Oracle defines it in terms of percentile_cont(0.5)
>>>> which *is* in the standard.
>>>
>>> Uhmm, then why don't we implement that? We could provide median() as a
>>> short-cut but percentile_cont() doesn't sound much harder to implement
>>> than median() and more general.
>>
>> The problem is in interface. The original patch did it, but I removed
>> it. We cannot to unsure immutability of some parameters now. Can we
>> enhance a AGGREGATE to allow some mark like IMMUTABLE parameter and
>> probably we should to support ANSI syntax:
>>
>> PERCENTILE_CONT ( expression1 )
>> WITHIN GROUP ( ORDER BY expression2 [ ASC | DESC ] )
>>
>> This syntax allows to divide a muttable and immutable parameters.
>
> If this is only a syntax sugar for mutable/immutable parameter, then I
> guess it's time to take it serious to implement in our syntax,
> although I'm not sure if it affects more execution model than
> interface.
>
> Regards,
>
>
>
> --
> Hitoshi Harada
>

Re: wip: functions median and percentile

От

Peter Eisentraut

Дата:

13 октября 2010 г., 07:48:29

On mån, 2010-10-11 at 20:46 +0200, Pavel Stehule wrote:
> The problem is in interface. The original patch did it, but I removed
> it. We cannot to unsure immutability of some parameters now.

How about you store the "immutable" parameter in the transition state
and error out if it changes between calls?

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

13 октября 2010 г., 07:52:59

2010/10/13 Peter Eisentraut <peter_e@gmx.net>:
> On mån, 2010-10-11 at 20:46 +0200, Pavel Stehule wrote:
>> The problem is in interface. The original patch did it, but I removed
>> it. We cannot to unsure immutability of some parameters now.
>
> How about you store the "immutable" parameter in the transition state
> and error out if it changes between calls?
>

yes, it's possibility. Now I looking there and I see the as more
important problem the conformance with ANSI SQL. see my last post.
There can be a kind of aggregate functions based on tuplesort.

Regards

Pavel

>

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

13 октября 2010 г., 07:56:55

2010/10/13 Pavel Stehule <pavel.stehule@gmail.com>:
> 2010/10/13 Peter Eisentraut <peter_e@gmx.net>:
>> On mån, 2010-10-11 at 20:46 +0200, Pavel Stehule wrote:
>>> The problem is in interface. The original patch did it, but I removed
>>> it. We cannot to unsure immutability of some parameters now.
>>
>> How about you store the "immutable" parameter in the transition state
>> and error out if it changes between calls?
>>
>
> yes, it's possibility. Now I looking there and I see the as more
> important problem the conformance with ANSI SQL. see my last post.
> There can be a kind of aggregate functions based on tuplesort.

more - all these functions needs to solve same problem with planner
and hash agg. So maybe is time to add a flag ISTUPLESORTED to pg_proc
and add solve these functions together.

Pavel

>
> Regards
>
> Pavel
>
>
>>
>

Re: wip: functions median and percentile

От

Greg Stark

Дата:

14 октября 2010 г., 06:50:53

On Wed, Oct 13, 2010 at 5:54 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Or to put it more bluntly - what is the "problem with planner and hash
> agg" that all of these functions need to solve?  And why does it need
> a flag in pg_proc?  Why can't't we leave it to the individual
> functions to perform a sort of one is needed?
>

So I think that's backwards. Why is the function doing data
manipulations like sorts that we usually put in the plan? Is there
some some key meta information that should be flagged in pg_proc and
general functionality the executor should be providing so that this
general class of problems can be solved efficiently?

Otherwise I fear lots of things we would expect to be efficient such
as calculating the top, median, and last items in the same sort order
would require three separate sorts of the same data. We have a planner
capable of comparing sort orders and estimating the costs of different
plans, we should be using it.

--
greg

Re: wip: functions median and percentile

От

Robert Haas

Дата:

14 октября 2010 г., 07:02:48

On Wed, Oct 13, 2010 at 6:56 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2010/10/13 Pavel Stehule <pavel.stehule@gmail.com>:
>> 2010/10/13 Peter Eisentraut <peter_e@gmx.net>:
>>> On mån, 2010-10-11 at 20:46 +0200, Pavel Stehule wrote:
>>>> The problem is in interface. The original patch did it, but I removed
>>>> it. We cannot to unsure immutability of some parameters now.
>>>
>>> How about you store the "immutable" parameter in the transition state
>>> and error out if it changes between calls?
>>>
>>
>> yes, it's possibility. Now I looking there and I see the as more
>> important problem the conformance with ANSI SQL. see my last post.
>> There can be a kind of aggregate functions based on tuplesort.
>
> more - all these functions needs to solve same problem with planner
> and hash agg. So maybe is time to add a flag ISTUPLESORTED to pg_proc
> and add solve these functions together.

I think that the design of this patch is still sufficiently up in the
air that it is not going to be practical to get it committed during
the current CommitFest, which is nearly over, so I am going to mark it
as Returned with Feedback.  I suggest that we continue discussing it,
however, so that we can get things squared away for the next
CommitFest.  It seems that the fundamental question here is whether
median is an instance of some more general problem, or whether it's a
special case; and more importantly, if it is an instance of a more
general problem, what is the shape of that general problem?

Or to put it more bluntly - what is the "problem with planner and hash
agg" that all of these functions need to solve?  And why does it need
a flag in pg_proc?  Why can't't we leave it to the individual
functions to perform a sort of one is needed?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: wip: functions median and percentile

От

Pavel Stehule

Дата:

14 октября 2010 г., 07:20:22

2010/10/14 Robert Haas <robertmhaas@gmail.com>:
> On Wed, Oct 13, 2010 at 6:56 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> 2010/10/13 Pavel Stehule <pavel.stehule@gmail.com>:
>>> 2010/10/13 Peter Eisentraut <peter_e@gmx.net>:
>>>> On mån, 2010-10-11 at 20:46 +0200, Pavel Stehule wrote:
>>>>> The problem is in interface. The original patch did it, but I removed
>>>>> it. We cannot to unsure immutability of some parameters now.
>>>>
>>>> How about you store the "immutable" parameter in the transition state
>>>> and error out if it changes between calls?
>>>>
>>>
>>> yes, it's possibility. Now I looking there and I see the as more
>>> important problem the conformance with ANSI SQL. see my last post.
>>> There can be a kind of aggregate functions based on tuplesort.
>>
>> more - all these functions needs to solve same problem with planner
>> and hash agg. So maybe is time to add a flag ISTUPLESORTED to pg_proc
>> and add solve these functions together.
>
> I think that the design of this patch is still sufficiently up in the
> air that it is not going to be practical to get it committed during
> the current CommitFest, which is nearly over, so I am going to mark it
> as Returned with Feedback.  I suggest that we continue discussing it,
> however, so that we can get things squared away for the next
> CommitFest.  It seems that the fundamental question here is whether
> median is an instance of some more general problem, or whether it's a
> special case; and more importantly, if it is an instance of a more
> general problem, what is the shape of that general problem?

+1

Median implemented as special case of some special sort of functions
will be better. The use case of ANSI SQL functions are more general -
but it needs discussion about design. I didn't find too much
consistency in standard. These functions are defined individually -
not as some special kind of functions. All functions from standard has
a immutable parameters - but Oracle listagg function has one parameter
mutable and second immutable.

More we should better to solve using these functions together with
window clause. I didn't find any note about using combination in
standard, but minimally Oracle allows a within_group and over clauses
together.

On second hand - this work can be really useful. We can get a bigger
conformity with ANSI SQL 200x and with other db - DB2, Oracle, MSSQL,
Sybase support this feature.

>
> Or to put it more bluntly - what is the "problem with planner and hash
> agg" that all of these functions need to solve?  And why does it need
> a flag in pg_proc?  Why can't't we leave it to the individual
> functions to perform a sort of one is needed?

These functions are hungry - It takes a 30 kb (minimum tuplesort) per
group. More there is relative general pattern, that can be shared -
there can be minimaly 6 functions, that just fill tuplesort in
iterations - so these code can be shared, tuplesort can be reseted and
used respectively. And it's question if requested sort can be used in
outer optimalizations. Primary thing for solving is memory usage.

Regards

Pavel Stehule

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: wip: functions median and percentile

От

Robert Haas

Дата:

14 октября 2010 г., 09:23:03

On Wed, Oct 13, 2010 at 10:37 PM, Greg Stark <gsstark@mit.edu> wrote:
> On Wed, Oct 13, 2010 at 5:54 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Or to put it more bluntly - what is the "problem with planner and hash
>> agg" that all of these functions need to solve?  And why does it need
>> a flag in pg_proc?  Why can't't we leave it to the individual
>> functions to perform a sort of one is needed?
>>
>
> So I think that's backwards. Why is the function doing data
> manipulations like sorts that we usually put in the plan? Is there
> some some key meta information that should be flagged in pg_proc and
> general functionality the executor should be providing so that this
> general class of problems can be solved efficiently?
>
> Otherwise I fear lots of things we would expect to be efficient such
> as calculating the top, median, and last items in the same sort order
> would require three separate sorts of the same data. We have a planner
> capable of comparing sort orders and estimating the costs of different
> plans, we should be using it.

Good point.  I think you're right.  I'm not sure what the best design
for it is, though.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: wip: functions median and percentile

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения