Re: wip: functions median and percentile

Поиск
Список
Период
Сортировка
От Hitoshi Harada
Тема Re: wip: functions median and percentile
Дата
Msg-id AANLkTinSbwJ9TJ_uOHR6VsQOg2yemd1cTstwqVB8096K@mail.gmail.com
обсуждение исходный текст
Ответ на Re: wip: functions median and percentile  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Ответы Re: wip: functions median and percentile  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Список pgsql-hackers
2010/10/5 Dean Rasheed <dean.a.rasheed@gmail.com>:
> On 5 October 2010 07:04, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
> Extrapolating from few quick timing tests, even in the best case, on
> my machine, it would take 7 days for the running median to use up
> 100MB, and 8 years for it to use 2GB. So setting the tuplesort's
> workMem to 2GB (only in the running median case) would actually be
> safe in practice, and would prevent the temp file leak (for a few
> years at least!). I feel dirty even suggesting that. Better ideas
> anyone?

So, I suggested to implement median as a *pure* window function aside
from Pavel's aggregate function, and Greg suggested insertion
capability of tuplesort. By this approach, we keep tuplesort to hold
all the values in the current frame and can release it on the last of
a partition (it's possible by window function API.) This is
incremental addition of values and is far better than O(n^2 log(n))
although I didn't estimate the order. Only when the frame head is
moving down, we should re-initialize tuplesort and it is as slow as
calling aggregate version per each row (but I think we can solve it
somehow if looking precisely).

Regards,

-- 
Hitoshi Harada


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: O_DSYNC broken on MacOS X?
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: standby registration (was: is sync rep stalled?)