Re: Sorting Improvements for 8.4

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Re: Sorting Improvements for 8.4
Дата
Msg-id 1196453253.22428.416.camel@dogma.ljc.laika.com
обсуждение исходный текст
Ответ на Sorting Improvements for 8.4  (Simon Riggs <simon@2ndquadrant.com>)
Ответы Re: Sorting Improvements for 8.4  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-hackers
On Tue, 2007-11-27 at 18:03 +0000, Simon Riggs wrote:
> 5. DYNAMIC RUN HANDLING (in Final Merge)
> 
> Another way of addressing a) is to simply make better use of memory
> itself. Let's look at that in more detail:
> 
> Number of runs that can be merged at once is currently fixed, based upon
> available memory. This has the underlying assumption that all runs will
> be concurrently active during final merging, which may not always be
> true.
> 
> If we have random data then almost all runs will overlap with all other
> runs, i.e. the min and max values are sufficiently wide that the runs do
> all overlap. In many cases, data arrives in somewhat sorted order, e.g.
> financial data is fairly regular with some late payers but not many, and
> those trail off with a fairly tight decay. In the somewhat sorted case
> we find that the actual overlap is less than total, so there are many
> later runs that don't overlap the earlier ones. In the best case we
> would find run 1 and 2 overlap, runs 2 and 3 overlap, then 3 and 4
> overlap.

I have spoken with Len Shapiro, a professor at Portland State
University, regarding sorting before.

He suggests that PostgreSQL should implement forecasting, which is
similar to what you're describing. Forecasting does not require that
entire runs are disjoint, it works by tracking the maximum values from
the last block read from every run. This allows you to know which run
you will need more blocks from the soonest.

I'm still looking into the problem to understand it better, but the
algorithm is in Knuth Vol 3.

I can look at it in more detail, but have you already looked into this
idea? Is there a reason we don't do this currently?

Regards,Jeff Davis



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Sam Mason
Дата:
Сообщение: Re: PostGreSQL and recursive queries...
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: [GENERAL] plperl and regexps with accented characters - incompatible?