Re: [PERFORM] Releasing memory during External sorting?

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [PERFORM] Releasing memory during External sorting?
Дата
Msg-id 5441.1127499340@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: [PERFORM] Releasing memory during External sorting?  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Mark Lewis <mark.lewis@mir3.com> writes:
> operations != passes.  If you were clever, you could probably write a
> modified bubble-sort algorithm that only made 2 passes.  A pass is a
> disk scan, operations are then performed (hopefully in memory) on what
> you read from the disk.  So there's no theoretical log N lower-bound on
> the number of disk passes.

Given infinite memory that might be true, but I don't think I believe it
for limited memory.  If you have room for K tuples in memory then it's
impossible to perform more than K*N useful comparisons per pass (ie, as
each tuple comes off the disk you can compare it to all the ones
currently in memory; anything more is certainly redundant work).  So if
K < logN it's clearly not gonna work.

It's possible that you could design an algorithm that works in a fixed
number of passes if you are allowed to assume you can hold O(log N)
tuples in memory --- and in practice that would probably work fine,
if the constant factor implied by the O() isn't too big.  But it's not
really solving the general external-sort problem.

            regards, tom lane

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeremy Drake
Дата:
Сообщение: Re: 64-bit API for large objects
Следующее
От: Martijn van Oosterhout
Дата:
Сообщение: Re: Releasing memory during External sorting?