Re: polyphase merge?

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: polyphase merge?
Дата	4 февраля 2009 г. 11:18:21
Msg-id	2245.1233760675@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: polyphase merge? (Greg Stark <stark@enterprisedb.com>)
Список	pgsql-hackers

Дерево обсуждения

Greg Stark <stark@enterprisedb.com> writes:
> Is this basically the same as our current algorithm but without
> multiplexing the tapes onto single files? I have been wondering
> whether we multiplex the tapes any better than filesystems can lay out
> separate files actually.

The reason for the multiplexing is so that space can get re-used
quickly.  If each tape were represented as a separate file, there would
be no way to release blocks as they're read; you could only give back
the whole file after reaching end of tape.  Which would at least double
the amount of disk space needed to sort X amount of data.  (It's
actually even worse, more like 4X, though the multiplier might depend on
the number of "tapes" --- I don't recall the details anymore.)

The penalty we pay is that in the later merge passes, the blocks
representing a single tape aren't very well ordered.

It might be interesting to think about some compromise that wastes a
little more space in order to get better sequentiality of disk access.
It'd be easy to do if we were willing to accept a 2X space penalty,
but I'm not sure if that would fly or not.  It definitely *wasn't*
acceptable to the community a few years ago when the current code was
written.  Disks have gotten bigger since then, but so have the problems
people want to solve.
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: polyphase merge?