Re: polyphase merge?

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: polyphase merge?
Дата
Msg-id 2245.1233760675@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: polyphase merge?  (Greg Stark <stark@enterprisedb.com>)
Список pgsql-hackers
Greg Stark <stark@enterprisedb.com> writes:
> Is this basically the same as our current algorithm but without
> multiplexing the tapes onto single files? I have been wondering
> whether we multiplex the tapes any better than filesystems can lay out
> separate files actually.

The reason for the multiplexing is so that space can get re-used
quickly.  If each tape were represented as a separate file, there would
be no way to release blocks as they're read; you could only give back
the whole file after reaching end of tape.  Which would at least double
the amount of disk space needed to sort X amount of data.  (It's
actually even worse, more like 4X, though the multiplier might depend on
the number of "tapes" --- I don't recall the details anymore.)

The penalty we pay is that in the later merge passes, the blocks
representing a single tape aren't very well ordered.

It might be interesting to think about some compromise that wastes a
little more space in order to get better sequentiality of disk access.
It'd be easy to do if we were willing to accept a 2X space penalty,
but I'm not sure if that would fly or not.  It definitely *wasn't*
acceptable to the community a few years ago when the current code was
written.  Disks have gotten bigger since then, but so have the problems
people want to solve.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Stark
Дата:
Сообщение: Re: add_path optimization
Следующее
От: "Jonah H. Harris"
Дата:
Сообщение: Re: add_path optimization