Re: Parallel Sort

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: Parallel Sort
Дата
Msg-id CAM3SWZTz3FtcNT=_gOOhX_5Qt_QRvG-5oma6x1GsuR6VC1AWzQ@mail.gmail.com
обсуждение исходный текст
Ответ на Parallel Sort  (Noah Misch <noah@leadboat.com>)
Ответы Re: Parallel Sort  (Peter Geoghegan <pg@heroku.com>)
Список pgsql-hackers
On Mon, May 13, 2013 at 7:28 AM, Noah Misch <noah@leadboat.com> wrote:
> We should decide whether to actually sort in parallel based on the comparator
> cost and the data size.  The system currently has no information on comparator
> cost: bt*cmp (and indeed almost all built-in functions) all have procost=1,
> but bttextcmp is at least 1000x slower than btint4cmp.

I think that this effort could justify itself independently of any
attempt to introduce parallelism to in-memory sorting. I abandoned a
patch to introduce timsort to Postgres, because I knew that there was
no principled way to reap the benefits. Unless you introduce
parallelism, it's probably going to be virtually impossible to come up
with an alogorithm that does in-memory sorting faster (in terms of the
amount of system time taken) than a highly optimized quicksort when
sorting integers. But sorting types with really expensive comparators
(even considerably more expensive than bttextcmp) for
pass-by-reference Datums (where the memory locality advantage of
quicksort doesn't really help so much) makes timsort much more
compelling. That's why it's used for Python lists.


-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kevin Grittner
Дата:
Сообщение: Re: counting algorithm for incremental matview maintenance
Следующее
От: Cédric Villemain
Дата:
Сообщение: Re: PostgreSQL 9.3 beta breaks some extensions "make install"