Re: PG qsort vs. Solaris

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: PG qsort vs. Solaris
Дата
Msg-id 21958.1159912552@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: PG qsort vs. Solaris  (Gregory Stark <stark@enterprisedb.com>)
Список pgsql-hackers
Gregory Stark <stark@enterprisedb.com> writes:
> Actually what I was more concerned about was things like on data structures
> with complex comparison routines. Things like sorting on arrays or ROWs.

The important point here is that blowing up the cost of the comparison
function by a factor of 3 (by switching from strcmp to strcoll) was not
sufficient to overcome the disadvantage --- which says to me that some
of the disadvantage is inbuilt and actually scales with the cost of the
comparisons.  I suspect what we are looking at here is cache effect on
the tuple accesses: quicksort has more locality of reference than
mergesort, and that applies not only to the tuple pointers that qsort
itself is manipulating, but the data they point at.

> For that matter it seems to me that sorting on a single column is a pretty
> unrealistic scenario too.

Not really; even if you are sorting on multi keys, most of the time the
first column determines the comparison result.  But since you insist,
here's a test case deliberately skewed to not do that:

postgres=# select count(*) from (select (random()*3)::int,random() from generate_series(1,1000000) order by 1,2) ss;

glibc:
LOG:  begin tuple sort: nkeys = 2, workMem = 204800, randomAccess = f
LOG:  performsort starting: CPU 0.10s/1.03u sec elapsed 1.14 sec
LOG:  performsort done: CPU 0.12s/5.83u sec elapsed 5.95 sec
LOG:  internal sort ended, 71452 KB used, 18675458 comparisons: CPU 0.12s/6.10u sec elapsed 6.22 sec

ours:
LOG:  begin tuple sort: nkeys = 2, workMem = 204800, randomAccess = f
LOG:  performsort starting: CPU 0.10s/1.01u sec elapsed 1.12 sec
LOG:  performsort done: CPU 0.10s/3.96u sec elapsed 4.06 sec
LOG:  internal sort ended, 71452 KB used, 21047424 comparisons: CPU 0.10s/4.23u sec elapsed 4.33 sec


In any case I don't see that there's anything much left to argue about:
every single test we have done says that glibc's qsort is a loser.
Speculating about how it might not lose on sufficiently unusual cases
doesn't really counter the argument that it does lose on typical
scenarios.  Between that and the other advantages of controlling our own
destiny sorting-wise, I think the decision has become pretty clear-cut.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: workaround for buggy strtod is not necessary
Следующее
От: Mark Kirkwood
Дата:
Сообщение: Re: PG qsort vs. Solaris