Inlining comparators as a performance optimisation

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Inlining comparators as a performance optimisation
Дата
Msg-id CAEYLb_V_EDmawDtc1q1DCueTjqpw0D75f=mwWrZKW5YP96LQ+Q@mail.gmail.com
обсуждение исходный текст
Ответы Re: Inlining comparators as a performance optimisation
Re: Inlining comparators as a performance optimisation
Список pgsql-hackers
Recent discussions on the threads "Double sorting split patch" and
"CUDA sorting" raised the possibility that there could be significant
performance optimisation "low-hanging fruit" picked by having the
executor treat integers and floats as a special case during sorting,
avoiding going to the trouble of calling a comparator using the
built-in SQL function machinery, and taking advantage of inlining of
the comparator, which has been shown to have a considerable
performance advantage (at least compared to a general purpose c stdlib
qsort(), that takes a function pointer as its comparator, much like
tuplesort).

I've hacked together a sloppy POC implementation in a hurry
(basically, some code is shifted around) , which is attached - I felt
that it would be useful to determine if the community feels that this
is a worth-while undertaking in advance of a business trip that I'm
leaving on tomorrow lasting until Friday, during which I will be
mostly unavailable. The patch breaks the Postgres sorting executor
node (at least when it quicksorts) for any type other than int4. I
apologise for how rough the patch is, but the code itself isn't
important right now - the idea is. I anticipate that the value
state->datumType or something similar will be set in a near future
revision, so that tuplesort_performsort will know which type-specific
optimisation it can use for the type, while falling back on the
existing generic qsort_arg + qsort_arg_comparator, and sorting won't
actually be broken.

I've been doing some preliminary testing using the dell store 2 sample
database. I increase work_mem to '50MB', to ensure that a quicksort
will be performed for sorting (otherwise, I'm using the
postgresql.conf that initdb gave me). The query is:

explain analyze select * from orderlines order by prod_id;

Once the cache has been warmed, explain analyze very consistently
reports a runtime of 123ms for this query on master/HEAD, which varies
+/- 1 ms, with a few outliers of maybe +/- 2ms. However, when I apply
this patch, that goes down to 107ms +/- 1ms at -O0. I think that
that's a pretty good start. Funnily enough, the difference/advantage
vanishes at -O2 (I'm guessing that the higher optimisation level of
GCC 4.5 hyper-corrects away the inlining, but I don't have time to
check that right now).

I imagine the version that I actually submit for patch review will
have a macro-based infrastructure for inlining the sorting of various
built-in types, initially integers and floats. It will most likely
have some other optimisations - I haven't even used a profiler yet.

This performance patch differs from most in that it's difficult in
principle to imagine a performance regression occurring.

Thoughts?

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "David Rinaldi"
Дата:
Сообщение: Re: Grouping Sets
Следующее
От: Tom Lane
Дата:
Сообщение: Re: File not found error on creating collation