Re: Inlining comparators as a performance optimisation

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Inlining comparators as a performance optimisation
Дата
Msg-id 201111292236.27140.andres@anarazel.de
обсуждение исходный текст
Ответ на Re: Inlining comparators as a performance optimisation  (Peter Geoghegan <peter@2ndquadrant.com>)
Список pgsql-hackers
On Tuesday, November 29, 2011 07:48:37 PM Peter Geoghegan wrote:
> On 29 November 2011 15:31, Bruce Momjian <bruce@momjian.us> wrote:
> > These are exciting advanced you are producing and I am hopeful we can
> > get this included in Postgres 9.2.
> 
> Thanks Bruce.
> 
> >I have mentioned already that I
> >
> > think parallelism is the next big Postgres challenge, and of course, one
> > of the first areas for parallelism is sorting.
> 
> I'm not sure that sorting has that much to recommend it as an initial
> target of some new backend parallelism other than being easy to
> implement. I've observed the qsort_arg specialisations in this patch
> out-perform stock qsort_arg by as much as almost 3 times. However, the
> largest decrease in a query's time that I've observed was 45%, and
> that was for a contrived worst-case for quicksort, but about 25% is
> much more typical of queries similar to the ones I've shown, for more
> normative data distributions. While that's a respectable gain, it
> isn't a paradigm shifting one, and it makes parallelising qsort itself
> for further improvements quite a lot less attractive - there's too
> many other sources of overhead.
I think that logic is faulty.

For one I doubt that anybody is honestly suggesting paralellism inside qsort 
itself. It seems more likely/sensible to implement that on the level of 
mergesorting.
Index builds for example could hugely benefit from improvements on that level. 
With index build you often get pretty non-optimal data distributions btw...

I also seriously doubt that you will find an area inside pg's executor where 
you find that paralellizing them will provide a near linear scale without 
much, much more work.

Also I wouldn't consider sorting the easiest target - especially on a qsort 
level - for parallelization as you constantly need to execute user defined 
operators with multiple input tuples which has the usual problems.
COPY parsing + inserting or such seems to be way easier target for example. 
Even doing hashing + aggregation in different threads seems likely to be 
easier.

Andres



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Allow pg_dumpall to use dumpmem.c functions, simplify exit code
Следующее
От: Greg Jaskiewicz
Дата:
Сообщение: Re: Inlining comparators as a performance optimisation