Re: PoC: Partial sort

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: PoC: Partial sort
Дата
Msg-id CAPpHfdsiRPaqn8DTty2DywkuOrXJJcJBQUiNy9Ossm1LDfjXwQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: PoC: Partial sort  (Andreas Karlsson <andreas@proxel.se>)
Ответы Re: PoC: Partial sort  (Marti Raudsepp <marti@juffo.org>)
Список pgsql-hackers
On Sun, Jan 19, 2014 at 5:57 AM, Andreas Karlsson <andreas@proxel.se> wrote:
On 01/18/2014 08:13 PM, Jeremy Harris wrote:
On 31/12/13 01:41, Andreas Karlsson wrote:
On 12/29/2013 08:24 AM, David Rowley wrote:
If it was possible to devise some way to reuse any
previous tuplesortstate perhaps just inventing a reset method which
clears out tuples, then we could see performance exceed the standard
seqscan -> sort. The code the way it is seems to lookup the sort
functions from the syscache for each group then allocate some sort
space, so quite a bit of time is also spent in palloc0() and pfree()

If it was not possible to do this then maybe adding a cost to the number
of sort groups would be better so that the optimization is skipped if
there are too many sort groups.

It should be possible. I have hacked a quick proof of concept for
reusing the tuplesort state. Can you try it and see if the performance
regression is fixed by this?

One thing which have to be fixed with my patch is that we probably want
to close the tuplesort once we have returned the last tuple from
ExecSort().

I have attached my patch and the incremental patch on Alexander's patch.

How does this work in combination with randomAccess ?

As far as I can tell randomAccess was broken by the partial sort patch even before my change since it would not iterate over multiple tuplesorts anyway.

Alexander: Is this true or am I missing something?

Yes, I decided that Sort node shouldn't provide randomAccess in the case of skipCols !=0. See assert in the beginning of ExecInitSort. I decided that it would be better to add explicit materialize node rather than store extra tuples in tuplesortstate each time.
I also adjusted ExecSupportsMarkRestore, ExecMaterializesOutput and ExecMaterializesOutput to make planner believe so. I found path->pathtype to be absolutely never T_Sort. Correct me if I'm wrong.

Another changes in this version of patch:
1) Applied patch to don't compare skipCols in tuplesort by Marti Raudsepp
2) Adjusting sort bound after processing buckets.

------
With best regards,
Alexander Korotkov.
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Marko Tiikkaja
Дата:
Сообщение: Re: plpgsql.warn_shadow
Следующее
От: Robert Haas
Дата:
Сообщение: Re: plpgsql.warn_shadow