Re: [HACKERS] [PATCH] Incremental sort

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: [HACKERS] [PATCH] Incremental sort
Дата
Msg-id CAPpHfds+CL_Kwkxmk9nwG=pzAqoxYEnJJrRZFRoRNMRC6Hn+QQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] [PATCH] Incremental sort  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-hackers
On Wed, Nov 22, 2017 at 12:01 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>> I gather that you have
>> determined empirically that it's better to be able to sort groups of
>> at least MIN_GROUP_SIZE than to be able to skip the comparisons on the
>> leading attributes, but why is that the case?
>
> Right.  The issue that not only case of one tuple per group could cause
> overhead, but few tuples (like 2 or 3) is also case of overhead.  Also,
> overhead is related not only to sorting.  While investigate of regression
> case provided by Heikki [1], I've seen extra time spent mostly in extra
> copying of sample tuple and comparison with that.  In order to cope this
> overhead I've introduced MIN_GROUP_SIZE which allows to skip copying sample
> tuples too frequently.

I see.  I wonder if there could ever be a function like
ExecMoveTuple(dst, src).  Given the polymorphism involved it'd be
slightly complicated and you'd probably have a general case that just
copies the tuple to dst and clears src, but there might be a bunch of
cases where you can do something more efficient like moving a pointer
and pin ownership.  I haven't really thought that through and
there may be fundamental problems with it...

ExecMoveTuple(dst, src) would be good.  But, it would be hard to implement "moving a pointer and pin ownership" principle in our current infrastructure.  It's because source and destination can have different memory contexts.  AFAICS, we can't just move memory area between memory contexts: we have to allocate new area, then memcpy, and then deallocate old area.  
 
If you're going to push the tuples into the sorter every time, then I
guess there are some special cases that could allow future
optimisations: (1) if you noticed that every prefix was different, you
can skip the sort operation (that is, you can use the sorter as a dumb
tuplestore and just get the tuples out in the same order you put them
in; not sure if Tuplesort supports that but it presumably could),

In order to notice that every prefix is different, I have to compare every prefix.  But that may introduce an overhead.  So, there reason why I introduced MIN_GROUP_SIZE is exactly to not compare every prefix...
 
(2)
if you noticed that every prefix was the same (that is, you have only
one prefix/group in the sorter) then you could sort only on the suffix
(that is, you could somehow tell Tuplesort to ignore the leading
columns),

Yes, I did so before.  But again, after introducing MIN_GROUP_SIZE, I missed knowledge whether all the prefixes were the same or different.  This is why, I've to sort by full column list for now...

(3) as a more complicated optimisation for intermediate
group sizes 1 < n < MIN_GROUP_SIZE, you could somehow number the
groups with an integer that increments whenever you see the prefix
change, and somehow tell tuplesort.c to use that instead of the
leading columns.

That is interesting idea.  The reason we have an overhead in comparison with plain sort is that we do extra comparison (and copying), but knowledge of this comparison result is lost for sorting itself.  Thus, sorting can "reuse" prefix comparison, and overhead would be lower.  But the problem is that we have to reformat tuples before putting them into tuplesort.  I wonder if tuple reformatting could eat potential performance win...

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: [HACKERS] Proposal for CSN based snapshots
Следующее
От: Alexander Korotkov
Дата:
Сообщение: Re: [HACKERS] [PATCH] Incremental sort