Re: [HACKERS] [PATCH] Incremental sort

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: [HACKERS] [PATCH] Incremental sort
Дата
Msg-id CAPpHfdtKHETXhf062CPvkjpG1wnjQ7rv4uLhZgYQ6VZjwqDYpg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] [PATCH] Incremental sort  (Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru>)
Ответы Re: [HACKERS] [PATCH] Incremental sort  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Список pgsql-hackers
On Fri, Apr 6, 2018 at 11:40 PM, Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> wrote:
On 06.04.2018 20:26, Tomas Vondra wrote:
I personally am OK with reducing the scope of the patch like this. It's
still beneficial for the common ORDER BY + LIMIT case, which is good. I
don't think it may negatively affect other cases (at least I can't think
of any).

I think we can reduce it even further. Just try incremental sort along with full sort over the cheapest path in create_ordered_paths, and don't touch anything else. This is a very minimal and a probably safe start, and then we can continue working on other, more complex cases. In the attached patch I tried to do this. We probably should also remove changes in make_sort() and create a separate function make_incremental_sort() for it, but I'm done for today.
 
I've done further unwedding of sort and incremental sort providing them separate function for plan createion.

2) Likewise, I've suggested that the claim about abbreviated keys in
nodeIncrementalsort.c is dubious. No response, and the XXX comment was
instead merged into the patch:

  * XXX The claim about abbreviated keys seems rather dubious, IMHO.

Not sure about that, maybe just use abbreviated keys for the first version? Later we can research this more closely and maybe start deciding whether to use abbrev on planning stage.

That comes from time when we're trying to make incremental sort to be always
not worse than full sort.  Now, we have separate paths for full and incremental sorts,
and some costing penalty for incremental sort.  So, incremental sort should be
selected only when it's expected to give big win.  Thus, we can give up with this
optimization at least in the initial version.

So, removed.

4) It's not clear to me why INITIAL_MEMTUPSIZE is defined the way it is.
There needs to be a comment - the intent seems to be making it large
enough to exceed ALLOCSET_SEPARATE_THRESHOLD, but it's not quite clear
why that's a good idea.

Not sure myself, let's ask the other Alexander.
 
I've added comment to INITIAL_MEMTUPSIZE.  However, to be fair it's not
invention of this patch.  Initial size of memtuples array was so previously.
What this patch did is just move it to the macro.

Also, this note hadn't been adressed yet.

On Sat, Mar 31, 2018 at 11:43 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
I'm wondering if a static MIN_GROUP_SIZE is good idea. For example, what
if the subplan is expected to return only very few tuples (say, 33), but
the query includes LIMIT 1. Now, let's assume the startup/total cost of
the subplan is 1 and 1000000. With MIN_GROUP_SIZE 32 we're bound to
execute it pretty much till the end, while we could terminate after the
first tuple (if the prefix changes).

So I think we should use a Min(limit,MIN_GROUP_SIZE) here, and perhaps
this should depend on average group size too.

I agree with that.  For bounded sort, attached patch now selects minimal group
size as Min(DEFAULT_MIN_GROUP_SIZE, bound).  That should improve
"LIMIT small_number" case.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jesper Pedersen
Дата:
Сообщение: Re: [HACKERS] Runtime Partition Pruning
Следующее
От: Tom Lane
Дата:
Сообщение: Re: WIP: a way forward on bootstrap data