Re: [HACKERS] [PATCH] Incremental sort

Поиск

Список

Период

Сортировка

От	Alexander Korotkov
Тема	Re: [HACKERS] [PATCH] Incremental sort
Дата	14 сентября 2017 г. 05:48:42
Msg-id	CAPpHfdtvx1oVJHCUe1x-R=t55Uz4uBi9i55BcoTJ7zZN_7Gxfw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [HACKERS] [PATCH] Incremental sort (Robert Haas <robertmhaas@gmail.com>)
Ответы	Re: [HACKERS] [PATCH] Incremental sort (Alexander Korotkov <a.korotkov@postgrespro.ru>) Re: [HACKERS] [PATCH] Incremental sort (Antonin Houska <ah@cybertec.at>)
Список	pgsql-hackers

Дерево обсуждения

On Mon, May 8, 2017 at 6:51 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, May 5, 2017 at 11:13 AM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> Incremental sort is faster in vast majority of cases. It appears to be
> slower only when whose dataset is one sort group. In this case incremental
> sort is useless, and it should be considered as misuse of incremental sort.
> Slowdown is related to the fact that we anyway have to do extra comparisons,
> unless we somehow push our comparison result into qsort itself and save some
> cpu cycles (but that would be unreasonable break of encapsulation). Thus,
> in such cases regression seems to be inevitable anyway. I think we could
> evade this regression during query planning. If we see that there would be
> only few groups, we should choose plain sort instead of incremental sort.

I'm sorry that I don't have time to review this in detail right now,
but it sounds like you are doing good work to file down cases where
this might cause regressions, which is great.

Thank you for paying attention to this patch!

Regarding the point in
the paragraph above, I'd say that it's OK for the planner to be
responsible for picking between Sort and Incremental Sort in some way.
It is, after all, the planner's job to decide between different
strategies for executing the same query and, of course, sometimes it
will be wrong, but that's OK as long as it's not wrong too often (or
by too much, hopefully).

Right, I agree.

It may be a little difficult to get this
right, though, because I'm not sure that the information you need
actually exists (or is reliable). For example, consider the case
where we need to sort 100m rows and there are 2 groups. If 1 group
contains 1 row and the other group contains all of the rest, there is
really no point in an incremental sort. On the other hand, if each
group contains 50m rows and we can get the data presorted by the
grouping column, there might be a lot of point to an incremental sort,
because two 50m-row sorts might be a lot cheaper than one 100m sort.

More generally, it's quite easy to imagine situations where the
individual groups can be quicksorted but sorting all of the rows
requires I/O, even when the number of groups isn't that big. On the
other hand, the real sweet spot for this is probably the case where
the number of groups is very large, with many single-row groups or
many groups with just a few rows each, so if we can at least get this
to work in those cases that may be good enough. On the third hand,
when costing aggregation, I think we often underestimate the number of
groups and there might well be similar problems here.

I agree with that. I need to test this patch more carefully in the case when groups have different sizes. It's likely I need to add yet another parameter to my testing script: groups count skew.

Patch rebased to current master is attached. I'm going to improve my testing script and post new results.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com

The Russian Postgres Company

Вложения

incremental-sort-8.patch

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Tom Lane
Дата: 14 сентября 2017 г., 05:48:29
Сообщение: Re: [HACKERS] psql: new help related to variables are not too readable

Следующее

От: Thomas Munro
Дата: 14 сентября 2017 г., 05:57:32
Сообщение: Re: [HACKERS] Parallel Hash take II

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [HACKERS] [PATCH] Incremental sort

Вложения

Предыдущее

Следующее