Re: Re: Abbreviated keys for Datum tuplesort

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: Re: Abbreviated keys for Datum tuplesort
Дата
Msg-id CAM3SWZTiZt+dfvkYArHnYvjmc6ewuQyXawp4ZWfghhVaXMjP-Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Re: Abbreviated keys for Datum tuplesort  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Thu, Apr 2, 2015 at 11:21 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Apr 2, 2015 at 5:34 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> I think it's explained by the pre-check for sorted input making the
>> number of comparisons exactly n -1. As I pointed out to Tomas, if you
>> put a single, solitary unsorted element at the end, the abbreviated
>> version is then 8x faster (maybe that was in relation to a slightly
>> different case, but the pattern is the same). So this case isn't an
>> argument against datum abbreviation, or even abbreviation in general,
>> but rather an argument against using strxfrm() in general (which for
>> example the GCC docs strongly recommend for sorting lists of strings).
>> It's a bad argument, IMV. This sort is already extremely fast.
>
> OK, I see.

Wait. It's also due to the fact that some cases are spuriously
aborted, because the cost model for text needs to be fixed, I believe.
It's not clear that this is actually such a case from Tomas' remarks,
because the smaller scale tested *did* win significantly, and that
appeared to be otherwise comparable to the regressed cases. So
although what I describe here about perfectly pre-sorted input is
something I've seen (and something that we discussed on another thread
IIRC), this is not an example of it. Again, this is just an example of
why we need to fix the cost model for text.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: Re: Abbreviated keys for Datum tuplesort
Следующее
От: Petr Jelinek
Дата:
Сообщение: Unused variable in hashpage.c