Re: [HACKERS] The case for removing replacement selection sort

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: [HACKERS] The case for removing replacement selection sort
Дата
Msg-id CAH2-WzkoKKtqOhDhR0MuG9yhtN7vkxi5PY4gDa0DY2DTJf486w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] The case for removing replacement selection sort  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: [HACKERS] The case for removing replacement selection sort  (Greg Stark <stark@mit.edu>)
Список pgsql-hackers
On Thu, Sep 7, 2017 at 2:49 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Sep 7, 2017 at 5:38 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> Do we need/want to repeat some of that benchmarking on these patches? I
>> don't recall how much this code changed since those benchmarks were done
>> in the 9.6 cycle.
>
> +1 for some new benchmarking.  I'm all for removing this code if we
> don't need it any more, but it'd be a lot better to have more numbers
> before deciding that.

It isn't always a win. For example, if I alter the pgbench_accounts
table so that the column "aid" is of type numeric, the picture changes
for my test case -- replacement selection is still somewhat faster for
the "select count(distinct aid) from pgbench_accounts" query with
work_mem=2MB. It takes about 5.4 seconds with replacement selection,
versus 7.4 seconds for quicksorting. But, I still think we should
proceed with my patch, because:

* It's still faster with int4/int8 comparisons on modern hardware, and
I think that most ordered inputs will be of those types in practice.
The trade-off between those two properties (faster for int4/int8 when
ordered, slower for numeric) recommends killing replacement selection
entirely. It's not a slam dunk, but it makes sense on performance
grounds, IMV.

* The comparator numeric_fast_cmp() is not well optimized -- it
doesn't avoid allocating memory with each call, for example, unlike
varstrfastcmp_locale(). This could and should be improved, and might
even change the picture for this second test case.

* With the default work_mem of 8MB, we never use replacement selection
anyway. Whatever its merits, it's pretty rare to use replacement
selection simply because the default replacement_sort_tuples just
isn't that  many tuples (150,000).

* The upside of my patch is not inconsiderable: We can remove a lot of
code, which will enable further improvements in the future, which are
far more compelling (cleaner memory management, the use of memory
batches during run generation).

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Bossart, Nathan"
Дата:
Сообщение: Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan