Re: [HACKERS] The case for removing replacement selection sort

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: [HACKERS] The case for removing replacement selection sort
Дата
Msg-id 11546c13-f194-5078-f5cd-9e44ef8f04b1@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: [HACKERS] The case for removing replacement selection sort  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: [HACKERS] The case for removing replacement selection sort  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
On 09/11/2017 02:22 AM, Peter Geoghegan wrote:
> On Sun, Sep 10, 2017 at 5:07 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> I'm currently re-running the benchmarks we did in 2016 for 9.6, but
>> those are all sorts with a single column (see the attached script). But
>> it'd be good to add a few queries testing sorts with multiple keys. We
>> can either tweak some of the existing data sets + queries, or come up
>> with entirely new tests.
> 
> I see that work_mem is set like this in the script:
> 
> "for wm in '1MB' '8MB' '32MB' '128MB' '512MB' '1GB'; do"
> 
> I suggest that we forget about values over 32MB, since the question of
> how well quicksort does there was settled by your tests in 2016. I
> would also add '4MB' to the list of wm values that you'll actually
> test.

OK, so 1MB, 4MB, 8MB, 32MB?

> 
> Any case with input that is initially in random order or DESC sort 
> order is not interesting, either. I suggest you remove those, too.
> 

OK.

> I think we're only interested in benchmarks where replacement
> selection really does get its putative best case (no merge needed in
> the end). Any (almost) sorted cases (the only cases that you are
> interesting to test now) will always manage that, once you set
> replacement_sort_tuples high enough, and provided there isn't even a
> single tuple that is completely out of order. The "before" cases here
> should have a replacement_sort_tuples of 1 billion (so that we're sure
> to not have the limit prevent the use of replacement selection in the
> first place), versus the "after" cases, which should have a
> replacement_sort_tuples of 0 to represent my proposal (to represent
> performance in a world where replacement selection is totally
> removed).
> 

Ah, so you suggest doing all the tests on current master, by only
tweaking the replacement_sort_tuples value? I've been testing master vs.
your patch, but I guess setting replacement_sort_tuples=0 should have
the same effect.

I probably won't eliminate the random/DESC data sets, though. At least
not from the two smaller data sets - I want to do a bit of benchmarking
on Heikki's polyphase merge removal patch, and for that patch those data
sets are still relevant. Also, it's useful to have a subset of results
where we know we don't expect any change.

>> For the existing queries, I should have some initial results
>> tomorrow, at least for the data sets with 100k and 1M rows. The
>> tests with 10M rows will take much more time (it takes 1-2hours for
>> a single work_mem value, and we're testing 6 of them).
> 
> I myself don't see that much value in a 10M row test.
> 

Meh, more data is probably better. And with the reduced work_mem values
and skipping of random/DESC data sets it should complete much faster.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: [HACKERS] Still another race condition in recovery TAP tests