Re: Using quicksort for every external sort run

Поиск

Список

Период

Сортировка

От	Greg Stark
Тема	Re: Using quicksort for every external sort run
Дата	7 декабря 2015 г. 16:18:15
Msg-id	CAM-w4HM4XW3u5kVEuUrr+L+KX3WZ=5JKk0A=DJjzypkB-Hyu4w@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Using quicksort for every external sort run (Peter Geoghegan <pg@heroku.com>)
Ответы	Re: Using quicksort for every external sort run (Greg Stark <stark@mit.edu>)
Список	pgsql-hackers

Дерево обсуждения

So incidentally I've been running some benchmarks myself. Mostly to understand the current scaling behaviour of sorting to better judge whether Peter's analysis of where the pain points are and why we should not worry about optimizing for the multiple merge pass case were on target. I haven't actually benchmarked his patch at all, just stock head so far.

The really surprising result (for me) so far is that apparently merge passes spent actually very little time doing I/O. I had always assumed most of the time was spent waiting on I/O and that's why we spend so much effort ensuring sequential I/O and trying to maximize run lengths. I was expecting to see a huge step increase in the total time whenever there was an increase in merge passes. However I see hardly any increase, sometimes even a decrease despite the extra pass. The time generally increases as work_mem decreases but the slope is pretty moderate and gradual with no big steps due to extra passes.

On further analysis I'm less surprised by this than previously. The larger benchmarks I'm running are on a 7GB table which only actually generates 2.6GB of sort data so even writing all that out and then reading it all back in on a 100MB/s disk would only take an extra 50s. That won't make a big dent when the whole sort takes about 30 minutes. Even if you assume there's a substantial amount of random I/O it'll only be a 10% difference or so which is more or less in line with what I'm seeing.

I haven't actually got to benchmarking Peter's patch at all but this is reinforcing his argument dramatically. If the worst case for using quicksort is that the shorter runs might push us into doing an extra merge and that might add an extra 10% to the run-time that will be easily counter-balanced by the faster quicksort and in any case it only affects people who for some reason can't just increase work_mem to allow the single merge mode.

Table Size	Sort Size	128MB	64MB	32MB	16MB	8MB	4MB
6914MB	2672 MB	3392.29	3102.13	3343.53	4081.23	4727.74	5620.77
3457MB	1336 MB	1669.16	1593.85	1444.22	1654.27	2076.74	2266.84
2765MB	1069 MB	1368.92	1250.44	1117.2	1293.45	1431.64	1772.18
1383MB	535 MB	716.48	625.06	557.14	575.67	644.2	721.68
691MB	267 MB	301.08	295.87	266.84	256.29	283.82	292.24
346MB	134 MB	145.48	149.48	133.23	130.69	127.67	137.74
35MB	13 MB	3.58	16.77	11.23	11.93	13.97	3.17

The colours are to give an idea of the number of merge passes. Grey, is an internal sort. White is a single merge. Yellow and red are successively more merges (though the exact boundary between yellow and red may not be exactly meaningful due to my misunderstanding polyphase merge).

The numbers here are seconds taken from the "elapsed" in the following log statements when running queries like the following with trace_sort enabled:

LOG: external sort ended, 342138 disk blocks used: CPU 276.04s/3173.04u sec elapsed 5620.77 sec

STATEMENT: select count(*) from (select * from n200000000 order by r offset 99999999999) AS x;

This was run on the smallest size VM on Google Compute Engine with 600MB of virtual RAM and a 100GB virtual network block device.

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Michael Paquier
Дата: 07 декабря 2015 г., 15:15:02
Сообщение: Re: HINTing on UPDATE foo SET foo.bar = ..;

Следующее

От: Fujii Masao
Дата: 07 декабря 2015 г., 16:33:47
Сообщение: Re: [DOCS] max_worker_processes on the standby

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Using quicksort for every external sort run

Предыдущее

Следующее