Re: [HACKERS] Detrimental performance impact of ringbuffers onperformance

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: [HACKERS] Detrimental performance impact of ringbuffers onperformance
Дата
Msg-id 20190507201619.lnyg2nyhmpxcgeau@alap3.anarazel.de
обсуждение исходный текст
Ответ на Detrimental performance impact of ringbuffers on performance  (Andres Freund <andres@anarazel.de>)
Ответы Re: [HACKERS] Detrimental performance impact of ringbuffers on performance  (Robert Haas <robertmhaas@gmail.com>)
Re: [HACKERS] Detrimental performance impact of ringbuffers onperformance  (Andrey Borodin <x4mmm@yandex-team.ru>)
Список pgsql-hackers
Hi,

On 2016-04-06 12:57:16 +0200, Andres Freund wrote:
> While benchmarking on hydra
> (c.f. http://archives.postgresql.org/message-id/20160406104352.5bn3ehkcsceja65c%40alap3.anarazel.de),
> which has quite slow IO, I was once more annoyed by how incredibly long
> the vacuum at the the end of a pgbench -i takes.
> 
> The issue is that, even for an entirely shared_buffers resident scale,
> essentially no data is cached in shared buffers. The COPY to load data
> uses a 16MB ringbuffer. Then vacuum uses a 256KB ringbuffer. Which means
> that copy immediately writes and evicts all data. Then vacuum reads &
> writes the data in small chunks; again evicting nearly all buffers. Then
> the creation of the ringbuffer has to read that data *again*.
> 
> That's fairly idiotic.
> 
> While it's not easy to fix this in the general case, we introduced those
> ringbuffers for a reason after all, I think we at least should add a
> special case for loads where shared_buffers isn't fully used yet.  Why
> not skip using buffers from the ringbuffer if there's buffers on the
> freelist? If we add buffers gathered from there to the ringlist, we
> should have few cases that regress.
> 
> Additionally, maybe we ought to increase the ringbuffer sizes again one
> of these days? 256kb for VACUUM is pretty damn low.

Just to attach some numbers for this. On my laptop, with a pretty fast
disk (as in ~550MB/s read + write, limited by SATA, not the disk), I get
these results.

I initialized a cluster with pgbench -q -i -s 1000, and VACUUM FREEZEd
pgbenc_accounts. I ensured that there's enough WAL files pre-allocated
that neither of the tests run into having to allocate WAL files.

I first benchmarked master, and then in a second run neutered
GetAccessStrategy(), by returning NULL in the BAS_BULKWRITE, BAS_VACUUM
cases.

master:

postgres[949][1]=# CREATE TABLE pgbench_accounts_copy AS SELECT * FROM pgbench_accounts ;
SELECT 100000000
Time: 199803.198 ms (03:19.803)
postgres[949][1]=# VACUUM VERBOSE pgbench_accounts_copy;
INFO:  00000: vacuuming "public.pgbench_accounts_copy"
LOCATION:  lazy_scan_heap, vacuumlazy.c:535
INFO:  00000: "pgbench_accounts_copy": found 0 removable, 100000000 nonremovable row versions in 1639345 out of 1639345
pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 4888968
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 13.31 s, system: 12.82 s, elapsed: 57.86 s.
LOCATION:  lazy_scan_heap, vacuumlazy.c:1500
VACUUM
Time: 57890.969 ms (00:57.891)
postgres[949][1]=# VACUUM FREEZE VERBOSE pgbench_accounts_copy;
INFO:  00000: aggressively vacuuming "public.pgbench_accounts_copy"
LOCATION:  lazy_scan_heap, vacuumlazy.c:530
INFO:  00000: "pgbench_accounts_copy": found 0 removable, 100000000 nonremovable row versions in 1639345 out of 1639345
pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 4888968
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 25.21 s, system: 33.45 s, elapsed: 185.76 s.
LOCATION:  lazy_scan_heap, vacuumlazy.c:1500
Time: 185786.829 ms (03:05.787)

So 199803.198 + 57890.969 + 185786.829 ms


no-copy/vacuum-ringbuffers:

postgres[5372][1]=# CREATE TABLE pgbench_accounts_copy AS SELECT * FROM pgbench_accounts ;
SELECT 100000000
Time: 143109.959 ms (02:23.110)
postgres[5372][1]=# VACUUM VERBOSE pgbench_accounts_copy;
INFO:  00000: vacuuming "public.pgbench_accounts_copy"
LOCATION:  lazy_scan_heap, vacuumlazy.c:535
INFO:  00000: "pgbench_accounts_copy": found 0 removable, 100000000 nonremovable row versions in 1639345 out of 1639345
pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 4888971
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 8.43 s, system: 0.01 s, elapsed: 8.49 s.
LOCATION:  lazy_scan_heap, vacuumlazy.c:1500
VACUUM
Time: 8504.410 ms (00:08.504)
postgres[5372][1]=# VACUUM FREEZE VERBOSE pgbench_accounts_copy;
INFO:  00000: aggressively vacuuming "public.pgbench_accounts_copy"
LOCATION:  lazy_scan_heap, vacuumlazy.c:530
INFO:  00000: "pgbench_accounts_copy": found 0 removable, 100000000 nonremovable row versions in 1639345 out of 1639345
pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 4888971
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 9.07 s, system: 0.78 s, elapsed: 14.22 s.
LOCATION:  lazy_scan_heap, vacuumlazy.c:1500
VACUUM
Time: 14235.619 ms (00:14.236)

So 143109.959 + 8504.410 + 14235.619 ms.


The relative improvements are:
CREATE TABLE AS: 199803.198 -> 143109.959: 39% improvement
VACUUM: 57890.969 -> 8504.410: 580% improvement
VACUUM FREEZE: 1205% improvement

And even if you were to argue - which I don't find entirely convincing -
that the checkpoint's time should be added afterwards, that's *still*
*much* faster:

postgres[5372][1]=# CHECKPOINT ;
Time: 33592.877 ms (00:33.593)


We probably can't remove the ringbuffer concept from these places, but I
think we should allow users to disable them. Forcing bulk-loads, vacuum,
analytics queries to go to the OS/disk, just because of a heuristic that
can't be disabled, yielding massive slowdowns, really sucks.


Small aside: It really sucks that we right now force each relation to
essentially be written twice, even leaving hint bits and freezing
aside. Once we fill it with zeroes (smgrextend call in
ReadBuffer_common()), and then later with the actual contents.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: We're leaking predicate locks in HEAD
Следующее
От: Robert Haas
Дата:
Сообщение: Re: make \d pg_toast.foo show its indices