Обсуждение: How to prioritise walsender reading from pg_wal over WAL writes?

Поиск
Список
Период
Сортировка

How to prioritise walsender reading from pg_wal over WAL writes?

От
Alexey Bashtanov
Дата:
Hello,

I've got a postgres master node that receives a lot of writes, WAL 
written at 100MB/sec or more at times.
And when these load spikes happen streaming replication starts lagging.
It looks like the lag happens on sending stage, and is limited by the 
master pg_wal partition throughput.
It's an SSD RAID-1 but it was the same when it was an HDD RAID-1.
I tried to prioritize reads using deadline scheduler, but even extreme 
values don't change the situation:
iostat shows more bytes written than read, while the device is busy, 
often 90-100%.

Writers connect via a transaction-pooling-mode pgbouncer, so I can tune 
the number of parallel connections.
Replication works fine when I limit it to 3, which is quite low.
It works for me so far, but looks really inflexible to me,
as e.g. I'll have to allocate a separate pgbouncer server pool for less 
eager apps and for humans.

I increased wal_buffers to 256MB hoping that it'll reduce the disk load, 
but it probably works only for
initial lag accumulation, once the lag is there it's not going to help.

ionice'ing walsender to best-effort -2 didn't help either

Any ideas how to prioritize walsender reads over writes from wal writer 
and backends even if there are multiple quite active ones?

The problem happens only occasionally, so if you ask for more details it 
may take some time to reply.
Sorry about this.

Best, Alex



Re: How to prioritise walsender reading from pg_wal over WAL writes?

От
Alexey Bashtanov
Дата:
If it helps, here's the details of the hardware config.
The controller is AVAGO MegaRAID SAS 9361-4i,
the SSDs are INTEL SSDSC2KG960G8 (configured as a raid1).
Current scheduler used is deadline.
Currently XFS is mounted without nobarriers, but I'm going to set that when there's high load next time to see how it affects throughput.

Properties of the array
VD6 Properties :
==============
Strip Size = 256 KB
Number of Blocks = 1874329600
VD has Emulated PD = Yes
Span Depth = 1
Number of Drives Per Span = 2
Write Cache(initial setting) = WriteBack
Disk Cache Policy = Disk's Default
Encryption = None
Data Protection = Disabled
Active Operations = None
Exposed to OS = Yes
Creation Date = 28-10-2020
Creation Time = 05:17:16 PM
Emulation type = default
Cachebypass size = Cachebypass-64k
Cachebypass Mode = Cachebypass Intelligent
Is LD Ready for OS Requests = Yes
SCSI NAA Id = 600605b00af03650272c641ca30c3196

Best, Alex

Re: How to prioritise walsender reading from pg_wal over WAL writes?

От
Laurenz Albe
Дата:
On Wed, 2020-11-18 at 11:54 +0000, Alexey Bashtanov wrote:
> If it helps, here's the details of the hardware config.
> The controller is AVAGO MegaRAID SAS 9361-4i,
> the SSDs are INTEL SSDSC2KG960G8 (configured as a raid1).
> Current scheduler used is deadline.
> Currently XFS is mounted without nobarriers, but I'm going to set that when there's high load next time to see how it
affectsthroughput.
 
> 
> Properties of the array
> VD6 Properties :
> ==============
> Strip Size = 256 KB
> Number of Blocks = 1874329600
> VD has Emulated PD = Yes
> Span Depth = 1
> Number of Drives Per Span = 2
> Write Cache(initial setting) = WriteBack
> Disk Cache Policy = Disk's Default
> Encryption = None
> Data Protection = Disabled
> Active Operations = None
> Exposed to OS = Yes
> Creation Date = 28-10-2020
> Creation Time = 05:17:16 PM
> Emulation type = default
> Cachebypass size = Cachebypass-64k
> Cachebypass Mode = Cachebypass Intelligent
> Is LD Ready for OS Requests = Yes
> SCSI NAA Id = 600605b00af03650272c641ca30c3196

Why??

WAL buffers has the most recent information, so that would result in unnecessary delay and I/O.

You'd have to hack the code, but I wonder what leads you to this interesting requirement.

Yours,
Laurenz Albe
-- 
+43-670-6056265
CYBERTEC PostgreSQL International GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com




Re: How to prioritise walsender reading from pg_wal over WAL writes?

От
Alexey Bashtanov
Дата:
Sorry Laurenz,

My reply did not get threaded appropriately.
My original question was here:
https://www.postgresql.org/message-id/a74d5732-60fd-d18b-05fd-7b2b97099f19%40imap.cc
I'd like to prioritize walsender for replication not to lag too much.
Otherwise, when I have load spikes on master, standby lags, sometimes by 
hundreds of gigabytes.

Best, Alex



Re: How to prioritise walsender reading from pg_wal over WAL writes?

От
Laurenz Albe
Дата:
On Wed, 2020-11-18 at 12:15 +0000, Alexey Bashtanov wrote:
> My reply did not get threaded appropriately.
> My original question was here:
> https://www.postgresql.org/message-id/a74d5732-60fd-d18b-05fd-7b2b97099f19%40imap.cc
> I'd like to prioritize walsender for replication not to lag too much.
> Otherwise, when I have load spikes on master, standby lags, sometimes by 
> hundreds of gigabytes.

I would first determine where the bottleneck is.

Is it really the walsender, or is it on the network or in the standby server's replay?

Check the difference between "sent_lsn", "replay_lsn" from "pg_stat_replication" and
pg_current_wal_lsn() on the primary.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: How to prioritise walsender reading from pg_wal over WAL writes?

От
Alexey Bashtanov
Дата:
> I would first determine where the bottleneck is.
>
> Is it really the walsender, or is it on the network or in the standby server's replay?
It is really the walsender, and it really is the performance of the WAL 
storage on the master.
> Check the difference between "sent_lsn", "replay_lsn" from "pg_stat_replication" and
> pg_current_wal_lsn() on the primary.
Yes I've checked these numbers, the lagging one is sent_lsn.
It doesn't look like it's hitting network capacity either.
When we moved it to an NVMe as a short-term solution it worked fine.

Best, Alex