Re: Monitoring gaps in XLogWalRcvWrite() for the WAL receiver
От | Bertrand Drouvot |
---|---|
Тема | Re: Monitoring gaps in XLogWalRcvWrite() for the WAL receiver |
Дата | |
Msg-id | Z8gFnH4o3jBm5BRz@ip-10-97-1-34.eu-west-3.compute.internal обсуждение исходный текст |
Ответ на | Monitoring gaps in XLogWalRcvWrite() for the WAL receiver (Michael Paquier <michael@paquier.xyz>) |
Ответы |
Re: Monitoring gaps in XLogWalRcvWrite() for the WAL receiver
Re: Monitoring gaps in XLogWalRcvWrite() for the WAL receiver |
Список | pgsql-hackers |
Hi, On Wed, Mar 05, 2025 at 12:35:26PM +0900, Michael Paquier wrote: > Hi all, > > While doing some monitoring of a replication setup for a stable > branch, I have been surprised by the fact that we have never tracked > WAL statistics for the WAL receiver in pg_stat_wal because we have > never bothered to update its code so as WAL stats are reported. Nice catch! > This > is relevant for the write and sync counts and timings. Also for sync? sync looks fine as issue_xlog_fsync() is being called in XLogWalRcvFlush(), no? > As of f4694e0f35b2, the situation is better thanks to the addition of > a pgstat_report_wal() in the WAL receiver main loop, so we have some > data. However, we are only able to gather the data for segment syncs > and initializations, not the writes themselves as these are managed by > an independent code path, XLogWalRcvWrite(). > > A second thing that lacks in XLogWalRcvWrite() is a wait event around > the pg_pwrite() call, which is useful as the WAL receiver is listed in > pg_stat_activity. Note that it is possible to re-use the same wait > event as XLogWrite() for the WAL receiver, WAL_WRITE, because the WAL > receiver does not rely on the write and flush calls from xlog.c when > doing its work, and both have the same meaning, aka they write WAL. > The fsync calls use issue_xlog_fsync() and the segment inits happen in > XLogFileInit(). > > Perhaps there's a point in backpatching a portion of what's in the > attached patch (the wait event?), but I am not planning to bother much > with the stable branches based on the lack of complaints. We're not emitting some statistics, so I think that it's hard for users to complain about something they don't/can't see. > If you > have an opinion about that, please feel free. I'm tempted to say that the wal receiver part of f4694e0f35b2 should be backpatched as well as what you're doing here. + /* + * Measure I/O timing to write WAL data, for pg_stat_io. + */ + start = pgstat_prepare_io_time(track_wal_io_timing); + + pgstat_report_wait_start(WAIT_EVENT_WAL_WRITE); byteswritten = pg_pwrite(recvFile, buf, segbytes, (off_t) startoff); + pgstat_report_wait_end(); Same logic as in XLogWrite() and I don't think there is a need for a dedicated wait event, so LGTM. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
В списке pgsql-hackers по дате отправления: