Re: [HACKERS] Measuring replay lag

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: [HACKERS] Measuring replay lag
Дата
Msg-id CAEepm=2J0VVX6wSbGtCPkdM-MenN90aa8WduNabYG6hYRP-CaQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Measuring replay lag  (Simon Riggs <simon@2ndquadrant.com>)
Ответы Re: [HACKERS] Measuring replay lag  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-hackers
On Wed, Jan 4, 2017 at 8:58 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 3 January 2017 at 23:22, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>
>>> I don't see why that would be unacceptable. If we do it for
>>> remote_apply, why not also do it for other modes? Whatever the
>>> reasoning was for remote_apply should work for other modes. I should
>>> add it was originally designed to be that way by me, so must have been
>>> changed later.
>>
>> You can achieve that with this patch by setting
>> replication_lag_sample_interval to 0.
>
> I wonder why you ignore my mention of the bug in the correct mechanism?

I didn't have an opinion on that yet, but looking now I think there is
no bug:  I was wrong about the current reply frequency.  This comment
above XLogWalRcvSendReply confused me:
* If 'force' is not set, the message is only sent if enough time has* passed since last status update to reach
wal_receiver_status_interval.

Actually it's sent if 'force' is set, enough time has passed, or
either of the write or flush positions has moved.  So we're already
sending replies after every write and flush, as you said we should.

So perhaps I should get rid of that replication_lag_sample_interval
GUC and send back apply timestamps frequently, as you were saying.  It
would add up to a third more replies.

The effective sample rate would still be lowered when the fixed sized
buffers fill up and samples have to be dropped, and that'd be more
likely without that GUC. With the GUC, it doesn't start happening
until lag reaches XLOG_TIMESTAMP_BUFFER_SIZE *
replication_lag_sample_interval = ~2 hours with defaults, whereas
without rate limiting you might only need to get
XLOG_TIMESTAMP_BUFFER_SIZE 'w' messages behind before we start
dropping samples.  Maybe that's perfectly OK, I'm not sure.

-- 
Thomas Munro
http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ashutosh Bapat
Дата:
Сообщение: Re: [HACKERS] Reporting planning time with EXPLAIN
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: [HACKERS] proposal: session server side variables