Re: [HACKERS] Measuring replay lag

Поиск
Список
Период
Сортировка
От Fujii Masao
Тема Re: [HACKERS] Measuring replay lag
Дата
Msg-id CAHGQGwGANKWsH4jETZpucK7K0FZ8P70=9NEwgOJHPUzGxN0Z9A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Measuring replay lag  (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы Re: [HACKERS] Measuring replay lag
Список pgsql-hackers
On Mon, Dec 19, 2016 at 8:13 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Mon, Dec 19, 2016 at 4:03 PM, Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>> On 11/22/16 4:27 AM, Thomas Munro wrote:
>>> Thanks very much for testing!  New version attached.  I will add this
>>> to the next CF.
>>
>> I don't see it there yet.
>
> Thanks for the reminder.  Added here:  https://commitfest.postgresql.org/12/920/
>
> Here's a rebased patch.

I agree that the capability to measure the remote_apply lag is very useful.
Also I want to measure the remote_write and remote_flush lags, for example,
in order to diagnose the cause of replication lag.

For that, what about maintaining the pairs of send-timestamp and LSN in
*sender side* instead of receiver side? That is, walsender adds the pairs
of send-timestamp and LSN into the buffer every sampling period.
Whenever walsender receives the write, flush and apply locations from
walreceiver, it calculates the write, flush and apply lags by comparing
the received and stored LSN and comparing the current timestamp and
stored send-timestamp.

As a bonus of this approach, we don't need to add the field into the replay
message that walreceiver can very frequently send back. Which might be
helpful in terms of networking overhead.

Regards,

-- 
Fujii Masao



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ants Aasma
Дата:
Сообщение: Re: [HACKERS] Replication slot xmin is not reset if HS feedback isturned off while standby is shut down
Следующее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] Logical tape pause/resume