walsender timeout on logical replication set

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема walsender timeout on logical replication set
Дата
Msg-id 20210913.103107.813489310351696839.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответы Re: walsender timeout on logical replication set  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
Hello.

As reported in [1] it seems that walsender can suffer timeout in
certain cases.  It is not clearly confirmed, but I suspect that
there's the case where LogicalRepApplyLoop keeps running the innermost
loop without receiving keepalive packet for longer than
wal_sender_timeout (not wal_receiver_timeout).  Of course that can be
resolved by giving sufficient processing power to the subscriber if
not. But if that happens between the servers with the equal processing
power, it is reasonable to "fix" this.  Theoretically I think this can
happen with equally-powered servers if the connecting network is
sufficiently fast.  Because sending reordered changes is relatively
simple and fast than apllying the changes on subscriber.

I think we don't want to call GetCurrentTimestamp every iteration of
the innermost loop.  Even if we call it every N iterations, I don't
come up with a proper N that fits any workload. So one possible
solution would be using slgalrm.  Is it worth doing?  Or is there any
other way?

Even if we won't fix this, we might need to add a description about
this restriciton in the documentation?

Any thougths?

[1] https://www.postgresql.org/message-id/CAEDsCzhBtkNDLM46_fo_HirFYE2Mb3ucbZrYqG59ocWqWy7-xA%40mail.gmail.com

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: Remove redundant initializations
Следующее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: corruption of WAL page header is never reported