Re: Switching XLog source from archive to streaming when primary available

Поиск
Список
Период
Сортировка
От Dilip Kumar
Тема Re: Switching XLog source from archive to streaming when primary available
Дата
Msg-id CAFiTN-vgSb7J6REYm2qTtZPdFQRvM-3zZ5ZxGam9kmod2D6V_g@mail.gmail.com
обсуждение исходный текст
Ответ на Switching XLog source from archive to streaming when primary available  (SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com>)
Список pgsql-hackers
On Mon, Nov 29, 2021 at 1:30 AM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:
>
> Hi Hackers,
>
> When the standby couldn't connect to the primary it switches the XLog source from streaming to archive and continues
inthat state until it can get the WAL from the archive location. On a server with high WAL activity, typically getting
theWAL from the archive is slower than streaming it from the primary and couldn't exit from that state. This not only
increasesthe lag on the standby but also adversely impacts the primary as the WAL gets accumulated, and vacuum is not
ableto collect the dead tuples. DBAs as a mitigation can however remove/advance the slot or remove the restore_command
onthe standby but this is a manual work I am trying to avoid. I would like to propose the following, please let me know
yourthoughts. 
>
> Automatically attempt to switch the source from Archive to streaming when the primary_conninfo is set after replaying
'N'wal segment governed by the GUC retry_primary_conn_after_wal_segments 
> when  retry_primary_conn_after_wal_segments is set to -1 then the feature is disabled
> When the retry attempt fails, then switch back to the archive

I think there is another thread [1] that is logically trying to solve
a similar issue, basically, in the main recovery apply loop is the
walreceiver does not exist then it is launching the walreceiver.
However, in that patch, it is not changing the current Xlog source but
I think that is not a good idea because with that it will restore from
the archive as well as stream from the primary so I have given that
review comment on that thread as well.  One big difference is that
patch is launching the walreceiver even if the WAL is locally
available and we don't really need more WAL but that is controlled by
a GUC.

[1] https://www.postgresql.org/message-id/CAKYtNApe05WmeRo92gTePEmhOM4myMpCK_%2BceSJtC7-AWLw1qw%40mail.gmail.com

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bharath Rupireddy
Дата:
Сообщение: Re: Synchronizing slots from primary to standby
Следующее
От: Zhihong Yu
Дата:
Сообщение: Re: [Proposal] Add foreign-server health checks infrastructure