Re: Cascading Replication - Standby Recovering Faster from Archive rather than Upstream node

Поиск
Список
Период
Сортировка
От Laurenz Albe
Тема Re: Cascading Replication - Standby Recovering Faster from Archive rather than Upstream node
Дата
Msg-id 2095e0615451e1a5a017a14d829ddda2d59a04b5.camel@cybertec.at
обсуждение исходный текст
Ответ на Cascading Replication - Standby Recovering Faster from Archive rather than Upstream node  (Tim <timfosho@gmail.com>)
Список pgsql-admin
On Thu, 2023-02-09 at 13:41 -0500, Tim wrote:
> We have a cascading replication setup with multiple nodes 2 of which are in a different
> cloud region for DR purposes. 
>
> DRnode01 > DRnode02 
>
> They are both standby nodes, DRnode01 only recovers from the archive and does not connect
> to any upstream node for streaming. DRnode02 has the same restore_command which points to
> that WAL archive, but it is also configured with primary_conninfo to stream from DRnode01.
> I'd like it to just WAL stream instead of using the archive, but it just ends up recovering
> slightly faster from the archive than DRnode01 can send WALs so the logs end up being
> spammed with 
>
> > 2023-02-09 13:32:00 EST [1277714]: [1-1] user=,db=,app=,client= LOG:  started streaming WAL from primary at
74CC/B5000000on timeline 33 
> > 2023-02-09 13:32:00 EST [1277714]: [2-1] user=,db=,app=,client= FATAL:  could not receive data from WAL stream:
ERROR: requested starting point 74CC/B5000000 is ahead of the WAL flush position of 
> > this server 74CC/B4FFFA08. 
>
> pg_wal_replay_pause() does not work, it ends up in the same situation after resuming.
> Changing restore_command requires a restart and turning it off altogether is not good
> for DR.
>
> I cannot get it out of this loop and this has been a recurring issue for a while. 
>
> Is there anything I can do to force to WAL stream instead of recovering from the archive
> without removing the restore_command setting?

My idea:

- On DRnode01, set "archive_mode = always" and configure an "archive_command" that copies
  WAL segments to a directory shared with DRnode02 (e.g. via NFS)
- DRnode02 has "primary_conninfo" that connects to DRnode01 and a "restore_command" that
  copies WAL segments from the shared directory.
- DRnode02 has an "archive_cleanup_command" that removes WAL segments that it no longer
  needs from the shared directory.

Yours,
Laurenz Albe



В списке pgsql-admin по дате отправления:

Предыдущее
От: Rui DeSousa
Дата:
Сообщение: Re: Cascading Replication - Standby Recovering Faster from Archive rather than Upstream node
Следующее
От: sireesha
Дата:
Сообщение: Re: Mutex error 22 - Postgres version 14