Re: [PATCH] Add archive_mode=follow_primary to prevent unarchived WAL on standby promotion
| От | Fujii Masao |
|---|---|
| Тема | Re: [PATCH] Add archive_mode=follow_primary to prevent unarchived WAL on standby promotion |
| Дата | |
| Msg-id | CAHGQGwHNQcwsyLP4UqnUBoRPo4+vT=wvfe6reLX4TxwES-48qQ@mail.gmail.com обсуждение исходный текст |
| Ответ на | [PATCH] Add archive_mode=follow_primary to prevent unarchived WAL on standby promotion (Andrey Borodin <x4mmm@yandex-team.ru>) |
| Ответы |
Re: [PATCH] Add archive_mode=follow_primary to prevent unarchived WAL on standby promotion
|
| Список | pgsql-hackers |
On Fri, Oct 24, 2025 at 1:25 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote: > > Hi hackers, > > I'd like to propose a new archive_mode setting to address a gap in WAL > archiving for high availability streaming replication configurations. > > ## Problem > > In HA setups using streaming replication, standbys can be > promoted when primary has failed. Some WAL segments might be not yet > archived. This creates gaps in the WAL archive, breaking point-in-time > recovery: > > 1. Primary generates WAL, streams to standby > 2. Standby receives WAL, marks segments as .done immediately > 3. Standby deletes WAL during checkpoints > 4. Primary hasn't archived yet (archiver lag, network issues, etc.) > 5. Primary vanishes > 6. Standby gets promoted > 7. WAL history lost from archive > > This is particularly problematic in synchronous replication where > promotion might happen while the primary is still catching up on archival. > > Promoted standby might have some WALs from walreceiver, some from archive. In > this case we need to archive only those WALs which were received, but not > confirmed to be archived by primary. > > ## Proposed Solution > > Add archive_mode=follow_primary, where standbys defer WAL deletion until > the primary confirms archival: Can't we achieve nearly the same behavior by setting archive_mode to always and configuring archive_command on the standby to check whether the WAL file already exists in the shared archive area (e.g., test -f <archive directory>/%f (probably also the WAL file size should be checked))? In this setup, archive_command would fail until the WAL file appears in the archive, preventing the standby from removing it while the command is failing. Regards, -- Fujii Masao
В списке pgsql-hackers по дате отправления: