Hi Mohan,
On Tue, May 28, 2024 at 02:26:41PM -0400, Mohan NBSPS wrote:
> Dear Community,
> [...]
> ```
> FATAL: could not receive data from WAL stream: ERROR: requested WAL
> segment 000000010000004100000049 has already been removed
> FATAL: the database system is starting up
> ```
>
> from my understanding, the WAL file is streamed over the network (secondary
> pulls from primary) and creates a WAL file in the secondary.
> then it replays the copied WAL file using a different process.
>
> in order for the local WAL file to go out of sync,
>
> 1. the primary removed the WAL file, the secondary was streaming
> 2. the WAL file on the secondary got corrupted
> 3 ....
>
> Questions
>
> - what do those error messages mean ?
> - how can I prevent this from happening ?
It means that, unless you have archived the required WAL segments somewhere
and can recover them from there, your replica is now broken, and you will have
to re-create it anew.
You can prevent this by correctly configuring streaming replication either by
using replication slots (not sure if that's already implemented in 9.5,
actually - you should prioritize upgrading to a supported release while you
are working this problem!), or by introducing a WAL archive[0] for replicas to
retrieve WAL from that the primary has already evicted from its kept segments.
Hth!
[0]:
https://www.postgresql.org/docs/9.5/runtime-config-wal.html#RUNTIME-CONFIG-WAL-ARCHIVING
--
with best regards:
- Johannes Truschnigg ( johannes@truschnigg.info )
www: https://johannes.truschnigg.info/