Re: Synchronizing slots from primary to standby
От | Bertrand Drouvot |
---|---|
Тема | Re: Synchronizing slots from primary to standby |
Дата | |
Msg-id | ZdyOlIIm3yiw5+GA@ip-10-97-1-34.eu-west-3.compute.internal обсуждение исходный текст |
Ответ на | Re: Synchronizing slots from primary to standby (Amit Kapila <amit.kapila16@gmail.com>) |
Список | pgsql-hackers |
Hi, On Mon, Feb 26, 2024 at 05:18:25PM +0530, Amit Kapila wrote: > On Mon, Feb 26, 2024 at 12:59 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > 10 === > > > > > > > > + if (slot->data.invalidated != RS_INVAL_NONE) > > > > + { > > > > + /* > > > > + * Specified physical slot have been invalidated, > > > > so no point > > > > + * in waiting for it. > > > > > > > > We discovered in [1], that if the wal_status is "unreserved" then the slot is still > > > > serving the standby. I think we should handle this case differently, thoughts? > > > > > > I think the 'invalidated' slot can still be used is a separate bug. > > > Because > > > once the slot is invalidated, it can neither protect WALs or ROWs from being > > > removed even if the restart_lsn of the slot can be moved forward after being invalidated. > > > > > > If the standby can move restart_lsn forward for invalidated slots, then > > > it should also set the 'invalidated' flag back to NONE, otherwise the slot > > > cannot serve its purpose anymore. I also reported similar bug before[1]. > > > ... > > > > My point is that I think we should behave like it's not a bug and then adapt the > > code accordingly here (until the bug gets fixed). > > > > oh, I think this doesn't sound like a good idea to me. We should fix > that bug independently rather than adding code in new features to > consider the bug as a valid behavior. Agree, but it all depends if there is a consensus of the other thread being a bug or not. I also think it is but there is this part of the code in pg_get_replication_slots() that makes me think ones could think it is not. " case WALAVAIL_REMOVED: /* * If we read the restart_lsn long enough ago, maybe that file * has been removed by now. However, the walsender could have * moved forward enough that it jumped to another file after * we looked. If checkpointer signalled the process to * termination, then it's definitely lost; but if a process is * still alive, then "unreserved" seems more appropriate. * " Anyway, I also think it is a bug so agree to keep the check as it is currenlty ( and keep an eye on the other thread outcome too). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
В списке pgsql-hackers по дате отправления: