Обсуждение: Question about InvalidatePossiblyObsoleteSlot()
Hi, all,
I have a question about a behavioral difference in InvalidatePossiblyObsoleteSlot() between PG15 (and earlier) and PG16 (and later):
Looking forward to your insights.
In PG15 and earlier: while attempting to acquire a slot, if the slot's restart_lsn advanced to be greater than oldestLSN during the process, the slot would not be marked invalid.
In PG16 and later: the invalidation decision is made solely based on the initial_restart_lsn captured at the start of the check, even if the slot's restart_lsn advances above oldestLSN during the process, the slot may still be marked invalid.
I wonder why not decide whether to mark the slot as invalid based on the slot's current restart_lsn? If a slot's restart_lsn has already advanced sufficiently during the invalidation process, indicating it's actively being used, shouldn't we refrain from invalidating it? What is the rationale behind this design change?
Best Regards,
suyu.cmj
Hi, On Tue, Sep 23, 2025 at 10:38:14PM +0800, suyu.cmj wrote: > Hi, all, > I have a question about a behavioral difference in InvalidatePossiblyObsoleteSlot() between PG15 (and earlier) and PG16(and later): > In PG15 and earlier: while attempting to acquire a slot, if the slot's restart_lsn advanced to be greater than oldestLSNduring the process, the slot would not be marked invalid. > In PG16 and later: the invalidation decision is made solely based on the initial_restart_lsn captured at the start of thecheck, even if the slot's restart_lsn advances above oldestLSN during the process, the slot may still be marked invalid. > I wonder why not decide whether to mark the slot as invalid based on the slot's current restart_lsn? If a slot's restart_lsnhas already advanced sufficiently during the invalidation process, indicating it's actively being used, shouldn'twe refrain from invalidating it? What is the rationale behind this design change? > Looking forward to your insights. That comes from 818fefd8fd4. Does the wording in the commit message ([1]) and the linked thread ([2]) answer your question? [1]: postgr.es/c/818fefd8fd4 [2]: postgr.es/m/ZaTjW2Xh+TQUCOH0@ip-10-97-1-34.eu-west-3.compute.internal Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi,
Thank you for the reference to commit 818fefd8fd4 and the related discussion thread. I understand the intent of introducing initial_restart_lsn was to preserve a consistent invalidation cause throughout the invalidation loop.
However, I still have a few concerns about this design change:
1. I understand the intention to keep the invalidation cause consistent, but If a slot's restart_lsn advances significantly during the invalidation check—indicating it is actively in use—shouldn't we reconsider invalidating it?
2. What potential issues arise if we refrain from invalidating slots whose restart_lsn advances during the invalidation process? Intuitively, an actively used slot that has moved it's restart_lsn beyond the problematic point should not be marked invalid.
3. If the current approach is indeed correct, should we consider making PG15 and earlier consistent with this behavior? The behavioral difference across versions may lead to different operational outcomes in otherwise similar situations.
I would appreciate your insights on these points.
Best regards,
suyu.cmjHi, On Thu, Oct 09, 2025 at 10:49:39AM +0800, suyu.cmj wrote: > Hi, > Thank you for the reference to commit 818fefd8fd4 and the related discussion thread. I understand the intent of introducinginitial_restart_lsn was to preserve a consistent invalidation cause throughout the invalidation loop. > However, I still have a few concerns about this design change: > 1. I understand the intention to keep the invalidation cause consistent, but If a slot's restart_lsn advances significantlyduring the invalidation check—indicating it is actively in use—shouldn't we reconsider invalidating it? > 2. What potential issues arise if we refrain from invalidating slots whose restart_lsn advances during the invalidationprocess? Intuitively, an actively used slot that has moved it's restart_lsn beyond the problematic point shouldnot be marked invalid. > 3. If the current approach is indeed correct, should we consider making PG15 and earlier consistent with this behavior?The behavioral difference across versions may lead to different operational outcomes in otherwise similar situations. > I would appreciate your insights on these points. I agree that before 818fefd8fd4 the invalidation cause could move from RS_INVAL_WAL_REMOVED to RS_INVAL_NONE if the slot restart lsn has been able to advance enough between the time we release the mutex and do the next check. With 818fefd8fd4 that's not the case anymore and we keep WAL_REMOVED as the invalidation cause (even if the slot restart lsn has been able to advance enough). That looks safe to use the pre 818fefd8fd4 behavior for the slot restart lsn case because the WAL files have not yet been removed by the checkpointer/startup process when it's busy in InvalidatePossiblyObsoleteSlot(). I think that we could get rid of the initial_restart_lsn and just use s->data.restart_lsn here (while keeping initial xmin ones to preserve the intent of 818fefd8fd4 for those). Something like the attached? Your concern is only about the restart_lsn, right? (asking because I don't think it's safe not to rely on the initial_* for the xmin ones, see [1]). [1]: https://www.postgresql.org/message-id/ZaACEPtVovKlRVUf%40ip-10-97-1-34.eu-west-3.compute.internal Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
Hi,
Thanks for your detailed reply.
> I think that we could get rid of the initial_restart_lsn and just use
> s->data.restart_lsn here (while keeping initial xmin ones to preserve the
> intent of 818fefd8fd4 for those).
I agree with your proposed change.
> Your concern is only about the restart_lsn, right? (asking because I don't think
> it's safe not to rely on the initial_* for the xmin ones, see [1]).
My primary concern is indeed the restart_lsn issue, and thanks again for the clear explanation.
Best regards,
Best regards,
suyu.cmj
On Sun, Oct 12, 2025 at 11:27 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On Thu, Oct 09, 2025 at 10:49:39AM +0800, suyu.cmj wrote: > > Hi, > > Thank you for the reference to commit 818fefd8fd4 and the related discussion thread. I understand the intent of introducinginitial_restart_lsn was to preserve a consistent invalidation cause throughout the invalidation loop. > > However, I still have a few concerns about this design change: > > 1. I understand the intention to keep the invalidation cause consistent, but If a slot's restart_lsn advances significantlyduring the invalidation check—indicating it is actively in use—shouldn't we reconsider invalidating it? > > 2. What potential issues arise if we refrain from invalidating slots whose restart_lsn advances during the invalidationprocess? Intuitively, an actively used slot that has moved it's restart_lsn beyond the problematic point shouldnot be marked invalid. > > 3. If the current approach is indeed correct, should we consider making PG15 and earlier consistent with this behavior?The behavioral difference across versions may lead to different operational outcomes in otherwise similar situations. > > I would appreciate your insights on these points. > > I agree that before 818fefd8fd4 the invalidation cause could move from > RS_INVAL_WAL_REMOVED to RS_INVAL_NONE if the slot restart lsn has been able to > advance enough between the time we release the mutex and do the next check. > > With 818fefd8fd4 that's not the case anymore and we keep WAL_REMOVED as the > invalidation cause (even if the slot restart lsn has been able to advance > enough). > > That looks safe to use the pre 818fefd8fd4 behavior for the slot restart lsn > case because the WAL files have not yet been removed by the checkpointer/startup > process when it's busy in InvalidatePossiblyObsoleteSlot(). > > I think that we could get rid of the initial_restart_lsn and just use > s->data.restart_lsn here (while keeping initial xmin ones to preserve the > intent of 818fefd8fd4 for those). IIUC with the proposed patch, it's possible that we report the slot invalidation once but don't actually invalidate the slot if slot's restart_lsn gets advanced and becomes greater than the oldestLSN after the report, is that right? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Hi, On Tue, Oct 14, 2025 at 04:06:44PM -0700, Masahiko Sawada wrote: > On Sun, Oct 12, 2025 at 11:27 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > Hi, > > > > On Thu, Oct 09, 2025 at 10:49:39AM +0800, suyu.cmj wrote: > > > Hi, > > > Thank you for the reference to commit 818fefd8fd4 and the related discussion thread. I understand the intent of introducinginitial_restart_lsn was to preserve a consistent invalidation cause throughout the invalidation loop. > > > However, I still have a few concerns about this design change: > > > 1. I understand the intention to keep the invalidation cause consistent, but If a slot's restart_lsn advances significantlyduring the invalidation check—indicating it is actively in use—shouldn't we reconsider invalidating it? > > > 2. What potential issues arise if we refrain from invalidating slots whose restart_lsn advances during the invalidationprocess? Intuitively, an actively used slot that has moved it's restart_lsn beyond the problematic point shouldnot be marked invalid. > > > 3. If the current approach is indeed correct, should we consider making PG15 and earlier consistent with this behavior?The behavioral difference across versions may lead to different operational outcomes in otherwise similar situations. > > > I would appreciate your insights on these points. > > > > I agree that before 818fefd8fd4 the invalidation cause could move from > > RS_INVAL_WAL_REMOVED to RS_INVAL_NONE if the slot restart lsn has been able to > > advance enough between the time we release the mutex and do the next check. > > > > With 818fefd8fd4 that's not the case anymore and we keep WAL_REMOVED as the > > invalidation cause (even if the slot restart lsn has been able to advance > > enough). > > > > That looks safe to use the pre 818fefd8fd4 behavior for the slot restart lsn > > case because the WAL files have not yet been removed by the checkpointer/startup > > process when it's busy in InvalidatePossiblyObsoleteSlot(). > > > > I think that we could get rid of the initial_restart_lsn and just use > > s->data.restart_lsn here (while keeping initial xmin ones to preserve the > > intent of 818fefd8fd4 for those). > > IIUC Thanks for looking at it! > with the proposed patch, it's possible that we report the slot > invalidation once but don't actually invalidate the slot if slot's > restart_lsn gets advanced and becomes greater than the oldestLSN after > the report, is that right? We don't really report an "invalidation", what we report is: LOG: terminating process 3998707 to release replication slot "logical_slot" DETAIL: The slot's restart_lsn 0/00842480 exceeds the limit by 2874240 bytes. HINT: You might need to increase "max_slot_wal_keep_size". and we terminate the process: FATAL: terminating connection due to administrator command We are not reporting: DETAIL: This replication slot has been invalidated due to "wal_removed". and the slot is still valid. That's the pre 818fefd8fd4 behavior. Ideally, I think that we should not report anything and not terminate the process. I did not look at it, maybe we could look at it as a second step (first step being to restore the pre 818fefd8fd4 behavior)? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Tue, Oct 14, 2025 at 9:51 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On Tue, Oct 14, 2025 at 04:06:44PM -0700, Masahiko Sawada wrote: > > On Sun, Oct 12, 2025 at 11:27 PM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > Hi, > > > > > > On Thu, Oct 09, 2025 at 10:49:39AM +0800, suyu.cmj wrote: > > > > Hi, > > > > Thank you for the reference to commit 818fefd8fd4 and the related discussion thread. I understand the intent of introducinginitial_restart_lsn was to preserve a consistent invalidation cause throughout the invalidation loop. > > > > However, I still have a few concerns about this design change: > > > > 1. I understand the intention to keep the invalidation cause consistent, but If a slot's restart_lsn advances significantlyduring the invalidation check—indicating it is actively in use—shouldn't we reconsider invalidating it? > > > > 2. What potential issues arise if we refrain from invalidating slots whose restart_lsn advances during the invalidationprocess? Intuitively, an actively used slot that has moved it's restart_lsn beyond the problematic point shouldnot be marked invalid. > > > > 3. If the current approach is indeed correct, should we consider making PG15 and earlier consistent with this behavior?The behavioral difference across versions may lead to different operational outcomes in otherwise similar situations. > > > > I would appreciate your insights on these points. > > > > > > I agree that before 818fefd8fd4 the invalidation cause could move from > > > RS_INVAL_WAL_REMOVED to RS_INVAL_NONE if the slot restart lsn has been able to > > > advance enough between the time we release the mutex and do the next check. > > > > > > With 818fefd8fd4 that's not the case anymore and we keep WAL_REMOVED as the > > > invalidation cause (even if the slot restart lsn has been able to advance > > > enough). > > > > > > That looks safe to use the pre 818fefd8fd4 behavior for the slot restart lsn > > > case because the WAL files have not yet been removed by the checkpointer/startup > > > process when it's busy in InvalidatePossiblyObsoleteSlot(). > > > > > > I think that we could get rid of the initial_restart_lsn and just use > > > s->data.restart_lsn here (while keeping initial xmin ones to preserve the > > > intent of 818fefd8fd4 for those). > > > > IIUC > > Thanks for looking at it! > > > with the proposed patch, it's possible that we report the slot > > invalidation once but don't actually invalidate the slot if slot's > > restart_lsn gets advanced and becomes greater than the oldestLSN after > > the report, is that right? > > We don't really report an "invalidation", what we report is: > > LOG: terminating process 3998707 to release replication slot "logical_slot" > DETAIL: The slot's restart_lsn 0/00842480 exceeds the limit by 2874240 bytes. > HINT: You might need to increase "max_slot_wal_keep_size". > > and we terminate the process: > > FATAL: terminating connection due to administrator command > > We are not reporting: > > DETAIL: This replication slot has been invalidated due to "wal_removed". > > and the slot is still valid. > > That's the pre 818fefd8fd4 behavior. Thank you for the clarification! Understood. > Ideally, I think that we should not report anything and not terminate the > process. I did not look at it, maybe we could look at it as a second step (first > step being to restore the pre 818fefd8fd4 behavior)? I find that reporting of terminating a process having an possibly obsolete slot is fine, but reading some related threads[1][2] it seems to me that a problem we want to avoid is that we report "terminated" without leading to an "obsolete" message. Does it make sense to report explicitly that the slot's restart_lsn gets recovered and we therefore skipped to invalidate it? Regards, [1] https://www.postgresql.org/message-id/flat/ZaTjW2Xh%2BTQUCOH0%40ip-10-97-1-34.eu-west-3.compute.internal [2] https://www.postgresql.org/message-id/7c025095-5763-17a6-8c80-899b35bd0459@gmail.com -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Hi, On Wed, Oct 15, 2025 at 04:24:03PM -0700, Masahiko Sawada wrote: > On Tue, Oct 14, 2025 at 9:51 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > We don't really report an "invalidation", what we report is: > > > > LOG: terminating process 3998707 to release replication slot "logical_slot" > > DETAIL: The slot's restart_lsn 0/00842480 exceeds the limit by 2874240 bytes. > > HINT: You might need to increase "max_slot_wal_keep_size". > > > > and we terminate the process: > > > > FATAL: terminating connection due to administrator command > > > > We are not reporting: > > > > DETAIL: This replication slot has been invalidated due to "wal_removed". > > > > and the slot is still valid. > > > > That's the pre 818fefd8fd4 behavior. > > Thank you for the clarification! Understood. > > > Ideally, I think that we should not report anything and not terminate the > > process. I did not look at it, maybe we could look at it as a second step (first > > step being to restore the pre 818fefd8fd4 behavior)? > > I find that reporting of terminating a process having an possibly > obsolete slot is fine, but reading some related threads[1][2] it seems > to me that a problem we want to avoid is that we report "terminated" > without leading to an "obsolete" message. Right, the focus at that time was around the invalidations related to the xmin conflicts (mainly because it felt unsafe to "ignore" the invalidation for them). Then later the restart_lsn was added to the discussion [1]. But after more thought, I do think it's safe for the restart_lsn case (for the reasons mentioned above). > Does it make sense to report > explicitly that the slot's restart_lsn gets recovered and we therefore > skipped to invalidate it? I'm not sure. The existing logging shows when we invalidate a slot, so the absence of that message indicates we skipped it. Users can also verify the slot status in the view if needed. Regards, [1]: https://www.postgresql.org/message-id/CALj2ACUo_rDTonZCkqdSnsc3tT5_cFcJQHSQrsyAYyO5MLO52A%40mail.gmail.com -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Thu, Oct 16, 2025 at 1:23 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Wed, Oct 15, 2025 at 04:24:03PM -0700, Masahiko Sawada wrote:
> > On Tue, Oct 14, 2025 at 9:51 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > We don't really report an "invalidation", what we report is:
> > >
> > > LOG: terminating process 3998707 to release replication slot "logical_slot"
> > > DETAIL: The slot's restart_lsn 0/00842480 exceeds the limit by 2874240 bytes.
> > > HINT: You might need to increase "max_slot_wal_keep_size".
> > >
> > > and we terminate the process:
> > >
> > > FATAL: terminating connection due to administrator command
> > >
> > > We are not reporting:
> > >
> > > DETAIL: This replication slot has been invalidated due to "wal_removed".
> > >
> > > and the slot is still valid.
> > >
> > > That's the pre 818fefd8fd4 behavior.
> >
> > Thank you for the clarification! Understood.
> >
> > > Ideally, I think that we should not report anything and not terminate the
> > > process. I did not look at it, maybe we could look at it as a second step (first
> > > step being to restore the pre 818fefd8fd4 behavior)?
> >
> > I find that reporting of terminating a process having an possibly
> > obsolete slot is fine, but reading some related threads[1][2] it seems
> > to me that a problem we want to avoid is that we report "terminated"
> > without leading to an "obsolete" message.
>
> Right, the focus at that time was around the invalidations related to the
> xmin conflicts (mainly because it felt unsafe to "ignore" the invalidation
> for them). Then later the restart_lsn was added to the discussion [1].
>
> But after more thought, I do think it's safe for the restart_lsn case
> (for the reasons mentioned above).
>
> > Does it make sense to report
> > explicitly that the slot's restart_lsn gets recovered and we therefore
> > skipped to invalidate it?
>
> I'm not sure. The existing logging shows when we invalidate a slot, so the
> absence of that message indicates we skipped it. Users can also verify the
> slot status in the view if needed.
>
I'm somewhat confused. As the commit message of 818fefd8fd4 says, the
invalidation of an active slot is done in two steps:
- Termination of the backend holding it, if any.
- Report that the slot is obsolete, with a conflict cause depending on
the slot's data.
Before commit 818fefd8fd4, it was possible that if slot's
effective_xmin, catalog_effective_xmin, or restart_lsn gets advanced
between two steps, we ended up skipping the actual invalidation for
the slot. It was not possible that the invalidation cause used when
terminating the process changes (like RS_INVAL_HORIZON to
RS_INVAL_WAL_REMOVED) between two steps since the possible_cause
doesn't change within InvalidatePossiblyObsoleteSlot(). The commit
818fefd8fd4 introduced initial_xxx variables to ensure that if we
report the "terminated" for a reason we invalidate the slot for the
same reason.
What the patch proposed on this thread does is that we allow such pre-
818fefd8fd4 behavior only for restart_lsn. That is, we skip the actual
slot invalidation if the slot's restart_lsn gets advanced and exceeds
the oldestLSN between two steps.
I thought the reason why we moved away from pre-818fefd8fd4 was that
we wanted to avoid reporting only "terminated" and to report both
things consistently. But if we can accept pre-818fefd8fd4 for
restart_lsn, why can we not do the same for effective_xmin and
catalog_effective_xmin? I guess that the same is true also for
effective_xmin and catalog_effective_xmin that we can skip the actual
slot invalidation if those values get advanced and exceed
snapshotConflictHorizon.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Hi, On Thu, Oct 16, 2025 at 11:32:03AM -0700, Masahiko Sawada wrote: > On Thu, Oct 16, 2025 at 1:23 AM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > I'm not sure. The existing logging shows when we invalidate a slot, so the > > absence of that message indicates we skipped it. Users can also verify the > > slot status in the view if needed. > > > > I'm somewhat confused. As the commit message of 818fefd8fd4 says, the > invalidation of an active slot is done in two steps: > > - Termination of the backend holding it, if any. > - Report that the slot is obsolete, with a conflict cause depending on > the slot's data. > > Before commit 818fefd8fd4, it was possible that if slot's > effective_xmin, catalog_effective_xmin, or restart_lsn gets advanced > between two steps, we ended up skipping the actual invalidation for > the slot. If advancing far enough to reach RS_INVAL_NONE, right. > What the patch proposed on this thread does is that we allow such pre- > 818fefd8fd4 behavior only for restart_lsn. That is, we skip the actual > slot invalidation if the slot's restart_lsn gets advanced and exceeds > the oldestLSN between two steps. Exactly. > I thought the reason why we moved away from pre-818fefd8fd4 was that > we wanted to avoid reporting only "terminated" and to report both > things consistently. But if we can accept pre-818fefd8fd4 for > restart_lsn, why can we not do the same for effective_xmin and > catalog_effective_xmin? > I guess that the same is true also for > effective_xmin and catalog_effective_xmin that we can skip the actual > slot invalidation if those values get advanced and exceed > snapshotConflictHorizon. I think this is safe to do for the restart_lsn because that's the same process (checkpointer/startup) that is executing InvalidatePossiblyObsoleteSlot() and will remove the WALs. It can not remove the WALs as long as it is busy in InvalidatePossiblyObsoleteSlot() so that I think this is safe for the restart_lsn case. I'm not that confident for the xmin cases, do you think that's also safe to not invalidate the slots for effective_xmin and catalog_effective_xmin if they advance far enough? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Fri, Oct 17, 2025 at 12:18 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On Thu, Oct 16, 2025 at 11:32:03AM -0700, Masahiko Sawada wrote: > > On Thu, Oct 16, 2025 at 1:23 AM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > I'm not sure. The existing logging shows when we invalidate a slot, so the > > > absence of that message indicates we skipped it. Users can also verify the > > > slot status in the view if needed. > > > > > > > I'm somewhat confused. As the commit message of 818fefd8fd4 says, the > > invalidation of an active slot is done in two steps: > > > > - Termination of the backend holding it, if any. > > - Report that the slot is obsolete, with a conflict cause depending on > > the slot's data. > > > > Before commit 818fefd8fd4, it was possible that if slot's > > effective_xmin, catalog_effective_xmin, or restart_lsn gets advanced > > between two steps, we ended up skipping the actual invalidation for > > the slot. > > If advancing far enough to reach RS_INVAL_NONE, right. > > > What the patch proposed on this thread does is that we allow such pre- > > 818fefd8fd4 behavior only for restart_lsn. That is, we skip the actual > > slot invalidation if the slot's restart_lsn gets advanced and exceeds > > the oldestLSN between two steps. > > Exactly. > > > I thought the reason why we moved away from pre-818fefd8fd4 was that > > we wanted to avoid reporting only "terminated" and to report both > > things consistently. But if we can accept pre-818fefd8fd4 for > > restart_lsn, why can we not do the same for effective_xmin and > > catalog_effective_xmin? > > I guess that the same is true also for > > effective_xmin and catalog_effective_xmin that we can skip the actual > > slot invalidation if those values get advanced and exceed > > snapshotConflictHorizon. > > I think this is safe to do for the restart_lsn because that's the same process > (checkpointer/startup) that is executing InvalidatePossiblyObsoleteSlot() > and will remove the WALs. It can not remove the WALs as long as it is busy in > InvalidatePossiblyObsoleteSlot() so that I think this is safe for the restart_lsn > case. > > I'm not that confident for the xmin cases, do you think that's also safe to not > invalidate the slots for effective_xmin and catalog_effective_xmin if they > advance far enough? I find the same in xmin cases. ResolveRecoveryConflictWithSnapshot() is called only during the recovery by the startup process, and it also tries to invalidate possibly obsolete slots. Catalog tuples are not removed as long as the startup calls ResolveRecoveryConflictWithSnapshot() before actually removing the tuples and it's busy in InvalidatePossiblyObsoleteSlot(). I might be missing something but this is the reason why I'm confused with the 818fefd8fd4 fix and the proposed change. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Hi, On Fri, Oct 17, 2025 at 03:08:07PM -0700, Masahiko Sawada wrote: > On Fri, Oct 17, 2025 at 12:18 AM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > do you think that's also safe to not > > invalidate the slots for effective_xmin and catalog_effective_xmin if they > > advance far enough? > > I find the same in xmin cases. ResolveRecoveryConflictWithSnapshot() > is called only during the recovery by the startup process, and it also > tries to invalidate possibly obsolete slots. Catalog tuples are not > removed as long as the startup calls > ResolveRecoveryConflictWithSnapshot() before actually removing the > tuples and it's busy in InvalidatePossiblyObsoleteSlot(). I looked more closely at the xmin related cases and I agree with the above. > I might be > missing something but this is the reason why I'm confused with the > 818fefd8fd4 fix and the proposed change. Yeah so 818fefd8fd4 is well suited for tests consistency but in some rare cases it invalidates a slot while it would be safe not to do so. OTOH it looks to me that the initial pre-818fefd8fd4 intend was to invalidate the slot as per this comment: " /* * Re-acquire lock and start over; we expect to invalidate the * slot next time (unless another process acquires the slot in the * meantime). */ " The fact that it could move forward far enough before we terminate the process holding the slot is a race condition due to the fact that we released the mutex. If the above looks right to you then 818fefd8fd4 is doing what was "initially" expected, do you agree? If so, then maybe it's fine to keep 818fefd8fd4 as is? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Mon, Oct 20, 2025 at 11:41 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Fri, Oct 17, 2025 at 03:08:07PM -0700, Masahiko Sawada wrote: > > On Fri, Oct 17, 2025 at 12:18 AM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > > do you think that's also safe to not > > > invalidate the slots for effective_xmin and catalog_effective_xmin if they > > > advance far enough? > > > > I find the same in xmin cases. ResolveRecoveryConflictWithSnapshot() > > is called only during the recovery by the startup process, and it also > > tries to invalidate possibly obsolete slots. Catalog tuples are not > > removed as long as the startup calls > > ResolveRecoveryConflictWithSnapshot() before actually removing the > > tuples and it's busy in InvalidatePossiblyObsoleteSlot(). > > I looked more closely at the xmin related cases and I agree with the above. > > > I might be > > missing something but this is the reason why I'm confused with the > > 818fefd8fd4 fix and the proposed change. > > Yeah so 818fefd8fd4 is well suited for tests consistency but in some rare cases > it invalidates a slot while it would be safe not to do so. > > OTOH it looks to me that the initial pre-818fefd8fd4 intend was to invalidate > the slot as per this comment: > > " > /* > * Re-acquire lock and start over; we expect to invalidate the > * slot next time (unless another process acquires the slot in the > * meantime). > */ > " > The comment doesn't indicate the intent that we will invalidate the slot after re-acquiring the lock even when the new conditions don't warrant the slot to be invalidated. The comment could be improved though. > The fact that it could move forward far enough before we terminate the > process holding the slot is a race condition due to the fact that we released > the mutex. > > If the above looks right to you then 818fefd8fd4 is doing what was "initially" > expected, do you agree? > Based on the discussion and points presented in this thread, I don't agree. I feel we should restore behavior prior to 818fefd8fd4 and fix the test case which relies on different messages. > If so, then maybe it's fine to keep 818fefd8fd4 as is? > I don't think so. -- With Regards, Amit Kapila.
Hi, On Wed, Oct 22, 2025 at 02:18:33PM +0530, Amit Kapila wrote: > On Mon, Oct 20, 2025 at 11:41 AM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > Yeah so 818fefd8fd4 is well suited for tests consistency but in some rare cases > > it invalidates a slot while it would be safe not to do so. > > > > OTOH it looks to me that the initial pre-818fefd8fd4 intend was to invalidate > > the slot as per this comment: > > > > " > > /* > > * Re-acquire lock and start over; we expect to invalidate the > > * slot next time (unless another process acquires the slot in the > > * meantime). > > */ > > " > > > > The comment doesn't indicate the intent that we will invalidate the > slot after re-acquiring the lock even when the new conditions don't > warrant the slot to be invalidated. The comment could be improved > though. Thanks for looking at it and clarifying this point. In the attached I try to improve the comment. > > The fact that it could move forward far enough before we terminate the > > process holding the slot is a race condition due to the fact that we released > > the mutex. > > > > If the above looks right to you then 818fefd8fd4 is doing what was "initially" > > expected, do you agree? > > > > Based on the discussion and points presented in this thread, I don't > agree. I feel we should restore behavior prior to 818fefd8fd4 Done in the attached. > and fix > the test case which relies on different messages. I think that 105b2cb3361 fixed it already or do you have something else in mind? [1]: https://www.postgresql.org/message-id/5d0e5bec-67f9-9164-36cb-c4ff5f95d1ed%40gmail.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
On Thu, Oct 23, 2025 at 3:07 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Wed, Oct 22, 2025 at 02:18:33PM +0530, Amit Kapila wrote:
> > On Mon, Oct 20, 2025 at 11:41 AM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > Yeah so 818fefd8fd4 is well suited for tests consistency but in some rare cases
> > > it invalidates a slot while it would be safe not to do so.
> > >
> > > OTOH it looks to me that the initial pre-818fefd8fd4 intend was to invalidate
> > > the slot as per this comment:
> > >
> > > "
> > > /*
> > > * Re-acquire lock and start over; we expect to invalidate the
> > > * slot next time (unless another process acquires the slot in the
> > > * meantime).
> > > */
> > > "
> > >
> >
> > The comment doesn't indicate the intent that we will invalidate the
> > slot after re-acquiring the lock even when the new conditions don't
> > warrant the slot to be invalidated. The comment could be improved
> > though.
>
> Thanks for looking at it and clarifying this point. In the attached I try
> to improve the comment.
>
> > > The fact that it could move forward far enough before we terminate the
> > > process holding the slot is a race condition due to the fact that we released
> > > the mutex.
> > >
> > > If the above looks right to you then 818fefd8fd4 is doing what was "initially"
> > > expected, do you agree?
> > >
> >
> > Based on the discussion and points presented in this thread, I don't
> > agree. I feel we should restore behavior prior to 818fefd8fd4
>
> Done in the attached.
+1 to change the behavior to pre-818fefd8fd4. Here are some review
comments for the v2 patch:
- if (initial_restart_lsn != InvalidXLogRecPtr &&
- initial_restart_lsn < oldestLSN)
+ XLogRecPtr restart_lsn = s->data.restart_lsn;
+
+ if (restart_lsn != InvalidXLogRecPtr &&
+ restart_lsn < oldestLSN)
I would suggest using XLogRecPtrIsInvalid() for better readability.
---
/*
- * The slot's mutex will be released soon, and it is possible that
- * those values change since the process holding the slot has been
- * terminated (if any), so record them here to ensure that we
- * would report the correct invalidation cause.
- *
* Unlike other slot attributes, slot's inactive_since can't be
* changed until the acquired slot is released or the owning
* process is terminated. So, the inactive slot can only be
* invalidated immediately without being terminated.
*/
- if (!terminated)
- {
- initial_restart_lsn = s->data.restart_lsn;
- initial_effective_xmin = s->effective_xmin;
- initial_catalog_effective_xmin = s->effective_catalog_xmin;
- }
invalidation_cause = DetermineSlotInvalidationCause(possible_causes,
s, oldestLSN,
dboid,
After deleting the first paragraph of the comments, the comment starts
with "Unlike other slot attributes.." seems no longer match what this
block of codes do. It needs to be updated or moved to a more
appropriate place.
>
> > and fix
> > the test case which relies on different messages.
>
> I think that 105b2cb3361 fixed it already or do you have something else in
> mind?
IIUC the test instability issue was caused by the fact that
RUNNING_XACTS record is decoded and catalog_xmin gets advanced between
terminating the slot owner process and invalidating the slot. The
issue was fixed by 105b2cb3361 by stopping the writing of
RUNNING_XACTS by injection points. 818fefd8fd4 was meant to fix the
inconsistency that the reason of process termination could be
different from the reason of the subsequent slot invalidation.
Therefore, the test instability issue is already cleared, but I think
we might want to do something to deal with the inconsistency that we
originally wanted to address.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Hi,
On Mon, Oct 27, 2025 at 10:22:32AM -0700, Masahiko Sawada wrote:
> On Thu, Oct 23, 2025 at 3:07 AM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Done in the attached.
>
> +1 to change the behavior to pre-818fefd8fd4.
Thanks for looking at it!
> Here are some review
> comments for the v2 patch:
>
> - if (initial_restart_lsn != InvalidXLogRecPtr &&
> - initial_restart_lsn < oldestLSN)
> + XLogRecPtr restart_lsn = s->data.restart_lsn;
> +
> + if (restart_lsn != InvalidXLogRecPtr &&
> + restart_lsn < oldestLSN)
>
> I would suggest using XLogRecPtrIsInvalid() for better readability.
Yeah and I think that should be done consistently, see the proposal in [1].
> ---
> /*
> - * The slot's mutex will be released soon, and it is possible that
> - * those values change since the process holding the slot has been
> - * terminated (if any), so record them here to ensure that we
> - * would report the correct invalidation cause.
> - *
> * Unlike other slot attributes, slot's inactive_since can't be
> * changed until the acquired slot is released or the owning
> * process is terminated. So, the inactive slot can only be
> * invalidated immediately without being terminated.
> */
> - if (!terminated)
> - {
> - initial_restart_lsn = s->data.restart_lsn;
> - initial_effective_xmin = s->effective_xmin;
> - initial_catalog_effective_xmin = s->effective_catalog_xmin;
> - }
>
> invalidation_cause = DetermineSlotInvalidationCause(possible_causes,
> s, oldestLSN,
> dboid,
>
> After deleting the first paragraph of the comments, the comment starts
> with "Unlike other slot attributes.."
Yeah, it has been added in ac0e33136abc.
> seems no longer match what this
> block of codes do.
Agree.
> It needs to be updated or moved to a more
> appropriate place.
What about moving it after?
"
* If the slot can be acquired, do so and mark it invalidated
* immediately. Otherwise we'll signal the owning process, below, and
* retry."
That looks like a good place to me.
>
> >
> > > and fix
> > > the test case which relies on different messages.
> >
> > I think that 105b2cb3361 fixed it already or do you have something else in
> > mind?
>
> IIUC the test instability issue was caused by the fact that
> RUNNING_XACTS record is decoded and catalog_xmin gets advanced between
> terminating the slot owner process and invalidating the slot. The
> issue was fixed by 105b2cb3361 by stopping the writing of
> RUNNING_XACTS by injection points.
Right.
> 818fefd8fd4 was meant to fix the
> inconsistency that the reason of process termination could be
> different from the reason of the subsequent slot invalidation.
Right.
> Therefore, the test instability issue is already cleared,
Yes.
> but I think
> we might want to do something to deal with the inconsistency that we
> originally wanted to address.
I see, you mean that the tests are stable now (thanks to 105b2cb3361) but
that we should still do something for "production" cases? (i.e not making use
of injection points).
[1]: https://www.postgresql.org/message-id/flat/aQB7EvGqrbZXrMlg@ip-10-97-1-34.eu-west-3.compute.internal
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Tue, Oct 28, 2025 at 1:58 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On Mon, Oct 27, 2025 at 10:22:32AM -0700, Masahiko Sawada wrote: > > On Thu, Oct 23, 2025 at 3:07 AM Bertrand Drouvot > > > seems no longer match what this > > block of codes do. > > Agree. > > > It needs to be updated or moved to a more > > appropriate place. > > What about moving it after? > > " > * If the slot can be acquired, do so and mark it invalidated > * immediately. Otherwise we'll signal the owning process, below, and > * retry." > > That looks like a good place to me. +1 > > > but I think > > we might want to do something to deal with the inconsistency that we > > originally wanted to address. > > I see, you mean that the tests are stable now (thanks to 105b2cb3361) but > that we should still do something for "production" cases? (i.e not making use > of injection points). Yes. While it seems we might want to review the past discussion, if we've concluded such behavior is problematic behavior and could confuse users, we can do something like improving the invalidation/termination reports. Or we can do nothing if the current reporting is fine. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Tue, Oct 28, 2025 at 11:53:26AM -0700, Masahiko Sawada wrote: > On Tue, Oct 28, 2025 at 1:58 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: Apologies for not following closely what has been happening on this thread. I have just not been paying much attention, and Sawada-san has just poked at me about its existence today, also regarding the fact that my fingerprints are all over the place (implying that I'm an owner here, even in light of 105b2cb3361 that has introduced the trick to bypass the creation of the annoying standby snapshot records that distabilized the tests). >> I see, you mean that the tests are stable now (thanks to 105b2cb3361) but >> that we should still do something for "production" cases? (i.e not making use >> of injection points). > > Yes. While it seems we might want to review the past discussion, if > we've concluded such behavior is problematic behavior and could > confuse users, we can do something like improving the > invalidation/termination reports. Or we can do nothing if the current > reporting is fine. What do you mean exactly here? An improvement in the reports done when the invalidations are kicked sounds separate to me, that may not be something to touch when it comes to the back-branches. Anyway, coming back to the patch, the argument of relying on the slot's restart_lsn and the two xmins to restore the pre-818fefd8fd4 sounds good to me in light of 105b2cb3361. Extra argument: Is the fact that RS_INVAL_WAL_REMOVED (and optionally RS_INVAL_IDLE_TIMEOUT) is used only by the checkpointer something that we may want to document in the shape of an assertion at the beginning of InvalidateObsoleteReplicationSlots() with an extra condition based on the backend type? Based on the fact that restart_lsn could be updated across two loops of the invalidation logic done by the checkpointer, the answer to my own question is "no" and we may want to trigger this reason from other contexts. Just wondering if that's worth doing, as it has been mentioned upthread. -- Michael
Вложения
Hi, On Tue, Oct 28, 2025 at 11:53:26AM -0700, Masahiko Sawada wrote: > On Tue, Oct 28, 2025 at 1:58 AM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > Hi, > > > > On Mon, Oct 27, 2025 at 10:22:32AM -0700, Masahiko Sawada wrote: > > > On Thu, Oct 23, 2025 at 3:07 AM Bertrand Drouvot > > > > > seems no longer match what this > > > block of codes do. > > > > Agree. > > > > > It needs to be updated or moved to a more > > > appropriate place. > > > > What about moving it after? > > > > " > > * If the slot can be acquired, do so and mark it invalidated > > * immediately. Otherwise we'll signal the owning process, below, and > > * retry." > > > > That looks like a good place to me. > > +1 Done that way in v3 attached. Please note that v3 does not take into account the XLogRecPtrIsInvalid() remark as this will be part of a global effort and it's not directly linked to what we want to achieve here. > > > > > but I think > > > we might want to do something to deal with the inconsistency that we > > > originally wanted to address. > > > > I see, you mean that the tests are stable now (thanks to 105b2cb3361) but > > that we should still do something for "production" cases? (i.e not making use > > of injection points). > > Yes. While it seems we might want to review the past discussion, if > we've concluded such behavior is problematic behavior and could > confuse users, we can do something like improving the > invalidation/termination reports. Or we can do nothing if the current > reporting is fine. That's the test instability that triggered 818fefd8fd4 and not any report from the field. I think that pre-818fefd8fd4 behavior has been there for a while and that hitting the inconsistency is a pathological case. I'd vote for do nothing unless we get complaints from the field. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
On Wed, Oct 29, 2025 at 12:02 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Oct 28, 2025 at 11:53:26AM -0700, Masahiko Sawada wrote: > > On Tue, Oct 28, 2025 at 1:58 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Apologies for not following closely what has been happening on this > thread. I have just not been paying much attention, and Sawada-san > has just poked at me about its existence today, also regarding the > fact that my fingerprints are all over the place (implying that I'm an > owner here, even in light of 105b2cb3361 that has introduced the trick > to bypass the creation of the annoying standby snapshot records that > distabilized the tests). > > >> I see, you mean that the tests are stable now (thanks to 105b2cb3361) but > >> that we should still do something for "production" cases? (i.e not making use > >> of injection points). > > > > Yes. While it seems we might want to review the past discussion, if > > we've concluded such behavior is problematic behavior and could > > confuse users, we can do something like improving the > > invalidation/termination reports. Or we can do nothing if the current > > reporting is fine. > > What do you mean exactly here? An improvement in the reports done > when the invalidations are kicked sounds separate to me, that may not > be something to touch when it comes to the back-branches. I thought the commit 818fefd8fd4 originally came from the complaint that we reported the cause of terminating the process could differ from the cause of invalidating the slot (or even we didn't report anything if we skipped the slot invalidation), so I proposed to report explicitly that the slot's restart_lsn gets recovered and we therefore skipped to invalidate it[1]. But I agree it's a separate discussion. > > Anyway, coming back to the patch, the argument of relying on the > slot's restart_lsn and the two xmins to restore the pre-818fefd8fd4 > sounds good to me in light of 105b2cb3361. BTW do you think restoring the pre-818fefd8fd4 behavior is going to be backpatched? > > Extra argument: Is the fact that RS_INVAL_WAL_REMOVED (and optionally > RS_INVAL_IDLE_TIMEOUT) is used only by the checkpointer something that > we may want to document in the shape of an assertion at the beginning > of InvalidateObsoleteReplicationSlots() with an extra condition based > on the backend type? Based on the fact that restart_lsn could be > updated across two loops of the invalidation logic done by the > checkpointer, the answer to my own question is "no" and we may want to > trigger this reason from other contexts. Just wondering if that's > worth doing, as it has been mentioned upthread. I don't think we need to put the assertion there, but it would be good if we could mention somewhere as comments why it's safe to skip the slot invalidation when the slot's xmin or restart_lsn get advanced across two loops. Regards, [1] https://www.postgresql.org/message-id/CAD21AoBvL8teXPEXK67_DaUsrft16QQ6XyGvuZVCnT-o8nH2ZA%40mail.gmail.com -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Wed, Oct 29, 2025 at 11:05:12AM -0700, Masahiko Sawada wrote: > On Wed, Oct 29, 2025 at 12:02 AM Michael Paquier <michael@paquier.xyz> wrote: >> Anyway, coming back to the patch, the argument of relying on the >> slot's restart_lsn and the two xmins to restore the pre-818fefd8fd4 >> sounds good to me in light of 105b2cb3361. > > BTW do you think restoring the pre-818fefd8fd4 behavior is going to be > backpatched? With 105b2cb3361 being introduced in v16, restoring the past behavior on all the branches sounds good to me. I prefer the long-term solution that we have in 18 and newer branches, but we don't have IS_INJECTION_POINT_ATTACHED() in the v16 and v17 branches, so there is not much more to do. It would be better to move quickly on this one to make sure that we do not make v16 and v17 again unstable by the next minor release. With this timing in mind and an upcoming long weekend in Japan (11/3), I am tempted to do something about that today as I'll be able to monitor the buildfarm for a bit more time. > I don't think we need to put the assertion there, but it would be good > if we could mention somewhere as comments why it's safe to skip the > slot invalidation when the slot's xmin or restart_lsn get advanced > across two loops. Okay, noted. -- Michael
Вложения
On Wed, Oct 29, 2025 at 08:55:56AM +0000, Bertrand Drouvot wrote: > Done that way in v3 attached. Please note that v3 does not take into account > the XLogRecPtrIsInvalid() remark as this will be part of a global effort and > it's not directly linked to what we want to achieve here. This comment related to inactive_since that you have moved to its new location has been added by ac0e33136abc, for v18~. Agreed that it is important to keep that documented. The new location of this comment feels OK. > That's the test instability that triggered 818fefd8fd4 and not any report > from the field. I think that pre-818fefd8fd4 behavior has been there for a > while and that hitting the inconsistency is a pathological case. I'd vote for > do nothing unless we get complaints from the field. I am not sure that there is anything else to be done, but let's just revert the change in v16~ for now. As far as I can see, the change is straight-forward in v16, slightly funky in v17 as "invalidation_cause" is a rename of "conflict", while your patch captures the refactoring of v18~ under DetermineSlotInvalidationCause(). I have run a few hundred loops of 035_standby_logical_decoding for v16 and v17, in case, without seeing problems. Now the original race was also hard to see. -- Michael
Вложения
Hi, On Thu, Oct 30, 2025 at 11:25:18AM +0900, Michael Paquier wrote: > On Wed, Oct 29, 2025 at 08:55:56AM +0000, Bertrand Drouvot wrote: > > > That's the test instability that triggered 818fefd8fd4 and not any report > > from the field. I think that pre-818fefd8fd4 behavior has been there for a > > while and that hitting the inconsistency is a pathological case. I'd vote for > > do nothing unless we get complaints from the field. > > I am not sure that there is anything else to be done, but let's just > revert the change in v16~ for now. As far as I can see, the change is > straight-forward in v16, slightly funky in v17 as "invalidation_cause" > is a rename of "conflict", while your patch captures the refactoring > of v18~ under DetermineSlotInvalidationCause(). I have run a few > hundred loops of 035_standby_logical_decoding for v16 and v17, in > case, without seeing problems. Now the original race was also hard to > see. The tests are also stabilized on 17 thanks to 17a165d60f73 and on 16 thanks to 86392e8827d8, so I think that we should be fine on those branches too. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Thu, Oct 30, 2025 at 04:18:26AM +0000, Bertrand Drouvot wrote: > The tests are also stabilized on 17 thanks to 17a165d60f73 and on 16 thanks > to 86392e8827d8, so I think that we should be fine on those branches too. Applied down to v16, including the comment when the LWLock is released for consistency across all the branches. The buildfarm will now tell a bit more. -- Michael