At Wed, 14 Jul 2021 19:10:26 -0400, Jeff Janes <jeff.janes@gmail.com> wrote in
> On Tue, Jul 13, 2021 at 10:12 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com>
> wrote:
> > Useless WAL files will be removd after a checkpoint runs.
> >
>
> They should be, but they are not. That is the bug. They just hang
> around, checkpoint after checkpoint. Some of them do get cleaned up, to
> make up for new ones created during that cycle. It treats
> max_slot_wal_keep the same way it treats wal_keep_size (but only if a
> "lost" slot is hanging around). If you drop the lost slot, only then does
> it remove all the accumulated WAL at the next checkpoint.
Thanks! I saw the issue here. Some investigation showd me a doubious
motion of XLogCtl->repliationSlotMinLSN. Slot invalidation is
forgetting to recalculate it and that misbehavior retreats the segment
horizon.
So the attached worked for me. I'll repost the polished version
including test.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index c7c928f50b..0fc0feb88e 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -9301,6 +9301,15 @@ CreateCheckPoint(int flags)
XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
KeepLogSeg(recptr, &_logSegNo);
InvalidateObsoleteReplicationSlots(_logSegNo);
+
+ /*
+ * Some slots may have been gone, recalculate the segments to keep based on
+ * the remaining slots.
+ */
+ ReplicationSlotsComputeRequiredLSN();
+ XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
+ KeepLogSeg(recptr, &_logSegNo);
+
_logSegNo--;
RemoveOldXlogFiles(_logSegNo, RedoRecPtr, recptr);
@@ -9641,6 +9650,15 @@ CreateRestartPoint(int flags)
endptr = (receivePtr < replayPtr) ? replayPtr : receivePtr;
KeepLogSeg(endptr, &_logSegNo);
InvalidateObsoleteReplicationSlots(_logSegNo);
+
+ /*
+ * Some slots may have been gone, recalculate the segments to keep based on
+ * the remaining slots.
+ */
+ ReplicationSlotsComputeRequiredLSN();
+ XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
+ KeepLogSeg(endptr, &_logSegNo);
+
_logSegNo--;
/*