At Fri, 19 Jun 2020 08:59:48 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
> > > > Mmm. pg_walfile_name seems too specialize to
> > > > pg_stop_backup(). (pg_walfile_name_offset() returns wrong result for
> > > > segment boundaries.) I'm not willing to do that only to follow such
> > > > suspicious(?) specification, but surely it would practically be better
> > > > doing that. Please find the attached first patch.
> > > >
> > >
> > > It is a little unclear to me how this or any proposed patch will solve
> > > the original problem reported by Fujii-San? Basically, the problem
> > > arises because we don't have an interlock between when the checkpoint
> > > removes the WAL segment and the view tries to acquire the same. Am, I
> > > missing something?
> >
> > I'm not sure, but I don't get the point of blocking WAL segment
> > removal until the view is completed.
> >
>
> I am not suggesting to do that.
>
> > The said columns of the view are
> > just for monitoring, which needs an information snapshot seemingly
> > taken at a certain time. And InvalidateObsoleteReplicationSlots kills
> > walsenders using lastRemovedSegNo of a different time. The two are
> > independent each other.
> >
> > Also the patch changes min_safe_lsn to show an LSN at segment boundary
> > + 1.
> >
>
> But aren't we doing last_removed_seg+1 even without the patch? See code below
>
> - {
> - XLogRecPtr min_safe_lsn;
> -
> - XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 0,
> - wal_segment_size, min_safe_lsn);
It is at the beginning byte of the *next* segment. Fujii-san told that
it should be the next byte of it, namely
"XLogSegNoOffsetToRecPtr(last_removed_seg + 1, *1*,", and the patch
calculates as that. It adds the follows instead.
+ if (max_slot_wal_keep_size_mb >= 0 && last_removed_seg != 0)
+ XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 1,
+ wal_segment_size, min_safe_lsn);
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center