Re: vac_truncate_clog()'s bogus check leads to bogusness

Поиск
Список
Период
Сортировка
От Noah Misch
Тема Re: vac_truncate_clog()'s bogus check leads to bogusness
Дата
Msg-id 20230625171324.GA2510034@rfd.leadboat.com
обсуждение исходный текст
Ответ на Re: vac_truncate_clog()'s bogus check leads to bogusness  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Fri, Jun 23, 2023 at 06:41:58PM -0700, Andres Freund wrote:
> On 2023-06-21 21:50:39 -0700, Noah Misch wrote:
> > On Wed, Jun 21, 2023 at 05:46:37PM -0700, Andres Freund wrote:
> > > A related issue is that as far as I can tell the determination of what is
> > > bogus is bogus.
> > > 
> > > The relevant cutoffs are determined vac_update_datfrozenxid() using:
> > > 
> > >     /*
> > >      * Identify the latest relfrozenxid and relminmxid values that we could
> > >      * validly see during the scan.  These are conservative values, but it's
> > >      * not really worth trying to be more exact.
> > >      */
> > >     lastSaneFrozenXid = ReadNextTransactionId();
> > >     lastSaneMinMulti = ReadNextMultiXactId();
> > > 
> > > but doing checks based on thos is bogus, because:
> > > 
> > > a) a concurrent create table / truncate / vacuum can update
> > >    pg_class.relfrozenxid of some relation in the current database to a newer
> > >    value, after lastSaneFrozenXid already has been determined. If that
> > >    happens, we skip updating pg_database.datfrozenxid.
> > > 
> > > b) A concurrent vacuum in another database, ending up in vac_truncate_clog(),
> > >    can compute a newer datfrozenxid. In that case the vac_truncate_clog() with
> > >    the outdated lastSaneFrozenXid will not truncate the clog (and also forget
> > >    to release WrapLimitsVacuumLock currently, as reported upthread) and not
> > >    call SetTransactionIdLimit(). The latter is particularly bad, because that
> > >    means we might not come out of "database is not accepting commands" land.
> > 
> > > I guess we could just recompute the boundaries before actually believing the
> > > catalog values are bogus?
> > 
> > That's how I'd do it.
> 
> I was looking at doing that and got confused by the current code. Am I missing
> something, or does vac_truncate_clog() have two pretty much identical attempts
> at a safety measures?
> 
> void
> vac_update_datfrozenxid(void)
> ...
>     lastSaneFrozenXid = ReadNextTransactionId();
> ...
>         vac_truncate_clog(newFrozenXid, newMinMulti,
>                           lastSaneFrozenXid, lastSaneMinMulti);
> }
> ...
> static void
> vac_truncate_clog(TransactionId frozenXID,
>                   MultiXactId minMulti,
>                   TransactionId lastSaneFrozenXid,
>                   MultiXactId lastSaneMinMulti)
> {
>     TransactionId nextXID = ReadNextTransactionId();
> ...
>         /*
>          * If things are working properly, no database should have a
>          * datfrozenxid or datminmxid that is "in the future".  However, such
>          * cases have been known to arise due to bugs in pg_upgrade.  If we
>          * see any entries that are "in the future", chicken out and don't do
>          * anything.  This ensures we won't truncate clog before those
>          * databases have been scanned and cleaned up.  (We will issue the
>          * "already wrapped" warning if appropriate, though.)
>          */
>         if (TransactionIdPrecedes(lastSaneFrozenXid, datfrozenxid) ||
>             MultiXactIdPrecedes(lastSaneMinMulti, datminmxid))
>             bogus = true;
> 
>         if (TransactionIdPrecedes(nextXID, datfrozenxid))
>             frozenAlreadyWrapped = true;
> 
> lastSaneFrozenXid is a slightly older version of ReadNextTransactionId(),
> that's the only difference afaict.

I don't think you missed anything.  nextXID and lastSaneFrozenXid are both
just caches of ReadNextTransactionId().  Each can become stale enough to make
those comparisons suggest trouble when all is fine.

> I guess this might be caused by 78db307bb23 adding the check, but using
> GetOldestXmin(NULL, true) to determine lastSaneFrozenXid. That was changed
> soon after, in 87f830e0ce03.

Yeah.  The nextXID check is from 9c54cfb (2002-04), and the newer check
converged with it in 87f830e0ce03 (2014-07).


While less important, some other things look weak in these functions:

- The only non-corruption cause to reach the "don't want to let datfrozenxid
  go backward" code is for GetOldestNonRemovableTransactionId(NULL) to go
  backward, e.g. if a walsender starts up and advertises an xmin.  One could
  eliminate that cause by replacing "newFrozenXid =
  GetOldestNonRemovableTransactionId(NULL)" with initialization from the first
  relfrozenxid, analogous to how vac_truncate_clog() initializes.
  vac_update_datfrozenxid() could then warn if the prevention code intervenes.
  Perhaps, instead of preventing the go-backwards, it should apply the
  go-backward change after warning?  (Unlike datfrozenxid, datminmxid going
  backward already implies corruption.)

- The "some databases have not been vacuumed in over 2 billion transactions"
  message is false more often than not.  More likely, something corrupted a
  frozen ID.  The message is also missing the opportunity to indicate one of
  the affected databases.

- vac_truncate_clog() bogosity checks examine XIDs only, not multis.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: DROP DATABASE is interruptible
Следующее
От: Steve Chavez
Дата:
Сообщение: Fwd: Castable Domains for different JSON representations