Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae

Поиск
Список
Период
Сортировка
От Noah Misch
Тема Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae
Дата
Msg-id 20240520204818.42.nmisch@google.com
обсуждение исходный текст
Ответ на Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae  (Melanie Plageman <melanieplageman@gmail.com>)
Ответы Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae
Список pgsql-bugs
On Mon, May 20, 2024 at 11:58:23AM -0400, Melanie Plageman wrote:
> On Sat, May 18, 2024 at 6:23 PM Noah Misch <noah@leadboat.com> wrote:
> > Are there obstacles to fixing the hang by back-patching 1ccc1e05ae instead of
> > this?  We'll need to get confident about 1ccc1e05ae before v17, and that
> > sounds potentially easier than getting confident about both 1ccc1e05ae and
> > this other approach.
> 
> I haven't tried back-patching 1ccc1e05ae yet, but I don't understand
> why we would want to use stable back branches to get comfortable with
> an approach committed to an unreleased version of Postgres.

I wouldn't say we want to use stable back branches to get comfortable with an
approach.  I wanted to say that it's less work to be confident about one new
way of doing things than two new ways of doing things.

> The small fix proposed in this thread is extremely minimal and
> straightforward. It seems much less risky as a backpatch.

Here's how I model the current and proposed code:

1. v15 VACUUM removes tuples that are HEAPTUPLE_DEAD according to VisTest.
   OldestXmin doesn't cause tuple removal, but there's a hang when OldestXmin
   rules DEAD after VisTest ruled RECENTLY_DEAD.

2. With 1ccc1e05ae, v17 VACUUM still removes tuples that are HEAPTUPLE_DEAD
   according to VisTest only.  OldestXmin doesn't come into play.

3. fix_hang_15.patch would make v15 VACUUM remove tuples that are
   HEAPTUPLE_DEAD according to _either_ VisTest or OldestXmin.

Since (3) is the only list entry that removes tuples that VisTest ruled
RECENTLY_DEAD, I find it higher risk.  For all three, the core task of
certifying the behavior is confirming that its outcome is sound when VisTest
and OldestXmin disagree.  How do you model it?

> > Regarding getting confident about 1ccc1e05ae, I think I
> > follow the upthread arguments that it does operate correctly.  As a cross
> > check, I looked at each mention of oldestxmin in vacuumlazy.c and vacuum.c.
> > Does the following heap_vacuum_rel() comment need an update?
> >
> >         /*
> >          * Get cutoffs that determine which deleted tuples are considered DEAD,
> >          * not just RECENTLY_DEAD, and which XIDs/MXIDs to freeze.  Then determine

Since 1ccc1e05ae, the cutoffs (values acquired in vacuum_get_cutoffs()) don't
decide DEAD.

> >          * the extent of the blocks that we'll scan in lazy_scan_heap.  It has to

Still accurate.

> >          * happen in this order to ensure that the OldestXmin cutoff field works
> >          * as an upper bound on the XIDs stored in the pages we'll actually scan

That sentence's problem predates 1ccc1e05ae.  (OldestXmin is a lower bound on
XIDs in pages we don't scan, not an upper bound on XIDs in pages we scan.  To
be pedantic, snapshots don't affect this bound, so the oldest running XID is
also a lower bound.)

> >          * (NewRelfrozenXid tracking must never be allowed to miss unfrozen XIDs).

Still accurate.

> >          *
> >          * Next acquire vistest, a related cutoff that's used in pruning.  We

Still accurate.

> >          * expect vistest will always make heap_page_prune_and_freeze() remove any
> >          * deleted tuple whose xmax is < OldestXmin.

Since 1ccc1e05ae, that sentence does not apply.

> Which part are you thinking needs an update? The part about vistest
> always making heap_page_prune_and_freeze() remove any deleted tuples
> whose xmax < OldestXmin?

That and one other sentence marked above.


(Long-term, I favor removal of the cutoffs->OldestXmin field.  I'm not aware
of any calculation for which it remains the optimal value.)



В списке pgsql-bugs по дате отправления:

Предыдущее
От: "Haifang Wang (Centific Technologies Inc)"
Дата:
Сообщение: RE: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607