Re: recovering from "found xmin ... from before relfrozenxid ..."

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: recovering from "found xmin ... from before relfrozenxid ..."
Дата
Msg-id 20200714193126.2xxtinqnm4hogzjt@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: recovering from "found xmin ... from before relfrozenxid ..."  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Hi,

On 2020-07-13 21:18:10 -0400, Robert Haas wrote:
> On Mon, Jul 13, 2020 at 9:10 PM Andres Freund <andres@anarazel.de> wrote:
> > > What if clog has been truncated so that the xmin can't be looked up?
> >
> > That's possible, but probably only in cases where xmin actually
> > committed.
> 
> Isn't that the normal case? I'm imagining something like:
> 
> - Tuple gets inserted. Transaction commits.
> - VACUUM processes table.
> - Mischievous fairies mark page all-visible in the visibility map.
> - VACUUM runs lots more times, relfrozenxid advances, but without ever
> looking at the page in question, because it's all-visible.
> - clog is truncated, rendering xmin no longer accessible.
> - User runs VACUUM disabling page skipping, gets ERROR.
> - User deletes offending tuple.
> - At this point, I think the tuple is both invisible and unprunable?
> - Fairies happy, user sad.

I'm not saying it's impossible that that happens, but the cases I did
investigate didn't look like this. If something just roguely wrote to
the VM I'd expect a lot more "is not marked all-visible but visibility
map bit is set in relation" type WARNINGs, and I've not seen much of
those (they're WARNINGs though, so maybe we wouldn't). Presumably this
wouldn't always just happen with tuples that'd trigger an error first
during hot pruning.

I've definitely seen indications of both datfrozenxid and relfrozenxid
getting corrupted (in particular vac_update_datfrozenxid being racy as
hell), xid wraparound, indications of multixact problems (although it's
possible we've now fixed those) and some signs of corrupted relcache
entries for shared relations leading to vacuums being skipped.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: [HACKERS] PATCH: Batch/pipelining support for libpq
Следующее
От: Andres Freund
Дата:
Сообщение: Re: recovering from "found xmin ... from before relfrozenxid ..."