Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Дата
Msg-id 20211110192010.ckvfzz352hsba5xf@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum  (Peter Geoghegan <pg@bowt.ie>)
Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum  (Peter Geoghegan <pg@bowt.ie>)
Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum  (Andres Freund <andres@anarazel.de>)
Список pgsql-bugs
Hi,

On 2021-11-09 15:31:37 -0800, Peter Geoghegan wrote:
> I'm not sure why this seems to have become more of a problem following
> the snapshot scalability work from Andres -- Alexander mentioned that
> commit dc7420c2 looked like it was the source of the problem here, but
> I can't see any reason why that might be true (even though I accept
> that it might well *appear* to be true). I believe Andres has some
> theory on that, but I don't know the details myself. AFAICT, this is a
> live bug on all supported versions. We simply weren't being careful
> enough about breaking the invariant that an LP_REDIRECT can only point
> to a valid heap-only tuple. The really surprising thing here is that
> it took this long for it to visibly break.

The way this definitely breaks - I have been able to reproduce this in
isolation - is when one tuple is processed twice by heap_prune_chain(), and
the result of HeapTupleSatisfiesVacuum() changes from
HEAPTUPLE_DELETE_IN_PROGRESS to DEAD.

Consider a page like this:

lp 1: redirect to lp2
lp 2: deleted by xid x, not yet committed

and a sequence of events like this:

1) heap_prune_chain(rootlp = 1)
2) commit x
3) heap_prune_chain(rootlp = 2)

1) heap_prune_chain(rootlp = 1) will go to lp2, and see a
HEAPTUPLE_DELETE_IN_PROGRESS and thus not do anything.

3) then could, with the snapshot scalability changes, get DEAD back from
HTSV. Due to the "fuzzy" nature of the post-snapshot-scalability xid horizons,
that is possible, because we can end up rechecking the boundary condition and
seeing that now the horizon allows us to prune x / lp2.

At that point we have a redirect tuple pointing into an unused slot. Which is
"illegal", because something independent can be inserted into that slot.


What made this hard to understand (and likely hard to hit) is that we don't
recompute the xid horizons more than once per hot pruning ([1]). At first I
concluded that a change from RECENTLY_DEAD to DEAD could thus not happen - and
it doesn't: We go from HEAPTUPLE_DELETE_IN_PROGRESS to DEAD, which is possible
because there was no horizon test for HEAPTUPLE_DELETE_IN_PROGRESS.



Note that there are several paths < 14, that cause HTSV()'s answer to change
for the same xid. E.g. when the transaction inserting a tuple version aborts,
we go from HEAPTUPLE_INSERT_IN_PROGRESS to DEAD. But I haven't quite found a
path to trigger problems with that, because there won't be redirects to a
tuple version that is HEAPTUPLE_INSERT_IN_PROGRESS (but there can be redirects
to a HEAPTUPLE_DELETE_IN_PROGRESS or RECENTLY_DEAD).



I hit a crash once in 13 with a slightly evolved version of the test (many
connections creating / dropping the partitions as in the original scenario,
using :client_id to target different tables). It's possible that my
instrumentation was the cause of that. Unfortunately it took quite a few hours
to hit the problem in 13...



Greetings,

Andres Freund

[1] it's a bit more complicated than that, we only recompute the horizon when
a) we've not done it before in the current xact, b) RecentXmin changed during
a snapshot computation. Recomputing the horizon is expensive-ish, so we don't
want to do it constantly.



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #17279: 'return query update ... returning *' reports syntax error in pg/plsql function
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()