Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Дата
Msg-id CAH2-WzmxQMHs9e61Qg0b7admeQc0y+ne_xxAxTLtvmHiJ=FQiA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum  (Andres Freund <andres@anarazel.de>)
Ответы Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum  (Andres Freund <andres@anarazel.de>)
Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum  (Dmitry Dolgov <9erthalion6@gmail.com>)
Список pgsql-bugs
On Fri, Nov 12, 2021 at 2:29 PM Andres Freund <andres@anarazel.de> wrote:
> With subtransactions abort is a bit more complicated than with plain
> transactions. I'm not at all sure a problematic scenario exists, but I
> wouldn't want to rely on it.

What would it actually mean to rely on it, or to not rely on it?

As I've pointed out many times already, a disconnected heap tuple
cannot be accessed from an index scan -- this is something that you
*can* rely on, because we've performed exactly the same steps as
heap_hot_search_buffer() would in making that determination. When you
talk about what HTSV thinks of the tuple, you're merely talking about
how to behave in the event of a specific form of HOT chain corruption
(a theoretical background risk for HOT chains that's nothing new).

This is a question of trade-offs around adding defensive checks and so
on. It is not a question of making the corruption itself any less
likely (unless early detection allows the user to prevent further
corruption). I'm a bit confused here, because it sounds like you might
not agree with that.

> > Naturally, I also went through the exercise of trying to find a
> > counterexample, where pruning doesn't see a disconnected tuple as DEAD
> > in its HTSV. I could not get the assertion to fail with Alexander's
> > test case, nor with make check-world.
>
> I don't think that provides a meaningful coverage. Alexander's test has a
> quite limited set operations (which e.g. doesn't include an subxacts), and our
> own tests around subtransactions, and particularly concurrent subtransaction
> heavy work, is quite, uh, minimal.

It's a start.

We need to be pragmatic here. There is some uncertainty about what
HTSV might say about a disconnected tuple in the absence of
corruption, or there is a risk of a new problem like that coming up in
the future -- let's work within those confines, then. What do you want
to do about that? There aren't that many choices, since, to repeat,
the tuple is "morally" DEAD no matter what. Even with corruption, even
without corruption in the presence of some unanticipated corner case
with HTSV -- this is fundamental.

-- 
Peter Geoghegan



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Следующее
От: Andres Freund
Дата:
Сообщение: Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum