Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Дата
Msg-id 20211112225725.2a32slgl5ou3dvre@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-bugs
Hi,

On 2021-11-12 14:46:22 -0800, Peter Geoghegan wrote:
> On Fri, Nov 12, 2021 at 2:29 PM Andres Freund <andres@anarazel.de> wrote:
> > With subtransactions abort is a bit more complicated than with plain
> > transactions. I'm not at all sure a problematic scenario exists, but I
> > wouldn't want to rely on it.
> 
> What would it actually mean to rely on it, or to not rely on it?

That we shouldn't throw an error / assert out if we find such a tuple.


> As I've pointed out many times already, a disconnected heap tuple
> cannot be accessed from an index scan -- this is something that you
> *can* rely on, because we've performed exactly the same steps as
> heap_hot_search_buffer() would in making that determination.

Yes, it'd also not be considered visible by SatisfiesMVCC().


> When you talk about what HTSV thinks of the tuple, you're merely talking
> about how to behave in the event of a specific form of HOT chain corruption
> (a theoretical background risk for HOT chains that's nothing new).

My point is that I don't think it necessarily signals corruption. But a very
short term transient state under heavy concurrency.


> We need to be pragmatic here. There is some uncertainty about what
> HTSV might say about a disconnected tuple in the absence of
> corruption, or there is a risk of a new problem like that coming up in
> the future -- let's work within those confines, then. What do you want
> to do about that? There aren't that many choices, since, to repeat,
> the tuple is "morally" DEAD no matter what. Even with corruption, even
> without corruption in the presence of some unanticipated corner case
> with HTSV -- this is fundamental.

I think we can assert/error out if it's visible, that's clearly
corruption. I'd personally not add assert/error checks for other states, given
that it could plausible happen without indicating a problem. Debugging
transient errors that happen rarely, under high load, with nontrivial
workloads isn't fun.

Greetings,

Andres Freund



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum