Re: BUG #17245: Index corruption involving deduplicated entries

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: BUG #17245: Index corruption involving deduplicated entries
Дата
Msg-id CAH2-Wzk-4_raTzawWGaiqNvkpwDXxv3y1AQhQyUeHfkU=tFCeA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #17245: Index corruption involving deduplicated entries  (Andres Freund <andres@anarazel.de>)
Ответы Re: BUG #17245: Index corruption involving deduplicated entries  (Andres Freund <andres@anarazel.de>)
Список pgsql-bugs
On Sat, Oct 30, 2021 at 1:42 PM Andres Freund <andres@anarazel.de> wrote:
> I think it probably is worth adding an error check someplace that verifies
> that problems of this kind will be detected with, uh, less effort.

The draft fix is something I wanted to get out quickly to signal that
the bug is definitely going to be fixed soon. I'll need to carefully
think about this some more on Monday, to make sure that it becomes
much harder to break in roughly the same way once again.

> I think it'd also be good to add a test that specifically verifies that
> parallel vacuum doesn't have a bug around "parallel worthy" and not "parallel
> worthy" indexes. It's too easy a mistake to make, and because visible
> corruption is delayed, it's likely that we won't detect such cases.

I agree.

> ISTM that at least a basic version of this is worth doing as a check throwing
> an ERROR, rather than an assertion. It's hard to believe this'd be a
> significant portion of the cost of heap_index_delete_tuples(), and I think it
> would help catch problems a lot earlier.

I like the idea of doing that, but only on master. I think that we
could do this as an ERROR, provided we avoid adding a new inner loop
to heap_index_delete_tuples() --- we can definitely afford that. And
we can even have a nice, detailed error message that blames a
particular tuple from a specific index page, pointing to a specific
heap TID.

Separately, we should add an assertion that catches cases where a TID
in the index points to an LP_REDIRECT line pointer, which does not
point to a heap tuple with storage. That check has much less practical
value, and necessitates adding a new inner loop, which is why an
assert seems good enough to me. (The patch I've posted doesn't have
any of this LP_REDIRECT business, but my v2 revision will.)

-- 
Peter Geoghegan



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
Следующее
От: Andres Freund
Дата:
Сообщение: Re: BUG #17245: Index corruption involving deduplicated entries