Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows
Дата
Msg-id 10e8dbad-d21a-b22b-6677-c5c9df9ea6ee@iki.fi
обсуждение исходный текст
Ответ на Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-bugs
On 17/07/2021 04:56, Peter Geoghegan wrote:
> On Fri, Jul 16, 2021 at 5:30 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> Hmm, seems we should fix that. But could a prematurely recycled deleted
>> page cause permanent corruption?
> 
> If scans can find a page that is wholly unrelated to the expected page
> (and possibly even in the wrong high level page category), then it's
> really hard to predict what might happen. This could lead to real
> chaos.

Right. Looking at the code, I don't think that can happen. The code that 
deals with the pending list uses lock coupling, so it should never 
follow a pointer to a page that was concurrently deleted. Might 
something funny happen if a pending list page is deleted and immediately 
reused as another kind of page? I don't see a problem with that. And a 
gin btree page won't be reused prematurely as a pending list page, 
because ginNewBuffer() checks GinPageIsRecyclable(), even if the page is 
being reused as a pending list page.

> ginInsertCleanup() makes no attempt to perform basic validation
> of its assumptions about what kind of page this is, except for some
> assertions. We should have something like a "can't happen" error on
> !GinPageIsList() inside ginInsertCleanup() -- if we had that already
> then I might be able to reason about this problem. It wouldn't hurt to
> have similar checks in other code that deals with posting trees and
> entry trees, too.

+1

While playing with the trace Pawel sent, I loaded some of the index 
pages into a local instance so that I could step through the code with 
gdb. I filled the rest of the pages with zeros. I managed to get the gin 
btree descend code into an infinite loop with that. When it followed a 
pointer to an entry leaf page, but that entry leaf page was substituted 
with zeros, it interpreted the all-zeros page as a page with a 
right-link to block 0. It then stepped right to read block 0 - which is 
the metapage! - and incorrectly interpreted it as an internal page. 
Amazingly, it managed to perform the binary search on it as if it 
contained index tuples, and came up wth an another block number, which I 
had also replaced with all-zeros.

Long story short, we should sprinkle a lot more sanity checks to the 
functions in ginbtree.c. It should check that the page is of expected 
kind; no stepping from internal entry page to posting tree page. And 
seeing a pointer to block 0 at any step should be an error.

But this was just an artifact of the missing pages my test. It doesn't 
explain the original problem.

> ginInsertCleanup() is tolerant of all kinds of things. It's not just
> the lack of page-level sanity checks. It's also the basic approach to
> crash safety, which relies on the fact that GIN only does lossy index
> scans. My guess is that there could be lots of problems without it
> being obvious to users. Things really went downhill in
> ginInsertCleanup() starting in commit e956808328.

Yeah, I wouldn't be surprised if this turns out to be a bug in pending 
list cleanup. But I haven't been able to come up with a concrete 
sequence of steps that would cause it.

- Heikki



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows