Re: BUG #17245: Index corruption involving deduplicated entries

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: BUG #17245: Index corruption involving deduplicated entries
Дата
Msg-id CAH2-Wz=s5EMTPeXYO9vS9Pdtq8UkY77qAdmEQ4sL1UMODK8zEA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #17245: Index corruption involving deduplicated entries  (Kamigishi Rei <iijima.yun@koumakan.jp>)
Ответы Re: BUG #17245: Index corruption involving deduplicated entries  (Peter Geoghegan <pg@bowt.ie>)
Re: BUG #17245: Index corruption involving deduplicated entries  (Andres Freund <andres@anarazel.de>)
Список pgsql-bugs
On Thu, Oct 28, 2021 at 1:04 AM Kamigishi Rei <iijima.yun@koumakan.jp> wrote:
> I forgot to mention: I have the data directory from the PostgreSQL 13.3
> instance that ran inside the same jail on the same FreeBSD system since
> May, so if that might be useful for the investigation, I can pull it
> into another VM.

I know I keep saying it's the last thing, but...I need more pages!

Can you please provide me with additional pages from the heap relation
"page", as follows: 5, 7, 12, 14, 16, 21, 22, 26

Your heap page (page 0 of the "page" heap relation) was very
interesting. I saw that the TID that is incorrectly repeated in the
index "page_len" is marked LP_UNUSED in the heap page. It's one of
only two such TIDs/line pointers on your heap/table page, out of 61
total, so I doubt that this is a coincidence. That's why I'm asking
for more heap pages -- I would like to verify that this is a
consistent thing, and not just a remarkable coincidence.

Now this is looking like a problem in VACUUM (pruning?), not a CREATE
INDEX thing. It looks like somehow an item that should be LP_DEAD ends
up being LP_UNUSED during pruning. I have CC'd Andres, to get his
thoughts on this.

This explanation fits quite well, because:

1. Even with every verification option enabled, amcheck only ever
complains about duplicate TIDs (the exact details vary in
not-very-interesting ways). I would expect to only see this if my new
hypothesis was correct, because in practice, with many workloads,
unused line pointers tend to get reused by later non-HOT updates that
create new versions without changing the value of indexes columns.
Because amcheck doesn't exhaustively verify that the index has no
duplicate TIDs, you're only going to see complaints about duplicates
that happen to be duplicated on all key values, up to and including
the heap TID itself.

2. If this theory was correct, then we'd occasionally notice TIDs that
are recycled by totally unrelated logical rows (or maybe the key
values do change). This is what we see in the original problem report,
where the index seems to think that 'Isokaze' and 'Ying_Swei' are the
same value. The TID originally *did* point to some version that had
the correct value, but since VACUUM never removed the TID from the
index (because it was never made LP_DEAD in the heap by pruning when
it should have been), the "pointed-to value" changed without the index
having the obsolete TID removed.

I am suggesting that 1 and 2 are the same thing, really. They look a
little different, but that's only due to limitations in our tooling.
In addition to the "amcheck cannot reliably detect duplicate TIDs in
an index" issue I mentioned, there are also heapallindexed limitations
to consider. We should not expect the heapallindexed stuff to detect
the case where an index has TIDs that currently point to LP_UNUSED
items in the heap -- heapallindexed verification can only detect the
absence of a needed index tuple, not the presence of a faulty/corrupt
index tuple. That's probably why heapallindexed doesn't detect one
single problem here, even though the original complaint describes
exactly the kind of scenario you might expect heapallindexed to catch.

3. Multiple indexes on the same table "independently" have very
similar corruption.

I noticed that the heap TID (21,41) is duplicated in posting lists in
both the page_redirect_namespace_len index (on block 14, posting list
tuple at page offset 13), as well as the page_len index (on block 1,
posting list tuple at page offset 15). If this was just a coincidence
(if the problem was just in nbtree) then it would be a remarkable
coincidence.

-- 
Peter Geoghegan



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Dave Cramer
Дата:
Сообщение: Re: Error in 'FROM' function
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: Error in 'FROM' function