Re: BUG #17245: Index corruption involving deduplicated entries

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: BUG #17245: Index corruption involving deduplicated entries
Дата
Msg-id CAH2-WzkTd6wHXhxn=hZT3NYwgoKNHaC5sDUwswtUo7Ay-VdR4w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #17245: Index corruption involving deduplicated entries  (Kamigishi Rei <iijima.yun@koumakan.jp>)
Ответы Re: BUG #17245: Index corruption involving deduplicated entries  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Re: BUG #17245: Index corruption involving deduplicated entries  (Kamigishi Rei <iijima.yun@koumakan.jp>)
Re: BUG #17245: Index corruption involving deduplicated entries  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-bugs
On Fri, Oct 29, 2021 at 11:36 AM Kamigishi Rei <iijima.yun@koumakan.jp> wrote:
> The newly manifested issue is caught by pg_amcheck:
>
> btree index "azurlane_wiki.mediawiki.page_main_title":
>      ERROR:  item order invariant violated for index "page_main_title"
>      DETAIL:  Lower index tid=(17,157) (points to heap tid=(540,5))
> higher index tid=(17,158) (points to heap tid=(540,5)) page lsn=2/A019DD78.

Great!

I'm not surprised to see that it's the page table, once again. It's
not particularly big, right? Are there other tables that are much
larger?

> The weird part about this is that the WAL archive does not seem to
> contain any data for 157 and 158 above (in 1663/19243/274869 blk 17).
> The last two entries are
>
> rmgr: Btree       len (rec/tot):     53/  4885, tx:    2085600, lsn:
> 2/A0195AE0, prev 2/A01943F0, desc: INSERT_LEAF off 155, blkref #0: rel
> 1663/19243/274869 blk 17 FPW
> rmgr: Btree       len (rec/tot):     72/    72, tx:    2085602, lsn:
> 2/A019DD30, prev 2/A019DCF0, desc: INSERT_LEAF off 156, blkref #0: rel
> 1663/19243/274869 blk 17
>
> The WAL file in data14/pg_wal does not have anything related to 157 and
> 158 for this filenode/blk as well.

If this was a heap relation then that would be true, because the
offset number of a heap needs to be stable, at least within a "VACUUM
cycle" (otherwise indexes will point to the wrong things, which would
of course be wrong). However, this relation is a B-Tree index, where
TIDs/page offset numbers are not stable at all.

Almost all individual index tuple inserts onto a B-Tree page put the
new index tuple "between" existing index tuples. This will "shift"
whatever index tuples are to the right of the position of the new
tuple. For example, with "INSERT_LEAF off 156", the insert atomic
operation will shift any existing index tuple at page offset 156 go to
page offset 157, plus any index tuple that was at page offset 157 is
going to go to 158. And so on.

We don't physically shift the index tuples themselves, but we do shift
the item ID/line pointer array at the start of the page, so it's not
too expensive.

-- 
Peter Geoghegan



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Kamigishi Rei
Дата:
Сообщение: Re: BUG #17245: Index corruption involving deduplicated entries
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: BUG #17258: Unexpected results in CHAR(1) data type